8 tips[If this post seems a bit more basic than my usual in-the-weeds blather, it’s because this is taken from a CLE article I wrote for an upcoming panel discussion on “E-Discovery on a Budget.”  I’m particularly pleased with tips 7 and 8, and hope you’ll please share some of your own tips as comments.]

This really happened:
Opposing counsel supplied an affidavit stating it would take thirteen years to review 33 months of e-mail traffic for thirteen people.  Counsel averred there would be about 950,000 messages and attachments after keyword filtering.  Working all day, every day reviewing 40 documents per hour, they expected first level review to wrap up in 23,750 hours.  A more deliberate second level review of 10-15% of the items would require an additional two years.  Finally, counsel projected another year to prepare a privilege log.  Cost: millions of dollars.

The arithmetic was unassailable, and a partner in a prestigious law firm swore to its truth under oath.

This could have happened:
On Monday afternoon, an associate attached a hard drive holding 33 months of e-mail for thirteen custodians to the USB port of her computer and headed home.   Overnight, e-discovery review software churned through the messages and attachments indexing their contents for search and de-duplicating redundant data.  The next morning, the associate identified responsive documents using keywords and concept clustering.  She learned the lingo, mastered the acronyms and identified common misspellings.  She found large swaths of irrelevant data that could be safely eliminated from the collection and began segregating responsive and non-responsive items.  By lunchtime on Wednesday, the software started asking whether particular items were responsive.  Before she called it a day, the associate ceded much of the heavy lifting to the program’s technology-assisted review capabilities and shifted her attention to searching for lawyers’ names and e-mail domains to flag privileged communications.  She spent Thursday afternoon sampling items the computer identified as non-responsive to be assured of the quality of review.  Before she called it a day, the associate tasked the software to generate a production set and a privilege log for partner review on Friday and wondered if it might be a good weekend to head to the beach.  Cost: 40 associate hours.

These two scenarios contrast the gross disparity in review costs and time between lawyers who approach e-discovery in ignorance and those who do so with skill.  The Luddite lawyer who knows nothing of modern methods misleads the court and cheats the client.  The adept associate proves that e-discovery is fast and affordable when the right tools and talents are brought to bear.  Electronically stored information (ESI) serves us in all our day-to-day endeavors.  ESI can and should serve us just as well in our search for probative evidence and in the resolution of disputes.

You Must Make It Happen
Finding efficiencies and avoiding dumb decisions in electronic discovery isn’t someone else’s responsibility.  It’s yours.  If someone else must perennially whisper in your ear, articulating the issues and answering the questions you should be competent to address, you aren’t serving your client.

ESI isn’t going away, nor will it wane in quantity, variety or importance as evidence.  Each day you fail to hone your e-discovery skills is a day closer to losing a case or losing a client.  Each day you learn something new about ESI and better appreciate how to request, find, cull, review and produce it at lowest cost is a day that cements your worth to your clients and makes you a more effective counselor and advocate.

Eight Tips to Quash the Cost of E-Discovery
The following tips are offered to help you slash the outsize cost of e-discovery:

  1. Eliminate Waste
  2. Reduce Redundancy and Fragmentation
  3. Don’t Convert ESI
  4. Review Rationally
  5. Test your Methods and Know your ESI
  6. Use good tools
  7. Communicate and cooperate
  8. Price is what the seller accepts

1.    Eliminate Waste
I once polled thought leaders in electronic discovery about costs.  They uniformly agreed that about half of every e-discovery dollar is expended unnecessarily as a consequence of counsel lacking competence with respect to ESI.  Half was kind.

Every time you over-preserve or over-collect ESI, every time you convert native data to alternate forms or fail to deduplicate ESI before review and every time you otherwise review information that didn’t warrant “eyes on,” you add cost without benefitting your client.  It’s money wasted.  Poor e-discovery choices tend to be driven by irrational fears, and irrational fears flow from lack of familiarity with systems, tools and techniques that achieve better outcomes at lower cost.  The consequences of poor e-discovery decisions prompt motions to compel or for sanctions, further ratcheting up the cost of incompetence.

2.    Reduce Redundancy and Fragmentation
Many complain that electronic discovery has made litigation more costly because there is so much more information available today.  Certainly, there are more channels of information available today, allowing an enlightened advocate more probative evidence.  Much of what evaporated as a phone conversation now endures as a writing.  There is more temporal, photographic and geolocation data to draw on, and more “persons with knowledge of relevant facts” who are privy to revealing information.

Despite there being more, the increase doesn’t reflect the dire logarithmic leap in data volume some suggest.   Much of the growth is attributable to replication and fragmentation.  Put simply, human beings don’t create that much more unique information; they mostly make more copies of the same information and break it into smaller pieces.  Yesterday’s memo sent to three people is today’s 30- message thread sent to the whole department and retrieved on multiple devices.  These iterations add a lot to the quantity of ESI, but little in the way of truly unique evidence.  Thus, the burden and cost of e-discovery is inversely proportional to a litigant’s ability to reduce redundancy and fragmentation. There are many ways to minimize redundancy and fragmentation.  Some entail sensible choices during identification and collection; others involve the application of tools and techniques geared to eliminating replication and organizing fragmented information for efficient review.

Anyone who has done a document review can attest to the tedium of seeing the same documents over and over again.  Messages repeat within threads or across recipients, and attachments to messages mirror documents from file servers.  Some of this can be readily eliminated by simple hash-based de-duplication that costs very little and reliably eliminates documents that are duplicates in all respects.  Hash-based deduplication calculates a “digital fingerprint” value (variously called an MD5 or SHA1 value) for each document, allowing redundant documents to be excluded from review.

Nothing offers a more cost-effective means to reduce the cost of document review than deduplication; consequently, no one should undertake a document review without minimally running a simple hash-based deduplication to eliminate replication.

Unfortunately, simple hash-based deduplication doesn’t work for e-mail messages (which necessarily reflect different routing information for different recipients) or for documents with minor variations that don’t signify material differences in content.  For these items, more advanced near-deduplication techniques are needed to eliminate redundancy without increasing the risk that unique documents will be overlooked.

Deduplication is a mechanical process requiring little, if any, human intervention or costly programming.  Accordingly, its cost should always be a nominal component of an e-discovery effort.  If a service provider attempts to charge princely sums for deduplication, consider it a sign that it’s time to find a new vendor.  When the volume of information to be deduplicated is modest (e.g., less than 10-15 GB), low cost tools are available to deduplicate without the need to engage a service provider.[1]

3.    Don’t Convert ESI
It’s criminal how much money is wasted converting electronic information into paper-like forms just so lawyers don’t have to update workflows or adopt contemporary review tools.  Clients work with native forms of ESI because native forms are the most utile, complete and efficient forms in which to store and access data.  Clients don’t print their e-mail before reading it.  Clients don’t emboss a document’s name on every page.  Clients communicate and collaborate using tracked changes and embedded comments, yet many lawyers intentionally or unwittingly purge these changes and comments in e-discovery and fail to disclose such redaction.  They do it by converting native forms to images, like TIFF.

Converting a client’s ESI from its natural state as kept “in its ordinary course of business” to TIFF images injects needless expense in at least half a dozen ways.  First, you must pay someone to convert native forms to TIFF images and emboss Bates numbers.  Second, you must pay someone to generate load files containing extracted text and application metadata from the native ESI.  Third, you must produce multiple copies of certain documents (like spreadsheets) that are virtually incapable of being produced as TIFF images.  Fourth, because TIFF images paired with load files are much “fatter” files than their native counterparts, you pay much more for vendors to ingest and host them by the gigabyte.  Fifth, it’s very difficult to reliably deduplicate documents once they have been converted to TIFF images.  Sixth, you may have to reproduce everything when your opponent wises up to the fact that you’ve substituted cumbersome TIFF images and load files for the genuine, efficient evidence.

4.    Review Rationally
Recently, an opponent advised the Court that their projected cost of review encompassed the obligation to look at every e-mail attachment when the body of the e-mail message contained a keyword hit, even when none of the attachments contained a hit.  They made this representation knowing that the majority of the hits would prove to be noise hits, that is, keywords in a context that doesn’t denote responsiveness.  Why would a party incur the expense to review the attachments to a message they’d determined was non-responsive when the attachments contained no keywords?  It turned out they had separated attachments from e-mail transmittals, surrendering the ability to know which attachments could be eliminated from review because the transmitting message was eliminated from review.  That’s not a rational approach to review.

A common irrational approach to review is to treat information in any form from any source as requiring privilege review when even a dollop of thought would make clear that not all forms or sources of ESI are created equal when it comes to their potential to hold privileged content.  The cost of review accounts for anywhere from 60-90% of the total cost of e-discovery; so, anything that defensibly narrows the scope of review prompts maximum savings.  Almost anytime you can use technology to isolate privileged content and prudently employ a clawback agreement or Federal Rule of Evidence 502 to guard against inadvertent disclosure, you can slash the cost of privilege review.

5.    Test your Methods and Know your ESI
Staggering sums are spent in e-discovery to collect and review data that would never have been collected if only someone had run a small scale test before deploying an enterprise search.  It’s easy and inexpensive to test proposed searches against representative samples of data (e.g., one key custodian’s mailbox) so as to identify outcomes that will unduly drive up the cost of ingestion, hosting and review.  This entails more than simply eliminating queries with large numbers of hits; it requires modifying them to balance the incidence of noise hits against hits on responsive data.

A lot of money gets wasted in e-discovery over disputes that could be quickly resolved if someone simply knew more about the ESI i.e., if someone simply looked.  Here again, knowing the software and file types used, the nature and configuration of the e-mail system, the retention scheme for backup media or whether a key custodian used a home system for business are all examples of information that can serve to facilitate decisions that will narrow the scope of collection and review with consequent cost savings.

6.    Use Good Tools
If you needed to dig a big hole, you wouldn’t use a teaspoon, nor would you hire a hundred people with teaspoons.  You’d use the right power tool and a skilled operator.

You can’t efficiently collect or review ESI without using good tools.  Anyone engaging in e-discovery should be able to answer the question, “What’s your review platform?”  They should be able to articulate why they use one review platform over another, and “because we already owned a copy” is not the best reason.

A review platform is the software tool used to index, sort, search, view, organize and tag ESI.  Choosing the right review platform for your practice requires understanding your workflow, personnel, search needs and forms in which ESI will be ingested and produced.  Review platforms can be cost-prohibitive for some practitioners, but it’s untenable to manage ESI in discovery without a capable review platform.

There are many review platforms on the market, including familiar names like Relativity, Concordance and Summation.  There are also Internet-accessible “hosted” review environments and many proprietary review tools touting more bells and whistles than a Mardi Gras parade.   Among the most important consideration in selecting a review platform is its ability to accept data in forms that do not to require costly conversion to TIFF images.  Additionally, you may want the platform you select to support the most advanced forms of technology-assisted search and review that your budget allows, including predictive coding capabilities.

7.    Communicate and Cooperate
Poor communication and lack of cooperation between parties on e-discovery issues contribute markedly to increased cost.  The incentives driving transparency and cooperation in e-discovery are often misunderstood.  You don’t communicate or cooperate with an opponent to help them win their case on the merits; you do it to permit the case to be resolved on its merits and not be derailed or made more expensive by e-discovery disputes.

Much of the waste in e-discovery grows out of apprehension and uncertainty.  Litigants often over-collect and over-review, preferring to spend more than necessary instead of giving the transparency needed to secure a crucial concession on scope or methodology.

Communication and cooperation in e-discovery are not signs of weakness but of strength.  Cooperation is a means to demonstrate that your client understands its e-discovery obligations and is meeting them.  More, it’s a means to build trust in the scope and methods of discovery so as to forestall challenges that may prove disruptive to the case and the client’s operations.  It’s even possible that your opponent understands e-discovery or your client’s systems better than you do and can propose more efficient ways to scope and complete the effort.  What an opponent will accept in a cooperative give-and-take is often less onerous than what you were planning to produce.

Put simply: the more you seek to hide the ball, the more likely a savvy opponent will dig deeper and find something your side missed.  Because there are no perfect e-discovery efforts, there are none that can withstand the heightened scrutiny invited by shortsighted stonewalling.

Hubris doesn’t help.  Most flaws in e-discovery processes can be rectified quickly and cheaply when they surface early.  An overlooked variant on a keyword or a missed file type is easy to fix at the outset, but can prove costly or irreparable when discovered months or years later.  Moreover, disclosure tends to shift the burden to act.  Courts are loathe to entertain belated objections from parties who’d been supplied sufficient information to act promptly.

8.    Price is What the Seller Accepts
I’ve haggled in bazaars and markets from Cairo to Kowloon; but, I’ve never seen more pliant pricing than among those hawking e-discovery tools and services in the United States.

A famous/infamous e-discovery vendor once quoted $43.5 million for a six-week engagement processing a very large volume of data on an expedited basis.  The customer was desperate, but not insane.  Rebuffed, the vendor re-quoted the job the next day for several million dollars less.  They “sharpened their pencil” again the next day…and the next.  Before the week was out, the vendor was proposing to do the job for $3.5 million.  They didn’t get the work.

Service providers have to pay staff and keep the lights on. So, almost any work beats no work at all.  Many will accept work that isn’t profitable, if it keeps a competitor from getting the business.  Shop around.  Make an offer.  Only a sucker pays rack rate.

Make yourself sheep and the wolves will eat you. Benjamin Franklin

[1] One of the finest tools for deduplicating collections less than 15GB is called Prooffinder (www.prooffinder.com).  It costs $100.00 for an annual license, and all proceeds from its sale go to support child literacy.