One of the conceits of writing is the perception that when you’ve written on something, it’s behind you. Not that nothing else need be said on the topic, but only that it need not be said by you. That’s silly for a host of reasons. I started writing the print version of Ball in Your Court ten years ago–before the 2006 Federal Rules amendments and before the EDRM. Half my readers weren’t in the field then, and veteran readers surely missed a few missives. Plus, if the point was worth making, perhaps it bears repeating. So, I now revisit columns and posts from the primordial past of e-discovery–starting over as it were, updating and critiquing in places, and hopefully restarting a few conversations. As always, your comments are gratefully solicited.
The DNA of Data
[2005: the very first Ball in Your Court]
Discovery of electronic data compilations has been part of American litigation for two generations, during which time we’ve seen nearly all forms of information migrate to the digital realm. Statisticians posit that only five to seven percent of all information is “born” outside of a computer, and very little of the digitized information ever finds its way to paper. Yet, despite the central role of electronic information in our lives, electronic data discovery (EDD) efforts are either overlooked altogether or pursued in such epic proportions that discovery dethrones the merits as the focal point of the case. At each extreme, lawyers must bear some responsibility for the failure. Few of us have devoted sufficient effort to learning the technology, instead deluding ourselves that we can serve our clients by continuing to focus on the smallest, stalest fraction of the evidence: paper documents. When we do garner a little knowledge, we abuse it like the Sorcerer’s Apprentice, by demanding production of “any and all” electronic data and insisting on preservation efforts sustainable only through operational paralysis. We didn’t know how good we had it when discovery meant only paper.
However, electronic evidence isn’t going away. It’s growing…exponentially, and some electronic evidence items, like databases, spreadsheets, voice mail and video, bear increasingly less resemblance to paper documents. Proposed changes in the rules of procedure wending their way through the system require lawyers to discuss ways to preserve electronic evidence, select formats in which to produce it and manage volumes of information dwarfing the Library of Congress. Litigators must learn it or find a new line of work. Here, I was referring to the 2006 amendments; but, with new proposed amendments in play, it still rings true.
My goal for this column is to help make electronic discovery and computer forensics a little easier to understand, never forgetting that this is exciting, challenging—and very cool—stuff.
Accessible versus Inaccessible
You can’t talk about EDD today without using the “Z” word: Zubulake (pronounced “zoo-boo-lake”). Judge Shira Scheindlin’s opinions in Zubulake v. UBS Warburg, L.L.C., 217 F.R.D. 309 (S.D.N.Y. 2003) triggered a whirlwind of discussion about EDD. Judge Scheindlin cited the “accessibility” of data as the threshold for determining issues of what must be produced and who must bear the cost of production. Accessible data must be preserved, processed and produced at the producing party’s cost, while inaccessible data is available for good cause and may trigger cost shifting.
But what makes data “inaccessible?” Is it a function of the effort and cost required to make sense of the data? If so, do the boundaries shift with the skill and resources of the producing party such that ignorance is rewarded and knowledge penalized? To understand when data is truly inaccessible requires a brief look at the DNA of data.
Computer data is simply a sequence of ones and zeroes. Data is only truly inaccessible when you can’t read the ones and zeroes or figure out where the sequence starts. To better grasp this, imagine you had the unenviable responsibility of typing the complete works of Shakespeare on a machine with only two keys, “A” and “B,” and if you fail, all the great works of the Bard would be lost forever. As you ponder this seemingly impossible task, you’d figure out that you could encode the alphabet using sequences of As and Bs to represent each of the twenty-six capital letters, their lower case counterparts, punctuation and spaces. The uppercase “W” might be “ABABABBB” and the uppercase “S,” “ABABAABB.” Cumbersome, but feasible. Armed with the code and knowing where the sequence begins, a reader can painstakingly reconstruct every lovely foot of iambic pentameter.
This is just what a computer does when it stores data in ones and zeroes, except computers encode many “alphabets” and work with sequences billions of characters long. Computer data is only “gone” when the media that stores it is obliterated, overwritten or strongly encrypted without a key. This is true for all digital media, including backup tapes and hard drives. But, inaccessibility due to damage, overwriting or encryption is rarely raised as grounds for limiting e-discovery or shifting costs.
Just Another Word for Burdensome?
Frequently, lawyers will couch a claim of undue burden in terms of inaccessibility, arguing that it’s too time-consuming or costly to restore the data. But, burden and inaccessibility are opposite sides of the same coin, and “inaccessibility” adds nothing to the mix but confusion. Arguing both burden and inaccessibility is two bites at the apple.
Worse, there is a risk in branding particular media as “inaccessible.” Parties resisting discovery shouldn’t be relieved of the obligation to demonstrate undue burden simply because evidence resides on a backup tape. We must be vigilant to avoid a reflexive calculus like:
All backup tapes are inaccessible
Inaccessible means undue burden presumed
Good cause showing required for production
Requesting party pays cost of conversion to “accessible” form.
Zubulake put EDD on every litigator’s and corporate counsel’s radar screen and proved invaluable as a provocateur of long-overdue debate about electronic discovery. Still, its accessibility analysis is not a helpful touchstone, especially in a fast-moving field like computing. Codifying it in proposed amendments to F.R.C.P. Rule 26(b)(2) would perpetuate a flawed standard. Even if that occurs, don’t be cowed by the label, “inaccessible,” and don’t shy away from seeking discovery of relevant media just because it’s cited as an example of something inaccessible. Instead, require the producing party to either show that the ones and zeroes can’t be accessed or demonstrate that production entails an undue burden.
A decade later, credible claims of inaccessibility grounded on technical hurdles and balky backup media have largely disappeared, relegated to boilerplate objections. My urging litigants to push back on inaccessibility claims seems quaint now; but in 2005, the prevailing view was that backups were ipso facto inaccessible.
Modern backups tend to be as easy to access and search as active data stores–easier in ways, because they consolidate information across custodians, facilitating collection from a single source and permitting search and deduplication across what would otherwise be separately siloed sources. Moreover, software tools and vendor services make it possible to search, cull, filter, deduplicate and extract compressed information on multivolume tape backup sets without fully restoring same. Finally, companies have gotten smarter in their refusal to retain vast legacy tape collections, and courts have awakened to the fact that tape media kept longer than a few weeks isn’t for disaster recovery; it’s an archive.
Today, we should expect claims of inaccessibility to reappear as we grapple with the smart phones and tablets that so captivate us–these sources are indeed harder and slower to preserve and process than PCs and servers and encryption renders some content functionally inaccessible. As well, the burgeoning volume of data going to the Cloud may foster the epiphany that data in an online repository is lots easier to populate than to repatriate.
PHILLIP A RODOKANAKIS said:
So Craig, are you saying that the costs of accessing data from outdated tapes or other legacy storage mediums, doesn’t come into play when considering accessibility?
Phil: Cost clearly comes into play respecting the question of proportionality, but the caselaw tends to confine the question of accessibility to technical impediments, not financial ones. This is one reason why I’ve long been troubled by the emphasis placed on accessibility as a touchstone for cost shifting. It’s all just ones and zeroes, but reported cases, the comments to the Rules and secondary sources all tend to treat backup media as the poster child for inaccessible content presumed to be non-discoverable absent good cause shown and cost shifting. The march of technology has shown us that backup media need not be all that costly to include within the scope of discovery, particularly if sound information governance practices have been observed and modern mechanisms put in place. Is a VTL really all that much harder to access and search than a local hard drive? Yet, the former tends to be reflexively deemed “not reasonably accessible” (as a backup) and the later an archetype of accessibility. Is it more efficient to recover e-mail by restoring a single backup set or collect local PSTs from twenty custodians? Cost and accessibility are only loosely coupled in my thinking.
I prefer to keep the focus principally on relevance and materiality when it comes to determining whether information should be discovered or costs allocated.
Celia C. Elwell, RP said:
Reblogged this on The Researching Paralegal and commented:
I am delighted that Craig Ball is taking this new approach to his blog, and looking forward to his posts. -CCE