I took Introductory Psychology with a phalanx of freshmen in the cavernous Hamman Hall amphitheater at Rice University. Thirty-five years later, I best remember the astounding experimental work of Cambridge researchers, Colin Blakemore and G.F. Cooper, proving the ability to see isn’t born in us, but must be learned. Blakemore and Cooper reared kittens in darkness save for five hours a day when the kittens were placed in environments rigged so they could see only horizontal or vertical stripes. When later exposed to a dangling black rod, the felines reared with horizontal stripes could see the rod only when it was positioned horizontally. As the rod was turned vertically, only the vertical world kittens saw it. The rod “disappeared” in the eyes of the horizontal world kittens. Deprived of experience with the other plane, each group of kittens was incapable of seeing it. Their visual cortices didn’t develop the cells to see the horizontals or verticals they’d never experienced.
I think of those poor kittens as I ponder the relentless pushback I face trying to help lawyers see the unmistakable advantages of native review and production of ESI versus TIFF image and load file productions. I’m starting to appreciate that what strikes me as pig headedness may just be kitten headedness.
Deprived of experience with native forms in production, lawyers lack the “cells” to see how very much cheaper and easier native forms are to use in e-discovery. They can’t picture a world without pagination and embossed Bates numbers, and lack a context to envision the simpler, faster and more accurate tools available to safeguard evidence integrity when produced natively. They are content to keep stumbling into chair legs like vertically-deprived kittens.
Once, I questioned a partner at my first law firm why there were no women lawyers in the firm. I swear he replied, “We tried a woman lawyer once, and it didn’t work out.” Likewise, I meet the odd litigator who claims to have tried native production, and says it didn’t work out.
Digging deeper, those lawyers’ notion of native production involved opening each file produced in its native application for review—something you just don’t do. Else, their vendor or lit support folks converted native forms into images behind-the-scenes; so, of course they wouldn’t see the advantages or savings. The dismissive lawyers still inhabit a world devoid of verticals.
I just penned a Ball in Your Court column for the 2/13 print version of Law Technology News called Debunking the Case against Native Production. I discuss the arguments parties use in support of TIFF and load file productions and their “usual suspects” attack on native production. The case against native pricipally hinges on four bunk-laden claims:
- You can’t Bates label native files;
- Opponents will alter the evidence;
- Native production requires broader review; and
- Redacting native files changes them.
My goal for the column was to equip those who seek native production with facts to counter arguments supporting the wasteful status quo.
But I confess it feels like a waste of time because the facts don’t seem to matter much. The reply remains, “everything you say about native makes perfect sense, but we always do TIFFs and load files. It’s good enough.”
But it’s not good enough. Not hardly. And it’s a whole lot more expensive. If you’re one of the tiny minority of lawyers who’ve experienced the ease of e-discovery with tools purpose-built for native review, you know what I’m talking about. Once you “go native,” you never go back!
By “native,” I mean data in the original electronic formats the producing party uses for, e.g., e-mail, word processing, spreadsheets and presentations. A native file is inherently electronically searchable and functional until it’s converted to TIFF images, when it loses both searchability and functionality.
Because converting to TIFF takes so much away, parties producing TIFF images attempt to restore a measure of electronic searchability by extracting text from the electronic document and supplying it in a load file accompanying the TIFF images. A recipient must then run searches against the extracted text file and seek to correlate the hits in the text to the corresponding page image. It’s clunky, costly and incomplete.
The irony of TIFF and load file productions is that it was a cutting edge technology before the turn of the century. In my column, I explore that history, noting that, back when discovery denoted paper, TIFF imaging, optical character recognition, manual coding and load files imbued productions with a crude electronic searchability.
The coding and OCR text had to be stored in separate files because TIFF images are just pictures of pages, incapable of carrying added content. So, in “single page TIFF” productions, each page of a document became an image file, another file held aggregate extracted OCR text and a third held the coded data about the data, i.e., its metadata. The metadata would include information about the content and origin of the paper evidence along with the names and locations of the various image and files on the media (i.e., CD or DVD) used to transmit same. Thus, adding a measure of searchability yielded a dozen or more electronic files to carry the pieces of a ten-page document.
To put Humpty Dumpty back together again demanded a database and picture viewer capable of correlating the extracted text to its respective page image and running word searches. Thus was born a new category of document management software called “review platforms.” Because the files holding OCR text and metadata were slated to be loaded onto a review platform, they were dubbed “load files.”
Then, as now, load files sucked; but, we put up with them because adding searchability to unsearchable paper documents was worth it. A stone axe is better than no axe at all.
But today, electronic documents are inherently searchable. They’re layered, multi-media and multi-dimensional. Much ESI defies characterization as a document.
Replete with embedded formulae, appended comments, tracked changes and animated text, ESI thumbs its nose at the printed page. Load files can’t begin to hold the myriad pieces and layers of information so as to faithfully mirror the efficiency, completeness or functionality of native evidence when reviewed using tools purpose-built for native review. Yet, lawyers cling to TIFF imaging and load files, downgrading ESI’s inherent searchability and eviscerating the multi-dimensional character of ESI. As I say in the column, “an obsolete technology that once made evidence easier to find now deep sixes probative content.”
But as frustrating as it is to dangle the rod of native production in front of a bunch of cats that see nothing but TIFFs and load files, there’s room for optimism. Blakemore and Cooper’s kittens learned to see both verticals and horizontals when they entered the real world. Inevitably, lawyers will come around to native productions; but until then, it’s just sad to watch them stumbling into table legs, little knowing what’s causing them so much pain.
Adam Wells said:
Well argued, Craig. In my experience authoring joint ESI production agreements, I’ve found that so long as there is a provision addressing the display of natively produced documents during depositions and trials, attorneys are more likely to sign on.
I’d also argue that aside from reduced cost and increased efficiency, attorneys truly appreciate being able to review recognized file types as they would in their office setting. That is, when an Excel file looks and feels like an Excel file, their appreciation for the native approach increases dramatically. It allows them to focus on content and legal argument, and not as much on the technology that mediates their visibility into a document.
Excellent post Craig.
Jeff Johnson said:
Good stuff Craig. Many companies in the ediscovery space with copy/scan roots have a financial incentive not to educate clients away from tiff. Furthermore, sales people across the industry are compensated per GB or per custodian…so there is little motivation to change or educate. Flat fee arrangements based on the needs of the case seem more in line with case realities.
Mike McBride said:
Good stuff Craig. One of the real challenges in moving firms to native production is the Bates Label, so I’ll be interested in seeing how you deal with that. Discussions I’ve had about native production typically start and end with “how would I keep track of which document is which if it’s not labeled?” Unfortunately, that question exposes the larger issue behind it, namely that attorneys are still printing their documents and reviewing them in that format, even when they have a sophisticated tool available to them to do review. When you print out a copy of a document, removing all of the available metadata, the label becomes the only way to truly tie it back to it’s electronic version. Until we get them to stop printing, I’m afraid native production is still a step too far.
Jill McIntyre said:
TIFing on the fly and embossing with a production number &/or exhibit number can work, say, if you are printing ten of twenty thousand documents produced to use during a deposition.
Mike McBride said:
Jill, sure for small documents to be used at deposition, printing is perfectly fine, but the discussions I’m having revolve around reviewing documents. For example, having a paralegal or Lit Support person running keyword searches and printing all the documents that are hits, basically allowing the attorneys to never touch the technology. If that’s the goal, there are larger problems than native review versus TIFF review. 😉
Joe Howie said:
Tigers are of course among the world’s largest cats. If you know of lawyers who are still printing to paper for review, please consider nominating them for the First Annual Paper Tiger Worst Practices Award. Details at http://howieconsulting.com/articles/FirstAnnualPaperTigerAward04.pdf
Kelly Twigger said:
Great post, Craig. I run into exactly the same problems in trying to convince other lawyers to work with native review, and like Mike McBride said, the biggest obstacle is the bates labeling that they rely on so readily. I too look forward to seeing your take on how to get them past that issue. As I generally manage the eDiscovery for a client, I work to persuade the client on the cost savings, and then impose on the outside counsel, but that is not always successful. Maybe as cooperation and trust are developed on the eDiscovery side in addition to the lack of need for printing, we’ll find less need for the tracking and control of each page.
Pingback: Master Bates Numbers in E-Discovery « Ball in your Court
Andy Wilson (Logik) said:
Sure, TIFF alone is just as bad as sending banker boxes full of paper. I think most sane people agree with that argument. But image (.tif or .pdf) + native appears to be the happy middle ground when it comes to producing documents. At least that’s what we’ve seen over the past four years and what we’re seeing today.
In 2008, before the recession hit, all-TIFF-all-the-time reviews were standard operating procedure, at least for us. Natives were provided for review, but TIFF was the default. We rendered over 100M pages to TIFF in 2008 primarily for document review. After the recession there was an increase in native review with TIFF or PDF productions sometimes accompanied by native files, but not always. The following year, 2009, we rendered ~20M pages to TIFF, but provided a similar amount of documents for review as we did in 2008. The large majority of the TIFFing was done after the review for productions. This was a major shift in review workflow and it hasn’t changed.
Nowadays the standard is to almost always include natives along with their TIFF or PDF representations when producing documents after review. This appears to be the happy middle ground. Native-only productions are increasingly rare. So at least from my perspective this change in behavior has already happened. The cats have learned to see (somewhat). And from our observations the change was based largely on the overall cost to TIFF upfront because native-review was/is much cheaper and faster.
The next behavior to change is the FUD (Fear Uncertainty & Doubt) with producing documents. Producing documents in native, tiff, pdf, etc. with complicated load files is just silly in this day and age. Take a step back and think about it. The word “production” doesn’t make much sense because nothing is really being manufactured. What is really happening when producing documents for discovery is the sharing of information.
I highly doubt the legal community would ever adopt the word “sharing” (too friendly) as an alternative to “producing”, but wouldn’t it be awesome if you could simply “share” a folder of documents ready to “produce” to the people you are “producing” too? Just like you share a folder in your Dropbox account (obviously with better restrictions). There would be no need for a load file. The ridiculous amount of time it takes (sometimes days) to make a production would be dramatically diminished if not eliminated entirely.
Scary? Sure it is, but that’s because it’s something new so naturally lawyers will initially have tons of FUD. But once they see the benefits of sharing it will become the standard method for producing.
When that happens this argument of native or tiff, endorse or not endorse, etc. will be moot. I’m optimistic this will happen in the next five years. For now though producing in native with or without TIFFs or PDFs is definitely the way to go. And it appears that people are listening to sound advice like yours. Keep it coming Craig!
Thank you, Andy. Very useful insight. Perhaps it’s my many years as a plaintiffs’ lawyer in a prior life that compels me, despite most of my work being on behalf of producing parties, to always consider the perspective and needs of requesting parties. From that vantage point, if an opponent wants to furnish both the native data AND at their sole cost, corresponding TIFFs and load files, I have no problem with that. That assumes, of course, that the TIFFing and load file generation doesn’t prompt delay, add unnecessary complexity or adversely impact the utility or completeness of the native items produced.
Unfortunately, I have not seen that in most of the cases in which I consult. At best–and inadequately–producing parties grudgingly offer to supply native (usually only for spreadsheets) on an ad hoc basis, shifting the burden to secure a native counterpart to the requesting party, who they demand show ‘special’ need for same.
You’re right that the TIFF-everything-before-review madness has subsided. Producing parties are indeed availing themselves of the efficiencies and economies of native forms in review. But damned if they don’t blithely deprive the other side of same, downgrading evidence for production and–despite your heartening observation–saying “let them eat TIFF” to the peasants on the other side.
I aspire to be Mme Defarge as the sanctions tumbrels roll. “Off with their heads!” I’ll cry. “D’avec leurs têtes! Décapitez-les!” New years resolution:Learn to knit…and cackle.
Andy Wilson (Logik) said:
HA! “Learn to knit…and cackle.” Craig, you really are becoming a “cat person”. =P
Jim Shook said:
Craig, spot on as usual, thank you. In reading through the comments, I agree that the biggest concern seems to be in having an identifier for the document. People remain comfortable with the Bates Stamp, even though we’ve all had cases where the same document has multiple IDs, there are conflicting IDs, no IDs, etc. Hash values are clearly the way to go in that area. But it seems we need a default method to translate that (long) value into a more human-friendly length before it can take off.
Craig Ball said:
Thanks, Jim. Though hash values have a helpful role in all of this, they can play that role behind the scenes. By that I mean, the hash doesn’t have to *be* the identifier so long as a hash value is tied to the identifier in some manner.
Once lawyers get it firmly in their minds that the source of most records sought in discovery is no longer existing “documents” but, instead, are data stored in a database (file systems and e-mail systems both being, essentially, databases), they will better appreciate sensible ways to go about uniquely identifying items produced in e-discovery. MS Word documents aside, the data we address in discovery today is, for the most part, not “paged” until and unless we force it into a paged format–a process that jettisons a measure of content and functionality for most forms.
Because ALL ESI requires an intervening technology to be accessible to humans (who cannot, regretably, read digitally encoded information right from the storage medium, as we could do with paper), our access for litigation should occur through the auspices of tools designed for the task (i.e., for native review) and which will routinely track many fields of data about the evidence item, including its hash value. Thus, a people-friendly Bates-style identifier can still be employed without requiring that a hash value be in our face.
As we move toward native file-friendly containers for use in e-discovery, these will have “pockets” (metaphorically speaking) to carry hash values, Bates identifiers and other system and litigation metadata. For the moment, the more familiar mechanism of the load file will serve that purpose.
Thanks for the comment.
Happy new year. Craig