Poring over Requests for Production this morning, I was gratified to see the client sought native forms of electronically-stored information; but the request said only, “All documents shall be Bates stamped and provided in native format.”  Is that sufficient? To me, specifying forms of production is best done via an agreed ESI production protocol, but failing that, requesting parties should supply more detail than simply asking for “native format.”  I believe requests need to lay out the forms sought for particularized types of ESI and specify the essential ancillary metadata to be produced in load files. 

Requesting native forms in discovery demands a few adaptations versus the way hard copy documents were sought in years past.  Take that request, “All documents shall be Bates stamped and provided in native format.”  If a document is supplied natively and not printed out or “flattened” to a static TIFF, where do you “stamp” the Bates number?  The solution is simple (in the file name and load file), but not obvious to lawyers unschooled in e-discovery.

Specifying more than “native format” in the request is sensible because much ESI doesn’t lend itself to production in its “true” native forms.  The “true” native form of email is typically a database of multiple user accounts holding messages, calendars, contacts, to-do lists, etc.  An opponent need not (and won’t) produce such a massive, undifferentiated blob of data.  So the better practice is to specify preferred near-native forms be produced; that is, forms that preserve the integrity and utility of the evidence and support the granularity needed for discovery of only relevant, non-privileged material.  As well, providing a load file specification ensures you obtain metadata values that only the producing party can supply (like Bates numbers, originating hash values, source paths and custodians). Too, you want that metadata in a structure suited to your needs and tools.

Native productions are more utile and cost-effective, but only to requesting parties prepared to reap their superior utility and savings.  One reason why producing parties have gotten away with producing inefficient and unsearchable static image formats (TIFFs) for so long is because TIFF images can be viewed in a browser; hence, recipients of TIFF productions can read documents page-by-page without review software.  Yet, that easy access comes at a perilous cost.  TIFF productions are many times larger in byte volume than native production of the same material, making it significantly more costly for requesting parties to ingest and host the evidence.  Moreover, TIFF images tend not to work well for common formats like spreadsheets and PowerPoint presentations, and don’t work at all for, e.g., video and sound files.  Finally, evidence produced as TIFF images gets shorn of metadata and searchable electronic content, requiring that the stripped metadata and searchable content be produced separately and reconstructed using software to comprise, at best, a degraded “TIFF Plus” facsimile of the evidence. 

For these reasons and more, requests for production must either succeed the entry of an agreed- or court-ordered production protocol or requesting parties must include useful and practical instructions about the forms of production right in the body of the Request.

To simplify my client’s task, I drafted an Appendix to be grafted onto the Requests for Production and suggested my client take out “All documents shall be Bates stamped and provided in native format” and substitute the phrase: “All production should be produced in accordance with the instructions contained in Appendix A to this Request.” It’s not perfect, but it should get the job done.

The Appendix I supplied reads as follows, and I don’t offer it as a paragon of legal draftsmanship.  Each time I create something like this, it’s a struggle deciding what details to omit versus supplying all features of a full-fledged production protocol.  I’ve kept it to about 1,000 words, and a tad verbose at that.  It’s for you to decide if it adds substantial value over simply asking for “native format.”  Tell me what do you think in the comments. If you’d like a Microsoft Word version of Appendix A to play with, you can download it from this link: http://craigball.com/Request_for_Native_Production-Appendix_A.docx

Appendix A: Forms of Production

I. Definitions

“Electronically Stored Information” or “ESI” includes communications, presentations, writings, drawings, graphs, charts, photographs, posts, video and sound recordings, images, and other data or data compilations existing in electronic form on any medium including, but not limited to: (i) e-mail, texting, social media or other means of electronic communications; (ii) word processing files (e.g., Microsoft Word); (iii) computer presentations (e.g., Microsoft PowerPoint); (iv) spreadsheets (e.g., Microsoft Excel); (v) database content and (vi) media files (e.g., jpg, wav).

“Metadata” means and refers to (i) structured (fielded) information embedded in a native file which describes the characteristics, origins, usage, and/or validity of the electronic file; (ii) information generated automatically by operation of a computer or other information technology system when a native file is created, modified, transmitted, deleted, or otherwise manipulated by a user of such system; (iii) information, such as Bates numbers, created during the course of processing documents or ESI for production; and (iv) information collected during the course of collecting documents or ESI, such as the name of the media device, or the custodian or non-custodial data source from which it was collected.

“Native Format” means and refers to the format of ESI in which it was generated and/or as used by the producing party in the usual course of its business and in its regularly conducted activities. For example, the native format of an Excel workbook is a .xls or .xslx file and the native format of a Microsoft Word document is a .doc or .docx file.

“Near-Native Format’ means and refers to a form of ESI production that preserves the functionality, searchability and integrity of a Native Format item when it is infeasible or unduly burdensome to produce the item in Native Format.  For example, an MBOX is a suitable near-native format for production of Gmail, an Excel spreadsheet is a suitable near-native format for production of Google Sheets, and EML and MSG files are suitable near-native formats for production of e-mail messages.  Static images are not near-native formats for production of any form except Hard Copy Documents.

II. Production

1. Responsive electronically stored information (ESI) shall be produced in its Native Format with Metadata.

2. If it is infeasible to produce an item of responsive ESI in its Native Format, it may be produced in a Near-Native Format with options for same set out in the table below:

Source ESINative or Near-Native Form or Forms Sought
Microsoft Word documents.DOC, .DOCX
Microsoft Excel Spreadsheets.XLS, .XLSX
Microsoft PowerPoint Presentations.PPT, .PPTX
Microsoft Access Databases.MDB, .ACCDB
WordPerfect documents.WPD
Adobe Acrobat Documents.PDF
Photographs.JPG, .PDF
E-mailMessages should be produced in a form or forms that readily support import into standard e-mail client programs; that is, the form of production should adhere to the conventions set out in RFC 5322 (the internet e-mail standard).   For Microsoft Exchange or Outlook messaging, .PST format will suffice.  Single message production formats like .MSG or .EML may be furnished, if source foldering data is preserved and produced.  If your workflow requires that attachments be extracted and produced separately from transmitting messages, attachments should be produced in their native forms with parent/child relationships to the message and container(s) preserved and produced in a delimited text file.
Social MediaSocial media content should be collected using industry standard practices incorporating reasonable methods of authentication, including but not limited to MD5 hash values.  Social media and webpages should be produced as HTML faithful to the content and appearance of the native source, or as JPG images with a searchable, document-level files containing textual content and delimited metadata (including “likes” and comments)

3. Paper (Hard-Copy) documents or items requiring redaction shall be produced in static image formats scanned at 300 dpi e.g., single-page Group IV.TIFF or multipage PDF images. If an item uses color to convey information and not merely for aesthetic reasons, the producing party shall not produce the item in a form that does not display color. The full content of each document will be extracted directly from the native source where feasible or, where infeasible, by optical character recognition (OCR) or other suitable method to a searchable text file produced with the corresponding page image(s) or embedded within the image file.  Redactions shall be logged along with other information items withheld on claims of privilege.

4. Each item produced shall be identified by naming the item to correspond to a Bates number according to the following protocol:

i. The first three (3) characters of the filename will reflect a unique alphanumeric designation identifying the party making production.

ii. The next eight (8) characters will be a unique, consecutive numeric value assigned to the item by the producing party. This value shall be padded with leading zeroes as needed to preserve its length.

iii. The final six (6) characters are reserved to a sequence consistently beginning with a dash (-) or underscore (_) followed by a five-digit number reflecting pagination of the item when printed to paper or converted to an image format for use in proceedings or when attached as exhibits to pleadings.

iv. This format of the Bates identifier must remain consistent across all productions. The number of digits in the numeric portion and characters in the alphanumeric portion of the identifier should not change in subsequent productions, nor should spaces, hyphens, or other separators be added or deleted except as set out above.

5. If a response to discovery requires production of discoverable electronic information contained in a database, you may produce standard reports; that is, reports that can be generated in the ordinary course of business and without specialized programming.  All such reports shall be produced in a delimited electronic format preserving field and record structures and names.  If the request cannot be fully answered by production of standard reports, Producing Party should advise the Requesting Party of same so the parties may meet and confer regarding further programmatic database productions.

III. Load Files

Producing party shall furnish a delimited load file in industry-standard Opticon and Concordance formats supplying the metadata field values listed below for each item produced (to the extent the values exist and as applicable):

CUSTODIANName of person or source from which data was collected.  **Where redundant names occur, individuals should be distinguished by an initial which is kept constant throughout productions (e.g., Smith, John A. and Smith, John B.)
ALL_CUSTODIANS If deduplication employed, name(s) of any person(s) from whom the identical item was collected and deduplicated.
BEGBATESBeginning Bates Number (production number)
ENDBATESEnd Bates Number (production number)
BEGATTACHFirst Bates number of first attachment in family range
ENDATTACHLast Bates number of last attachment in family range (i.e. Bates number of the last page of the last attachment).
ATTACHCOUNTNumber of attachments to an e-mail.
ATTACHNAMESName of each individual attachment, separated by semi-colons.
PARENTBATESBEGBATES number for the parent email of a family (will not be populated for documents that are not part of a family)
ATTACHBATESBates number from the first page of each attachment
PGCOUNTNumber of pages in the document
FILENAMEOriginal filename at the point of collection, without extension of native file
FILEEXTENSIONFile extension of native file
FILEPATHFile source path for all electronically collected documents and emails, which includes location, folder name, file name, and file source extension.
NATIVEFILELINKFor documents provided in native format only
TEXTPATHFile path for OCR or Extracted Text files
CCAdditional Recipients
BCCBlind Additional Recipients
SUBJECTSubject line of e-mail. 
DATESENT (mm/dd/yyyy hh:mm:ss AM)Date Sent
EMAILDATSORT (mm/dd/yyyy hh:mm:ss AM)Sent Date of the parent email (physically top email in a chain, i.e. immediate/direct parent email)
MSGIDEmail system identifier assigned by the host email system. 
IRTIDE-mail In-Reply-To ID assigned by the host e-mail system.
CONVERSATIONIDE-mail thread identifier.
HASHVALUEMD5 Hash Value of production item
TITLETitle provided by user within the document
AUTHORCreator of a document
DATECRTD (mm/dd/yyyy hh:mm:ss AM)Creation date
LASTMODD (mm/dd/yyyy hh:mm:ss AM)Last Modified Date

The chart above describes the metadata fields to be produced in generic, commonly used terms.   You should adapt these to the specific types of electronic files you are producing to the extent such metadata fields are exist in the original ESI and can be extracted as part of the electronic data discovery process. Any ambiguity about a metadata field should be discussed with the Requesting Party prior to processing and production.