Earlier this week, I did a webcast on “data mapping.” Data mapping is one of those nimble e-discovery buzz words–like ECA and Predictive Coding–that takes on any meaning the fertile minds in the Marketing Department care to ascribe.

I use “data mapping” to encompass methods used to memorialize the identification of ESI–an essential prerequisite to everything in the EDRM east of Information Management. Of course, like Nessie and Bigfoot, Information Management is something many believe exists but no one has ever shown to be anything but a myth. Consequently, identification of ESI, viz. data mapping, is the de facto entry point for all things e-discovery.

Data mapping is an unfortunate moniker because it suggests the need to generate a graphical representation of ESI sources, leading many to assume a data map is synonymous with those Visio-style network diagrams IT departments use to depict, inter alia, hardware deployments and IP addresses.

Unless created expressly for e-discovery, few companies have any diagram approaching what’s required to serve as an EDD data map. Neither network diagrams from IT nor retention schedules from Records and Information Management are alone sufficient to serve as an EDD data map, but they contribute valuable information; clues, if you will, to where the ESI resides.

Thus, a data “map” isn’t often a map or diagram, though both are useful ways to organize the information. A data map is likely a list, table, spreadsheet or database. I tend to use Excel spreadsheets because it’s easier to run totals. A data map can also be a narrative. John Collins, a J.D. with The Ingersoll Firm in Illinois recently shared a data map in the form of a 64-page narrative report describing an enterprise e-mail environment in exacting detail. Whatever the form employed, your client doesn’t have a data map lying around somewhere. It’s got to be built, usually from scratch.

What your data map looks like matters less than the information it contains. Again, don’t let the notion of a “map” mislead. The data map is as much about what as where. If the form chosen enables you to quickly and clearly access the information needed to implement defensible preservation, reliably project burden and accurately answer questions at meet-and-confer and in court, then it’s the right form even if it isn’t a pretty picture.


The duty to identify ESI is the most encompassing obligation in e-discovery. Think about it: You can’t act to preserve sources you haven’t found. You certainly can’t collect, review or produce them. The Federal Rules of Civil Procedure expressly impose a duty to identify all potentially responsive sources of information deemed “not reasonably accessible.” So even if you won’t search potentially responsive ESI, you’re bound to identify it.

A “data map” might be better termed an “Information Inventory.” It’s very much like the inventories that retail merchants undertake to know what’s on their shelves by description, quantity, location and value.

Creating a competent data map is also akin to compiling a history of:

  • Human resources and careers (after all, cases are still mostly about people);
  • Information systems and their evolution; and
  • Projects, facilities and tools.
A data map spans both logical and physical sources of information. Bob’s e-mail is a logical collection that may span multiple physical media. Bob’s hard drive is a physical collection that may hold multiple logical sources. Logical and physical sources may overlap, but they are rarely exactly the same thing.

As needed, a data map might encompass:

  1. Custodian and/or source of information;
  2. Location;
  3. Physical device or medium;
  4. Currency of contents;
  5. Volume (e.g., in bytes);
  6. Numerosity (e.g., how many messages and attachments?)
  7. Time span (including intervals and significant gaps)
  8. Purpose (How is the ESI resource tasked?);
  9. Usage (Who uses the resource and when?);
  10. Form; and
  11. Fragility (What are the risks it may go away?).
This isn’t an exhaustive list because the information implicated changes with the nature of the sources being inventoried. That is, you map different data for e-mail than for databases.

A data map isn’t a mindless exercise in minutiae. The level of detail is tailored to the likely relevance and materiality of the information.

Tips for Better Data Mapping

  • Custodial interviews are an essential component of a sound data map methodology; but, custodial interviews are an unreliable (and occasionally even counterproductive) facet of data mapping. Custodians will know a lot about their data that will be hard to ferret out except by questioning them. Custodians will not know (or will misstate) a lot about their data that must be supplemented (or corrected) objectively, through, e.g., search or sampling.
  • Do not become so wedded to a checklist when conducting custodial interviews that you fail to listen to the subject or use common sense. When a custodian claims they have no thumb drives or web mail accounts, don’t just move on. It’s just not so. When a custodian claims they’ve never used a home computer for work, don’t believe it without eliciting a reason to trust their statement. Remember: custodians want you out of their stuff and out of their hair. Even those acting in complete good faith will say what promotes that end. Trust, but verify.
  • Don’t be so intent on minimizing sources that you foster reticence. If you really want to find ESI, use open-ended language that elicits candor. ” Avoid leading questions. You didn’t take any confidential company data home, did you?” isn’t likely to stir a reply of “Sure, I did!” Offer an incentive to disclose (“It would really help us if you had your e-mail from 2009“).
  • Legacy hardware grows invisible, even when it’s right in front of you. A custodian can’t see the old CPU in the corner. The IT guy can’t see the box under his desk filled with backup tapes. You must bring a fresh set of eyes to the effort, and can’t be reluctant to say, “What’s in there?” or “Let me see please.” Don’t be blind leading the blind.
  • Companies don’t just buy costly systems and software and expense it. They have to amortize the cost over time and maintain amortization and depreciation schedules. Accordingly, the accounting department’s records can be a ready means to identify systems, mobile devices and even pricey software applications that are all paths to ESI sources.

Three Pressing Points to Ponder

if you take nothing else away from this post, please consider these three closing comments:

  1. Accountability is key every step of the way. If someone says, “that’s gone, ” be sure to note who made the representation and test its accuracy. Get their skin in the game. Ultimately, building the data map needs to be one person’s hands-on, buck-stops-here responsibility, and that person needs to give a hot damn about the quality of their work. Make it a boots-on-the-ground duty devolving on someone with the ability, curiosity, diligence and access to get the job done.
  2. Where you start matters less than when and with whom. Don’t dither! Dive in the deep end! Go right to the über key custodians and start digging. Get eyes on offices, storerooms, closets, servers and C: drives, and go where the evidence leads.
  3. Just because your data map can’t be perfect doesn’t mean it can’t be great. Don’t fall into the trap of thinking that, because no data mapping effort can be truly complete and current, the quality of the data map doesn’t matter. Effective data mapping is the bedrock on which any sound e-discovery effort is built.

For more about data mapping in e-discovery, check out:
The Quest for eDiscovery: Creating a Data Map, by Ganesh Vednere and this handy list of data mapping resources.