• Home
  • About
  • CRAIGBALL.COM
  • Disclaimer
  • Log In

Ball in your Court

~ Musings on e-discovery & forensics.

Ball in your Court

Category Archives: Computer Forensics

Deduplication: Why Computers See Differences in Files that Look Alike

08 Wednesday Jul 2015

Posted by craigball in Computer Forensics, E-Discovery

≈ 21 Comments

apples_orangesTexasBarToday_TopTen_Badge_SmallAn employee of an e-discovery service provider asked me to help him explain to his boss why deduplication works well for native files but frequently fails when applied to TIFF images.  The question intrigued me because it requires we dip our toes into the shallow end of cryptographic hashing and dispel a common misconception about electronic documents.

Most people regard a Word document file, a PDF or TIFF image made from the document file, a printout of the file and a scan of the printout as being essentially “the same thing.”  Understandably, they focus on content and pay little heed to form.  But when it comes to electronically stored information, the form of the data—the structure, encoding and medium employed to store and deliver content–matters a great deal.  As data, a Word document and its imaged counterpart are radically different data streams from one-another and from a digital scan of a paper printout.  Visually, they are alike when viewed as an image or printout; but digitally, they bear not the slightest resemblance. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

The Virtues of Fielding

29 Monday Jun 2015

Posted by craigball in Computer Forensics, E-Discovery

≈ 11 Comments

fieldingI am a member of the Typewriter Generation.  With pencil and ink, we stored information on paper and termed them “documents.”  Not surprisingly, members of my generation tend to think of stored information in terms of tangible and authoritative things we persist in calling “documents.”  But unlike use of the word “folder” to describe a data directory (despite the absence of any  folded thing) or the quaint shutter click made by camera phones (despite the absence of shutters), couching requests for production as demands for documents is not harmless skeuomorphism.  The outmoded thinking that electronically stored information items are just electronic paper documents makes e-discovery more difficult and costly.  It’s a mindset that hampers legal professionals as they strive toward competence in e-discovery.

Does clinging to the notion of “document” really hold us back?  I think so, because continuing to define what we seek in discovery as “documents” ties us to a two-dimensional view of four-dimensional information.  The first two dimensions of a “document” are its content, essentially what emerges when you print it to paper or an image format like TIFF.  But, ESI always implicates a third dimension, metadata and embedded content, and sometimes a fourth, temporal dimension, as we often discover different versions of information items over time.

The distinction becomes crucial when considering suitable forms of production and prompts a need to understand the concept of Fielding and Fielded Data, as well as recognize that preserving the fielded character of data is essential to preserving its utility and searchability.

Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

ESI Observations on a Pretty Good Friday

03 Friday Apr 2015

Posted by craigball in Computer Forensics, E-Discovery

≈ 8 Comments

demonbunny2Though each merit their own post, I’ve lumped two short topics TexasBarToday_TopTen_Badge_Smalltogether.  The first concerns a modest e-discovery headache, being the cost, friction and static posed by GIF logos in e-mail. The second is a much uglier vulnerability hoppin’ down the bunny trail toward you right now; but rejoice, because you may still have time to avert disaster.  Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

The Conundrum of Competence in E-Discovery: Need Input

07 Saturday Mar 2015

Posted by craigball in Computer Forensics, E-Discovery

≈ 24 Comments

NeedInputlitigationworld-200I frequently blast lawyers for their lack of competence when it comes to electronic evidence.  I’m proud to be a lawyer and admire all who toil in the fields of justice; but I cannot hide my shame at how my brilliant colleagues have shirked and dodged their duty to master modern evidence.

So, you might assume I’d be tickled by the efforts of the American Bar Association and the State Bar of California to weave technical competency into the rules of professional conduct.  And I am, a little. Requiring competence is just part of the solution to the competence crisis.  The balance comes from supplying the education and training needed to become competent.  You can’t just order someone who’s lost to “get there;” you must show them the way.  In this, the bar associations and, to a lesser extent, the law schools have not just failed; they’ve not tried to succeed.

The legal profession is dominated by lawyers and judges.  I state the obvious to expose the insidious: the profession polices itself.  We set the standards for our own, and our standard setters tend to be our old guard.  What standard setter defines himself out of competence?  Hence, it’s extraordinary that the ABA commentary to Model Rule 1.1 and the proposed California ethics opinion have emerged at all.

These laudable efforts just say “get there.”  They do not show us the way. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Do-It-Yourself Digital Discovery, Revisited

07 Saturday Feb 2015

Posted by craigball in Computer Forensics, E-Discovery

≈ 3 Comments

keep-calm-and-do-it-yourselfThis is the thirteenth in a series revisiting Ball in Your Court columns and posts from the primordial past of e-discovery–updating and critiquing in places, and hopefully restarting a few conversations.  As always, your comments are gratefully solicited.

Do-It-Yourself Digital Discovery

[Originally published in Law Technology News, May 2006]

Recently, a West Texas firm received a dozen Microsoft Outlook PST files from a client.  Like the dog that caught the car, they weren’t sure what to do next.  Even out on the prairie, they’d heard of online hosting and e-mail analytics, but worried about the cost.  They wondered: Did they really need an e-discovery vendor?  Couldn’t they just do it themselves?

As a computer forensic examiner, I blanch at the thought of lawyers harvesting data and processing e-mail in native formats.  “Guard the chain of custody,” I want to warn.  “Don’t mess up the metadata!  Leave this stuff to the experts!”  But the trial lawyer in me wonders how a solo/small firm practitioner in a run-of-the-mill case is supposed to tell a client, “Sorry, the courts are closed to you because you can’t afford e-discovery experts.”

Most evidence today is electronic, so curtailing discovery of electronic evidence isn’t an option, and trying to stick with paper is a dead end.  We’ve got to deal with electronic evidence in small cases, too.  Sometimes, that means doing it yourself. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Data Recovery: Lessons from Katrina, Revisited

31 Saturday Jan 2015

Posted by craigball in Computer Forensics, General Technology Posts

≈ 2 Comments

wet HDDThis is the twelfth in a series revisiting Ball in Your Court columns and posts from the primordial past of e-discovery–updating and critiquing in places, and hopefully restarting a few conversations.  As always, your comments are gratefully solicited.

Data Recovery: Lessons from Katrina

[Originally published in Law Technology News, April 2006]

When the sea reclaimed New Orleans and much of the Gulf Coast, hundreds of lawyers saw their computers and networks submerged.  Rebuilding law practices entailed Herculean efforts to resurrect critical data stored on the hard drives in sodden machines.

Hard drives operate within such close tolerances that a drop of water or particle of silt that works its way inside can cripple them; yet, drives aren’t sealed mechanisms.  Because we use them from the beach to the mountains, drives must equalize air pressure through filtered vents called “breather holes.”  Under water, these breather holes are like screen doors on a submarine.  When Hurricane Katrina savaged thousand of systems, those with the means and motivation turned to data recovery services for a second chance. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Locard’s Principle, Revisited

27 Tuesday Jan 2015

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ 3 Comments

ShellbagsThis is the tenth in a series revisiting Ball in Your Court columns and posts from the primordial past of e-discovery–updating and critiquing in places, and hopefully restarting a few conversations.  As always, your comments are gratefully solicited.

Locard’s Principle

[Originally published in Law Technology News, February 2006]

Devoted viewers of the TV show “CSI” know about Locard’s Exchange Principle: the theory that anyone entering a crime scene leaves something behind or takes something away.  It’s called cross-transference, and though it brings to mind fingerprints, fibers and DNA, it applies to electronic evidence, too.  The personal computer is Grand Central Station for smart phones, thumb drives, MP3 players, CDs, floppies, printers, scanners and a bevy of other gadgets.  Few systems exist in isolation from networks and the Internet.  When these connections are used for monkey business like stealing proprietary data, the electronic evidence left behind or carried away can tell a compelling story. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

The Path to E-Mail Production IV, Revisited

22 Thursday Jan 2015

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ 1 Comment

path of email-4This is the ninth in a series revisiting Ball in Your Court columns and posts from the primordial past of e-discovery–updating and critiquing in places, and hopefully restarting a few conversations.  As always, your comments are gratefully solicited.

The Path to Production: Are We There Yet?

(Part IV of IV)

[Originally published in Law Technology News, January 2006]

The e-mail’s assembled and accessible.  You could begin review immediately, but unless your client has money to burn, there’s more to do before diving in: de-duplication. When Marge e-mails Homer, Bart and Lisa, Homer’s “Reply to All” goes in both Homer’s Sent Items and Inbox folders, and in Marge’s, Bart’s and Lisa’s Inboxes.  Reviewing Homer’s response five times is wasteful and sets the stage for conflicting relevance and privilege decisions.

Duplication problems compound when e-mail is restored from backup tape.  Each tape is a snapshot of e-mail at a moment in time.  Because few users purge mailboxes month-to-month, one month’s snapshot holds nearly the same e-mail as the next.  Restore a year of e-mail from monthly backups, and identical messages multiply like rabbits. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

The Path to E-Mail Production III, Revisited

21 Wednesday Jan 2015

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ 1 Comment

path of email-3This is the eighth in a series revisiting Ball in Your Court columns and posts from the primordial past of e-discovery–updating and critiquing in places, and hopefully restarting a few conversations.  As always, your comments are gratefully solicited.

The Path to Production: Harvest and Population

(Part III of IV)

[Originally published in Law Technology News, December 2005]

On the path to production, we’ve explored e-mail’s back alleys and trod the mean streets of the data preservation warehouse district.  Now, let’s head to the heartland for harvest time.  It’s data harvest time.

After attorney review, data harvest is byte-for-byte the costliest phase of electronic data discovery.  Scouring servers, local hard drives and portable media to gather files and metadata is an undertaking no company wants to repeat because of poor planning.

The Harvest
Harvesting data demands a threshold decision: Do you collect all potentially relevant files, then sift for responsive material, or do you separate the wheat from the chaff in the field, collecting only what reviewers deem responsive?  When a corporate defendant asks employees to segregate responsive e-mail, (or a paralegal goes from machine-to-machine or account-to-account selecting messages), the results are “field filtered.” Today, we’d call this “targeted collection.”

Field filtering holds down cost by reducing the volume for attorney review, but it increases the risk of repeating the collection effort, loss or corruption of evidence and inconsistent selections.  If keyword or concept searches alone are used to field filter data, the risk of under-inclusive production skyrockets.

Initially more expensive, comprehensive harvesting (unfiltered but defined by business unit, locale, custodian, system or medium), saves money when new requests and issues arise.  A comprehensive collection can be searched repeatedly at little incremental expense, and broad preservation serves as a hedge against spoliation sanctions.  Companies embroiled in serial litigation or compliance production benefit most from comprehensive collection strategies.

A trained reviewer “picks up the lingo” as review proceeds, but a requesting party can’t frame effective keyword searches without knowing the argot of the opposition.  Strategically, a producing party requires an opponent to furnish a list of search terms for field filtering and seeks to impose a “one list, one search” restriction.  The party seeking discovery must either accept inadequate production or force the producing party back to the well, possibly at the requesting party’s cost. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

The Path to E-Mail Production I, Revisited

19 Monday Jan 2015

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ Comments Off on The Path to E-Mail Production I, Revisited

path of emailThis is the sixth in a series revisiting Ball in Your Court columns and posts from the primordial past of e-discovery–updating and critiquing in places, and hopefully restarting a few conversations.  As always, your comments are gratefully solicited.

The Path to E-Mail Production

(Part I of IV)

[Originally published in Law Technology News, October 2005]

Asked, “Is sex dirty,” Woody Allen quipped, “Only if it’s done right.”  That’s electronic discovery: if it’s ridiculously expensive, enormously complicated and everyone’s lost sight of the merits of the case, you’re probably doing it right.

But it doesn’t have to be that way.  Over the next four days, we’ll walk a path to production of e-mail — perhaps the trickiest undertaking in EDD.  The course we take may not be the shortest or easiest, but that’s not the point.  We’re trying to avoid stepping off a cliff.  Not every point is suited to every production effort, but all deserve consideration.

Think Ahead
EDD missteps are painfully expensive, or even unredeemable, if data is lost. Establish expectations at the outset.

Will the data produced:

  • Integrate paper and electronic evidence?
  • Be electronically searchable?
  • Preserve all relevant metadata from the host environment?
  • Be viewable and searchable using a single application, such as a web browser?
  • Lend itself to Bates numbering?
  • Be easily authenticable for admission into evidence?

Meeting these expectations hinges on what you collect along the way through identification, preservation, harvest and population. Continue reading →

Share this:

  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...
← Older posts
Newer posts →
Follow Ball in your Court on WordPress.com

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 2,231 other subscribers

Recent Posts

  • A Master Table of Truth November 4, 2025
  • Kaylee Walstad, 1962-2025 August 19, 2025
  • Native or Not? Rethinking Public E-Mail Corpora for E-Discovery (Redux, 2013→2025) August 16, 2025
  • Still on Dial-Up: Why It’s Time to Retire the Enron Email Corpus August 15, 2025
  • Chambers Guidance: Using AI Large Language Models (LLMs) Wisely and Ethically June 19, 2025

Archives

RSS Feed RSS - Posts

CRAIGBALL.COM

Helping lawyers master technology

Categories

EDD Blogroll

  • eDiscovery Journal (Greg Buckles)
  • Sedona Conference
  • The Relativity Blog
  • Complex Discovery (Rob Robinson)
  • GLTC (Tom O'Connor)
  • Minerva 26 (Kelly Twigger)
  • eDiscovery Today (Doug Austin)
  • E-D Team (Ralph Losey)
  • Illuminating eDiscovery (Lighthouse)
  • E-Discovery Law Alert (Gibbons)
  • CS DISCO Blog
  • Basics of E-Discovery (Exterro)
  • Corporate E-Discovery Blog (Zapproved )

Admin

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Enter your email address to follow Ball in Your Court and receive notifications of new posts by email.

Website Powered by WordPress.com.

  • Subscribe Subscribed
    • Ball in your Court
    • Join 2,083 other subscribers
    • Already have a WordPress.com account? Log in now.
    • Ball in your Court
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d