Dem Phones, Dem Phones, Dem iPhones

29 Monday Sep 2014

Posted by craigball in Computer Forensics, E-Discovery

I am not a dinosaur. Except that I prefer e-mail to texting, and I forget that my students have never used a record player or lived without the Internet. I’m not near the national average of 14 daily visits to Facebook, and I’ve yet to text a photo of my genitals–a practice so routine that it has a name, “junk shots” and its very own app, “Snapchat.” When I need to know how to turn off a nagging dashboard light, I prefer written instructions over YouTube, and I do not video every concert and papal investiture I attend. I still have two landline phone numbers.

Omigosh! That last one. I AM a dinosaur!

According to the U.S. Center for Disease Control, more than 41% of American households have no landline phone, relying on wireless service alone. For those between the ages of 25 and 29, two-thirds are wireless-only. Per an IDC report sponsored by Facebook, four out of five people start using their smartphones within 15 minutes of waking up and, for most, it’s the very first thing they do, ahead of brushing their teeth or answering nature’s call.

I cite these astonishing statistics to underscore a tendency in e-discovery to seek information in those places where we’ve grown comfortable despite compelling evidence that relevant information is elsewhere. I’ve written on this “Streetlight Effect” before (at p. 252 of this collection of articles), in the context of the blind eye long turned to shortcomings of keyword search. The latest manifestation is graver still, and will make for a perilous future if we do not rise to the challenge now.

I speak of the rapid accretion of unique, relevant data on mobile devices that has greatly outstripped our ability (or willingness) to preserve and process same. Look around you. Do you see the look down generation out there? Why do you suppose the person in front of you on the jetway is walking so #$%^& slowly?

Apple just sold ten million units of its latest iPhone. Ten million. In a week. How many of those purchasers sought a better device for making phone calls? Did Apple even hint it had improved the phone as a phone? No siree, Bob! Continue reading →

A Guide to Forms of Production

19 Monday May 2014

Posted by craigball in Computer Forensics, E-Discovery, Uncategorized

≈ 6 Comments

Semiannually, I compile a primer on some key aspect of electronic discovery. In the past, I’ve written on computer forensics, backup systems, metadata and databases. For 2014, I’ve completed the first draft of the Lawyers’ Guide to Forms of Production, intended to serve as a primer on making sensible and cost-effective specifications for production of electronically stored information. It’s the culmination and re-purposing of much that I’ve written on forms heretofore, along with new material extolling the advantages of native and near-native forms.

Reviewing the latest draft, there is much I want to add and re-organize; accordingly, it will be a work-in-progress for months to come. Consider it a “public comment” version. The linked document includes exemplar verbiage for requests and model protocols for your adaption and adoption. I plan to add more forms and examples. Continue reading →

Becoming a Better Digital Forensics Witness

03 Monday Mar 2014

Posted by craigball in Computer Forensics

≈ 14 Comments

I love to testify—in court, at deposition, in declarations and affidavits—and I even like writing reports about my findings in forensic exams.

I love the challenge—the chance to mix it up with skilled interrogators, defend my opinions and help the decision makers hear what the electronic evidence tells us. There is a compelling human drama being played out in those bits and bytes, and computer forensic examiners are the fortunate few who get to tell the story. It’s our privilege to help the finders of fact understand the digital evidence.[1]

This post is written for computer forensic examiners and outlines ways to become a more effective witness and avoid common pitfalls. But the advice offered applies as well to almost anyone who takes the stand. Continue reading →

Query the Quintessential Quintet

20 Monday Jan 2014

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ 1 Comment

On Wednesday, February 5, 2014 at 9:00am, I’m moderating a plenary session at LegalTech New York where the panelists are a veritable Mount Olympus of e-discovery leaders from the federal bench: John Facciola, James Francis, Andrew Peck, Lee Rosenthal and Shira Scheindlin. I can hardly imagine a more quintessential quintet of rare knowledge and eloquence! Kudos to ALM educational coordinator, Judy Kelly, for deftly getting them all to commit.

The judges will be discussing some of what you might expect, e.g., proposed Rules amendments, predictive coding, Rule 502 and expectations of lawyer technical competence. We will also be exploring a few fresh issues, like the impact all those little screens are having on everyone in and out of court.

There’s still time to add topics and questions of interest to you to the program; so, if you have questions you’d pose or topics you’d explore, please share them here as a comment (or e-mail them to me: craig at ball dot net), and I’ll try to work them in. Hope to see you in New York!

Thanks. Can You Do Me a Favor Please?

19 Sunday Jan 2014

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts, Personal, Uncategorized

≈ 5 Comments

Sorry to take your time asking for help. so I’ll be quick about it.

But first, thank you. Thanks to you, dear reader, this blog and its 85 posts reached 100,000 views a few days ago. That’s nothing compared to the millions of page views others see, but it’s very gratifying to me because I launched this blog without saying a word to anyone. Somehow, you just found it. Ball in Your Court is an outlet born of frustration with the two-month publication lag attendant to my former print column and the sudden shuttering of an American Lawyer Media blog where I’d previously posted. I wanted a place where no one could pull the plug but you or me. This blog is a very personal connection to you.

The favor I ask is this: if you like the content here or find it of some value, please share it with someone you think might be interested. If you have a blog or site with a blogroll, please consider adding Ball in Your Court to your blogroll. I will try to earn my place on your page and in your day. Thanks.

Revisiting ‘How Many Documents in a Gigabyte?’

15 Wednesday Jan 2014

Posted by craigball in Computer Forensics, E-Discovery

≈ 6 Comments

I once wrote a column titled “Page Equivalency and Other Fables.” It lambasted lawyers who larded their burden arguments with bogus page equivalencies like, “everyone knows a gigabyte of data equates to a pile of printed pages that would reach from Uranus to Earth.” We still see wacky page equivalencies, and “from Uranus” still aptly describes their provenance.

Back in 2007, I wrote, “It’s comforting to quantify electronically stored information as some number of pieces of paper or bankers’ boxes. Paper and lawyers are old friends. But you can’t reliably equate a volume of data with a number of pages unless you know the composition of the data. Even then, it’s a leap of faith.”

So, I’m happy to point you to some notable work by my friend, John Tredennick. I’ve known John since the emerging technology was fire and watched with awe and admiration as John transitioned from old-school trial lawyer to visionary forensic technology entrepreneur running e-discovery service provider, Catalyst. John is as close to a Renaissance man as anyone I know in e-discovery, and when John speaks, I listen.

Lately, John Tredennick shared some revealing metrics on the Catalyst blog looking at the relationship between data and document volumes, an update to his 2011 article called, How Many Documents in a Gigabyte? John again examines document volumes seen in the data that Catalyst receives and processes for its customers and, crucially, parses the data by file type. As the results bear out, the forms of the data still make an enormous difference in terms of data volume. Even as between documents we think of as being “the same” (like Word .doc and .docx formats), the differences are striking.

For example, John’s data suggests that there are almost 60% more documents in a gigabyte of Word files in the .docx format (7,085) than in a gigabyte of files stored in the predecessor .doc format (4,472). This makes sense because the newer .docx format incorporates zip compression, and text is highly compressible data.

[One exercise I require of the law students in my E-discovery class is to look at the file header of a Word .docx file to note its binary signature, PK, characteristic of a zip-compressed file and short for Phil Katz, author of the zip compression algorithm. For grins, you can change the file extension of a .docx file to .zip and open it to see what a Word document really looks like under the hood. Hint: it’s in XML].

John reports a similar discrepancy between new and old Excel spreadsheet formats (1,883 .xlsx files per gigabyte versus 1,307 for .xls). Here again, the .xlsx format builds in zip compression.

But, the results are reversed when it comes to PowerPoint presentations, with John finding that there are marginally fewer of the newer .pptx files in a gigabyte (505) than the older .ppt format files (580). This makes sense to me because Microsoft phased out the .doc format ten years ago. Since then, presenters have gotten better about adding visual enhancements to deadly-dull PowerPoints, and they tend to add ‘fatter’ components like video clips. The biggest factor is that pictures are highly incompressible, and common image formats (i.e., .jpg images) have always been compressed. Compressing data that’s already compressed tends to increase, not decrease its size.

Wisely, John speaks only of document volumes and makes no effort to project page equivalencies, not even by extrapolating some postulated ‘average-pages-per-file type.’ Anything like that would be as insupportable today as it was when I wrote about it in 2007. Also, when you look at John’s post, note that there is no data supplied concerning TIFF images. I’m not sure why, but I can promise you this: TIFF images are MUCH fatter files, costing far more in terms of storage space and ingestion costs than their native counterparts. Had John added TIFF to the mix, I’m confident his weighted averages would have been much different…and far less useful–much like TIFF images as a form of production. 😉

Transparency of Process No Peril to Work Product

16 Monday Dec 2013

Posted by craigball in Computer Forensics, E-Discovery, Uncategorized

≈ 13 Comments

I’m rarely moved to criticize the work of other commentators because, even when I don’t share their views, I applaud the airing of the issues their efforts bring. But sometimes a proposition is just so blatantly ill-advised, so prone to unfairly tilt the litigation playing field, that any reader and every writer should stop and say, “Wait a second….” One such article, currently running in the New York Law Journal and called No Disclosure: Why Search Terms Are Worthy of Court’s Protection, charges that judges who require disclosure of search terms “discount or misunderstand” what the authors term the “protected nature of key aspects of the e-discovery process,” namely filtering of data by use of search terms. The authors think that disclosure of search terms used to exclude data from disclosure compromises the work product privilege and argue that judges should “recognize that a search term is more than a collection of words, rather, the culmination of an attorney’s interaction with the facts of the case.”

Espousing the sanctity of work product privilege to an audience of litigators is like saying, “I support our troops.” It’s mom, baseball and apple pie. It’s also popular to paint judges as addled abusers of discretion. But let’s not let jingoism displace judgment. Search terms are precisely what the authors claim they are not: search terms are a collection of words. They are lexical filters. Nothing more.

Search terms deserve no more protection from disclosure than date ranges, file types and other mechanical means employed to exclude data from scrutiny. Search terms strip out information that will never see the light of day nor benefit from the application of lawyer judgment as to their relevance. In that sense, search terms are anathema to the core principles of work product and warrant more, not less, scrutiny. Continue reading →

It’s the Parties’ Data, Stupid!

03 Tuesday Dec 2013

Posted by craigball in Computer Forensics, E-Discovery

≈ 4 Comments

As the curtain comes down on 2013, I’m reflecting on where the weeks went. This was the year of fights about forms; months spent endeavoring to persuade courts, opponents (and even my clients) that lawyers and judges have been peering into the wrong end of the telescope when it comes to forms of production. We must stop focusing on the feeble forms lawyers use for review, and concentrate on the robust forms that parties use for everything else.

In discovery and disclosure we seek information from parties and third-parties. We want the data used and created by, for and about parties and third-parties relating to the actions they took or didn’t take. We don’t pursue discovery/disclosure against the lawyers in the case. If we tried, our efforts would be confounded by claims of attorney-client privilege and attorney work product. Apart from pro se lawyers with fools for clients, attorneys aren’t parties, and attorneys aren’t witnesses. The forms your opposing counsel uses for review shouldn’t matter. Discovery and disclosure is party-centric, not attorney-centric.

Ask parties about the forms of ESI they use daily and it’s doubtful you’ll hear a peep about TIFF images or load files. Parties don’t use that junk; only Luddite lawyers do. Clients use spreadsheet programs, word processors, mail and messaging applications and databases, to name a few. When they create, communicate and collaborate, they do it using forms geared to native applications with file extensions like .XLSX, .DOCX, .PPTX, .MSG, etc. They choose and use functional and complete native and near-native forms. Those are the forms witnesses consult to reconstruct events and refresh their memories. Those are the forms witnesses recognize at deposition and in trial. Continue reading →

Collecting Gmail for Preservation

29 Tuesday Oct 2013

Posted by craigball in Computer Forensics, E-Discovery

≈ 15 Comments

I’m surprised how frequently I’m engaged to collect the contents of Gmail accounts in e-discovery, especially when the account is being collected solely for preservation, and there’s no compelling reason to entrust the task to a neutral. I appreciate that hiring an expert offers greater assurance that the task will be approached with skill and experience, as well as that integrity of process can be supported by the testimony of someone unconnected with the client or law firm. But, though collecting and validating the complete contents of a Gmail account can be tricky and tedious, it’s not all that difficult to do. Happily, unless you do something really dumb, it’s unlikely that even a botched Gmail collection effort will harm the contents of the account.

For those seeking a low-cost, defensible mechanism to preserve Gmail content, this (long, dry) post lays out a detailed methodology for collection and preservation of the contents of a Gmail webmail account in the static form of a standard Outlook PST container file. I will address various technical considerations, but few legal ones. Whether or not the methods described in this post are legally sufficient in your case or compliant with Gmail’s terms of service is not my call, and I offer no opinions about same.

[NOTE TO READERS 10/14/14: When I wrote this post, there was not yet a backup capability built into Gmail. Google now makes data tools available that support the creation of a rich archive of a user’s Google content, including, Gmail, Contacts, Calendar and Google Drive. You can find it the Archive section of https://www.google.com/settings/datatools when logged into Google and can read more about it here.]

Continue reading →

4 Sale: Fixer Upper in Potemkin Village

13 Friday Sep 2013

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ 5 Comments

This morning, as I so often do, I met with some nice folks touting a new e-discovery product. As we talked, I couldn’t help but recall Lover Come Back, a goofy Mad Men-era flick about an ad executive who mounts a glitzy campaign for a product that doesn’t exist. The movie starred Rock Hudson, Doris Day and Tony Randall, and was fun; the product briefing less so.

Without offering sufficient detail to identify the product, let me say that it’s one of those that come on the scene before every ILTA or LegalTech, with catchy names, slick brochures and ambitious development timelines. These upstarts claim to offer groundbreaking features and pricing that always turn out to be much the same groundbreaking features and pricing offered by last year’s new kid on the block. Names we recognize from other products and vendors attach themselves to these ventures, and it all seems like an honest-to-goodness business save for one teeny tiny wrinkle: the promised product doesn’t exist.

Behind the scenes of this powerful end-to-end dynamo are people using a competitor’s tool and painstakingly positioning the output so that it seems like the product really delivers. It’s not meant to deceive because beneath the marketing lies a heartfelt intent to build the product as soon as enough people commit to buy it and cash begins to flow. In this field of dreams, if they come, we will build it.

I don’t know. Maybe this is how great products are born nowadays. Perhaps it’s all about hype, and it doesn’t matter if the product follows the deal or the deal follows the product. But, I don’t think a product pitch should recall Empress Catherine II admiring the false fronts of Disneyesque villages erected by her lover, Potemkin, or of late, the photos of thriving businesses placed in vacant storefronts to downplay economic doldrums to those attending the 2013 G8 Summit in Enniskillen, Northern Ireland.

Vendors: I like to look at your products, I really do. I ask this of you in return. If you are going to show me something, it should exist now, not “maybe in the next release.” If you claim your product can do something, it should be able to do it, and not only in a contrived demo against a handful of sanitized Enron documents. Your pricing should be clear and reflect real world experience, not the costs paid by those who don’t need you to actually do anything. And if you can’t direct me to a satisfied customer who regularly uses your product, don’t tell me it’s because you’re guarding client confidentiality. Instead, please change my litter, fill my water bottle and put pellets in my dish, so I can get back to being a guinea pig.

Ball in your Court

~ Musings on e-discovery & forensics.

Category Archives: Computer Forensics

Dem Phones, Dem Phones, Dem iPhones

A Guide to Forms of Production

Becoming a Better Digital Forensics Witness

Query the Quintessential Quintet

Thanks. Can You Do Me a Favor Please?

Revisiting ‘How Many Documents in a Gigabyte?’

Transparency of Process No Peril to Work Product

It’s the Parties’ Data, Stupid!

Collecting Gmail for Preservation

4 Sale: Fixer Upper in Potemkin Village

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: