• Home
  • About
  • CRAIGBALL.COM
  • Disclaimer
  • Log In

Ball in your Court

~ Musings on e-discovery & forensics.

Ball in your Court

Monthly Archives: January 2014

Query the Quintessential Quintet

20 Monday Jan 2014

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts

≈ 1 Comment

fab5judges

On Wednesday, February 5, 2014 at 9:00am, I’m moderating a plenary session at LegalTech New York where the panelists are a veritable Mount Olympus of e-discovery leaders from the federal bench: John Facciola, James Francis, Andrew Peck, Lee Rosenthal and Shira Scheindlin.  I can hardly imagine a more quintessential quintet of rare knowledge and eloquence!  Kudos to ALM educational coordinator, Judy Kelly, for deftly getting them all to commit.

The judges will be discussing some of what you might expect, e.g., proposed Rules amendments, predictive coding, Rule 502 and expectations of lawyer technical competence.  We will also be exploring a few fresh issues, like the impact all those little screens are having on everyone in and out of court.

There’s still time to add topics and questions of interest to you to the program; so, if you have questions you’d pose or topics you’d explore, please share them here as a comment (or e-mail them to me: craig at ball dot net), and I’ll try to work them in.  Hope to see you in New York!

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Print (Opens in new window) Print
  • Share on X (Opens in new window) X
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Thanks. Can You Do Me a Favor Please?

19 Sunday Jan 2014

Posted by craigball in Computer Forensics, E-Discovery, General Technology Posts, Personal, Uncategorized

≈ 5 Comments

Sorry to take your time asking for help. so I’ll be quick about it.

But first, thank you.  Thanks to you, dear reader, this blog and its 85 posts reached 100,000 views a few days ago.  That’s nothing compared to the millions of page views others see, but it’s very gratifying to me because I launched this blog without saying a word to anyone.  Somehow, you just found it.  Ball in Your Court is an outlet born of frustration with the two-month publication lag attendant to my former print column and the sudden shuttering of an American Lawyer Media blog where I’d previously posted.  I wanted a place where no one could pull the plug but you or me.  This blog is a very personal connection to you.

The favor I ask is this:  if you like the content here or find it of some value, please share it with someone you think might be interested.  If you have a blog or site with a blogroll, please consider adding Ball in Your Court to your blogroll.  I will try to earn my place on your page and in your day.  Thanks.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Print (Opens in new window) Print
  • Share on X (Opens in new window) X
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Forms that Function

16 Thursday Jan 2014

Posted by craigball in E-Discovery

≈ 3 Comments

forms that functionOver the course of the last decade, it’s been a Sisyphean task to get lawyers to lay aside rigid ideas about forms of production in e-discovery and focus on selecting forms that function. 

“Forms that function.”  Forms of production that work.

Ever since the demanding class, “Architecture for Non-Architects” at Rice University, I’ve been a wannabe architect, and the battle cry, “form follows function,” my mantra.  It’s ascribed to Louis Sullivan, legendary American architect and Father of the Skyscraper.  “Form follows function” fairly defines what we think of as “modern,” and it’s a credo at the heart of the clearest idea I’ve had in a while, being that we should produce e-mail in forms that can be made to function in common e-mail client programs like Microsoft Outlook.

I don’t point to Outlook because I think it a suitable review platform for ESI (I don’t, though many use it that way).  I point to Outlook because it’s ubiquitous and, if a message is produced in a form that can be imported into Outlook, it’s a form likely to be searchable, sortable, utile and complete.  More, it’s a form that anyone can assimilate into whatever review platform they wish at lowest cost.

The criterion, “Will the form produced function in an e-mail client?” enables parties to explore a broad range of functional native and near-native forms, not just PSTs.  It an objective “acid test” to determine if e-mail will be produced in a reasonably usable form; that is, a form not too far degraded from the way the data is used by the parties and witnesses in the ordinary course.

Forms that Function retain essential features like Fielded Data, allowing users to reliably sort messages by date, sender, recipients and subject, as well as Message IDs, supporting the threading of messages into coherent conversations.  Forms that Function supply the UTC Offset Data within e-mails that allows messages originating from different time zones and using different Daylight Savings Time settings to be normalized across an accurate timeline. Forms that Function don’t disrupt the Family Relationships between messages and attachments.  Forms that Function are inherently electronically searchable.

Best of all, producing Forms that Function means that all parties receive data in a form that anyone can use in any way they choose, visiting the costs of converting to alternate forms on the parties who want those alternate forms and not saddling parties with forms so degraded that they are functionally fractured and broken.

If you are a requesting party, don’t be bamboozled by an alphabet soup of file extensions when it comes to e-mail production (PST, OST, MSG, EML, DBX, NSF, MHTML, TIFF, PDF, RTF, TXT, DAT, XML).  Instead, tell the other side, “I want Forms that Function.  If it can be imported into Microsoft Outlook and work, that form will be fine by me.”

If the other side says, “We will pull all that information out of the messages and give it to you in a load file,” say, “No thanks, leave it where it lays, and give it to me in a Form that Functions!“

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Print (Opens in new window) Print
  • Share on X (Opens in new window) X
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Revisiting ‘How Many Documents in a Gigabyte?’

15 Wednesday Jan 2014

Posted by craigball in Computer Forensics, E-Discovery

≈ 6 Comments

equalI once wrote a column titled “Page Equivalency and Other Fables.”  It lambasted lawyers who larded their burden arguments with bogus page equivalencies like, “everyone knows a gigabyte of data equates to a pile of printed pages that would reach from Uranus to Earth.”  We still see wacky page equivalencies, and “from Uranus” still aptly describes their provenance.

Back in 2007, I wrote, “It’s comforting to quantify electronically stored information as some number of pieces of paper or bankers’ boxes.  Paper and lawyers are old friends.  But you can’t reliably equate a volume of data with a number of pages unless you know the composition of the data.  Even then, it’s a leap of faith.”

So, I’m happy to point you to some notable work by my friend, John Tredennick.  I’ve known John since the emerging technology was fire and watched with awe and admiration as John transitioned from old-school trial lawyer to visionary forensic technology entrepreneur running e-discovery service provider, Catalyst.  John is as close to a Renaissance man as anyone I know in e-discovery, and when John speaks, I listen.

Lately, John Tredennick shared some revealing metrics on the Catalyst blog looking at the relationship between data and document volumes, an update to his 2011 article called, How Many Documents in a Gigabyte?  John again examines document volumes seen in the data that Catalyst receives and processes for its customers and, crucially, parses the data by file type.  As the results bear out, the forms of the data still make an enormous difference in terms of data volume.  Even as between documents we think of as being “the same” (like Word .doc and .docx formats), the differences are striking.

For example, John’s data suggests that there are almost 60% more documents in a gigabyte of Word files in the .docx format (7,085) than in a gigabyte of files stored in the predecessor .doc format (4,472).  This makes sense because the newer .docx format incorporates zip compression, and text is highly compressible data.

[One exercise I require of the law students in my E-discovery class is to look at the file header of a Word .docx file to note its binary signature, PK, characteristic of a zip-compressed file and short for Phil Katz, author of the zip compression algorithm.  For grins, you can change the file extension of a .docx file to .zip and open it to see what a Word document really looks like under the hood.  Hint: it’s in XML].

John reports a similar discrepancy between new and old Excel spreadsheet formats (1,883 .xlsx files per gigabyte versus 1,307 for .xls).  Here again, the .xlsx format builds in zip compression.

But, the results are reversed when it comes to PowerPoint presentations, with John finding that there are marginally fewer of the newer .pptx files in a gigabyte (505) than the older .ppt format files (580).  This makes sense to me because Microsoft phased out the .doc format ten years ago.  Since then, presenters have gotten better about adding visual enhancements to deadly-dull PowerPoints, and they tend to add ‘fatter’ components like video clips.  The biggest factor is that pictures are highly incompressible, and common image formats (i.e., .jpg images) have always been compressed.  Compressing data that’s already compressed tends to increase, not decrease its size.

Wisely, John speaks only of document volumes and makes no effort to project page equivalencies, not even by extrapolating some postulated ‘average-pages-per-file type.’  Anything like that would be as insupportable today as it was when I wrote about it in 2007.  Also, when you look at John’s post, note that there is no data supplied concerning TIFF images.  I’m not sure why, but I can promise you this: TIFF images are MUCH fatter files, costing far more in terms of storage space and ingestion costs than their native counterparts.  Had John added TIFF to the mix, I’m confident his weighted averages would have been much different…and far less useful–much like TIFF images as a form of production. 😉

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Print (Opens in new window) Print
  • Share on X (Opens in new window) X
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
Like Loading...
Follow Ball in your Court on WordPress.com

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 2,233 other subscribers

Recent Posts

  • A Dog and Its Tail: Don’t Let Version Uncertainty Cloud Linked Attachment Production April 2, 2026
  • The EDRM Isn’t Broken; It’s Misunderstood. March 18, 2026
  • Detecting Deep Fakes February 24, 2026
  • A Fun Way to Build AI Fluency February 21, 2026
  • Electronic Evidence Workbook 2026 February 18, 2026

Archives

RSS Feed RSS - Posts

CRAIGBALL.COM

Helping lawyers master technology

Categories

EDD Blogroll

  • GLTC (Tom O'Connor)
  • Illuminating eDiscovery (Lighthouse)
  • Complex Discovery (Rob Robinson)
  • eDiscovery Journal (Greg Buckles)
  • eDiscovery Today (Doug Austin)
  • CS DISCO Blog
  • Minerva 26 (Kelly Twigger)
  • E-Discovery Law Alert (Gibbons)
  • E-D Team (Ralph Losey)
  • The Relativity Blog
  • Sedona Conference
  • Basics of E-Discovery (Exterro)
  • Corporate E-Discovery Blog (Zapproved )

Admin

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Enter your email address to follow Ball in Your Court and receive notifications of new posts by email.

Website Powered by WordPress.com.

  • Subscribe Subscribed
    • Ball in your Court
    • Join 2,085 other subscribers
    • Already have a WordPress.com account? Log in now.
    • Ball in your Court
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d