Don’t Bet the Farm on Slack Space

A depiction of file slack from Ball, E-Discovery Workbook © 2020

A federal court appointed me Special Master, tasked to, in part, search the file slack space of a party’s computers and storage devices.  The assignment prompted me to reconsider the value of this once-important forensic artifact.

Slack space is the area between the end of a stored file and the end of its concluding cluster: the difference between a file’s logical and physical size. It’s wasted space from the standpoint of the computer’s file system, but it has forensic significance by virtue of its potential to hold remnants of data previously stored there.  Slack space is often confused with unallocated clusters or  free space, terms describing areas of a drive not currently used for file storage (i.e., not allocated to a file) but which retain previously stored, deleted files. 

A key distinction between unallocated clusters and slack space is that unallocated clusters can hold the complete contents of a deleted file whereas slack space cannot.  Data recovered (“carved”) from unallocated clusters can be quite large—spanning thousands of clusters—where data recovered from a stored file’s slack space can never be larger than one cluster minus one byte.  Crucially, unallocated clusters often retain a deleted file’s binary header signature serving to identify the file type and reveal the proper way to decode the data, whereas binary header signatures in slack space are typically overwritten.

A little more background in file storage may prove useful before I describe the dwindling value of slack space in forensics.

Electronic storage media are physically subdivided into millions, billions or trillions of sectors of fixed storage capacity.  Historically, disk sectors on electromagnetic hard drives were 512 bytes  in size.  Today, sectors may be much larger (e.g., 4,096 bytes).  A sector is the smallest physical storage unit on a disk drive, but not the smallest accessible storage unit.  That distinction belongs to a larger unit called the cluster, a logical grouping of sectors and the smallest storage unit a computer can read from or write to.  On Windows machines, clusters are 4,096 bytes (4kb) by default for drives up to 16 terabytes.  So, when a computer stores or retrieves data, it must do so in four kilobyte clusters.

File storage entails allocation of enough whole clusters to hold a file.  Thus, a 2kb file will only fill half a 4kb cluster–the balance being slack space.  A 13kb file will tie up four clusters, although just a fraction of the final, fourth cluster is occupied is occupied by the file.  The balance is slack space and it could hold fragments of whatever was stored there before.  Because it’s rare for files to be perfectly divisible by 4 kilobytes and many files stored are tiny, much drive space is lost to slack space.  Using smaller clusters would mean less slack space, but any efficiencies gained would come at the cost of unwieldy file tracking and retrieval.

So, slack space holds forensic artifacts and those artifacts tend to hang around a long time.  Unallocated clusters may be called into service at any time and their legacy content overwritten.  But data lodged in slack space endures until the file allocated to the cluster is deleted–on conventional “spinning” hard drives at any rate.

When I started studying computer forensics in the MS-DOS era, slack space loomed large as a source of forensic intelligence.  Yet, apart from training exercises where something was always hidden in slack, I can’t recall a matter I’ve investigated this century which turned on evidence found in slack space.  The potential is there, so when it makes sense to do it, examiners search slack using unique phrases unlikely to throw off countless false positives.

But how often does it make sense to search slack nowadays?

I’ve lately grappled with that question because it seems to me that the shopworn notions respecting slack space must be re-calibrated.  

Keep in mind that slack space holds just a shard of data with its leading bytes overwritten.  It may be overwritten minimally or overwritten extensively, but some part is obliterated, always.  Too, slack space may hold the remnants of multiple deleted files; that is, as overlapping artifacts: files written, deleted overwritten by new data, deleted again, then overwritten again (just less extensively so).  Slack can be a real mess.

Fifteen years ago, when programs stored text in ASCII (i.e., encoded using the American Standard Code for Information Interchange or simply “plain text”), you could find intelligible snippets in slack space.  But since 2007, when Microsoft changed the format of Office productivity files like Word, PowerPoint and Excel files to Zip-compressed XML formats, there’s been a sea change in how Office applications and other programs store text.  Today, if a forensic examiner looks at a Microsoft Office file as it’s written on the media, the content is compressed.  You won’t see any plain text.  The file’s contents resemble encrypted data.  The “PK” binary header signature identifying it as compressed content is gone, so how will you recognize zipped content?  What’s more, the parts of the Zip file required to decompress the snippet have likely been obliterated, too. How do you decode fragments if you don’t know the file type or the encoding schema?

The best answer I have is you throw common encodings against the slack and hope something matches up with the search terms.  More-and-more, nothing matches, even when what you seek really is in the slack space. Searches fail because the data’s encoded and invisible to the search tool.  I don’t know how searching slack stacks up against the odds of winning the lottery, but a lottery ticket is cheap; a forensic examiner’s time isn’t.

That’s just the software.  Storage hardware has evolved, too.  Drives are routinely encrypted, and some oddball encryption methods make it difficult or impossible to explore the contents of file slack.  The ultimate nail in the coffin for slack space will be solid state storage devices and features, like wear leveling and TRIM that routinely reposition data and promise to relegate slack space and unallocated clusters to the digital dung heap of history.

Taking a fresh look at file slack persuades me that it still belongs in a forensic examiner’s bag of tricks when it can be accomplished programmatically and with little associated cost.  But, before an expert characterizes it as essential or a requesting party offers it as primary justification for an independent forensic examination, I’d urge the parties and the Court to weigh cost versus benefit; that is, to undertake a proportionality analysis in the argot of electronic discovery.  Where searching slack space was once a go-to for forensic examination, it’s an also-ran now. Do it, when it’s an incidental feature of a thoughtfully composed examination protocol; but don’t bet the farm on finding the smoking gun because the old gray mare, she ain’t what she used to be!
See? I never metaphor I didn’t like.


Postscript: A question came up elsewhere about solid state drive forensics. Here was my reply:

The paradigm-changing issue with SSD forensic analysis versus conventional magnetic hard drives is the relentless movement of data by wear leveling protocols and a fundamentally different data storage mechanism. Solid state cells have a finite life measured in the number of write-rewrite cycles.

To extend their useful life, solid state drives move data around to insure that all cells are written with roughly equal frequency. This is called “wear leveling,” and it works. A consequence of wear leveling is that unallocated cells are constantly being overwritten, so SSDs do not retain deleted data as electromagnetic drives do. Wear leveling (and the requisite remapping of data) is handled by an SSD drive’s onboard electronics and isn’t something users or the operating system control or access.

Another technology, an ATA command called TRIM, is controllable by the operating system and serves to optimize drive performance by disposing of the contents of storage cell groups called “pages” that are no longer in use. Oversimplified, it’s faster to write to an empty memory page than to initiate an erasure first; so, TRIM speeds the write process by clearing contents before they are needed, in contrast to an electromagnetic hard drive which overwrites clusters without need to clear contents beforehand.

The upshot is that resurrecting deleted files by identifying their binary file signatures and “carving” their remnant contents from unallocated clusters isn’t feasible on SSD media. Don’t confuse this with forensically-sound preservation and collection. You can still image a solid state drive, but you’re not going to get unallocated clusters. Too, you won’t be interfacing with the physical media grabbing a bitstream image. Everything is mediated by the drive electronics.


Dear Reader, Sorry I’ve been remiss in posting here during the COVID crisis. I am healthy, happy and cherishing the peace and quiet of the pause, hunkered down in my circa-1880 double shotgun home in New Orleans, enjoying my own cooking far too much. Thanks to Zoom, I completed my Spring Digital Evidence class at the University of Texas School of Law, so now one day just bubbles into the next, and I’m left wondering, Where did the day go?. Every event where I was scheduled to speak or teach cratered, with no face-to-face events sensibly in sight for 2020. One possible exception: I’ve just joined the faculty of the Tulane School of Law ten minutes upriver for the Fall semester, and plan to be back in Austin teaching in the Spring. But, who knows, right? Man plans and gods laugh.

We of a certain age may all be Zooming and distancing for many months. As one who’s bounced around the world peripatetically for decades, not being constantly on airplanes and in hotels is strange…and stress-relieving. While I miss family, friends and colleagues and mourn the suffering others are enduring, I’ve benefited from the reboot, ticking off household projects and kicking the tires on a less-driven day-to-day. It hasn’t hurt that it’s been the best two months of good weather I’ve ever seen, here or anywhere. The prospect of no world travel this summer–and no break from the soon-to-be balmy Big Easy heat–is disheartening, but small potatoes in the larger scheme of things.

Be well, be safe, be kind to yourself. This, too, shall pass and as my personal theme song says, There's a Great Big Beautiful Tomorrow. Just a Dream Away.

Protect your Meetings From Zoom Bombers

Distanced by Coronavirus, lawyers and teachers are flocking to the teleconferencing platform Zoom to meet and share screens.  Zoom is also turning up as a way to emulate face-to-face social interactions ranging from AA meetings and book clubs to happy hours and rock concerts.  Last week, the Chipotle fast food chain sought to bring a little joy to COVID-stressed customers by hosting an online concert with singer/songwriter Lauv. Things didn’t go as planned, and there’s a lesson there for lawyers and others needing meeting security.

Per Tressie Lieberman, Chipotle’s VP of digital and off-premise, “As we saw large scale events begin to get cancelled, we wanted to act fast and give our fans something to get excited about despite being surrounded by negative news.”  Chipotle acted fast–too fast it seems–and assuredly gave viewers something to get excited about, though not as intended.  Chipotle was forced to pull the plug after one attendee used Zoom’s Screen Share feature to broadcast pornography to hundreds of other attendees.  ‘Zoombombing’: When Video Conferences Go Wrong New York Times, March 22, 2020

Whoever configured the Zoom meeting apparently failed to select the option that limits the ability of any meeting participant other than the host to share screens.  As a result, any attendee—including any troll logging in anonymously—could share any content they like with all other attendees.  It’s called Zoom bombing (like Photobombing) and it’s a growing disruption.  If a Zoom bomber logs in multiple times, stopping the interloper is like playing Whack-a-Mole.  The host shuts down one Zoom bombing instance only to push the Zoom bomber to the next and the next.

It’s an embarrassment that could have been avoided had the individual setting up the Zoom meeting changed a Screen Sharing option buried in the program’s settings menu, eschewing the default “All Participants” in favor of the the considerably safer “Host Only” as seen below.

This unfortunate intrusion was caused by user error, not a vulnerability in the tool.  But I’d been expecting something of a similar nature to occur since I noticed that Zoom issues every subscriber a personal Zoom meeting ID as an alternative to generating a one-time use meeting ID for every meeting. That’s a vulnerability. What it means is, if anyone learns the host’s personal Zoom meeting ID (hint: it’s the meeting number contained in the meeting invitation), anyone can attend the host’s personal meetings whether invited or not.  Of course, if the host is managing participants and keeping a close eye on headcounts, an uninvited lurker may be spotted.  If it were a meeting of many counsel in multidistrict litigation or other matters characterized by large teams, it would be easy for an opponent to log in and listen undetected. 

Here are other simple tips to secure your Zoom meetings against Zoom bombers and eavesdroppers:

1. Protect your personal Zoom meeting ID as you would your personal passwords. Never use your personal Zoom meeting ID to host a meeting.   Instead, have Zoom automatically generate a unique meeting ID for your invitations.

2. Require a meeting password.  Zoom will generate one for your invitees when you check the box.

3. Allow only authenticated users to join.  To gain entry, invited users will need to have a Zoom user account (they’re free) and log into Zoom.

4. Require participants attend with video cameras turned on, at least until the host can identify all the participants in the meeting and confirm they were invited.

5. Lock the meeting after all invited attendees have joined and prevent latecomers. To lock an ongoing meeting, click “Manage Participants,” then click “More” at the bottom of the Participants screen.  Finally, choose” Lock Meeting.”

Zoom ‘Cheat Sheet’

Thanks to the Coronavirus crisis, my 280-odd colleagues on the University of Texas Law School faculty are valiantly struggling to transpose their years of classroom skill and content to the daunting digital realm of remote instruction using Zoom teleconferencing. Zoom has been a part of the University of Texas’ Canvas learning platform for less than 48 hours, and over 3,000 professors at UT Austin have just two weeks to be ready to teach via Zoom when some 40,000 students return from an extended Spring Break. That’s just the UT Austin campus. It’s closer to a quarter of a million students and 21,000 faculty in the whole U.T. system who face this unprecedented test of their resiliency. I’m deeply proud of how hard everyone is trying to rise to the challenge.

I’ve taught classes with Zoom for years, so apart from misplacing a window now-and-then, I find Zoom simple to use and navigate. In a modest effort to help my colleagues, I prepared a one-page cheat sheet. It might help anyone trying to use Zoom to navigate Law in the Time of Cholera, I mean, Coronavirus. You can download it below, and its text follows:

HOW DO I:Keyboard Shortcut – PCKeyboard Shortcut – Mac
Mute All Students’ MicrophonesALT+MCommand⌘+Control+M
Unmute All Students’ MicrophonesALT+MCommand⌘+Control+U
Mute Instructor’s MicrophoneALT+ACommand⌘+Shift⇧ +A
Push to Talk When MutedSpacebarSpacebar
Pause or Resume RecordingALT+PCommand⌘+Shift⇧+P
Begin Screen SharingALT+Shift+SCommand⌘+Shift⇧+S
Pause or Resume Screen SharingALT+TCommand⌘+Shift⇧+T
Toggle Instructor’s Video On/OffALT+VCommand⌘+Shift⇧+V
Switch to Gallery ViewALT+F2Command⌘+Shift⇧+W
Previous/Next Group in Gallery ViewPageUp/PageDownControl+P/Control+N
End or Leave a Zoom MeetingALT+QCommand⌘+W
Switch Between Open Applications*ALT+TabCommand⌘+Tab
*Switching between open applications with the last shortcut is a quick way to get your bearings.  For a complete list of shortcuts, click your profile picture in Zoom, then Settings>Keyboard Shortcuts. NOTE: Zoom Shortcuts work when a Zoom screen is in focus.  To enable a shortcut to work globally (from any application screen), check the box “Enable Global Shortcut” alongside that shortcut in Keyboard Shortcuts.

To Begin Screen Sharing: Click the green “Share” button on the meeting menu bar or type Alt+Shift+S (PC) or Command+Shift+S (Mac).  When the Share window appears, select the source you wish to share.  You can choose from among any screen (monitor), any running application, a whiteboard or your iPhone/iPad. 

If you want to share a PowerPoint  presentation:

  1. Launch the PowerPoint slide show presentation
  2. ALT+Tab (PC) or Command⌘+Tab (Mac) to the Zoom meeting window (with the menu bar at the bottom) and click “Share.”
  3. Check “Share Computer Sound” at the bottom left of the Share window if you want students to hear sound in your PowerPoint presentation.
  4. Select “PowerPoint Slide Show,” then click the blue “Share” button.
  5. To stop sharing, return to Zoom meeting window and click “Stop Share” or type ALT-S (PC) or Command⌘+Shift⇧+T (Mac).

If you want to share an iPhone or iPad screen:

  1. On your iPhone or iPad, connect to the same Wi-Fi network as the computer running Zoom.
  2. In Zoom, select Share>iPhone/iPad>Share
  3. On your iPhone or iPad, select AirPlay (swipe down from top right corner for iOS 12 or newer or up from bottom for iOS 11 or older).  Select Screen Mirroring>Zoom.

HINT: Share your iPhone or iPad camera screen when you need an impromptu document camera or to show a place or object or conduct an interview.

Teleconferencing Tips: Are You Ready for your Closeup?

  • “Is Bob on the call? Will someone PLEASE e-mail Bob?
  • “Everyone, everyone, PLEASE mute your #$%^& line!”
  • “THERE’S there’s, AN an, ECHO, Echo, echo, ech….”
  • “How do we share our screen again? Wait I see it. No, that’s not it.”

It’s 2020; one year AFTER the events of the film Blade Runner. Still no flying cars. No androids. And apparently no lawyers capable of carrying off a flawless video conference.

COVID-19 is pushing everyone to videoconferencing. I’ve long used it to webcast and teach law classes, so thought I’d share a few tips to exorcise the gremlins.

SOUND: While the microphone on your laptop may suffice, a quality microphone makes a big difference in sound quality, especially amidst ambient noise. My buddy Ernie “the Attorney” Svenson has a quality microphone and scissors-arm stand in his office/studio. It’s great, and you could probably rig up something similar for under a hundred dollars. For my money, I adore my $50 Blue Snowball microphone and stand. Great pickup and timbre. It plugs into any USB port (no fumbling for a mike jack) and just works every time. Bulletproof.

LIGHTING: There’s a reason cinematographers spend so much time fussing over lighting. It’s important because much of what we “say” in teleconferences is conveyed by facial expression and small gestures. Overhead lighting casts ghoulish shadows. The shadows caused by back lighting (e.g., a window behind you) make everyone look like they’re in witness protection. Your face needs to be brightly and evenly lit, best accomplished by diffused and/or reflected light.

I’ve struggled to rig up suitable webcast lighting. I even had studio lights on tripods flanking my desk and a big overhead hair light on a boom balanced by a sandbag. Not quite law office. Not quite sound stage. All quite hideous.

I found a better way. My desk faces a white wall, so my compromise solution was to position a single $39 softbox studio light behind my center monitor and bounce the light off the wall and ceiling. I only turn it on for conferences, but it would be great for those struggling with Seasonal Affective Disorder. Videoconferencing is the new normal so invest in purpose-built lighting. There are loads of low-cost options designed for the task, from LED light rings to studio setups worthy of Steven Spielberg.

CAMERA: If you’re going to be working from a desktop machine, get a decent camera. You needn’t spend a fortune. I’m currently content with the Logitech C922 USB webcam. It has 1080p resolution and is sturdy and stable perched atop my monitor. I can adjust it easily, and it’s built-in microphone is a solid backup for my (never fail) Blue Snowball.

SCREENS: I use and love Zoom as my teleconferencing platform. In conjunction with PowerPoint, I regularly hold classes on Zoom ranging from 90 minutes to three hours. Zoom offers loads of features and flexibility, but it also dumps three windows across my screens. Alongside PowerPoint, an active presentation window, plus chat and question boxes, I’m frequently shifting and sizing six or more windows in search of an optimum layout. So, if you’ve not yet embraced the convenience of multiple monitors, make the coronavirus your excuse to upgrade. I position whatever content my students see via screen share to be as closely aligned below the camera as possible. That way, I can face the camera and not appear to be looking sideways.

BACKGROUND: I’ve tried professional draping and chromakey backgrounds. They just got in the way, and they were a pain to put up, take down and stow. In the end, I just cleaned up the room and assembled a wall of New Orleans art, photos and mementos behind me. My advice is minimize distractions.

You Don’t Want to Know What It Means to Miss New Orleans this May 7-8

My bosom buddy and lifestyle mentor, Ernie “the Attorney” Svenson, has spent much of his career trying to share the smart stuff he’s learned with other lawyers. The last few years, aided by his wonderful wife, Donna, he’s focused on lawyer marketing and systematized practice efficiency. Ernie has a large cadre of avid followers who periodically convene at the feet of the master to learn the Tao of perfected practice and taste the sweetness of New Orleans. It’s always a great group and this May, the conclave will be bigger than any before. Ever dedicated to labor saving, Ernie drafted copy to help me invite you to join our merry band. It’s not my voice, but it’s an excellent voice; so, I share it here verbatim:

I want to let you know about a special conference for solo and small firm lawyers (which I’m speaking at)…
It’s a two-day conference for lawyers who want to make big improvements in their practices, specifically…

—More streamlined workflows

—Less email overload

—More document automation

—Less paper and less disorganization

—More clients (good ones, not just anything that walks in the door)

—More profit & more steady cashflow

—Less overhead & fewer worries

—More clarity about exactly how to simplify, automate and outsource the complex workload in a busy small firm practice.

Folks who register will get immediate access to online training so they can start making those improvements right away. And the conference organizer (my good friend Ernie Svenson) is also doing free weekly webinars leading up to the event.
The full price of the conference, with all the bonuses, is $1,295 but the special pricing is still in effect and so if you go to the website you can register for only $850.
Ernie gave the speakers a limited number of “speakers discounttickets and so I wanted to give you the opportunity to use one that I was given.
It will give you an additional $200 off the $850 discount. Go check out all the details here.
In other words, you can register with this link for $649
And use this discount code when you decide to register so you get that extra discount. But don’t procrastinate in using the special speakers’ discount.
There is only a limited number of these speakers’ discounts and they are available on a first-come-first-served basis.
So check it out and see if it’s something you can do, and will find helpful to your practice.
Best, Craig

P.S. here’s a detailed agenda of topics and times.

So, (the real me, again) what’s the worst that could happen here? You come, meet some great people, listen to good music, dance in the streets behind a second line brass band, eat delicious food, maybe laugh and drink a wee bit more than your norm? Too, you’re sure to leave with some splendid ideas for your law practice and broaden your network of like-minded solo and small firm practitioners.

We don’t call New Orleans “The City That Care Forgot” and “The Big Easy” for nothing. If you can’t have a wonderful time in NOLA, you can’t have one anywhere. Pair that with some practical strategies to improve the efficiency and profitability of your practice., along with a hefty 50% discount. Now, how can you NOT come? Trust me, you don’t want to know what it means to miss Ernie and Donna’s New Orleans, May 7-8.

Degradation: How TIFF+ Disrupts Search

broken searchRecently, I wrote on the monstrous cost of TIFF+ productions compared to the same data produced as native files.  I’ve wasted years trying to expose the loss of utility and completeness caused by converting evidence to static formats.  I should have recognized that no one cares about quality in e-discovery; they only care about cost.  But I cannot let go of quality because one thing the Federal Rules make clear is that producing parties are not permitted to employ forms of production that significantly impair the searchability of electronically stored information (ESI).

In the “ordinary course of business,” none but litigators “ordinarily maintain” TIFF images as substitutes for native evidence   When requesting parties seek production in native forms, responding parties counter with costly static image formats by claiming they are “reasonably usable” alternatives.  However, the drafters of the 2006 Rules amendments were explicit in their prohibition:

[T]he option to produce in a reasonably usable form does not mean that a responding party is free to convert electronically stored information from the form in which it is ordinarily maintained to a different form that makes it more difficult or burdensome for the requesting party to use the information efficiently in the litigation. If the responding party ordinarily maintains the information it is producing in a way that makes it searchable by electronic means, the information should not be produced in a form that removes or significantly degrades this feature.

 FRCP Rule 34, Committee Notes on Rules – 2006 Amendment.

I contend that substituting a form that costs many times more to load and host counts as making the production more difficult and burdensome to use.  But what is little realized or acknowledged is the havoc that so-called TIFF+ productions wreck on searchability, too.  It boggles the mind, but when I share what I’m about to relate below to opposing counsel, they immediately retort, “that’s not true.”  They deny the reality without checking its truth, without caring whether what they assert has a basis in fact.  And I’m talking about lawyers claiming deep expertise in e-discovery.  It’s disheartening, to say the least.

A little background: We all know that ESI is inherently electronically searchable.  There are quibbles to that statement but please take it at face value for now.  When parties convert evidence in native forms to static image forms like TIFF, the process strips away all electronic searchability.  A monochrome screenshot replaces the source evidence.  Since the Rules say you can’t remove or significantly degrade searchability, the responding party must act to restore a measure of searchability.  They do this by extracting text from the native ESI and delivering it in a “load file” accompanying the page images.  This is part of the “plus” when people speak of TIFF+ productions.

E-discovery vendors then seek to pair the page images with the extracted text in a manner that allows some text searchability.  Vendors index the extracted text to speed search, a mapping process intended to display the page where the text was located when mapped.  This is important because where the text appears in the load file dictates what page will be displayed when the text is searched and determines whether features like proximity search and even predictive coding work as well as we have a right to expect.  Upshot: The location and juxtaposition of extracted text in the load file matters significantly in terms of accurate searchability.  If you don’t accept that, you can stop reading.

Now, let’s consider the structure of modern electronic evidence.  We could talk about formulae in spreadsheets or speaker notes in presentations, but those are not what we fight over when it comes to forms of production. Instead,  I want to focus on Microsoft Word documents and those components of Word documents called Comments and Tracked Changes; particularly Comments because these aren’t “metadata” by any stretch.  Comments are user-contributed content, typically communications between collaborators.  Users see this content on demand and it’s highly contextual and positional because it is nearly always a comment on adjacent body text.  It’s NOT the body text, and it’s not much use when it’s separated from the body text.  Accordingly, Word displays comments as marginalia, giving it the power of place but not enmeshing it with the body text.

But what happens to these contextual comments when you extract the text of a Word document to a load file and then index the load files?

There are three ways I’ve seen vendors handle comments and all three significantly degrade searchability:

First, they suppress comments altogether and do not capture the text in the load files.  This is content deletion.  It’s like the content was never there and you can’t find the text using any method of electronic search.  Responding parties don’t disclose this deletion nor is it grounded on any claim of privilege or right.  Spoliation is just S.O.P.

Second, they merge the comments into the adjacent body text. This has the advantage of putting the text more-or-less on the same page where it appears in the source, but it also serves to frustrate proximity search and analytics.  The injection of the comment text between a word combination or phrase causes searches for that word combo or phrase to fail.  For example, if your search was for ignition w/3 switch and a four-word comment comes between “ignition” and “switch,” the search fails.

Third, and frequently, vendors aggregate comments and dump them at the end of the load file with no clue as to the page or text they reference.  No links.  No pointers.  Every search hitting on comment text takes you to the wrong page, devoid of context.

Some of what I describe are challenges inherent to dealing with three-dimensional data using two-dimensional tools.  Native applications deal with Comments, speaker notes and formulae three-dimensionally.  We can reveal that data as needed, and it appears in exactly the way witnesses use it outside of litigation.  But flattening native forms to static images and load files destroys that multidimensional capability.   Vendors do what they can to add back functionality; but we should not pretend the results are anything more than a pale shadow of what’s possible when native forms are produced.  I’d call it a tradeoff, but that implies requesting parties know what’s being denied them.  How can requesting party’s counsel know what’s happening when responding parties’ counsel haven’t a clue what their tools do, yet misrepresent the result?

But now you know.  Check it out.  Look at the extracted text files produced to accompany documents with comments and tracked changes.  Ask questions.  Push back.  And if you’re producing party’s counsel, fess up to the evidence vandalism you do.  Defend it if you must but stop denying it.  You’re better than that.

Don’t Let Plaintiffs’ Lawyers Read This!!

Be honest.  Wouldn’t you love to stick it to the plaintiffs?  Wouldn’t your corporate client or carrier be ecstatic if you could make litigation much more expensive for those greedy opportunists bringing frivolous suits and demanding discovery?  What if you could make discovery not just more costly, but make it, say, five times more costly, ten times more costly, than it is for you?  Really bring the pain.  Would you do it?

Now that I have your attention–and the attention of plaintiffs’ counsel wondering if they’ve stumbled into a closed meeting at a corporate counsel retreat—I want to show you this is real.  Not just because I say so, but because you prove it to yourself.  You do the math.

Math!  You didn’t say there would be math!

Stop.  You know you’re good at math when the numbers come with dollar signs.  Legendary Texas trial lawyer W. James Kronzer used to say to me, “I’m no good at math, Herman; but I can divide any number by three.”  That was back when a third was the customary contingent fee.

Even after you do the math, you’re not going to believe it; instead, you’ll conclude it can’t be true.  Surely nothing so unjust could have escaped my notice.  Why would Courts allow this?  How can I be such a sap?

The real question is this: What am I going to do about it? Continue reading

Preserving Social Media Content: DIY

Social Media Content (SMC) is a rich source of evidence.  Photos and posts shed light on claims of disability and damages, establish malicious intent and support challenges to parental fitness–to say nothing of criminals who post selfies at crime scenes or holding stolen goods, drugs and weapons.  SMC may expose propensity to violence, hate speech, racial animus, misogyny or mental instability (even at the highest levels of government).  SMC is increasingly a medium for business messaging and the primary channel for cross-border communications.  In short, SMC and messaging are heirs-apparent to e-mail in their importance to e-discovery.

Competence demands swift identification and preservation of SMC.

Screen shots of SMC are notoriously unreliable, tedious to collect and inherently unsearchable.  Applications like X1 Social Discovery and service providers like Hanzo can help with SMC preservation; but frequently the task demands little technical savvy and no specialized tools.  Major SMC sites offer straightforward ways users can access and download their content.  Armed with a client’s login credentials, lawyers, too, can undertake the ministerial task of preserving SMC without greater risk of becoming a witness than if they’d photocopied paper records.

Collecting your Client’s SMC
Collecting SMC is a two-step process of requesting the data followed by downloading.  Minutes to hours or longer may elapse between a request and download availability. Having your client handle collection weakens the chain of custody; so, instruct the client to forward download links to you or your designee for collection.  Better yet, do it all yourself.

Obtain your client’s user ID and password for each account and written consent to collect. Instruct your client to change account passwords for your use, re-enabling customary passwords following collection.  Clients may need to temporarily disable two-factor account security.  Download data promptly as downloads are available briefly.

Collection Steps for Seven Social Media Sites
Facebook: After login, go to Settings>Your Facebook Information>Download Your Information.  Select the data and date ranges to collect (e.g., Posts, Messages, Photos, Comments, Friends, etc.).  Facebook will e-mail the account holder when the data is ready for download (from the Available Copies tab on the user’s Download Your Information page). Facebook also offers an Access Your Information link for review before download. Continue reading

Privacy: A Wolf in Sheep’s Clothing?

Next week is Georgetown Law Center’s sixteenth annual Advanced E-Discovery Institute.  Sixteen years of a keen focus on e-discovery; what an impressive, improbable achievement!  Admittedly, I’m biased by longtime membership on its advisory board and my sometime membership on its planning committees, but I regard the GTAEDI confab of practitioners and judges as the best e-discovery conference still standing.  So, it troubles me how much of the e-discovery content of the Institute and other conferences is ceded to other topics, and one topic in particular, privacy, is being pushed to be the focus of the Institute in future.

This is not a post about the Georgetown Institute, but about privacy, particularly whether our privacy fears are stoked and manipulated by companies and counsel as an opportunistic means to beat back discovery.  I ask you: Is privacy a stalking horse for a corporate anti-discovery agenda? Continue reading

A Primer on Processing and a Milestone

Processing 2019Today, I published my primer on processing.  It’s fifty-odd pages on a topic that’s warranted barely a handful of paragraphs anywhere else.  I wrote it for the upcoming Georgetown Law Center Advanced E-Discovery Institute and most of the material is brand new, covering a stage of e-discovery–a “black box” stage–where a lot can go quietly wrong.  Processing is something hardly anyone thinks about until it blows up.

Laying the foundation for a deep dive on processing required I include a crash course on the fundamentals of digitization and encoding.  My students at the University of Texas and at the Georgetown Academy have had to study encoding for years because I see it as the best base on which to build competency on the technical side of e-discovery.

The research for the paper confirmed what I’d long suspected about our industry.  Despite winsome wrappers, all the leading e-discovery tools are built on a handful of open source and commercial codebases, particularly for the crucial tasks of file identification and text extraction.  Nothing evil in that, but it does make you think about cybersecurity and pricing.  In the process of delving deeply into processing, I gained  greater respect for the software architects, developers and coders who make it all work.  It’s complicated, and there are countless ways to run off the rails.  That the tools work as well as they do is an improbable achievement.  Stilli, there are ingrained perils you need to know, and tradeoffs to be weighed.

Working from so little prior source material, I had to figure a lot out by guess and by gosh.  I have no doubt I’ve misunderstood points and could have explained topics more clearly.  Please don’t hesitate to weigh in to challenge or correct.  Regular readers know I love to hear your thoughts and critiques.

I’ll be talking about processing in an ACEDS/Logikcull webcast tomorrow (Tuesday, November 5, 2019) at 1:00pm EST/10:00am PST.  I expect it’s not to late to register.

The milestone of the title is that this is my 200th blog post and it neatly coincides with my 200,000 unique visitor to the blog (actually 200,258, but who’s counting?).  When I started blogging here on August 20, 2011, I honestly didn’t know if anyone would stop by.  Two hundred thousand kind readers have rung the bell (and that’s excluding the many more spammers turned away).  I hope something I wrote along the way gave you some insight or a chuckle.  I’m intensely grateful for your attention.

By the way, if you’d like to come to the Georgetown Advanced E-Discovery Institute in Washington, D.C. on November 21-22, 2019, please use my speaker’s discount code to save $100.00.  The discount code is BALL (all caps).  Hope to see you!