I’ve been skeptical of predictive coding for years, even before I wrote my first column on it back in 2005. Like most, I was reluctant to accept that a lifeless mass of chips and wires could replicate the deep insight, the nuanced understanding, the sheer freaking brilliance that my massive lawyer brain brings to discovery. Wasn’t I the guy who could pull down that one dusty box in a cavernous records repository and find the smoking gun everyone else overlooked? Wasn’t it my rarefied ability to discern the meaning lurking beneath the bare words that helped win all those verdicts?
Well, no, not really. But, I still didn’t trust software to make the sort of fine distinctions I thought assessing relevance required.
So, as others leapt aboard the predictive coding bandwagon, I hung back, uncertain. I felt not enough objective study had been done to demonstrate the reliability and superiority of predictive coding. I well knew the deep flaws of mechanized search, and worried that predictive coding would be just another search tool tarted up in the frills and finery of statistics and math. So, as Herb and Ralph, Maura and Gordon and Karl and Tom sung Hosannas to TAR and CAR from Brooklyn Heights to Zanzibar, I was measured in my enthusiasm. With so many smart folks in thrall, there had to be something to it, right? Yet, I couldn’t fathom how the machine could be better at the fine points of judging responsiveness than I am.
Then, I figured it out: The machine’s not better at fine judgment. I’m better at it, and so are you.
So why, then, have I now drunk the predictive coding Kool-Aid and find myself telling anyone who will listen that predictive coding is the Way and the Light?
It’s because I finally grasped that, although predictive coding isn’t better at dealing with the swath of documents that demand careful judgment, it’s every bit as good (and actually much, much better) at dealing with the overwhelming majority of documents that don’t require careful judgment—the very ones where keyword search and human reviewers fail miserably.
Let me explain.
For the most part, it’s not hard to characterize documents in a collection as responsive or not responsive. The vast majority of documents in review are either pretty obviously responsive or pretty obviously not. Smoking guns and hot docs are responsive because their relevance jumps out at you. Most irrelevant documents get coded quickly because one can tell at a glance that they’re irrelevant. There are close calls, but overall, not a lot of them.
If you don’t accept that proposition, you might as well not read further; but if you don’t, I question whether you’ve done much document review.
It turns out that well-designed and –trained software also has little difficulty distinguishing the obviously relevant from the obviously irrelevant. And, again, there are many, many more of these clear cut cases in a collection than ones requiring judgment calls.
So, for the vast majority of documents in a collection, the machines are every bit as capable as human reviewers. A tie. But giving the extra point to humans as better at the judgment call documents, HUMANS WIN! Yeah! GO HUMANS! Except….
Except, the machines work much faster and much cheaper than humans, and it turns out that there really is something humans do much, much better than machines: they screw up.
The biggest problem with human reviewers isn’t that they can’t tell the difference between relevant and irrelevant documents; it’s that they often don’t. Human reviewers make inexplicable choices and transient, unwarranted assumptions. Their minds wander. Brains go on autopilot. They lose their place. They check the wrong box. There are many ways for human reviewers to err and just one way to perform correctly.
The incidence of error and inconsistent assessments among human reviewers is mind boggling. It’s unbelievable. And therein lays the problem: it’s unbelievable. People I talk to about reviewer error might accept that some nameless, faceless contract reviewer blows the call with regularity, but they can’t accept that potential in themselves. “Not me,” they think, “If I were doing the review, I’d be as good as or better than the machines.” It’s the “Not Me” Factor.
Indeed, there is some cause to believe that the best trained reviewers on the best managed review teams get very close to the performance of technology-assisted review. A chess grand master has been known to beat a supercomputer (though not in quite some time).
But so what? Even if you are that good, you can only achieve the same result by reviewing all of the documents in the collection, instead of the 2%-5% of the collection needed to be reviewed using predictive coding. Thus, even the most inept, ill-managed reviewers cost more than predictive coding; and the best trained and best managed reviewers cost much more than predictive coding. If human review isn’t better (and it appears to generally be far worse) and predictive coding costs much less and takes less time, where’s the rational argument for human review?
What’s that? “My client wants to wear a belt AND suspenders?” Oh, PLEASE.
What about that chestnut that human judgment is superior on the close calls? That doesn’t wash either. First–and being brutally honest–quality is a peripheral consideration in e-discovery. I haven’t met the producing party who loses sleep worrying about whether their production will meet their opponent’s needs. Quality is a means to avoid sanctions, and nothing more.
Moreover, predictive coding doesn’t try to replace human judgment when it comes to the close calls. Good machine learning systems keep learning. When they run into one of those close call documents, they seek guidance from human reviewers. It’s the best of both worlds.
So why isn’t everyone using predictive coding? One reason is that the pricing has not yet shifted from exploitive to rational. It shouldn’t cost substantially more to expose a collection to a predictive coding tool than to expose it to a keyword search tool; yet, it does. That will change and the artificial economic barriers to realizing the benefits of predictive coding will soon play only a minor role in the decision to use the technology.
Another reason predictive coding hasn’t gotten much traction is that Not Me Factor. To that I say this: Believe what you will about your superior performance, tenacity and attention span (or that of your team or law firm), but remember that you’re spending someone else’s money on your fantasy. When the judge, the other side or (shudder) the client comes to grips with the exceedingly poor value proposition that is large-scale human review, things are going to change…and, Lucy, there’s gonna be some ‘splainin to do!
Shae Thurman said:
IPRO Eclipse offers free TAR aka “Predictive Coding” for no charge through its embedded Content Analyst CAAT tool. I’m wondering when other companies will stop charging for this feature!
LikeLike
Joe Treese said:
While many of your followers will dig in and extol (self-extol) the superiority of humans (lawyers) in the process, let me add just a few virtues of the machines which might help round out your list.
-Machines don’t complain about Saturday work, and to all-nighter’s it’s a digital chorus of “Bring it on”:
-Machines don’t expect bonuses or extra recognition for a job well done – the 120 volts will do just fine:
-Machines will never make fun of the managing partner’s techno-illiteracy or the lead litigator’s toupee:
-Machines do it the same way, every time, exactly as instructed (this includes doing it wrong, if the humans (lawyers) provide incomplete or erroneous criteria):
-Machines can blast thru an ESI mountain, honestly and completely lay open the results (good AND bad) and won’t be offended when told how to do it better – then they’ll blast hru it again, and again, and…generally, before lunch.
Finally, there is no machine “hubris”, nor is there a machine bias for or against the professional qualifications of any team member – the JD’s count no more than the BS, MBA and paralegal qualifications
Thanks for the boil-down, looking forward to the simmer.
LikeLike
ESIDence said:
Reblogged this on ESIdence and commented:
Who else but @craigball can work predictive coding, gentle (but forceful) peer goading and an “I Love Lucy” reference, all in the same blog post?
LikeLike
Andy Wilson (logikcull.com) said:
Nice post Craig. Gordon Moore welcomes you to the predictable Early Majority. Looks like predictive coding has crossed the chasm.
LikeLike
Andy Wilson (logikcull.com) said:
Whoops. Meant Geoffrey Moore. I’m in the “early-morning-haven’t-had-coffee-yet-phase-of-development”.
LikeLike
craigball said:
So I’ve crossed the chasm, Andy? 😉
I’ll add:
All appearances to the contrary, I am not pro-plaintiff so much as I am pro-evidence. I believe that evidence has as great a (or greater) propensity to exonerate as to implicate. So, I believe in getting the relevant evidence out there, whether it helps or hurts one side or the other. That is not a shared goal. In truth, the predominant view among responding parties is, “if we give opponents as little as possible, we give them little to use against us.” That’s the real world, and anyone who pretends it’s not is either not telling the truth or not paying attention.
But I think that stinks, and I hope judges, Bar disciplinary boards and law schools will strive to eradicate such rampant obstructionist behavior.
I’ve heard it said that requesting parties (especially those greedy Davids asymmetrically suing data-rich Goliaths) will fight the use of predictive coding solely to increase litigation costs to their opponent. In the end, that won’t happen (or won’t last) because predictive coding isn’t just cheaper when done right, it’s better. There’s less junk in the production and, done right, the result is a more thorough and attentive exploration of the content. Much cheaper means that more sources can be explored without disproportionality. When something gives you more and better for less, it’s not a value proposition that can be suppressed indefinitely…by those north or south of the V.
LikeLike
Andy Wilson (logikcull.com) said:
So…what you’re telling me is that predictive coding = less junk in the data trunk? We need to bottle this up.
Right.
Now!
We’ll be RICH!! Oh…wait. I think Recommind patented the machine learning/predictive coding/data weight loss drug. Waa-waa.
I guess there’s always Predictive Culling: the art of systematically and proactively removing dumb sheep from the herd based on frequency of bahhs/min and total blades of grass consumed/hour.
LikeLike
Matt nelson said:
Great post Craig. I think lower prices, more transparent tools, and built-in sampling are among the remaining barriers to broader PC adoption, but we are on our way….
LikeLike
Ralph Losey said:
Good post!
Glad the koolaide I’ve been secreting into your water supply has finally taken affect. You got it right.
Still, the problem, as usual, is the attorney (SME) skill level needed to do this new kind of review correctly, and the significant differences in the many software programs out there that claim to have active machine learning capacities.
LikeLike
Craig Ball said:
Maybe it was Kool-Aid, Maybe it was Canadian Club. I tend to credit (blame?) Gordon and Maura for much of my consciousness raising.
I’m not especially concerned about the SME issue because I credit and charge counsel with the ability and duty to know their case sufficiently to be able to distinguish relevant documents from the rest once in front of a review tool. Else, what is counsel for? Are we reduced to scriveners?
As to the differences in the software, I agree. I see that as being the next big hurdle. Just as predictive coding starts to be broadly embraced, the marketplace will be awash with tools called machine learning/TAR/predictive coding that are really just the old pigs in lipstick. Caveat emptor instanter.
LikeLike
Pingback: Craig Ball, Predictive Coding, and Wordsmithing | Part of the Solution
Pingback: Tolson’s Three Laws of Machine Learning | eDiscovery101
Pingback: “Not Me”, The Fallibility of Human Review – eDiscovery Best Practices | eDiscoveryDaily