Reports of human document review's death have been greatly exaggerated

Back to Blog Posts

Predictive Coding, or Technology Assisted Review (TAR), has been “the next big thing” in ediscovery for so long that it’s now making the jet pack industry jealous. 

Well, drop your vintage Popular Mechanics magazine because DISCO is entering this “tired” emerging TAR market in 2016, with a more formal announcement scheduled for later this quarter.

If you’re still reading this and are neither my close personal friend nor related to me, my own prediction is that you’re not likely a litigator.  It’s okay if you’re not, of course...it’s just a shame that more litigators have already written off TAR (or worse, are just not reading my blog).  Before coming to DISCO, I was a litigator for fifteen years (at an AmLaw 100 firm, a litigation boutique, and a two-person shop), and can attest that very few attorneys at any size firm use -- or at this point want to hear more about -- TAR.

To understand the 'why' behind the lack of interest, first consider that TAR systems should accomplish two things: Reduce costs and improve accuracy. Good TAR, like all good software, also must accomplish those goals simply.  For example, complex processes, hiring a mathematician, or learning an unfamiliar language should not be required. That sounds easy enough, so what are some of the pitfalls with TAR today?  

Let’s start with accuracy.  

Fear of an unknown algorithm ultimately making tagging and production decisions with minimal attorney oversight is one of the biggest hurdles.  The TAR decision process typically involves several iterations of seed set creation and review, usually by “trusted” (i.e., senior-level) attorneys.  Even if this cumbersome process is done correctly, attorneys cannot easily discern how the software learns from the seed sets and makes its decisions (even if otherwise decipherable, it may be proprietary).  In short, this process creates understandable anxiety surrounding the results.   

Predictive coding advocates note correctly that errors occur in traditional review -- they claim even more so, implying errors should be tolerated by TAR users.  But in a traditional review lawyers can diagnose those errors, and can either fix them or assign blame. Pinpointing and explaining to courts, opposing parties, and clients mistakes made by an algorithmic black box following a complex seed set process, such as when clawback is sought (or the CEO’s grocery list is unveiled at her deposition), causes butterflies in the stomach of a litigator.  

Admittedly many of the so-called TAR 2.0 systems have alleviated some of these accuracy concerns. The principal shift between TAR 1.0 to 2.0 was an emphasis on prioritizing documents according to “predicted” decisions rather than tagging them.  Also common in TAR 2.0 approaches is the ability of the software to “learn” continuously as the review progresses.  Although this approach relieves some of the anxiety since the seed set process is less critical, it still requires a fairly intricate setup, and negates some of the promised time savings.

It'll cost you.

On the cost front, resistance to predictive coding comes from several areas:  The software price itself can be substantial.  Additionally, the learning curve and expense in time setting up the seed sets (even in TAR 2.0 systems), senior-level review of them, and complex processes involved (some involving hiring specialists) is no doubt costly. They’re also difficult to quantify, which is itself a problem.  And if those set-up expenses and doubts are present even in “good” TAR software, they are magnified by more complex and onerous systems (does any lawyer or judge know or want to know what an “f-score” should be?).  

Whether the largely front-loaded costs will actually materialize into future potential savings is also difficult to weigh.  Every litigator knows that settlements and dismissals are commonplace throughout the life of a case.  Consequently, litigation front-loaded costs are never preferred, knowing the benefits may never be fully realized.  Other costs include negotiating with opposing counsel or convincing the judge of the propriety and parameters around the anticipated use of TAR, potentially requiring expert testimony. The idea of Daubert hearings on TAR discovery is surely only relished by a tiny minority of attorneys (and even fewer clients).

User-friendly? Not so much.

Finally, the lack of simplicity (or opacity) is a clear source of current resistance and frustration.  Lawyers complain that they don’t understand the decisions, the process, or even the terms (e.g., richness, recall or precision) employed by TAR systems.  And similarly, even if they satisfactorily grasp the details, the “user interface” is not such that users or managers find easy to operate.  These problems seem to also be blocking more universal TAR adoption.

In sum, when litigators are actually faced with a live case and TAR usage decision, they must weigh the very real burdens of cost, effort, complexity, and accuracy against largely uncertain benefits.  Further complication arises from the anticipation of selling this new system to clients, other counsel, and unfamiliar courts.  Strapping on a backpack jet engine and commuting to work can seem less intimidating.  This fear and uncertainty makes it easy to see why TAR adoption has not reached the levels the ediscovery industry had anticipated.   

But wait, there is hope!

DISCO wants to make TAR better and easier to use.  To do so, DISCO has teamed Dr. Alan Lockett, an expert in deep neural networks with a Ph.D. in Artificial Intelligence, with our litigators, who have over 60 years of litigation experience collectively on full-time staff, and our thirty plus world-class software and systems engineers.  In the next part of this blog, I will explain how the approach we are taking at DISCO is different, and why we believe it will drive the adoption of TAR closer to where the industry has long predicted. 

Quick Menu
0%
100%