Virtual Canary in the Digital Mine #9: Visualize Targeted Review and Instant Case Enlightenment!

Dear Reader, we’ve covered many topics over the past months: iPhones, Octonauts, Justins. We’ve covered TAR, cyber-security and playing well in the schoolyard (OK, I’m double-dipping, that was the Octonauts too). Surprisingly, one thing I haven’t covered, really, is probably only the biggest topic of them all in e-discovery: we have too much electronically stored information. As I’ve detailed before, pre-TAR review workflows did not really fit the work in front of us or the skills we had with which to do it. We have been trying to review document collections as if we were reading novels: beginning with the first document and reading our way through until the end, hoping that we captured enough details to pass the triers’-of-fact exam at the end of the process. At best, we could be a bit more skillful when we had an index to guide us but an index is only as reliable as its input and, actually, we really needed to read the whole stack to create the index in the first place. But, things are beginning to change. We have TAR now, the Indexobot 3000 that will crawl through your collection and attempt to predict, at least, which documents are not responsive. We have increasingly “intelligent” OCR technologies. We have indexes automatically generated by our databases. So, yeah, we have some assistance. But, we still have too many documents that are being generated too fast for us to make sense of in any meaningful way without expending increasingly onerous billable hours of review time. And all of this is just to get a sense of what’s in the collection. We’re not even looking for the needles yet. We’re just trying to find out how much hay there is and what state it’s in. Well, today, I’d like to introduce to the visualization modules . . . your best shortcuts to the best stuff!

Dear Reader, we’ve covered many topics over the past months: iPhones, Octonauts, Justins. We’ve covered TAR, cyber-security and playing well in the schoolyard (OK, I’m double-dipping, that was the Octonauts too). Surprisingly, one thing I haven’t covered, really, is probably only the biggest topic of them all in e-discovery: we have too much electronically stored information.

As I’ve detailed before, pre-TAR review workflows did not really fit the work in front of us or the skills we had with which to do it. We have been trying to review document collections as if we were reading novels: beginning with the first document and reading our way through until the end, hoping that we captured enough details to pass the triers’-of-fact exam at the end of the process. At best, we could be a bit more skillful when we had an index to guide us but an index is only as reliable as its input and, actually, we really needed to read the whole stack to create the index in the first place.

But, things are beginning to change. We have TAR now, the Indexobot 3000 that will crawl through your collection and attempt to predict, at least, which documents are not responsive. We have increasingly “intelligent” OCR technologies. We have indexes automatically generated by our databases. So, yeah, we have some assistance. But, we still have too many documents that are being generated too fast for us to make sense of in any meaningful way without expending increasingly onerous billable hours of review time.

And all of this is just to get a sense of what’s in the collection. We’re not even looking for the needles yet. We’re just trying to find out how much hay there is and what state it’s in.

Well, today, I’d like to introduce to the visualization modules . . . your best shortcuts to the best stuff!

While ediscoverers like to fancy ourselves as cutting-edge, I’d like to acknowledge that, for decades now, at the intersection of neuroscience, psychology, sociology and computer sciences – a community of researchers has been working to determine how humans take in new information and how to best deliver that information to them. The results of their studies have collectively come to be known as visual analytics. These are the researchers responsible for colorful maps depicting world-wide pandemics, economic disparities, species extinctions, hunger rates. They collect or acquire datasets from national archives and multinational organizations and hash them out into meaningful statistical pronouncements.

Sponsored

Also, they study humans and attempt to discern the most efficient means of delivering information to their brains in a format that can be quickly digested without losing any essential details along the way. For instance: they use pretty shapes and colors. They produce pie charts, bar and line graphs. They make learning simple, fun and – ideally – instantaneous!

Ok, maybe now you’re thinking, thanks Canary, why are you wasting our time here . . . we have documents to review!

Well, folks, visual analytics has arrived in full in the e-discovery market. Most providers offer some sort of it, most trade show booths have giant monitors showcasing it, and, if they want, most users now have the opportunity to use it.

Summation 5.0, AccessData’s document review platform, has an exciting array of visualization tools.

With Email Visualization you can choose from a variety of graphical formats. You can construct a timeline to graphically illustrate relationships within your data set based on emails in a certain date range and, once your graph is rendered by the system, you can easily identify the frequency of communication between your custodian and different parties.

Sponsored

With the Social Analyzer, you can see an overall view of communication at a domain level, identifying Custodian interactions and revealing connections that make literal the idea of a “document universe”.

And if you want to know just how many of your documents are – the horror – spreadsheets, you can click quickly on the Filetype Analyzer.

Your initial reaction to all of this might be, well, skeptical of the accuracy or utility of the tools; or perhaps you just think that “work” should not be so easy on your eyes. That’s fine. But remember, like predictive coding or other assistive technologies, this is just a tool. I’m not suggesting that you go a picture-only model of review. But, look above, why don’t you start this review with the orange slice of pie. It has all the documents. Or further up above, what’s going on with those outlying circles? They’re large because there’s a lot of activity but they’re far away from the more populated clusters because they come from a different domain. What’s that domain? Did you know about it before you quickly peeked at this image?

Ok, you’re still not convinced? Then go ahead and click on the X in the corner. Close the picture. You still have all the data underneath.

And all the time you need, right?

For more info on Visual Analytics, join me and a panel of experts here at the EDRM webinar series last week . . .

Eric Killough is the virtual canary AccessData has released into your digital mine. He is a JD, a CEDS, and a librarian. He thinks about electronic discovery probably more than he should. Please join him here, at Twitter, at LinkedIn, and at his own blog. He’ll be happy to meet you.