Man vs. Machine: Google Translate Jeopardizes Client Confidentiality, eDiscovery

While machine translation can and does play a significant role in reducing costs and speeding the eDiscovery process, legal professionals should be careful when using any translation service that does not specifically protect data from any use not directly authorized by the client.   After careful examination and consideration, it seems that we “are not quite there yet” with respect to web-based translation technologies.  Instead, implementing machine translation services within a secure firewall that operates under a terms of service agreement tailored to the legal industry appears to be the most secure way to leverage machine translation capabilities without jeopardizing client confidentiality.

Translating foreign language documents during eDiscovery can be an expensive and time-consuming process. As today’s law firms increasingly gravitate towards a seemingly speedier, cost-reducing web-based machine translation capabilities of products like Google Translate (and other free platforms), questions about the accuracy and defensibility continue to surface – primarily because free online translation solutions do not have a proven record of the high accuracy rates required for many legal situations.

An increasingly considerable red flag for these free on-line translation solutions lies in the fact that they won’t have a method of determining translation accuracy across the body of documents for a specific case – a question most opposing counsel will certainly ask. The issue of translation accuracy and defensibility is an important one when considering their use but there is another potential issue: Is a free on-line translation service appropriate for privileged or confidential information translation during discovery?  To date, it seems many in the legal profession have not truly considered the associated risks, only deference to the benefits of high speeds and minimal (if any) costs.

What is Google Translate?

Google Translate is a free “machine translation” service that, as of the date of this writing, supports 80 languages. From the Google Translate web page:

“Google Translate is a free translation service that provides instant translations between dozens of different languages. It can translate words, sentences and web pages between any combination of our supported languages.”

There are three types of machine translation technologies: rules-based, statistical and hybrid. Rules-based systems use a combination of language and grammar rules plus dictionaries. Statistical systems do not rely on language rules, instead they “learn” to translate by analyzing large amounts of data for each language pair. The hybrid system uses both statistical and rules-based technologies. Google Translate utilizes the statistical system which can become more accurate over time as it “learns”.

Like many of the free Google services, Google Translate offers a great deal of functionality for many day-to-day tasks, but what are the risks of using it over the course of eDiscovery?

When is confidential information no longer confidential?

Should attorneys rethink utilizing Google Translate for foreign language document translation to determine document meaning and status during eDiscovery? Could a case be made by opposing counsel for unintentionally voiding a protective order for information confidentiality because of the use of Google Translate based on its published terms of service?

A release of confidential business information by the holder of the information to someone without a fiduciary responsibility or confidential relationship can call into question the true sensitivity of the information and potentially void a legal protection order. According to many legal experts, the actual sensitivity of confidential information and the ongoing efforts to keep it undisclosed are necessary to keeping information confidential. If the owner of the confidential information is reckless with the information, is it truly confidential?

The reason for the legal industry to be cautious is reflected in the Google terms of service which spell out what Google can do with information, not necessarily what they will do. A key section of the Google Terms of Service, as of April 14, 2014”

“Your Content in our Services

Some of our Services allow you to upload, submit, store, send or receive content. You retain ownership of any intellectual property rights that you hold in that content. In short, what belongs to you stays yours.

When you upload, submit, store, send or receive content to or through our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content.”

Confidential information that has been uploaded (including cut/paste) to Google Translate also raises a troubling question about information confidentiality during eDiscovery: Can information that has been translated using the Google Translate web site still be considered confidential if the possibility exists that Google can freely use and distribute the content in any way it chooses?

This brings up another Google Translate eDiscovery concern; the risk associated with utilizing the service for translation when, in reality, you won’t know what’s actually in the document (whether it’s confidential or not) until it has been uploaded to the Google Translate service and translated potentially voiding any confidentiality.