In Your Face: How Facial Recognition Databases See Copyright Law But Not Your Privacy

(Image via Getty)

As technology evolves, it seems that more than people are recognizing other people’s faces. From cellphones using facial identification as an unlocking mechanism to Facebook’s detection of faces in photos on its platform, facial recognition is quickly becoming a part of everyday life. Unfortunately, the way that these datasets have been compiled and are being used is starting to draw some scrutiny, and rightly so. More importantly, the interplay with copyright law and your privacy is a lot less recognizable than you may think.

Whether you realize it or not, your face may not be “yours” anymore. In a report from Georgetown Law’s Center on Privacy and Technology, the Center found more than 117 million adults are part of a “virtual, perpetual lineup,” accessible to law enforcement nationwide. Yep — even though you may not have ever gotten anything more than a speeding ticket, your photo may be accessible to law enforcement professionals as part of a surveillance database. Some of these photos are taken from surveillance cameras being used by city governments across the country, while others seem to have been compiled from less obvious sources. For example, certain companies engaging in facial recognition research (like IBM) obtain photos from publicly available collections for research purposes to “train” their algorithms. Unfortunately, this seems to have been done without the consent of the people whose photos are being used in this manner, and what’s worse, it may not matter.

Skills That Set Firms Apart

Legal expertise alone isn’t enough. Today’s most successful firms invest in developing the skills that drive collaboration, leadership, and business growth. Our on-demand, customizable training modules deliver practical, high-impact learning for attorneys and staff—when and where they need it.

By Barbri

How can IBM do this? From a copyright perspective, they are covered on at least two fronts. First, Section 107 of the Copyright Act actually permits the “fair use” of copyrighted works for “purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research….” On the basis of research alone, it seems that IBM’s use of such photos would be permissible. Secondly, the photos used by IBM appear to derive from a collection of 99.2 million photos known as the YFCC100M compiled by Yahoo (the former owner of Flickr) for research purposes, and subject to the Creative Commons license. That said, it seems that the people whose photographs have been used did not consent to such use (according to NBC News).

The Creative Commons is a public license framework that essentially permits the “creator” of the work to “retain copyright while allowing others to copy, distribute, and make some uses of their work — at least non-commercially,” as well as maintain attribution to their works. There are different types of licenses available through the Creative Commons, so the question becomes whether Creative Commons permits such use. According to Creative Commons CEO Ryan Merkley, it seems that the answer is, well, a “fair” one:

[C]opyright is not a good tool to protect individual privacy, to address research ethics in AI development, or to regulate the use of surveillance tools employed online. Those issues rightly belong in the public policy space, and good solutions will consider both the law and the community norms of CC licenses and content shared online in general.

Where the use is “fair use,” then the Creative Commons license will not prevent use of the photos in the dataset. Further, “[i]f someone uses a CC-licensed work with any new or developing technology, and if copyright permission is required, then the CC license allows that use without the need to seek permission from the copyright owner so long as the license conditions are respected.” Although most people take this response as a “yes,” the real takeaway is that the Creative Commons license will permit the fair use of such works insofar as AI training datasets are concerned, but is simply not operative as a mechanism to protect individual privacy.

Law Schools

Do Law School Rankings Affect Your Choice?

Share your insights in this brief survey.

By Above the Law

And therein lies the rub, so to speak. For most individuals, having their photo in an AI-training dataset for research may not be an issue, but where that dataset is being used for a commercial purposes, things get a bit more murky. I cannot disagree with the position of Creative Commons — copyright law is not designed to protect individual privacy because at its core, copyright is designed to protect creator’s rights in and to original works and prevent unauthorized uses of such works. That said, copyright law specifically preserves certain exclusive rights in the creator of such work. It does not automatically follow that a research dataset that uses photos under the fair use exception to copyright can then be used as the basis for a commercial enterprise’s use of that database for its facial recognition software. Thankfully, whether such use constitutes a fair use under the fair use doctrine requires the consideration of at least the following factors: “(1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work.”

So what does all this mean? First, the use of the YFCC100M photos in AI-training datasets appears to be permissible copyright fair use. To the extent an individual objects to their photo in such dataset, the issue becomes more complicated depending upon how the dataset is to be used. For internal research, there is little recourse under copyright or state privacy statutes. From a privacy perspective, state privacy tort claims may not work — misappropriation of likeness usually requires an exploitation of some aspect of the persons reputation, prestige, or commercial or social standing to some other value associated with the likeness (which is not likely in most cases). Other state privacy torts — intrusion upon seclusion, public discourse of private facts, and false light — are equally ill suited to protect against the use of such photos within such facial recognition datasets for most people. This becomes even more apparent for commercial uses of the datasets — whether such uses stay within the confines of fair use require a balancing of factors, including whether the use of the photos is primarily for commercial purposes. Given the situation, it seems that regulation may be the best option. In fact, the Commercial Facial Recognition Privacy Act was recently introduced in the Senate “[t]o prohibit certain entities from using facial recognition technology to identify or track an end user without obtaining the affirmative consent of the end user.” Whether this legislation will become law is anyone’s guess.

Like many legal issues involving evolving technology, there is more here than meets the eye. Unlike your face, however, the answers are simply not as easy to recognize.

Tom Kulik is an Intellectual Property & Information Technology Partner at the Dallas-based law firm of Scheef & Stone, LLP. In private practice for over 20 years, Tom is a sought-after technology lawyer who uses his industry experience as a former computer systems engineer to creatively counsel and help his clients navigate the complexities of law and technology in their business. News outlets reach out to Tom for his insight, and he has been quoted by national media organizations. Get in touch with Tom on Twitter (@LegalIntangibls) or Facebook (www.facebook.com/technologylawyer), or contact him directly at [email protected].