Automated Copyright Filtering Removes Public Domain Mueller Report From Platform

Automated upload filters are ripe for potential abuse, or at least significant confusion.

Robert Mueller (Photo by SAUL LOEB/AFP/Getty Images)

Rightholders often raise concerns that the digital era creates outsized problems for piracy, given how quickly information can be circulated and spread. As a result, many have proposed solutions such as “notice and staydown” (instead of “notice and takedown”) or advocate for automatic copyright filters (a proposal that ultimately was included in the EU’s Copyright Directive). While it is easy to sympathize with rightholders’ concerns regarding actual piracy, copyright upload filters — where a platform filters out content before it’s even uploaded to the site — pose their own problems and can result in the removal or prevention of the uploading of completely non-infringing material. User-generated content and public domain works are particularly vulnerable. In short, automated copyright filters are not a perfect solution.

One of the obvious potential issues with automated filters is that it is difficult, if not impossible, for an algorithm to take into account fair use. In the infamous “Dancing Baby” case, a mother uploaded a video of her toddler dancing to a clip of a Prince song (which had poor audio). The video was subjected to a takedown notice and courts ultimately determined that a rightholder has an obligation to consider fair use before issuing a takedown notice. While that’s all good and well in terms of sending takedown notices, how does an algorithm take into account fair use in automated filtering? While there are various approaches, such as determining what portion of content is used in the material being uploaded, they would still be problematic for a number of reasons. Because fair use is determined on a case-by-case basis, algorithms cannot properly make these decisions.

Automated upload filters are also ripe for potential abuse, or at least significant confusion. Automated filters depend on a database of copyrighted content against which uploaded content is matched. If there is a match between the copyrighted content in the database and the material someone is trying to upload, it will get flagged and automatically filtered out. These databases are populated by content contributed by rightholders, which means that submissions could include material not under copyright protection or not owned by those uploading the content.

One of the best examples of material that was mistakenly flagged for copyright infringement occurred when NASA could not upload its own footage of the Mars lander to YouTube back in 2012. That’s right, NASA could not upload its own video, which it created and for which no one owns the copyright in the United States (since it is a United States government work, it is in the public domain). How could this possibly happen? Well, NASA’s footage was used in various news broadcasts. The broadcast of the news was submitted to YouTube’s content ID database by newscasters. Because the NASA footage had been incorporated, it was flagged as a copyrighted work. While NASA is a prominent example, certainly others could face the same problems — such as when news programs run clips taken by bystanders for newsworthy events or other content, and the actual copyright owner of the underlying material may have their own videos flagged.

More recently, automated filters also ensnared the Mueller report, another U.S. government — and therefore public domain — work. Some users uploaded the Mueller report, which is freely available in a number of locations online, to Scribd. Despite its public domain status, Scribd started mass removal of the various uploads of the Mueller report. Again, one wonders how this result could possibly happen, but databases that simply depend on rightholder contributions basically operate on an honor system. In this case, publishers have started selling bound copies of the Mueller report (as they did for the 9/11 Commission report, which shot up to the bestsellers list on Day 1, though I have yet to meet anyone who actually read the entire report). These publishers submitted a copy of the Mueller report to Scribd’s content ID database, despite not owning any underlying copyright in the work.

The examples of the removal of NASA’s footage or the Mueller report are clear examples where content ID failed, due to the fact that these works are in the public domain and free to all in the United States. These examples don’t even cover the trickier questions of fair use, where copyrighted content (such as a song, a poster, or a TV show) may appear in the background of what is actually being featured in a user-generated video. It doesn’t cover examples of satire or political commentary, which may incorporate a copyrighted work and thus cause a flag in the content ID system.

Sponsored

As there is a growing clamor for automated filtering through content ID systems —particularly with the passage of the EU Directive earlier this year — the United States should tread carefully before imposing upload filters on platforms. The potential for abuse or mistaken matches is clear, and unintended consequences must be taken into account.


Krista L. Cox is a policy attorney who has spent her career working for non-profit organizations and associations. She has expertise in copyright, patent, and intellectual property enforcement law, as well as international trade. She currently works for a non-profit member association advocating for balanced copyright. You can reach her at kristay@gmail.com.

Sponsored