Nanney: Clearview AI demonstrates online users’ lack of control over their privacy

Clearview AI's facial recognition technology is able to match faces to a database of over 3 billion images.

Clearview AI [Public domain]

Clearview AI's facial recognition technology is able to match faces to a database of over 3 billion images.

Andie Nanney, Guest Contributor

How many times have you been told to be careful about the things you post online? That everything you put onto the internet is permanent and public no matter what you do?

With the rise of increasingly invasive technology, it’s getting wiser to heed this advice. Thanks to an explosive New York Times article published early this January, most prevalent of evolving technology in the public eye right now is the tech startup Clearview AI, a company with a mission to aid law enforcement through facial recognition technology that will “identify child molesters, murderers, suspected terrorists, and other dangerous people quickly, accurately, and reliably to keep our families and communities safe.” They use a self-reported database of over 3 billion images, scraped from social media websites such as Facebook and Twitter, to find previously unfound criminals. To reiterate: this private company has collected three billion images from social media websites (alongside videos, addresses, contact information, and even the names and respective identifying information of people one may associate with) stored them all in a database, and developed a tool to match them to images taken at crime scenes and from surveillance cameras. According to the New York Times, law enforcement agencies have used similar technology, but with databases of up to 411 million photos that are usually passport photos, driver’s license photos, and mugshots, making Clearview’s mountain of data near unprecedented.

The Clearview AI website makes clear that it is not a consumer application, but sources have reported that licenses have been sold to private companies alongside law enforcement. There also exists a discrepancy in just how many agencies are currently using Clearview AI. Clearview first claimed the number was around 600, but a media interview with an investor of Clearview stated that they were working with “over one thousand independent law enforcement agencies.”

Depending on the way you approach privacy online, this may sound alarming, but perhaps a necessary evil–if it’s helping to catch child molesters, murderers, suspected terrorists, and other dangerous people, isn’t the sacrifice of privacy worth it? Well, that’s if it’s actually doing anything to help. The company has claimed in advertisements that they have played a role in a number of cases ranging from an assailant in Brooklyn, to a groper on an NYC subway, and a vague 40 cold cases. However, there isn’t anything verifying this. The NYPD has denied Clearview playing a role in the former two cases, to which Clearview AI responded by saying the information they contributed was through an anonymous tip line. 

Data privacy advocates, such as the Electronic Frontier Foundation have been quick to speak out about Clearview–and rightfully so. Currently, there isn’t any concrete law deeming what Clearview is doing as illegal, but data privacy laws in the US have proven to be phenomenally outdated, doing little to protect the information of American citizens. Michael Chertoff, former Secretary of Homeland Security and co-author of the Patriot Act, details a scenario current data laws allow for in his book Exploding Data: Reclaiming Our Cybersecurity in the Digital Age. In it, he details a kids toy that recorded its conversations with you, and collected information about you, in order to become a better ‘companion’ for the child to whom it was given. The toy is described as being realistic in its dialogue, eventually being able to initiate a conversation with the child. 

In all fairness, if this sounds dystopian, that’s because it is. It’s only a hypothetical scenario based off of real events Chertoff describes at the beginning of the book as a lead-in for why better data laws are so important–and why opt-out policies aren’t enough.

These opt-out policies are what have led to things like Clearview existing. Again, all of the information they collected was from technically public sources–but should things such as personal photos, videos, addresses, contact information, and social circles be treated, by default, public? Especially when the misuse of this information could lead to catastrophic consequences?

In China, information like this is already being misused. It is estimated that the country has more than 200 million surveillance cameras, a vast but unknown number of which are used to feed information to a dataset detailing the movement, gas and electricity usage, and facial recognition data of mainly Uighur citizens–an ethnic minority in China which is largely Muslim. This data is then being used to place them into what the Chinese government calls “re-education camps.” Sophie Richardson, the China director for Human Rights Watch, tells FRONTLINE, “The kinds of behavior that’s now being monitored — you know, which language do you speak at home, whether you’re talking to your relatives in other countries, how often you pray — that information is now being Hoovered up and used to decide whether people should be subjected to political reeducation in these camps.” It is perhaps one of the worst-case scenarios for what could be done with the disaster soup of astronomical amounts of personal data being publicly available, easy access to it, and the strong capabilities of modern technology. Clearview is not the software China is using, but it could be what other countries with histories of infringing on human rights use: “A document obtained via a public records request reveals that Clearview has been touting a ‘rapid international expansion’ to prospective clients using a map that highlights how it either has expanded, or plans to expand, to at least 22 more countries, some of which have committed human rights abuses,” writes Buzzfeed News.

Something similar to what is happening in China happening in America is unlikely, but similar misuses have undoubtedly already occurred. An email from Clearview sent to a police lieutenant in Green Bay, Wisconsin read “Have you tried taking a selfie with Clearview yet? It’s the best way to quickly see the power of Clearview in real-time. Try your friends or family. Or a celebrity like Joe Montana or George Clooney. Your Clearview account has unlimited searches. So feel free to run wild with your searches.” There is a motive for sending this email that is perfectly innocent: the more times any AI practices its function, the better it gets. Encouraging people to use Clearview as much as possible will only benefit their product. However, this email also provides an easy excuse for any law enforcement official using the software for less innocent purposes, such as using it to stalk someone or selling the information it finds on someone to an unauthorized third party. The possibilities are endless, and the company’s terms of service is contradictory to what they have told clients. There is one line addressing misuse: 

“Briefly, the User Code of Conduct requires that all Users maintain the security of their own account, only use the Services for law enforcement or security purposes that are authorized by their employer and conducted pursuant to their employment, and independently support and verify all image search results.”

What exactly is the law enforcement or security purpose of searching for George Clooney? Of running wild with searches?

And, of course, there is the issue of the inherent dangers of any facial recognition tool in law enforcement. Study after study has suggested a disproportionately lower accuracy rate for people of color in popular facial recognition tools. Most would agree that the current criminal justice system in America is already unfair enough to minorities, and Clearview’s own claims and studies are the only things preventing the software from contributing to it. Additionally, with AI, there is a danger of corruption because its behavior is based entirely on what it has been trained on. An infamous but fantastic example of an AI with a perfectly innocent function becoming corrupted is Microsoft’s chatbot, Tay, who was to become better at speaking in the way people online tend to through talk with human Twitter users. However, Microsoft was forced to take it down in less than a day–it quickly began tweeting worryingly offensive and conspiratory comments. 

The company has also had ties to politically far-right individuals which, given the previously-explained impressionability of AI, is alarming. An article from Buzzfeed News reads,

“While there’s little left online about [Smartcheckr, Clearview AI’s original name], BuzzFeed News obtained and confirmed a document, first reported by the Times, in which the company claimed it could provide voter ad microtargeting and ‘extreme opposition research’ to Paul Nehlen, a white nationalist who was running on an extremist platform to fill the Wisconsin congressional seat of the departing speaker of the House, Paul Ryan.”

Adding to this, despite claims of high accuracy from Clearview itself, there is no third-party verification confirming them. Therefore, trusting the results of Clearview AI’s software requires blind trust in the company–which, by using it, law enforcement officials across the U.S. are doing. The trust they are putting forth to Clearview AI is strong enough that it is being used as basis for investigation into American citizens.

There are all kinds of ways information which is made accessible through one’s regular online usage could be used in alarming ways, and Clearview AI is a powerful example. As of now, Google, Facebook, YouTube, and Twitter have all sent cease-and-desist letters to Clearview in an attempt to stop their scraping of data from their platforms, and the company is facing various lawsuits. What’s next for Clearview isn’t so clear, but hopefully as more groups and individuals take notice and talk about the ethics of so much information on the internet being treated as public by default, internet privacy laws may begin to catch up with the capabilities and applications of modern technology.