One code to rule them all: How big data could help the 1 percent and hurt the little guy

Computer algorithms could run business, law enforcement and more. But what happens when they get it wrong?

Published January 3, 2014 12:43PM (EST)

               (<a href=''>Ryan Rodrick Beiler</a>, <a href=''>jeka84</a> via <a href=''>Shutterstock</a>/Salon)
(Ryan Rodrick Beiler, jeka84 via Shutterstock/Salon)

Visual artist Adam J. Manley says December is his favorite month. But this year's holiday season proved bumpy. On Dec. 21, moments after Manley uploaded a video titled "Winter Solstice" to YouTube, he was hit by a copyright claim delivered by YouTube's automated Content ID system. His violation? An a cappella version of "Silent Night."

When a Content ID copyright claims kicks in, the revenue from any ads that appear on a video is diverted from the creator of the video to the copyright claimant. Since as far back as 2007 Content ID has been the clever, innovative way YouTube has resolved the clash between anarchic user-generated content and corporate concerns about piracy and copyright violation.

But in this particular instance, the money grab was fraudulent. "Silent Night," originally written in 1818, is in the public domain. Manley immediately disputed the claim. It was promptly dropped, after which Manley "re-monetized" the video. Time was of the essence! The window to cash in on a Winter Solstice-themed video closes pretty quickly once the days start getting longer again.

But that wasn't the end of it. The next morning --

...I woke to find my video had been de-monetized again. Once again, YouTube’s automatic Content ID system had decided that my rendition of a public domain song belonged to someone else. This time, it was three claims at once, all from major record labels: BMG, Warner/Chappell, and Universal Music Publishing Group.

Three of the biggest music publishers in the world had all made completely bogus claims on Manley's rendition of "Silent Night." Google's much-vaunted system for policing copyright violations was being systematically abused by agile, well-financed entertainment corporations.

And in that anecdote there is a warning we need to hear. In a world where every step we take is increasingly mediated by digital networks and devices, we are going to increasingly find ourselves governed by automated software regimes. Call it "algorithmic regulation" or "embedded governance" or "automated law enforcement," these built-in systems are sure to become ubiquitous. They will be watching for stock market fraud and issuing speeding tickets. They will doubtless be quicker to act, more all-seeing and less forgiving than the human-populated bureaucracies that preceded them.

Advocates of greater bureaucratic efficiency may well be happier in an algorithmically regulated future. But Adam Manley's example raises a serious question that has a pretty obvious answer. When the network automatically delivers its ruling, who will be better positioned to contest the inevitable miscarriages of justice sure to follow? The little guy, or the well-capitalized corporation?

* * *

Algorithmic regulation fans look at our current regulatory system and law enforcement regimes and see a broken mishmash. Laws are outdated, and their application is inadequately funded and clumsily enforced. Google's Content ID is a promising example, they argue, of a better way forward. Imagine, suggests computer books publisher and Web 2.0 evangelist Tim O'Reilly, a financial system regulatory scheme that automatically punished fraud, just like Content ID spotlights copyright violation, or Gmail targets spam.

Consider financial markets. New financial instruments are invented every day and implemented by algorithms that trade at electronic speed. How can these instruments be regulated except by programs and algorithms that track and manage them in their native element in much the same way that Google’s search quality algorithms, Google’s “regulations”, manage the constant attempts of spammers and black hat SEO experts to game the system?

...[W]hen Google discovers via algorithmic means that a new kind of spam is damaging search results, they quickly change the rules to limit the effect of those bad actors. We need to find more ways to make the consequences of bad action systemic, rather than subject to haphazard enforcement.

As examples of already existing algorithmic regulation, O'Reilly cites "congestion pricing" to reduce traffic in city downtowns, "smart" parking meters that raise or lower their tolls according to supply and demand, and the "reputational" systems in which drivers and passengers rate each other on services like Uber and Lyft. "As users of these services can attest," he writes, "reputation does a better job of ensuring a superb customer experience than any amount of government regulation."

There seems little question that such models will spread. As a group of academics wrote in the illuminating article "Confronting Automated Law Enforcement," recent technological developments all but mandate the rollout of algorithmic regulation.

The ubiquity of sensors, advances in computerized analysis and robotics, and widespread adoption of networked technologies have paved the way for the combination of sensor systems with law enforcement algorithms and punishment feedback loops. While in the past, law enforcement was manpower intensive and moderated by the discretion of the police officer on the beat, automated systems scale efficiently, allow meticulous enforcement of the law, provide rapid dispatch of punishment and offer financial incentives to law enforcement agencies, governments, and purveyors of these systems.

A white paper on "Embedded Governance from the Institute for the Future is even more direct:

Laws, now written on paper and enforced by people, will be carried on software and enforced through electronically updated and immediately downloadable rules woven into the fabric of our environment. Governance will become automatic, and lawbreaking much more difficult.... Embedded governance will prevent many of the crimes and violations we see today from happening. Firearms will work only when operated by their rightful, registered owners. Office computers will shut down after 40 hours of work unless overtime has been authorized. Disasters and quarantines could also be managed more effectively if information about citizens were known and if laws were downloaded to change behaviors immediately.

Cue the backlash!

One of the most vociferous critics of algorithmic regulation is Evgeny Morozov, a writer who has made a career out of mocking technological "solutionism." In his seminal Technology Review article, "The Real Privacy Problem" Morozov warned of the possibility that our new overload algorithms will "do the moral calculus on their own." Say goodbye to human autonomy!

... [T]he new digital infrastructure, thriving as it does on real-time data contributed by citizens, allows the technocrats to take politics, with all its noise, friction, and discontent, out of the political process. It replaces the messy stuff of coalition-building, bargaining, and deliberation with the cleanliness and efficiency of data-powered administration.....

Reaching back to a 1985 essay by the German privacy scholar Spiros Simitis, Morozov outlines a foreboding future:

Instead of getting more context for decisions, we would get less; instead of seeing the logic driving our bureaucratic systems and making that logic more accurate and less Kafkaesque, we would get more confusion because decision making was becoming automated and no one knew how exactly the algorithms worked. We would perceive a murkier picture of what makes our social institutions work; despite the promise of greater personalization and empowerment, the interactive systems would provide only an illusion of more participation.

It's important to note that Tim O'Reilly's vision of algorithmic regulation is predicated on openly accessible data and transparency, along with a clear social consensus on what the algorithm is supposed to achieve. O'Reilly's belief that Morozov consistently misrepresents his views to cast them in their worst possible light is merited. But Adam Manley's experience with Content ID -- an incident that is hardly unique -- provides a perfect illustration of Morozov's "murkier picture." Imagine that if instead of being accused of a bogus copyright infringement, Manley had gotten a speeding ticket when he wasn't even driving a car, or been flagged by his Internet service provider for trafficking in child pornography when he was completely innocent. Anyone who has dealt with the user support systems for large Internet providers or major Web service companies like Google or Facebook or Yahoo knows how difficult it can be to reach the humans hiding behind the shields of the all-powerful algorithm.

But what's equally likely to be true is that entities with significant legal and financial resources will be on much more equal footing with their algorithmic regulators than ordinary people. The notion that an automated fraud punishment regime won't be contested at every possible juncture by the likes of Goldman Sachs and JPMorgan Chase is just ludicrous. You and I will find it mind-numbingly difficult to contest that speeding ticket, but the big banks will tie their computer regulators up in knots just as easily as they stymie their human regulators.

I asked Tim O'Reilly if the Content ID "Winter Solstice" incident poked any holes in the promise of algorithmic regulation. He didn't think so.

"I would bet that Google will get much more sophisticated over time, and will learn when the takedown requests are false," he said. "The general lesson from algorithmic regulation systems is that you focus on the outcome and you continue to tweak the algorithm to achieve that outcome, precisely because people do try to game the system. At least with an algorithmic regulation system, you have a chance of adapting more quickly. With an old-fashioned paper regulatory system, people game the system too and they go on doing it for years."

Fair enough. In a perfect world, we will always be tweaking the algorithm. But the nature of those tweaks will be just as contested as the writing of existing regulations is contested in Congress. And the same power law is likely to be as true in the black boxes of software as in the legislative sausage factory. Capital writes the rules. And the more we take humans out of the picture, the harder it will be for real people to fight the power.

By Andrew Leonard

Andrew Leonard is a staff writer at Salon. On Twitter, @koxinga21.

MORE FROM Andrew Leonard

Related Topics ------------------------------------------

Adam Manley Content Id Copyright Evgeny Morozov Google Privacy Silent Night Youtube