The United States jails more of its citizens, by percentage and in raw numbers, than any other country on earth, including those we label dictatorships and criticize as human rights violators. Judge, jury and parole board verdicts are influenced by everything from lived experience to momentary mood to how recently participants have had a food break. Studies consistently show that being black counts against defendants, resulting in far longer, harsher penalties than white offenders get for the same crimes.
So what solution are courts now employing in order to overcome those biases? Let computers make sentencing decisions.
Correctional Offender Management Profiling for Alternative Sanctions, or COMPAS, is perhaps the most widely used risk-assessment algorithm. The program, distributed by Northpointe Inc., uses data to make predictions about the likelihood that a criminal defendant will reoffend. Essentially a digital questionnaire, COMPAS poses 137 queries, then uses the answers to determine, on a scale from 1 to 10, whether a defendant is at a high or low risk of committing more crimes. (No one, save for the manufacturer, knows precisely how COMPAS’ proprietary algorithm works, and Northpointe has repeatedly declined to offer greater transparency.)
Risk scores are supposed to be just one of a constellation of factors that inform sentencing decisions, but research has found those numbers often weigh heavily on sentencing decisions. Essentially, artificial intelligence machines are now the basis of critical life decisions for already vulnerable humans.
As you might guess, the problems with this practice have proven myriad. The most glaring issue relates to the tendency of computer programs to replicate the biases of their designers. That means along with say, the ability to crunch data in the blink of an eye, racism and sexism are also built into our AI machines. A 2016 ProPublica study found that COMPAS is “particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants.” The analysis also determined that white offenders were wrongly given particularly low scores that were poor predictors of their real rates of recidivism. Ellora Thadaney Israni, a former software engineer and current Harvard Law student, notes that without constant corrective upkeep to make AI programs like COMPAS unlearn their bigotry, those biases tend to be further compounded. “The computer isworse than the human,” Israni writes at the New York Times. “It is not simply parroting back to us our own biases, it is exacerbating them.”
Beyond helping an already racist system perpetuate justice inequalities, by reducing a defendant to a series of facts and data points without nuance or human understanding, risk assessments miss mitigating factors that offer a fuller picture. Israni notes that while judges and juries are notoriously prone to human failures in reason, it remains true that a “computer cannot look a defendant in the eye, account for a troubled childhood or disability, and recommend a rehabilitative sentence.” The alternative is true as well. Computers can miss red flags, while traits that look good on paper can outweigh more serious issues, favorably skewing a defendant’s score.
“A guy who has molested a small child every day for a year could still come out as a low risk because he probably has a job,” Mark Boessenecker, a Superior Court judge in California’s Napa County, told ProPublica. “Meanwhile, a drunk guy will look high risk because he’s homeless. These risk factors don’t tell you whether the guy ought to go to prison or not; the risk factors tell you more about what the probation conditions ought to be.”
At the end of the day, the ProPublica investigation found that COMPAS in particular, and risk assessment programs in general, are not very good at their jobs.
Only 20 percent of the people predicted to commit violent crimes actually went on to do so. When a full range of crimes were taken into account — including misdemeanors such as driving with an expired license — the algorithm was somewhat more accurate than a coin flip. Of those deemed likely to re-offend, 61 percent were arrested for any subsequent crimes within two years.
Risk assessment tools continue to be used in courtrooms around the country, despite so much troubling evidence and a recent court challenge. A Wisconsin man named Eric Loomis was sentenced to six years in jail for driving a stolen car and fleeing police, with the judge in the case citing Loomis’ high COMPAS score during sentencing. Loomis appealed the ruling up to the Supreme Court, which declined to hear the case. In doing so, the court essentially (though not explicitly) gave its blessing to the program’s use.
In an era in which the Trump Department of Justice has repeatedly promised to push policies that make the justice system fail at even more turns, the use of AI programs in our courts is all the more dangerous. At the very least, courts—which don’t understand how the programs they use make the assessments they consider—should attempt to find more transparent systems and to mandate oversight that makes those systems function at optimal level. But that would actually be a departure from the way the courts have always functioned in this country, and it would require the U.S. to develop a real commitment to justice.