In one large metropolitan area, arraignment decisions made with the assistance of machine learning cut new domestic violence incidents by half, leading to more than 1,000 fewer such post-arraignment arrests annually, according to new findings.
In the United States, the typical pretrial process proceeds from arrest to preliminary arraignment to a mandatory court appearance, when appropriate. During the preliminary arraignment, a judge or magistrate chooses whether to release or detain the suspect, a decision intended to account for the likelihood that the person will return to court or commit new crimes. This is especially important in domestic violence, which is often a serial offense and directed at a particular individual.
Arraignments are usually brief, with outcome projections made based on limited data. However, Richard Berk, a professor of criminology, and a professor of statistics at the Wharton School, and Susan B. Sorenson, a professor of social policy in the School of Social Policy and Practice, found that using machine-learning forecasts at these proceedings can dramatically reduce subsequent domestic violence arrests.
“A large number of criminal justice decisions by law require projections of the risk to society. These threats are called ‘future dangerousness,’” Berk says. “Many decisions, like arraignments, are kind of seat-of-the-pants. The question is whether we can do better than that, and the answer is yes we can. It’s a very low bar.”
For domestic violence crimes between intimate partners, parents and children, or even siblings, there’s typically a threat to one particular person, says Sorenson, who directs Penn’s Evelyn Jacobs Ortner Center on Family Violence.
“It’s not a general public safety issue,” she says. “With a domestic violence charge, let’s say a guy—and it usually is a guy—is arrested for this and is awaiting trial. He’s not going to go assault some random woman. The risk is for a re-assault of the same victim.”
To understand how machine learning could help in domestic violence cases, Berk and Sorenson obtained data from more than 28,000 domestic violence arraignments between January 2007 and October 2011. They also looked at a two-year follow-up period after release that ended in October 2013.
A computer can “learn” from training data which kinds of individuals are likely to re-offend. For this research, the 35 initial inputs included age, gender, prior warrants and sentences, and even residential location. These data points help the computer understand appropriate associations for projected risk, offering extra information to a court official deciding whether to release an offender.
“In all kinds of settings, having the computer figure this out is better than having us figure it out,” Berk says.
That’s not to say there aren’t obstacles to its use. The number of mistaken predictions can be unacceptably high, and some people object in principle to using data and computers in this manner. To both of these points, the researchers respond that machine learning is simply a tool.
“It doesn’t make the decisions for people by any stretch,” Sorenson says. These choices “might be informed by the wisdom that accrues over years of experience, but it’s also wisdom that has accrued only in that courtroom. Machine learning goes beyond one courtroom to a wider community.”
In some criminal justice settings, use of machine learning is already routine, although different kinds of decisions require different datasets from which the computer must learn. The underlying statistical techniques, however, remain the same.
Berk and Sorenson contend the new system can improve current practices.
“The algorithms are not perfect. They have flaws, but there are increasing data to show that they have fewer flaws than existing ways we make these decisions,” Berk says. “You can criticize them—and you should, because we can always make them better—but, as we say, you can’t let the perfect be the enemy of the good.”
The Penn researchers published their work in the Journal of Empirical Legal Studies.