Social algorithms and the Weapons of Math Destruction
September 14, 2016 1 Comment
I owe much of my knowledge about the (negative) effects of algorithms on society to the writings of Cathy O’Neil. I highly recommend her blog mathbabe.org. A couple of months ago, she shared the proofs of her book Weapons of Math Destruction with me, and given that the book came out last week, I wanted to share some of my impressions. In this post, I want to summarize what makes a social algorithm into a weapon of math destruction, and share the example of predictive policing.
To start us off, let me sketch what I mean by a social algorithm. I will be borrowing the definition, although not the name, from Cathy. A social algorithm is a mathematical model, usually embedded in computer code, that often takes the form of a scoring system that evaluates a person on some criteria. The results are used to determine – – sometimes directly and automatically, sometimes as a central point of consideration for a human decision maker — access to or denial of certain resources to that person. The social algorithm acts as a manager or advisor in place of a role typically occupied by a human decision maker or case worker. More often than not, these algorithms are not (completely) hard-coded but trained on population (and individual) level data.
In a world where prominent economists like Noah Smith believe that “[t]echnological progress (usually) causes moral progress”, it is easy to imagine social algorithms as improvements over human decision making. After all, humans are faulty machines full of biases and deviations from the perfect rationality of homo economicus. People will be racist, sexist, homophobic, xenophobic, Islamphobic — you name it. Computers will not. So why leave decisions about who to lend to, who to hire, who to release for early parole, who to admit to college, etc to this basket of deplorables when instead they can be sorted by an objective algorithm? In this rhetorical world, social algorithms are fairer because they eliminate the possibility for individual bias and provide a uniform and mechanised procedure for evaluating all.
In the real world, this rhetoric remains, but the true motivation is usually efficiency. It is cheaper to have a computer evaluate credit scores, grade personality traits for hiring, or target internet ads than to have endless case workers. This efficiency, combined with the portability of algorithms — just copy the code and run on more machines — leads to scale. A single social algorithm can affect millions of people.
In the rhetorical world, since we design social algorithms, we can examine them in ways that we cannot examine human biases. In the real world, many of the widest reaching social algorithms are controlled by private entities that keep their source code and training data secret. But even in the cases where the source and data are publicly accessible, understanding the algorithm can still be inaccessible to those it affects. Most people are not statisticians, programmers, or data scientists. Most people cannot easily decipher computer code or check a dataset for validity and collection biases. In fact, the latter task requires a training in the quantitative social sciences that even the typical designers of social algorithms sometimes lack. So in reality, social algorithms can be just as opaque as human minds. If not more so for the average person. Combined with their scale, these algorithms can start to function like secret laws.
Finally, in the rhetorical world, a real-time data-informed social algorithm can adapt to each individual and society much faster than traditions or laws. It can keep up with the times and provide individual service. But in the real world, this allows social algorithms to get caught in vicious cycles of self-reinforcement and self-fulfilling prophecies. After all, social algorithms don’t aim to only interpret the world in various ways; their point is to change it. And their scale allows this. The scale of social algorithms allows them to have a significant effect on the datasets that inform or train them.
For Cathy O’Neil, these three features — (1) scale, (2) opacity, and (3) vicious self-reinforcement — are the hallmarks of weapons of math destruction (WMDs). A hallmark of social algorithms that negative outcomes for society. In particular, negative outcomes for those who are already marginalized. She illustrates this with a number of case studies of algorithms that can affect us through out our lives. Algorithms to assess teachers based on their students’ standardized scores, to filter out applicants from minimum-wage jobs, to advise judges in sentencing or parole decisions, to micro-target online ads, monitor our health, score our credit, and more.
Consider, for example, predictive policing: uses of algorithms to determine where limited police resources should be deployed in order to minimize crime. Let’s check the hallmarks. By design, these algorithms regulate the deployment of many police officers and through them affect whole precincts, cities, and counties. Sometimes several counties when a particular algorithm is marketed to many police departments. The algorithm itself is usually inaccessible to either the police officers or the people they police. In the worst case, it is a proprietary black box implemented by a private contractor using nonpublic data. In the best it might be using public data but has an unreasonably high barrier of understanding. It requiring a level of training in statistics or computer science that most citizens (and police officers) lack. The algorithm is opaque.
Most importantly, predictive policing algorithms are hungry. They take in all the data available to them, not just the occurrence of violent crimes — like murder and assault, that are relatively rare — but also nuisance crimes — loitering, possessions of small amount of drugs, aggressive pan-handling, etc., that are relatively common. This is justified by the neoliberal mentality that more data can’t be worse — after all, the algorithm would just ignore it if it isn’t relevant — and often a tacit belief in the broken windows theory. If there are minor crimes happening then surely major crimes are more likely, too. So more police officers get deployed to primarily poor neighbourhoods where nuisance crimes are higher. But unlikely violent crimes like murder, nuisance crimes are often not reported or recorded unless there is a police officer to notice them. With more boots on the ground, more nuisance crimes are noticed, more citations are written and tickets are given. If a kid in a poor neighbourhood smokes pot on his porch, he is immediately noticed by a passing police officer. An analogous kid in a rich neighbourhood has no police officer around to notice. Recorded crime rate goes up — mostly from nuisance crimes — fewer wealthy people move to the neighbourhood and the neighbourhood becomes even poorer. And even more heavily policed. The algorithm reinforces itself and policing and enforcement of laws becomes more and more unequal.
For Cathy, the efficiency, scale, and ‘objectivity’ offered by WMDs come at the expense of fairness, equal opportunity, and social justice. Social algorithms might be designed (or, at least touted) to reduce injustice, but the ones that become WMDs instead increase injustice and inequality. In this way, social algorithms today are not all that different from previous technological advances. Christopher Jones, writing about the history of infrastructure — rail, roads, electricity, running water — in the US, points out that new technology “usually benefitted small groups and exacerbated social inequality.” It is only through the concentrated and conscious action of the public that it can become a force for justice. The moral progress that Noah Smith associates with technology does not come from the technical innovations, it comes from the community organizers and everyday people that learn how to cope with and co-opt the technology and so transform it into a force for justice and the public good. “Citizen engagement, not technical ingenuity, deserves credit” for the moral progress that Noah Smith associated with technology. Although Cathy does not make this historical connection, she does realize that the solution to Weapons of Math Destruction does not depend on technical innovation, as much as it does on social and policy innovation.
It is important to recognize that just being mathematical does not make algorithms inherently fair. For Cathy, they are just opinions embedded in code. But algorithms are also not inherently bad. We can have good opinions. But to turn social algorithms into good opinions, instead of WMDs, we need to train those that design them better. Cathy suggests a data scientist’s version of a Hippocratic oath for this. We need to enact and update the laws that protect workers, citizens and consumers. And we need to eliminate the current levels of opacity in big data. It is possible to build and test algorithms for fairness, but we have to actually make that effort.
We have to stop worrying about the oppressive super-intelligent AI of the future and the tin-men that we imagine it’ll run in. Instead, we need to look at the flesh-and-blood men (and institutions) that are already using WMDs to oppress the masses for profit and power.