There has been much written about how big data can help to eradicate poverty, with significant analyses done to gauge the scale of the problem and determine its causes. One such project, run by Harvard, analyzed 1.4 billion federal tax records on income and life expectancy. They found that the average life expectancy of the lowest-income classes in America is now equal to that of Sudan or Pakistan, while the richest men in the US now outlive the poorest by a fairly appalling 15 years.
While you can thank big data for highlighting this terrible state of affairs, you can, according to some, also hold it partially responsible for creating it. In a new book titled ‘Weapons of Math Destruction’, Data scientist and former Wall Street banker, Cathy O'Neil, has detailed the many ways mathematical models and big data are being used as ideological tools by the powerful in ways that exacerbate oppression and inequality by using it to justify their decisions, ’deliberately [wielding] formulas to impress rather than clarify.’
O'Neil argues that big data isn't always better data and is often biased against women and the poor. Those living in poor neighborhoods are, for example, often targeted with ads for predatory payday lenders because the data suggests that they are most likely to buy it, perpetuating the cycle of poverty. She also cites examples of credit scores being used by HR teams in recruitment, ignoring potentially talented candidates on the misguided assumption that a poor scores correlates with weaker job performance.
O’Neil is not the first to notice the relationship between data and poverty. Michele Gilman, a law professor at the University of Baltimore and a former civil-rights attorney at the Department of Justice, has also noted that standards of privacy around data collection are far lower for the poor. Not only is this a gross indignity, the large volume of data collected often obstructs attempts to escape poverty. Gilman goes as far as to say that ‘data collection is where the poor are most stigmatized and humiliated.’
When we talk about protecting individuals’ right to privacy, poor people on welfare are rarely, if ever, considered. In many states, to qualify for food stamps, applicants even have to undergo fingerprinting and drug testing, and they are constantly checked up on to ensure they are as poor as they say they are. The data that’s gathered also often ends up feeding back into police systems, further perpetuating the cycle of surveillance and limiting their opportunities. This data is also handled with less care. Gilman notes that welfare programs collect massive amounts of data that is often stored in potentially unsecure databases for unknown amounts of time, with unspecified permissions control, or criteria for caseworker access, leaving them extremely vulnerable to rogue actors.
This higher level of surveillance for the poor helps to further engender an atmosphere of distrust between the lower classes and the authorities, as well as reinforcing stereotypes - further serving to marginalize poorer communities. The negative impacts are, perhaps, most profound in the way law enforcement uses data collection to target the poor. Data often reinforces existing prejudices among police officers that the poor are more likely to commit crimes and helps to justify them. O’Neil’s prime example in her thesis is recidivism models, which are used across the country by judges in sentencing convicts. She notes that, ‘People are being labeled high risk by the models because they live in poor neighborhoods and therefore, they’re being sentenced longer. That exacerbates that cycle. People are like, ‘Damn, there are some racist practices going on.’ What they don’t understand is that that’s never going to change because policemen do not want to examine their own practices. What they want, in fact, is to get the scientific objectivity to cover up any kind of acts for condemning their practices.’
In a National Review article, David French argued that Math is not racist and O’Neil is just a ‘social justice warrior’ with funny ideas around what is ‘fair’. However, her ideas are further reinforced by Mathematics PhD Jeremy, who identified that machine learning algorithms when applied to crime data can have an impact racially. He argues that data mining looks to find patterns in data, so if ‘race is disproportionately (but not explicitly) represented in the data fed to a data-mining algorithm, the algorithm can infer race and use race indirectly to make an ultimate decision.’ The defining factor of crime is poverty, and this is an issue that still disproportionately impacts black people. Therefore, not only is the data helping to keep people in a cycle of crime and therefore perpetuating poverty, it is doing so along racial lines - particularly incendiary at a time when tensions between minorities and authorities are at an all time high.
O’Neil acknowledges that data is not inherently bad, but that individuals and society mis-use it to draw so-called ‘natural conclusions.’ She argues that, ’If we hand over our decision-making processes to computers that use historical data, it will just repeat history. And that simply is not okay.’ However, it is not just a case of mis-use, it’s also the case that we collect too much. The way we both collect data about the poor and use it must be entirely re-examined. The digital age has brought with it many benefits, and has been used to streamline many aspects of the welfare system. Racism and class politics have long been built into people in power’s assumptions/prejudices, and it was hoped that big data would eradicate these. However, it seems that it is has simply helped build on historical biases. Data should be used as a tool to liberate the disenfranchised, and to do this ethical considerations need to be built into processes around data analysis, and we need to put more onus on the privacy rights of the poor - or at least as much as we put on those of the rich, otherwise the machines who learn from the past are doomed to repeat it.