Data and Racism in Machine Learning?

We often hear stories these days about racism in machine learning algorithms. The subtlety in these stories is often missing. I've been reading about this recently and found this quote very telling:
A wave of scholarship, triggered by the ProPublica report, illuminated the statistical challenge at the heart of the argument: Given that the underlying “base rate” of rearrest is higher for blacks than for whites, it is mathematically inevitable that the burden of false positives will fall more heavily on black defendants than on white ones. In other words, given that more black defendants than white defendants actually do have a high risk of reoffending, a “high risk” label that is correct 70% of the time for both white and black defendants will still mis-label more black than white defendants as high risk. A study titled “Inherent Tradeoffs in the Fair Determination of Risk Scores” proved mathematically that when rearrest rates are not equal between races, a well-calibrated tool like Northpointe’s – that is, a tool that is mistaken equally often about whites and about blacks – will inevitably have more false positives for blacks. As the authors of a second study explained, to equalize the error rates, one would have to make the tool itself race conscious, and “set multiple, race-specific [risk] thresholds.” 
In short, an intuitive understanding of equal protection cannot square with the mathematics of predictive risk scoring. Under real conditions, a tool that is equally often mistaken about white and black defendants will more often send blacks to jail by mistake than send whites to jail by mistake. But an explicitly race-conscious risk assessment tool, that predicted scores differently for whites than for blacks, would itself face serious constitutional challenges. An understanding of “equal protection” that would require race-blindness, and simultaneously require that races are burdened equally by prediction errors, simply does not leave room for risk assessment tools to operate.
The quote is from "The Challenges of Prediction: Lessons from Criminal Justice."

Comments

  1. Thanks For Sharing Excellent Blog. Machine Learning is steadily moving away from abstractions and engaging more in business problem solving with support from AI and Deep Learning. With Big Data making its way back to mainstream business activities, now smart (ML) algorithms can simply use massive loads of both static and dynamic data to continuously learn and improve for enhanced performance. Pridesys IT Ltd

    ReplyDelete

Post a Comment

Popular posts from this blog

The Holocaust Analogy for Animal Agriculture Matters—And It Drives My Activism

An Engagement Good for More Than Two

What I Learned from a Year Spent Studying How to Get Policymakers to Use Evidence