Data and Racism in Machine Learning?

We often hear stories these days about racism in machine learning algorithms. The subtlety in these stories is often missing. I've been reading about this recently and found this quote very telling:
A wave of scholarship, triggered by the ProPublica report, illuminated the statistical challenge at the heart of the argument: Given that the underlying “base rate” of rearrest is higher for blacks than for whites, it is mathematically inevitable that the burden of false positives will fall more heavily on black defendants than on white ones. In other words, given that more black defendants than white defendants actually do have a high risk of reoffending, a “high risk” label that is correct 70% of the time for both white and black defendants will still mis-label more black than white defendants as high risk. A study titled “Inherent Tradeoffs in the Fair Determination of Risk Scores” proved mathematically that when rearrest rates are not equal between races, a well-calibrated tool like Northpointe’s – that is, a tool that is mistaken equally often about whites and about blacks – will inevitably have more false positives for blacks. As the authors of a second study explained, to equalize the error rates, one would have to make the tool itself race conscious, and “set multiple, race-specific [risk] thresholds.” 
In short, an intuitive understanding of equal protection cannot square with the mathematics of predictive risk scoring. Under real conditions, a tool that is equally often mistaken about white and black defendants will more often send blacks to jail by mistake than send whites to jail by mistake. But an explicitly race-conscious risk assessment tool, that predicted scores differently for whites than for blacks, would itself face serious constitutional challenges. An understanding of “equal protection” that would require race-blindness, and simultaneously require that races are burdened equally by prediction errors, simply does not leave room for risk assessment tools to operate.
The quote is from "The Challenges of Prediction: Lessons from Criminal Justice."


  1. Thanks For Sharing Excellent Blog. Machine Learning is steadily moving away from abstractions and engaging more in business problem solving with support from AI and Deep Learning. With Big Data making its way back to mainstream business activities, now smart (ML) algorithms can simply use massive loads of both static and dynamic data to continuously learn and improve for enhanced performance. Pridesys IT Ltd

  2. Good blog for the people who are seeking information about the technology.
    Awesome work keep it up. Simply superb.
    machine learning course
    machine learning certification
    machine learning training
    machine learning training course

  3. Awesome post. Thanks for sharing this post with form a pc checkers application that was one amongst the primary programs that would learn from its own mistakes and improve its performance over time.Machine learning course one step any - it changes its program's behavior supported what it learns.

  4. very useful post, thank you.

  5. nice article, thank you for posting such an valuable information.

  6. Great Information. Thank You Author, for sharing your valuable information about Machine Learning with us. People who are reading this blog can continue your knowledge which you gained with us and know how to apply this practically along with our Machine Learning Course

  7. Really very nice blog information for this one and more technical skills are improve,i like that kind of post.

    Guest posting sites

  8. The information which you have provided is very good. It is very useful who is looking for machine learning course


Post a Comment

Popular posts from this blog

What I Learned from a Year Spent Studying How to Get Policymakers to Use Evidence

How Much Do Wild Animals Suffer? A Foundational Result on the Question is Wrong.

A Simple Reason Why Vegan Options Can Have Increased While Veganism Did Not