When equal risk scores hide unequal illness

Introduction

One of the most important lessons in understanding artificial intelligence is that equal scores do not necessarily mean equal outcomes. In the widely discussed healthcare risk-scoring case, researchers discovered that Black patients and White patients who received the same algorithmic risk score often had very different levels of illness. The score appeared neutral, but patients with identical ratings were not equally sick. Black patients were typically carrying a substantially greater disease burden than White patients at the same predicted risk level. This finding revealed how an AI system can seem accurate overall while still producing unequal consequences for different groups. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

Equal Scores illustration 1

What equal scores revealed about sickness

The most striking evidence emerged when researchers stopped looking at overall prediction accuracy and instead compared patients who received the same risk score.

The healthcare algorithm was designed to identify patients who might benefit from additional care-management services. When researchers examined people with identical scores, they found that Black patients consistently had more chronic illness, more severe health conditions, and more signs of uncontrolled disease than White patients assigned the same level of risk. [Data 6]data6.orgJune 24, 2021…Published: June 24, 2021

This observation mattered because a risk score is supposed to summarise underlying need. If two patients receive the same score, decision-makers generally assume they have roughly similar health risks. The study showed that this assumption was false. The algorithm’s rankings systematically understated the health needs of many Black patients. [PubMed]pubmed.ncbi.nlm.nih.govDissecting racial bias in an algorithm used to manage the health of populations - PubMed…

Researchers measured illness burden in several ways, including the number of chronic conditions and broader indicators of health status. Across these measures, the pattern remained consistent: equal algorithmic scores did not correspond to equal levels of sickness. Black patients had to be substantially sicker before receiving the same risk rating as White patients. [Data 6]data6.orgJune 24, 2021…Published: June 24, 2021

This finding is especially important because the disparity was not obvious from conventional performance statistics. The model could appear successful when evaluated against its training target while still producing unequal rankings when used to guide care decisions. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

How under-ranking affected programme eligibility

The unequal scoring translated directly into access to support programmes.

Many health systems use risk thresholds to determine who receives additional monitoring, preventive interventions, case management, or specialised care coordination. Patients above a certain score are enrolled; those below it are not. Because Black patients were often assigned lower scores than their actual illness burden justified, many were less likely to qualify for these programmes. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

The researchers estimated that if the bias were removed, the share of Black patients identified for extra care would rise dramatically—from 17.7% to 46.5%. This was not a small statistical adjustment. It represented a major change in who would receive healthcare resources and support. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

The practical consequence was that two patients with similar medical needs could be treated differently by the allocation system. A White patient might cross the eligibility threshold while a Black patient with comparable or greater illness remained below it. The algorithm therefore influenced not only rankings on a screen but also real-world access to healthcare services. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

The pattern reflected a broader structural problem. Historical healthcare spending on Black patients was often lower than spending on White patients with similar health conditions. Because the algorithm learned from spending data, it inherited those patterns and reproduced them in its predictions. [Chicago Booth+2PubMed]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

Equal Scores illustration 3

Equal Scores illustration 2

Why group-level checks matter before deployment

The equal-score finding became influential because it demonstrated a limitation of standard AI evaluation.

A model can perform well on average while still treating groups differently. If evaluators only examine aggregate accuracy, they may miss important disparities hidden inside the rankings. The healthcare case showed that an algorithm can correctly predict its chosen target yet still fail at the task decision-makers actually care about. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

Group-level analysis helped expose the problem. Instead of asking whether the model predicted future costs accurately, researchers asked a different question: when Black and White patients receive the same score, do they have similar levels of illness? The answer was clearly no. That comparison revealed a disparity invisible in many conventional performance reports. [Data 6]data6.orgJune 24, 2021…Published: June 24, 2021

The lesson extends beyond healthcare. Whenever an AI system ranks people for opportunities, services, or interventions, developers must examine how scores relate to real outcomes across different populations. Equal numerical outputs do not automatically imply equal treatment. A score may encode hidden assumptions inherited from the data used to train the model. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

For students of artificial intelligence, this case is a reminder that fairness questions often emerge not from the algorithm’s mathematics alone but from the relationship between the score and the reality it is meant to represent. The discovery that Black patients were sicker at the same risk score became one of the clearest demonstrations that AI systems should be evaluated not only for predictive accuracy but also for how their rankings affect different groups in practice. [Chicago Booth]chicagobooth.eduChicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth…

Amazon book picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Example eBay listing

Soviet Poster ORIGINAL First Aid kit Medical AI-2 NBC Chernobyl stalker USSR

Search eBay.co.uk: medical AI poster

Browse similar on eBay.co.uk

Example eBay listing

Soviet Russian Poster First Aid Individual Medical Kit AI Military 24x36"

Search eBay.co.uk: medical AI poster

Browse similar on eBay.co.uk

Example eBay listing

Soviet Russian Poster First Aid Individual Medical Kit AI Military Army 18x24"

Search eBay.co.uk: medical AI poster

Browse similar on eBay.co.uk

Browse more on eBay.co.uk

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Endnotes

Source: chicagobooth.edu
Link: https://www.chicagobooth.edu/research/tolan/research/2019/dissecting-racial-bias-in-an-algorithm-used-to-manage-the-health-of-populations
Source snippet
Chicago BoothDissecting racial bias in an algorithm used to manage the health of populations - Tolan Center | Chicago Booth...
Source: pubmed.ncbi.nlm.nih.gov
Link: https://pubmed.ncbi.nlm.nih.gov/31649194/
Source snippet
Dissecting racial bias in an algorithm used to manage the health of populations - PubMed...
Source: data6.org
Link: https://data6.org/su22/assignments/DissectingRacialBias.pdf
Source snippet
June 24, 2021...

Published: June 24, 2021
Source: ouci.dntb.gov.ua
Title: Dissecting racial bias in an algorithm used to manage the health of populations
Link: https://ouci.dntb.gov.ua/en/works/7P2G12Gl/

Additional References

Source: reddit.com
Title: Millions of black people affected by racial bias in health-care algorithms
Link: https://www.reddit.com/r/autotldr/comments/dn232j
Source snippet
Millions of black people affected by racial bias in health-care algorithms...
Source: youtube.com
Title: Dissecting Racial Bias in an Algorithm that Guides Health Decisions for Millions
Link: https://www.youtube.com/watch?v=y6eo0FZIqjk
Source snippet
Dissecting Algorithmic Bias | Ziad Obermeyer | AI FOR GOOD DISCOVERY...
Source: youtube.com
Title: Dissecting Algorithmic Bias | Ziad Obermeyer | AI FOR GOOD DISCOVERY
Link: https://www.youtube.com/watch?v=U5MlyFsMi-E
Source snippet
Keynote Presentation: Dissecting Algorithmic Bias...
Source: youtube.com
Title: The Double-Edged Sword of AI, with Dr. Ziad Obermeyer
Link: https://www.youtube.com/watch?v=c5KcSNeGXOM
Source snippet
The Double-Edged Sword of AI, with Dr. Ziad Obermeyer - YouTube...
Source: youtube.com
Title: Keynote Presentation: Dissecting Algorithmic Bias
Link: https://www.youtube.com/watch?v=JfKYO1W4uuA
Source snippet
The Double-Edged Sword of AI, with Dr. Ziad Obermeyer...

When equal risk scores hide unequal illness

Introduction

What equal scores revealed about sickness

How under-ranking affected programme eligibility

Why group-level checks matter before deployment

Further Reading

Weapons of Math Destruction

Deep Medicine

Invisible Women

Race After Technology

Marketplace Samples

Soviet Poster ORIGINAL First Aid kit Medical AI-2 NBC Chernobyl stalker USSR

Soviet Russian Poster First Aid Individual Medical Kit AI Military 24x36"

Soviet Russian Poster First Aid Individual Medical Kit AI Military Army 18x24"

Endnotes

Additional References

Follow this branch

Parent topic

Related pages 2