Within Biased data

Why facial recognition errors are not evenly shared

Facial recognition studies show how high average accuracy can hide larger false-match risks for some demographic groups.

On this page

  • What demographic testing revealed
  • False matches and false non matches as real harms
  • Why representation in image data matters
Preview for Why facial recognition errors are not evenly shared

Introduction

Facial recognition became one of the most widely discussed examples of AI bias because it revealed a crucial lesson: a system can appear highly accurate overall while still making substantially more mistakes for some groups of people than others. Early evaluations often reported a single accuracy figure, but later research showed that performance could vary across race, sex, age, and combinations of those characteristics. In practical terms, that means the risks of being wrongly identified, wrongly excluded, or subjected to additional scrutiny may not be shared equally across a population. Studies from researchers, independent auditors, and the US National Institute of Standards and Technology (NIST) helped turn facial recognition into a defining case study of how biased data and uneven learned patterns can emerge in AI systems. [NIST]nist.govstudy evaluates effects race age sex face recognition softwareNIST Study Evaluates Effects of Race, Age, Sex on Face…19 Dec 2019 — A new NIST study examines how accurately face recognition sof…

Face bias illustration 1

What demographic testing revealed

For many years, facial recognition systems were primarily evaluated using overall accuracy scores. Those averages often hid important differences between demographic groups. When researchers began testing systems separately by race, sex, and age, a more complex picture emerged.

One influential example was the 2018 Gender Shades study by Joy Buolamwini and Timnit Gebru. The researchers evaluated commercial face-analysis systems and found dramatic differences in error rates across groups. The highest-performing category was lighter-skinned men, with error rates below 1% in some systems, while darker-skinned women experienced error rates as high as 34.7%. The study became influential because it demonstrated that performance gaps could be extremely large even when vendors advertised high overall accuracy. Proceedings of Machine Learning Research+2Proceedings of Machine Learning Research [proceedings.mlr.press]proceedings.mlr.pressProceedings of Machine Learning ResearchGender Shades: Intersectional Accuracy Disparities in…by J Buolamwini · 2018 · Cited by 10693…

The findings were reinforced by larger evaluations. In 2019, NIST tested many facial recognition algorithms and reported that the majority exhibited what it called “demographic differentials” — measurable differences in performance across demographic groups. The results varied by algorithm, but many systems showed substantially higher error rates for certain populations. False-positive rates were often elevated for people of African and East Asian ancestry compared with some European-origin groups, while women, children, and older adults frequently experienced different error patterns than middle-aged men. [CSIS+3NIST+3NIST Publications]nist.govstudy evaluates effects race age sex face recognition softwareNIST Study Evaluates Effects of Race, Age, Sex on Face…19 Dec 2019 — A new NIST study examines how accurately face recognition sof…

Importantly, these disparities were not identical across all systems. Some algorithms performed far better than others, demonstrating that uneven outcomes were not an unavoidable property of facial recognition itself. Algorithm design, training data, and evaluation practices all influenced the results. [ASIS International+2Security Industry Association]asisonline.orgfacial recognition error rates vary by demographicASIS InternationalFacial Recognition Error Rates Vary by Demographic1 May 2020 — In the NIST study, not all algorithms gave these high ra…Published: May 2020

Why average accuracy can be misleading

A common misunderstanding is that a facial recognition system with 99% accuracy must work equally well for everyone. In reality, averages can conceal large differences between groups.

Imagine a system used on millions of people. If the overall error rate is low but one demographic group experiences ten times more false matches than another, the burden of mistakes becomes concentrated rather than evenly distributed. NIST’s demographic analysis found that false-positive rates in some algorithms differed by factors ranging from tenfold to more than one hundredfold across demographic groups. [PMC]pmc.ncbi.nlm.nih.govPMCBeating the bias in facial recognition technologyNIHby J Lunter · 2020 · Cited by 38 — NIST recently ran a large-scale test focused on identifying bias in FRT, with a particular em…

This is why researchers increasingly insist on reporting disaggregated results rather than a single headline accuracy figure. Looking only at overall performance can make a system appear fair even when particular groups face substantially higher risks. The lesson extends beyond facial recognition and applies broadly across AI systems: averages do not automatically reveal who bears the cost of mistakes. [ResearchGate]researchgate.netGender shades: intersectional phenotypic and…For example, Buolamwini (2017) found that facial recognition technology is m…

False matches and false non-matches as real harms

Understanding facial recognition bias requires distinguishing between two major categories of error.

Face bias illustration 2

False matches

A false match occurs when a system incorrectly decides that two images belong to the same person. In identification settings, this can cause an innocent individual to be linked to someone else.

Researchers and regulators pay particular attention to false matches because they can have serious consequences in policing, border control, security screening, and other identity-sensitive applications. NIST’s demographic testing found that many algorithms produced higher false-positive rates for some racial and ethnic groups than for others. In operational settings, this means members of certain groups may face a greater chance of being incorrectly flagged. [NIST Publications+2NIST Publications]nvlpubs.nist.govNIST PublicationsFace Recognition Vendor Test (FRVT), Part 3: Demographic…by P Grother · 2019 · Cited by 93 — False positives: Using t…

False non-matches

A false non-match occurs when a system fails to recognise that two images belong to the same person. This error can prevent legitimate access to services, devices, or secure locations.

Although false non-matches often receive less public attention than false matches, they can create unequal burdens. A traveller may be delayed at an automated border gate, or a user may repeatedly fail an identity verification process. If these failures occur disproportionately for particular demographic groups, the convenience promised by automation becomes unevenly distributed. [NIST]nist.govstudy evaluates effects race age sex face recognition softwareNIST Study Evaluates Effects of Race, Age, Sex on Face…19 Dec 2019 — A new NIST study examines how accurately face recognition sof…

The key point is that different applications make different errors more important. A phone-unlocking system and a police search system may both use facial recognition, but the social consequences of their mistakes are very different.

Why representation in image data matters

One major explanation for unequal error rates involves the data used to train and test AI systems.

Machine-learning models learn patterns from examples. If certain groups appear less frequently in training datasets, the model may have fewer opportunities to learn reliable representations of those faces. The Gender Shades researchers found that prominent face datasets used in the field contained disproportionately large numbers of lighter-skinned individuals, creating concerns about how well systems would generalise to more diverse populations. [Proceedings of Machine Learning Research]proceedings.mlr.pressProceedings of Machine Learning ResearchGender Shades: Intersectional Accuracy Disparities in…by J Buolamwini · 2018 · Cited by 10693…

Representation affects more than simple counts. Differences in image quality, lighting conditions, camera equipment, pose, age distribution, and collection practices can all influence model performance. Researchers have also shown that image quality and demographic composition interact in complex ways, meaning that performance gaps cannot always be explained by a single factor. [arXiv]arxiv.orgCharacterizing the Variability in Face Recognition Accuracy Relative to RaceApril 15, 2019…Published: April 15, 2019

As a result, improving fairness is not simply a matter of adding more images. Developers increasingly focus on collecting more representative datasets, testing systems across multiple demographic categories, and measuring performance separately for different groups before deployment. [ResearchGate]researchgate.netGender shades: intersectional phenotypic and…For example, Buolamwini (2017) found that facial recognition technology is m…

Face bias illustration 3

Why this case became a landmark example of AI bias

Facial recognition attracted unusual attention because the evidence was measurable and concrete. Researchers could compare error rates across groups, identify disparities, and independently test commercial systems. The findings transformed public discussions about AI fairness from abstract concerns into observable performance differences. [MIT News]news.mit.edustudy finds gender skin type bias artificial intelligence systems 0212MIT NewsStudy finds gender and skin-type bias in commercial…11 Feb 2018 — For darker-skinned women — those assigned scores of IV, V, o…

The debate also demonstrated that bias is not always visible in headline performance numbers. A system can perform well overall while imposing greater risks on particular populations. That insight has influenced how researchers evaluate many other forms of AI, encouraging demographic testing, subgroup analysis, and fairness audits as standard parts of system assessment. Facial recognition therefore became more than a controversy about one technology; it became a widely cited example of how biased data and uneven learned patterns can produce unequal outcomes in real-world AI systems. [NIST+2PMC]nist.govstudy evaluates effects race age sex face recognition softwareNIST Study Evaluates Effects of Race, Age, Sex on Face…19 Dec 2019 — A new NIST study examines how accurately face recognition sof…

Amazon book picks

Further Reading

Books and field guides related to Why facial recognition errors are not evenly shared. Use these as the next step if you want deeper reading beyond the article.

eBay marketplace picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Using USA

Endnotes

  1. Source: nist.gov
    Title: study evaluates effects race age sex face recognition software
    Link: https://www.nist.gov/news-events/news/2019/12/nist-study-evaluates-effects-race-age-sex-face-recognition-software
    Source snippet

    NIST Study Evaluates Effects of Race, Age, Sex on Face...19 Dec 2019 — A new NIST study examines how accurately face recognition sof...

  2. Source: news.mit.edu
    Title: study finds gender skin type bias artificial intelligence systems 0212
    Link: https://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212
    Source snippet

    MIT NewsStudy finds gender and skin-type bias in commercial...11 Feb 2018 — For darker-skinned women — those assigned scores of IV, V, o...

  3. Source: nvlpubs.nist.gov
    Link: https://nvlpubs.nist.gov/nistpubs/ir/2019/nist.ir.8280.pdf
    Source snippet

    NIST PublicationsFace Recognition Vendor Test (FRVT), Part 3: Demographic...by P Grother · 2019 · Cited by 93 — [False positives]({{ 'false-positives/' | relative_url }}): Using t...

  4. Source: csis.org
    Title: problem bias facial recognition
    Link: https://www.csis.org/blogs/strategic-technologies-blog/problem-bias-facial-recognition
    Source snippet

    The Problem of Bias in Facial Recognition1 May 2020 — NIST found that Asians, African Americans, and American Indians generally had highe...

    Published: May 2020

  5. Source: pmc.ncbi.nlm.nih.gov
    Title: PMCBeating the bias in facial recognition technology
    Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC7575263/
    Source snippet

    NIHby J Lunter · 2020 · Cited by 38 — NIST recently ran a large-scale test focused on identifying bias in FRT, with a particular em...

  6. Source: researchgate.net
    Link: https://www.researchgate.net/publication/323722163_Gender_shades_intersectional_phenotypic_and_demographic_evaluation_of_face_datasets_and_gender_classifiers
    Source snippet

    Gender shades: intersectional phenotypic and...For example, Buolamwini (2017) found that facial recognition technology is m...

  7. Source: arxiv.org
    Link: https://arxiv.org/abs/1904.07325
    Source snippet

    Characterizing the Variability in Face Recognition Accuracy Relative to RaceApril 15, 2019...

    Published: April 15, 2019

  8. Source: researchgate.net
    Link: https://www.researchgate.net/publication/224238108_Demographic_effects_on_estimates_of_automatic_face_recognition_performance
    Source snippet

    Demographic effects on estimates of automatic face...Specifically, these studies suggested that face recognition is less acc...

  9. Source: nist.gov
    Link: https://www.nist.gov/
    Source snippet

    National Institute of Standards and TechnologyNIST promotes U.S. innovation and industrial competitiveness by advancing measurement scien...

  10. Source: pages.nist.gov
    Title: frvt demographics
    Link: https://pages.nist.gov/frvt/html/frvt_demographics.html
    Source snippet

    False positives can in principle occur...Read more...

  11. Source: arxiv.org
    Link: https://arxiv.org/html/2502.02309v1
    Source snippet

    Review of Demographic Bias in Face Recognition4 Feb 2025 — The Face Recognition Vendor Test (FRVT) conducted by NIST [15] substantiated t...

  12. Source: researchgate.net
    Title: 388685657 Review of Demographic Bias in Face Recognition
    Link: https://www.researchgate.net/publication/388685657_Review_of_Demographic_Bias_in_Face_Recognition
    Source snippet

    (PDF) Review of Demographic Bias in Face Recognition4 Feb 2025 — Demographic bias in face recognition (FR) has emerged as a critical area...

  13. Source: proceedings.mlr.press
    Link: https://proceedings.mlr.press/v81/buolamwini18a.html
    Source snippet

    Proceedings of Machine Learning ResearchGender Shades: Intersectional Accuracy Disparities in...by J Buolamwini · 2018 · Cited by 10693...

  14. Source: proceedings.mlr.press
    Title: Darker females have the highest error rates for all gender.Read more
    Link: https://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf
    Source snippet

    Proceedings of Machine Learning ResearchGender Shades: Intersectional Accuracy Disparities in...by J Buolamwini · 2018 · Cited by 10693...

  15. Source: asisonline.org
    Title: facial recognition error rates vary by demographic
    Link: https://www.asisonline.org/security-management-magazine/articles/2020/05/facial-recognition-error-rates-vary-by-demographic/
    Source snippet

    ASIS InternationalFacial Recognition Error Rates Vary by Demographic1 May 2020 — In the NIST study, not all algorithms gave these high ra...

    Published: May 2020

  16. Source: securityindustry.org
    Title: what nist data shows about facial recognition and demographics
    Link: https://www.securityindustry.org/report/what-nist-data-shows-about-facial-recognition-and-demographics/
    Source snippet

    Security Industry AssociationWhat NIST Data Shows About Facial Recognition and...6 Feb 2020 — The report specifically identifies six sup...

  17. Source: Wikipedia
    Title: National Institute of Standards and Technology
    Link: https://en.wikipedia.org/wiki/National_Institute_of_Standards_and_Technology
    Source snippet

    National Institute of Standards and TechnologyThe National Institute of Standards and Technology (NIST) is an agency of the United Sta...

  18. Source: pmc.ncbi.nlm.nih.gov
    Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC7879975/
    Source snippet

    This is the only study...

Additional References

  1. Source: cognitec.com
    Link: https://www.cognitec.com/files/tao/downloads/Cognitec-White-Paper-Demographic-Effects.pdf
    Source snippet

    Bias in Face Recognition Systems: Controversial Opinions...NIST used algorithms submitted to 1:1 and 1:N FRVTs for a specific study on d...

  2. Source: linkedin.com
    Link: https://www.linkedin.com/posts/justine-juillard_femalefounder-activity-7365451042727133184-gvD_
    Source snippet

    How Joy Buolamwini fought for fair facial recognitionThe systems could identify lighter-skinned men with 99.2% accuracy. But for darker-s...

  3. Source: darktrace.com
    Link: https://www.darktrace.com/cyber-ai-glossary/national-institute-of-standards-and-technology-nist
    Source snippet

    What is NIST? | Definition & ExamplesThe National Institute of Standards and Technology (NIST) is the federal technology agency that deve...

  4. Source: cs4fn.blog
    Link: https://cs4fn.blog/2022/11/01/recognising-and-addressing-bias-in-facial-recognition-tech-the-gender-shades-audit-blackhistorymonth-jb/
    Source snippet

    Recognising (and addressing) bias in facial recognition tech1 Nov 2022 — A 2018 study found that facial recognition systems were ess able...

  5. Source: gendershades.org
    Link: https://gendershades.org/overview.html

  6. Source: proofpoint.com
    Link: https://www.proofpoint.com/uk/threat-reference/nist-compliance

  7. Source: rrapp.spia.princeton.edu
    Link: https://rrapp.spia.princeton.edu/algorithmic-bias-in-facial-recognition-technology-on-the-basis-of-gender-and-skin-tone/
    Source snippet

    13 Oct 2020 — Researchers identify discrepancies in classification of gender and skin tone by facial recognition technology indicati...

  8. Source: studocu.vn
    Title: nistir 8280 frvt part 3 analyzing demographic effects in face recognition
    Link: https://www.studocu.vn/vn/document/truong-dai-hoc-kinh-te-luat-dai-hoc-quoc-gia-thanh-pho-ho-chi-minh/cong-nghe-tien-dien-tu/nistir-8280-frvt-part-3-analyzing-demographic-effects-in-face-recognition/155963992
    Source snippet

    NISTIR 8280 FRVT Part 3: Analyzing Demographic Effects...This report evaluates the demographic effects on face recognition algorithms, h...

  9. Source: youtube.com
    Link: https://www.youtube.com/watch?v=TWWsW1w-BVo
    Source snippet

    Gender ShadesThe Gender Shades Project pilots an intersectional approach to inclusive product testing for AI. Gender Shades is a prelimin...

  10. Source: itif.org
    Title: critics were wrong nist data shows best facial recognition algorithms
    Link: https://itif.org/publications/2020/01/27/critics-were-wrong-nist-data-shows-best-facial-recognition-algorithms/
    Source snippet

    The Critics Were Wrong: NIST Data Shows the Best Facial...by M McLaughlin · 2020 · Cited by 22 — In comparison to the false-negative rat...

Topic Tree

Follow this branch

Parent topic

Biased data When learned patterns become unfair

Related pages 2