Within Training Choices

What AI labels should tell US before launch

Datasheets and model cards make training choices, intended uses, and performance limits easier to inspect before an AI system is trusted.

On this page

  • What datasheets disclose about datasets
  • What model cards disclose about trained systems
  • Where documentation helps and where it cannot guarantee safety
Preview for What AI labels should tell US before launch

Introduction

Before an AI system is deployed, one of the most important questions is not how powerful the model is, but whether anyone can clearly explain where its data came from, how it was trained, what it was tested on, and where it is likely to fail. Dataset datasheets and model cards were developed to answer those questions. They are documentation standards designed to make AI systems more transparent, helping organisations decide whether a model is suitable for a particular use before it affects real people. Rather than treating AI as a black box, these documents expose key design choices, assumptions, and limitations that might otherwise remain hidden. [arXiv]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

Model Cards illustration 1 For governance and risk management, this matters because many deployment failures arise not from a lack of technical sophistication but from a mismatch between how a system was built and how it is actually used. Documentation cannot guarantee that an AI system is safe or fair, but it can make important risks visible before deployment decisions are made. [NIST]nist.govAI Risk Management FrameworkNIST has developed a framework to better manage risks to individuals, organizations, and society associat…

What AI labels should tell us before launch

The idea behind datasheets and model cards is similar to the labels that accompany medicines, electrical equipment, or industrial components. Users need information about intended use, testing conditions, limitations, and known risks before deciding whether a system can be trusted in a particular setting.

The proposal for “Datasheets for Datasets” emerged from concerns that machine-learning datasets were often shared without adequate information about their origins, collection methods, composition, or intended uses. The authors argued that every dataset should be accompanied by structured documentation covering why it was created, how it was assembled, what populations it contains, and where it should or should not be used. [arXiv+2Microsoft]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

A related proposal, “Model Cards for Model Reporting”, focused on trained models rather than datasets. Model cards were designed to describe a model’s intended uses, evaluation procedures, performance characteristics, limitations, and known risks. The goal was to help decision-makers understand not merely that a model works, but under what conditions it works and for whom. [arXiv+2ACM Digital Library]arxiv.orgarXiv Model Cards for Model ReportingarXiv Model Cards for Model Reporting

What datasheets disclose about datasets

Datasets are often treated as raw inputs, yet they encode many of the assumptions that later shape model behaviour. A datasheet aims to expose those assumptions before deployment.

Where the data came from

A useful datasheet explains the motivation for creating a dataset, who collected it, how the information was obtained, and whether individuals consented to its use where relevant. It should also identify funding sources, collection methods, and any significant preprocessing or filtering steps. [arXiv+2mlr3 Fairness]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

This information matters because two datasets may appear similar while reflecting very different populations or collection practices. Without documentation, organisations may unknowingly deploy models trained on data that does not resemble their target environment.

Who and what the dataset represents

Datasheets typically describe dataset composition, including the types of examples included, the size of the dataset, and any known gaps or imbalances. This helps reviewers determine whether important groups, locations, languages, or scenarios are under-represented. [Microsoft+2MDSD4Health]microsoft.comDatasheets for Datasetsby T Gebru · Cited by 4580 — The questions are divided into seven categories: motivation for dataset crea…

For example, a model intended for global deployment may have been trained primarily on data from a small number of countries. A deployment team that sees this information before launch can investigate whether additional testing or retraining is necessary.

A dataset may be appropriate for one task and inappropriate for another. Datasheets encourage creators to document recommended uses and known limitations. This helps prevent “dataset drift”, where information collected for one purpose is later reused in contexts for which it was never designed. [arXiv+2Overleaf]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

From a governance perspective, this is valuable because it shifts attention from raw accuracy claims to questions of fitness for purpose.

What model cards disclose about trained systems

While datasheets focus on training data, model cards focus on the behaviour of the trained system itself.

Model Cards illustration 2

Intended use and deployment boundaries

A model card should explain what the model was designed to do and, equally importantly, what it was not designed to do. Intended-use statements help organisations avoid deploying systems in settings that differ substantially from the conditions under which they were developed and tested. [arXiv+2IAPP.org]arxiv.orgarXiv Model Cards for Model ReportingarXiv Model Cards for Model Reporting

This may seem straightforward, but many deployment problems stem from using a model outside its validated scope. A model built to assist human reviewers, for example, may be unsuitable as a fully automated decision-maker.

How performance was measured

Aggregate accuracy figures can hide important weaknesses. Model cards therefore encourage disclosure of evaluation procedures, benchmark datasets, testing environments, and performance metrics. They also promote reporting across different demographic and contextual groups rather than relying on a single headline score. [arXiv+2ResearchGate]arxiv.orgarXiv Model Cards for Model ReportingarXiv Model Cards for Model Reporting

This information allows deployment teams to ask practical questions: Was the model tested on populations similar to ours? Were edge cases evaluated? Were error rates consistent across groups?

Known limitations and failure modes

A well-designed model card documents situations in which performance degrades or uncertainty increases. This may include limitations related to language coverage, environmental conditions, demographic variation, data quality, or adversarial inputs. [Alan Turing Institute+2Practical AI Act]alan-turing-institute.github.ioAlan Turing Institute Model CardsAlan Turing InstituteModel Cards - TEA TechniquesModel cards are standardised documentation frameworks that systematically document machi…

For governance purposes, these disclosures help organisations design safeguards, monitoring procedures, and human oversight mechanisms before deployment rather than after a failure occurs.

Why documentation changes deployment decisions

Documentation is often viewed as an administrative exercise, but its real value lies in supporting decision-making.

Before deployment, reviewers typically need to answer questions such as:

  • Does the training data resemble the environment where the system will be used?
  • Were relevant populations represented during development and testing?
  • Has the model been evaluated under realistic operating conditions?
  • What kinds of errors are expected?
  • Are there contexts in which deployment should be restricted or prohibited?

Datasheets and model cards provide structured evidence that helps answer these questions. They transform deployment reviews from informal trust in a developer’s claims into a more auditable process based on documented information. [arXiv+2arXiv]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

Their importance has grown alongside broader AI governance frameworks. The NIST AI Risk Management Framework, for example, emphasises documentation, transparency, measurement, and traceability as part of responsible AI risk management. Documentation helps organisations map risks, evaluate evidence, and justify deployment decisions in a systematic way. [NIST+2ETO AGORA]nist.govAI Risk Management FrameworkNIST has developed a framework to better manage risks to individuals, organizations, and society associat…

Model Cards illustration 3

Where documentation helps and where it cannot guarantee safety

Datasheets and model cards improve transparency, but they are not a substitute for rigorous testing, monitoring, or governance.

A well-written model card may reveal that a model performs poorly in certain conditions, but the document itself does not fix the problem. Likewise, a datasheet can disclose sampling biases without eliminating them. Documentation helps organisations recognise risks; it does not automatically mitigate them. [arXiv]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

There is also a risk of treating documentation as a compliance checklist rather than a meaningful governance tool. Researchers studying AI risk management have warned that documentation practices can become superficial if organisations focus on appearances rather than substantive evaluation and risk reduction. [arXiv]arxiv.orgEvolving AI Risk Management: A Maturity Model based on the NIST AI Risk Management FrameworkJanuary 26, 2024…Published: January 26, 2024

Another limitation is that documentation depends on truthful and complete reporting. A model card is only as useful as the evidence behind it. For this reason, many governance discussions increasingly emphasise audits, reproducible testing, and traceable records alongside documentation requirements. Emerging assurance approaches seek to supplement descriptive reports with stronger forms of verifiable evidence. [arXiv]arxiv.orgAI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) FrameworkNovember 16, 2025…Published: November 16, 2025

Even with these limitations, dataset datasheets and model cards remain among the most practical tools available for understanding an AI system before deployment. They make hidden assumptions visible, clarify intended uses, expose performance limits, and provide a structured basis for deciding whether a system is ready for real-world use. [arXiv+2arXiv]arxiv.orgarXiv Datasheets for DatasetsDatasheets for DatasetsMarch 23, 2018…Published: March 23, 2018

Amazon book picks

Further Reading

Books and field guides related to What AI labels should tell US before launch. Use these as the next step if you want deeper reading beyond the article.

BookCover for Atlas of AI

Atlas of AI

By Kate Crawford

Explains data origins, AI systems, transparency, accountability, and governance concerns behind documentation efforts.

BookCover for AI Snake Oil

AI Snake Oil

By Arvind Narayanan, Sayash Kapoor

Helps readers evaluate claims, limitations, testing results, and deployment risks that model cards seek to communicate.

eBay marketplace picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Using USA

Endnotes

  1. Source: arxiv.org
    Title: arXiv Datasheets for Datasets
    Link: https://arxiv.org/abs/1803.09010
    Source snippet

    Datasheets for DatasetsMarch 23, 2018...

    Published: March 23, 2018

  2. Source: arxiv.org
    Title: arXiv Model Cards for Model Reporting
    Link: https://arxiv.org/abs/1810.03993

  3. Source: nist.gov
    Link: https://www.nist.gov/itl/ai-risk-management-framework
    Source snippet

    AI Risk Management FrameworkNIST has developed a framework to better manage risks to individuals, organizations, and society associat...

  4. Source: agora.eto.tech
    Title: AGORANIST AI Risk Management Framework
    Link: https://agora.eto.tech/instrument/772
    Source snippet

    ETO AGORANIST AI Risk Management Framework - ETO AGORAAI risk measurements include documenting aspects of systems... MEASURE bolster AI...

  5. Source: microsoft.com
    Link: https://www.microsoft.com/en-us/research/wp-content/uploads/2019/01/1803.09010.pdf
    Source snippet

    Datasheets for Datasetsby T Gebru · Cited by 4580 — The questions are divided into seven categories: motivation for dataset crea...

  6. Source: dl.acm.org
    Link: https://dl.acm.org/doi/10.1145/3287560.3287596
    Source snippet

    ACM Digital LibraryModel Cards for Model Reporting | Proceedings of the...by M Mitchell · 2019 · Cited by 4162 — In this paper, we propo...

  7. Source: arxiv.org
    Link: https://arxiv.org/pdf/1810.03993
    Source snippet

    Model Cards for Model Reportingby M Mitchell · 2018 · Cited by 4162 — Model cards also disclose the context in which models are intended...

  8. Source: mdsd4health.com
    Title: Datasheets for Datasets
    Link: https://www.mdsd4health.com/modules/module-3-mdsd-methods-mediums-pt-i/datasheets-for-datasets
    Source snippet

    Datasheets for Datasets are documents that disclose the motivation, composition, collection process, recommended uses of a dat...

  9. Source: overleaf.com
    Link: https://www.overleaf.com/latex/templates/datasheet-for-dataset-template/jgqyyzyprxth
    Source snippet

    Datasheet for dataset templateDocument [the dataset] motivation, composition, collection process, recommended uses, and so on. [They] hav...

  10. Source: arxiv.org
    Link: https://arxiv.org/pdf/1803.09010
    Source snippet

    Datasheets for Datasetsby T Gebru · 2018 · Cited by 4541 — dataset be accompanied with a datasheet that documents its motivation, com- po...

  11. Source: iapp.org
    Title: 5 things to know about ai model cards
    Link: https://iapp.org/news/a/5-things-to-know-about-ai-model-cards
    Source snippet

    23 Aug 2023 — Model cards are short documents provided with machine learning models that explain the context in which the models are inte...

  12. Source: researchgate.net
    Link: https://www.researchgate.net/publication/328189552_Model_Cards_for_Model_Reporting
    Source snippet

    Model Cards for Model ReportingConcurrently, the proposal of Model cards are concise reports accompanying ML models that detail their int...

  13. Source: practical-ai-act.eu
    Link: https://practical-ai-act.eu/latest/engineering-practice/model-cards/
    Source snippet

    Model cardsModel cards are a somewhat standardized form of documentation that provide a comprehensive overview of an AI model, including...

  14. Source: arxiv.org
    Link: https://arxiv.org/abs/2401.15229
    Source snippet

    Evolving AI Risk Management: A Maturity Model based on the NIST AI Risk Management FrameworkJanuary 26, 2024...

    Published: January 26, 2024

  15. Source: arxiv.org
    Link: https://arxiv.org/abs/2511.12668
    Source snippet

    AI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) FrameworkNovember 16, 2025...

    Published: November 16, 2025

  16. Source: nist.gov
    Link: https://www.nist.gov/
    Source snippet

    National Institute of Standards and TechnologyNIST promotes U.S. innovation and industrial competitiveness by advancing measurement scien...

  17. Source: researchgate.net
    Title: 324055506 Datasheets for Datasets
    Link: https://www.researchgate.net/publication/324055506_Datasheets_for_Datasets
    Source snippet

    Datasheets for Datasets3 May 2026 — We propose the concept of a datasheet for datasets, a short document to accompany public datasets, co...

    Published: May 2026

  18. Source: researchgate.net
    Link: https://www.researchgate.net/publication/386668632_Datasheets_for_Datasets
    Source snippet

    Datasheets for DatasetsBy analogy, we propose that every dataset be accompanied with a datasheet that documents its motivation, compositi...

  19. Source: youtube.com
    Title: Model Cards for Model Reporting
    Link: https://www.youtube.com/watch?v=saAUB_MG2d0
    Source snippet

    Datasheets for Datasets help ML engineers notice and understand ethical issues in training data...

  20. Source: ainowinstitute.org
    Title: datasheets for datasets
    Link: https://ainowinstitute.org/publications/datasheets-for-datasets
    Source snippet

    22 Feb 2023 — every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended u...

  21. Source: mlr3fairness.mlr-org.com
    Title: mlr3 Fairness Datasheet for dataset “add dataset name here”
    Link: https://mlr3fairness.mlr-org.com/articles/datasheet/datasheet.html
    Source snippet

    mlr3 FairnessDatasheet for dataset “add dataset name here” - mlr3fairnessMotivation Composition Collection process Preprocessing/cleaning...

  22. Source: aisecurityandsafety.org
    Title: datasheets for datasets
    Link: https://aisecurityandsafety.org/en/glossary/datasheets-for-datasets/
    Source snippet

    AI Security & Safety DirectoryDatasheets for Datasets — AI Governance Definition & Guide27 Mar 2026 — Standardized documentation for mach...

  23. Source: verifywise.ai
    Link: https://verifywise.ai/de/ai-governance-library/transparency-and-documentation/model-cards-paper
    Source snippet

    Model Cards for Model Reporting | KI-Governance-BibliothekModel cards provide standardized documentation covering intended uses, performa...

  24. Source: alan-turing-institute.github.io
    Title: Alan Turing Institute Model Cards
    Link: https://alan-turing-institute.github.io/tea-techniques/techniques/model-cards/
    Source snippet

    Alan Turing InstituteModel Cards - TEA TechniquesModel cards are standardised documentation frameworks that systematically document machi...

  25. Source: emergentmind.com
    Title: model cards for model reporting
    Link: https://www.emergentmind.com/topics/model-cards-for-model-reporting
    Source snippet

    Model Cards for Reporting AI Models18 Mar 2026 — They enable transparency and regulatory compliance by including sections on intended use...

  26. Source: sentinelone.com
    Title: nist ai risk management framework
    Link: https://www.sentinelone.com/cybersecurity-101/cybersecurity/nist-ai-risk-management-framework/
    Source snippet

    What is the NIST AI Risk Management Framework?Oct 14, 2025 — The NIST artificial intelligence risk management framework (AI RMF) guides o...

  27. Source: github.com
    Link: https://github.com/AudreyBeard/Datasheets-for-Datasets-Template/blob/master/refs.bib
    Source snippet

    AudreyBeard/Datasheets-for-Datasets-Templateevery dataset be accompanied with a datasheet that documents its motivation, composition, col...

  28. Source: ui.adsabs.harvard.edu
    Link: https://ui.adsabs.harvard.edu/abs/2018arXiv180309010G/abstract
    Source snippet

    for Datasets - ADSby T Gebru · 2018 · Cited by 4536 — We propose that every dataset be accompanied with a datasheet that documents its mo...

  29. Source: edwinwenink.github.io
    Title: model card
    Link: https://edwinwenink.github.io/ai-ethics-tool-landscape/tools/model-card/
    Source snippet

    s for Model Reporting13 Jul 2021 — Model cards are short documents accompanying trained machine learning models that provide benchmarked...

  30. Source: info4940.infosci.cornell.edu
    Title: model card
    Link: https://info4940.infosci.cornell.edu/project/proj-01/model-card.html
    Source snippet

    card9 Nov 2025 — It provides a summary of the model's performance and limitations, as well as the context in which it was trained and use...

  31. Source: ai-solutions.daviesmeyer.com
    Title: datasheets for datasets
    Link: https://ai-solutions.daviesmeyer.com/en/glossary/datasheets-for-datasets
    Source snippet

    for Datasets Explained - HamburgStandardized documentation for ML datasets describing [provenance]({{ 'provenance/' | relative_url }}), composition, collection methods, recomm...

Additional References

  1. Source: adeptiv.ai
    Link: https://adeptiv.ai/nist-ai-rmf-guide-to-ai-risk-management-systems/
    Source snippet

    AI Governance & Risk Management | NIST AI RMF GuideThe primary objective of the NIST AI RMF is to help organizations identify, assess, ma...

  2. Source: merriam-webster.com
    Link: https://www.merriam-webster.com/dictionary/model
    Source snippet

    MODEL Definition & Meaning1. a usually miniature representation of something; a plastic model of the human heart; also: a pattern of som...

  3. Source: medium.com
    Link: https://medium.com/%40tahirbalarabe2/model-cards-explained-b14cd7c9439e
    Source snippet

    Model Cards Explained. Shoutout to Google | by TahirBy clearly stating intended use cases and out-of-scope scenarios, Model Cards help no...

  4. Source: mbrenndoerfer.com
    Link: https://mbrenndoerfer.com/writing/model-cards-documentation-intended-use-limitations-best-practices
    Source snippet

    Model Cards: Documentation, Intended Use, and LimitationsLearn how to write model cards that communicate intended use, training data, eva...

  5. Source: domino.ai
    Link: https://domino.ai/solutions/nist-risk-management
    Source snippet

    NIST AI risk management frameworkDomino Governance supports the NIST AI Risk Management Framework (RMF) standards with one universal syst...

  6. Source: docs.modulos.ai
    Link: https://docs.modulos.ai/frameworks/nist-ai-rmf/
    Source snippet

    AI Risk Management Framework 1.0 (NIST AI RMF)Complete guide to the NIST AI Risk Management Framework 1.0 (AI RMF 1.0): the four core fun...

  7. Source: morgan-klaus.com
    Link: https://www.morgan-klaus.com/readings/datasheets-for-datasets.html

  8. Source: medium.com
    Link: https://medium.com/%40akankshasinha247/model-cards-datasheets-governance-frameworks-0cda9605c94e

  9. Source: panaseer.com
    Link: https://panaseer.com/resources/blog/delivering-responsible-ai-with-model-cards

  10. Source: ateam-oracle.com
    Link: https://www.ateam-oracle.com/ciso-perspectives-a-practical-guide-to-implementing-the-nist-ai-risk-management-framework-ai-rmf
    Source snippet

    The NIST AI RMF provides a structured approach for addressing risks related to AI...Read more...

Topic Tree

Follow this branch

Parent topic

Training Choices What AI Learns Depends on Its Goals

Related pages 4

More on this topic 3