Within Overfitting

The curve that says stop training

Validation curves can show the point where more training improves memory of old data but harms performance on new cases.

On this page

  • How validation error changes during training
  • The warning gap between train and validation results
  • Early stopping as a practical defence
Preview for The curve that says stop training

Introduction

A validation curve is one of the most practical tools developers use to decide when a machine-learning model has learned enough. During training, a model is repeatedly adjusted to reduce errors. At first, these adjustments usually improve performance on both the training data and a separate validation set. Eventually, however, a turning point often appears: the model keeps improving on the training examples while its validation performance stops improving or begins to worsen. That moment is a warning that the model is starting to memorise details of the training data rather than learning patterns that generalise to new examples. Detecting that point and stopping training is known as early stopping, and it is one of the most widely used defences against overfitting. [Google for Developers]developers.google.comGoogle for DevelopersOverfitting: L2 regularization | Machine Learning9 Apr 2026 — Early stopping: an alternative to complexity-based reg…

Stop training illustration 1

How validation error changes during training

A validation curve tracks how well a model performs on data that was not used to update its parameters. Developers typically evaluate the model on the validation set after each training cycle, often called an epoch.

In the early stages of training, both training error and validation error usually fall together. The model is discovering broad patterns that help on familiar and unfamiliar examples alike. As training continues, the rate of improvement slows. Eventually, the validation curve often reaches its best point and begins to flatten. Beyond that point, further training may continue reducing training error while offering little or no benefit on unseen data. [Google for Developers]developers.google.comGoogle for DevelopersOverfitting: Interpreting loss curves | Machine LearningDec 3, 2025 — Unfortunately, loss curves are often challengi…

The most important feature of the curve is not the absolute value but its direction:

  • Training error falling and validation error falling: learning is still helping.
  • Training error falling and validation error flattening: gains on new examples are becoming limited.
  • Training error falling and validation error rising: overfitting is likely beginning. [scikit-learn.org]scikit-learn.orgUnderfitting vs. OverfittingWe calculate the mean squared error (MSE) on the validation set, the higher, the less likely the model genera…
  • Both curves remaining high: the model may be underfitting and needs more capacity, better features, or more training. [Scikit-learn]scikit-learn.orglearning curveValidation curves: plotting scores to evaluate modelsIf the training score is high and the validation score is low, the estimator is over…

Developers often save model checkpoints throughout training and identify the checkpoint that achieved the lowest validation loss. That checkpoint, rather than the final training state, is frequently chosen for deployment. [Google for Developers]developers.google.comGoogle for DevelopersOverfitting: L2 regularization | Machine Learning9 Apr 2026 — Early stopping: an alternative to complexity-based reg…

The warning gap between train and validation results

One of the clearest signals in machine learning is the widening gap between training and validation performance.

Imagine a model whose training accuracy climbs from 90% to 99%. At first glance, that looks like progress. However, if validation accuracy peaks at 94% and then drifts downward to 92%, the extra training is making the model worse at handling new examples. The model is learning increasingly specific details of the training set that do not transfer to the outside world. [Cross Validated]stats.stackexchange.comvalidation loss increases while training loss decreaseThe training loss will always tend to improve as training continues up until the model's capacity to learn…

This divergence is sometimes called the generalisation gap. A small gap is expected because the model naturally performs best on data it has already seen. A rapidly growing gap is a warning sign that memorisation is replacing generalisation. Scikit-learn’s guidance on validation curves explicitly identifies the pattern of high training scores combined with lower validation scores as evidence of overfitting. [Scikit-learn]scikit-learn.orglearning curveValidation curves: plotting scores to evaluate modelsIf the training score is high and the validation score is low, the estimator is over…

A useful way to think about the gap is that the training curve measures how well the model remembers, while the validation curve measures how well it understands. When memory keeps improving but understanding does not, developers become cautious about continuing training.

Stop training illustration 2

Early stopping as a practical defence

Early stopping turns the information in validation curves into a concrete training rule. Instead of deciding in advance to train for a fixed number of epochs, developers continuously monitor validation performance and stop when improvement disappears. [Google for Developers]developers.google.comGoogle for DevelopersOverfitting: L2 regularization | Machine Learning9 Apr 2026 — Early stopping: an alternative to complexity-based reg…

A common procedure works like this:

  1. Train the model for one epoch.
  2. Measure validation loss. [codesignal.com]codesignal.comCode Signalearly stoppingPreventing Overfitting During TrainingBut if the validation loss stops improving for a certain number of epochs, called the patience, you…
  3. Save the model if validation loss improves.
  4. Continue training while improvements continue.
  5. Stop after a predefined period without improvement.

That waiting period is often called patience. Developers rarely stop at the first slight increase in validation loss because random fluctuations can occur. Instead, they allow several validation checks without improvement before ending training. [CodeSignal]codesignal.comCode Signalearly stoppingPreventing Overfitting During TrainingBut if the validation loss stops improving for a certain number of epochs, called the patience, you…

Google’s machine-learning guidance describes early stopping as a form of regularisation because it limits the model’s opportunity to over-specialise on the training data. Interestingly, early stopping often accepts slightly worse training performance in exchange for better performance on unseen examples. [Google for Developers]developers.google.comGoogle for DevelopersOverfitting: L2 regularization | Machine Learning9 Apr 2026 — Early stopping: an alternative to complexity-based reg…

Why developers do not rely on the test set instead

A natural question is why developers watch validation curves rather than repeatedly checking the final test set.

The reason is that every decision influenced by a dataset leaks information from that dataset into development. If developers repeatedly examine test results and adjust training accordingly, the test set gradually becomes part of the training process. The supposedly independent final evaluation loses its value.

Validation curves solve this problem by providing feedback during development while keeping the test set untouched until the end. The test set remains the final independent check of whether the stopping decision truly improved generalisation. [Scikit-learn]scikit-learn.org3.1. Cross-validation: evaluating estimator performanceCross-validation provides information about how well an estimator gene…

Stop training illustration 3

The curve is useful, but not infallible

Although validation curves are powerful, they are not perfect indicators. Validation measurements can be noisy, especially when datasets are small. Different metrics can also suggest different stopping points. Recent research has shown that selecting checkpoints using validation accuracy alone may sometimes produce worse outcomes than using validation loss, highlighting that the choice of monitored metric matters. [arXiv]arxiv.orgDon't stop me now: Rethinking Validation Criteria for Model Parameter SelectionFebruary 25, 2026…Published: February 25, 2026

Modern deep-learning systems can also display complex behaviours, including periods where performance temporarily worsens before improving again. Because of these effects, developers usually combine validation monitoring with checkpoint saving, patience settings, and other regularisation methods rather than relying on a single curve. [arXiv]arxiv.orgarXiv Early Stopping in Deep Networks: Double Descent and How to Eliminate itEarly Stopping in Deep Networks: Double Descent and How to Eliminate itJuly 20, 2020…Published: July 20, 2020

Even with these caveats, the central lesson remains straightforward: when validation performance stops improving while training performance keeps improving, the model is often crossing from learning useful patterns into memorising the training data. The validation curve is therefore not merely a graph—it is a practical signal that tells developers when to stop before better memory becomes worse intelligence. [Cross Validated]stats.stackexchange.comvalidation loss increases while training loss decreaseThe training loss will always tend to improve as training continues up until the model's capacity to learn…

Amazon book picks

Further Reading

Books and field guides related to The curve that says stop training. Use these as the next step if you want deeper reading beyond the article.

BookCover for Deep Learning

Deep Learning

By Ian Goodfellow, Yoshua Bengio et al.

Rating: 3.5/5 from 6 Google Books ratings

Discusses early stopping as a defence against overfitting.

eBay marketplace picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Using USA

Endnotes

  1. Source: developers.google.com
    Link: https://developers.google.com/machine-learning/crash-course/overfitting/regularization
    Source snippet

    Google for DevelopersOverfitting: L2 regularization | Machine Learning9 Apr 2026 — Early stopping: an alternative to complexity-based reg...

  2. Source: developers.google.com
    Link: https://developers.google.com/machine-learning/crash-course/overfitting/interpreting-loss-curves
    Source snippet

    Google for DevelopersOverfitting: Interpreting loss curves | Machine LearningDec 3, 2025 — Unfortunately, loss curves are often challengi...

  3. Source: developers.google.com
    Link: https://developers.google.com/machine-learning/crash-course/overfitting
    Source snippet

    Interpret different kinds of loss curves; detect convergence and overfitting in loss curves. Prerequisites: This module assumes...Read more...

  4. Source: scikit-learn.org
    Title: learning curve
    Link: https://scikit-learn.org/stable/modules/learning_curve.html
    Source snippet

    Validation curves: plotting scores to evaluate modelsIf the training score is high and the validation score is low, the estimator is over...

  5. Source: developers.google.com
    Title: overfitting gbdt
    Link: https://developers.google.com/machine-learning/decision-forests/overfitting-gbdt
    Source snippet

    Therefore, as for neural networks, you can apply regularization and early stopping using a validation dataset.Read more...

  6. Source: codesignal.com
    Title: Code Signalearly stopping
    Link: https://codesignal.com/learn/courses/improving-neural-networks-with-pytorch/lessons/early-stopping-in-pytorch-preventing-overfitting-during-training
    Source snippet

    Preventing Overfitting During TrainingBut if the validation loss stops improving for a certain number of epochs, called the patience, you...

  7. Source: scikit-learn.org
    Link: https://scikit-learn.org/stable/modules/cross_validation.html
    Source snippet

    3.1. Cross-validation: evaluating estimator performanceCross-validation provides information about how well an estimator gene...

  8. Source: arxiv.org
    Link: https://arxiv.org/abs/2602.22107
    Source snippet

    Don't stop me now: Rethinking Validation Criteria for Model Parameter SelectionFebruary 25, 2026...

    Published: February 25, 2026

  9. Source: arxiv.org
    Title: arXiv Early Stopping in Deep Networks: Double Descent and How to Eliminate it
    Link: https://arxiv.org/abs/2007.10099
    Source snippet

    Early Stopping in Deep Networks: Double Descent and How to Eliminate itJuly 20, 2020...

    Published: July 20, 2020

  10. Source: google.com
    Link: https://www.google.com/
    Source snippet

    Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exac...

  11. Source: developers.google.com
    Link: https://developers.google.com/machine-learning/crash-course/overfitting/overfitting
    Source snippet

    Machine Learning3 Dec 2025 — Overfitting means creating a model that matches (memorizes) the training set so closely that the model fai...

  12. Source: developers.google.com
    Title: crash course
    Link: https://developers.google.com/machine-learning/crash-course
    Source snippet

    Learning Crash CourseDatasets, Generalization, and Overfitting. An introduction to the characteristics of machine learning datasets, and...

  13. Source: scikit-learn.org
    Link: https://scikit-learn.org/
    Source snippet

    machine learning in Python — scikit-learn 1.9.0...Machine Learning in Python · Simple and efficient tools for predictive d...

  14. Source: scikit-learn.org
    Title: learning curve
    Link: https://scikit-learn.org/0.17/modules/learning_curve.html
    Source snippet

    Validation curves: plotting scores to evaluate modelsA learning curve shows the validation and training score of an estimator for varying...

  15. Source: scikit-learn.org
    Title: learning curve
    Link: https://scikit-learn.org/0.15/modules/learning_curve.html
    Source snippet

    Validation curves: plotting scores to evaluate modelsA learning curve shows the validation and training score of an estimator for varying...

  16. Source: scikit-learn.org
    Title: learning curve
    Link: https://scikit-learn.org/0.16/modules/learning_curve.html
    Source snippet

    Validation curves: plotting scores to evaluate modelsA learning curve shows the validation and training score of an estimator for varying...

  17. Source: scikit-learn.org
    Link: https://scikit-learn.org/stable/auto_examples/model_selection/plot_learning_curve.html
    Source snippet

    The effect is depicted by checking the statistical performance of the model...

  18. Source: scikit-learn.org
    Link: https://scikit-learn.org/stable/auto_examples/model_selection/plot_underfitting_overfitting.html
    Source snippet

    Underfitting vs. OverfittingWe calculate the mean squared error (MSE) on the validation set, the higher, the less likely the model genera...

  19. Source: stats.stackexchange.com
    Title: validation loss increases while training loss decrease
    Link: https://stats.stackexchange.com/questions/395332/validation-loss-increases-while-training-loss-decrease
    Source snippet

    The training loss will always tend to improve as training continues up until the model's capacity to learn...

  20. Source: datascience.stackexchange.com
    Title: early stopping on validation loss or on accuracy
    Link: https://datascience.stackexchange.com/questions/37186/early-stopping-on-validation-loss-or-on-accuracy
    Source snippet

    stopping on validation loss or on accuracy?Aug 20, 2018 — I am currently training a neural network and I cannot decide which to use to im...

  21. Source: datascience.stackexchange.com
    Title: why might my validation loss flatten out while my training loss continues to dec
    Link: https://datascience.stackexchange.com/questions/85532/why-might-my-validation-loss-flatten-out-while-my-training-loss-continues-to-dec
    Source snippet

    might my validation loss flatten out while my training...Nov 17, 2020 — In my effort to learn a bit more about data science I scraped so...

  22. Source: datascience.stackexchange.com
    Title: difference between learning curve and validation curve
    Link: https://datascience.stackexchange.com/questions/62303/difference-between-learning-curve-and-validation-curve
    Source snippet

    between learning_curve and validation_curveOct 28, 2019 — What is the difference between these two curves: learning_curve and validation_...

  23. Source: dictionary.cambridge.org
    Link: https://dictionary.cambridge.org/dictionary/english-chinese-traditional/validation
    Source snippet

    in Traditional Chinese - Cambridge Dictionarythe act or process of making something officially or legally acceptable or approved 批准;認證...

  24. Source: stackoverflow.com
    Link: https://stackoverflow.com/questions/20357705/scikit-learn-cross-validation-over-fitting-or-under-fitting

Additional References

  1. Source: stackoverflow.com
    Link: https://stackoverflow.com/questions/52717219/machine-learning-python-drawing-validation-curve
    Source snippet

    "Machine Learning + Python: Drawing Validation curveI want to draw a validation curve for my Naive Bayes estimator like this: [http://scik..."](http://scik...")...

  2. Source: inria.github.io
    Link: https://inria.github.io/scikit-learn-mooc/python_scripts/cross_validation_validation_curve.html
    Source snippet

    Overfit-generalization-underfit — Scikit-learn courseIn this notebook, we put these two errors into perspective and show how they can hel...

  3. Source: kaggle.com
    Link: https://www.kaggle.com/code/ryanholbrook/overfitting-and-underfitting
    Source snippet

    Overfitting and UnderfittingInterrupting the training this way is called early stopping. A graph of the learning curves with early stoppi...

  4. Source: medium.com
    Link: https://medium.com/%40juanc.olamendy/real-world-ml-early-stopping-in-deep-learning-a-comprehensive-guide-fabb1e69f8cc
    Source snippet

    Real World ML: Early Stopping in Deep LearningEarly stopping is a regularization technique that aims to find the optimal point at w...

  5. Source: stackoverflow.com
    Link: https://stackoverflow.com/questions/65549498/training-loss-improving-but-validation-converges-early
    Source snippet

    Training Loss Improving but Validation Converges EarlyI am creating a CNN using TensorFlow and when training, I find that the training da...

  6. Source: merriam-webster.com
    Link: https://www.merriam-webster.com/dictionary/validation
    Source snippet

    VALIDATION Definition & MeaningThe meaning of VALIDATION is an act, process, or instance of validating; especially: the determination of...

  7. Source: scikit-yb.org
    Link: https://www.scikit-yb.org/en/latest/api/model_selection/validation_curve.html
    Source snippet

    Validation Curve — Yellowbrick v1.5 documentationAfter a depth of 7, the training and test scores diverge, this is because deeper trees a...

  8. Source: github.com
    Link: https://github.com/litaotao/machine-learning-crash-course
    Source snippet

    machine-learning-crash-course from googleThis may be caused by a complex model, we use regularization to prevent overfitting... Lower le...

  9. Source: siddiqueabusaleh.medium.com
    Link: https://siddiqueabusaleh.medium.com/why-my-training-loss-is-higher-than-validation-loss-is-the-reported-loss-even-accurate-8843e14a0756
    Source snippet

    My Training Loss is Higher Than Validation Loss? | MediumEarly Stopping: If both losses plateau for several epochs, it might be time to s...

  10. Source: reddit.com
    Link: https://www.reddit.com/r/learnmachinelearning/comments/129an7p/how_to_reduce_both_training_and_validation_loss/
    Source snippet

    First Underfitting --> Increase the number of epochs and/or data size. Then overfitting --> Tune the regularization parameters.Read more...

Topic Tree

Follow this branch

Parent topic

Overfitting When a model memorises instead of learning

Related pages 2