Within Image layers

Can we trust pictures of AI neurons?

Feature visualisation can reveal useful internal patterns, but synthetic images may make neurons look cleaner than they really are.

On this page

  • What activation maximised images can reveal
  • Why one neuron may have multiple meanings
  • How real image examples can check synthetic views
Preview for Can we trust pictures of AI neurons?

Introduction

Feature visualisation is one of the most striking tools in AI interpretability. Researchers can generate synthetic images that strongly activate a neuron or layer inside an image-recognition network, producing pictures that appear to reveal what the system has learned. These images have helped show how deep networks move from simple edge detectors to representations of textures, object parts and higher-level concepts. However, the resulting pictures are not direct photographs of a neuron’s “thoughts”. They are interpretations produced by a method, and that method has important limitations. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Visualisation limits illustration 1 The key lesson is not that feature visualisation is useless. It is that visualisations can be persuasive while still being incomplete. A synthetic image may highlight one aspect of a neuron’s behaviour, hide others, or exaggerate how cleanly a concept is represented inside a network. Understanding these limitations is essential when using visualisations to explain how image layers turn pixels into objects. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

What activation-maximised images can reveal

Most feature visualisation methods use a technique called activation maximisation. Starting from random noise, the algorithm gradually modifies an image so that a chosen neuron, channel or class becomes increasingly active. The final image is then treated as a clue about what patterns the network prefers. [christophm.github.io]christophm.github.ioFeature Visualization visualizes the learned features by activation…Read more…

This approach has produced valuable insights. Early-layer visualisations often resemble edges, colour contrasts and simple textures. Deeper-layer visualisations can contain fur-like patterns, wheels, faces, feathers or other structures that correspond to meaningful visual features. Such results helped demonstrate that image networks frequently build hierarchical representations rather than relying solely on memorised templates. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

The problem is that activation-maximised images are not natural photographs. They are artificial solutions to an optimisation problem. The algorithm is searching for whatever input most strongly excites a target neuron, not for a typical image that humans would encounter. As a result, visualisations can contain unusual combinations of textures, colours and shapes that rarely occur in real-world scenes. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

A neuron that helps detect dog faces, for example, might be visualised as a surreal blend of fur textures, eye-like structures and abstract patterns. The image may communicate something genuine about the neuron’s preferences, but it can also create the impression that the neuron is far more specialised or interpretable than it really is. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Why one neuron may have multiple meanings

One of the most important discoveries in interpretability research is that many neurons are not dedicated detectors for a single concept. Instead, they can respond to several different visual patterns. Researchers sometimes describe this as a neuron being “multifaceted” or, more broadly, as a form of polysemantic behaviour. [ResearchGate+2arXiv]researchgate.netResearch Gate Uncovering the Different Types of Features Learned ByUncovering the Different Types of Features Learned By…February 11, 2016 — A limitation of current techniques is that they…Published: February 11, 2016

This creates a problem for simple feature visualisations. If a neuron activates for multiple distinct situations, a single synthetic image may blend those situations together. The resulting picture can look confusing because it is effectively averaging several meanings into one visual artefact. [ResearchGate]researchgate.netResearch Gate Uncovering the Different Types of Features Learned ByUncovering the Different Types of Features Learned By…February 11, 2016 — A limitation of current techniques is that they…Published: February 11, 2016

Researchers studying multifaceted feature visualisation highlighted this issue using examples where a neuron responded to very different appearances associated with the same category. A neuron linked to grocery-store recognition, for instance, might activate for both storefront views and rows of produce. A single activation-maximised image can merge these patterns into an unrealistic hybrid that never appears in nature. [ResearchGate]researchgate.netResearch Gate Uncovering the Different Types of Features Learned ByUncovering the Different Types of Features Learned By…February 11, 2016 — A limitation of current techniques is that they…Published: February 11, 2016

This matters because readers often interpret neuron images too literally. Seeing a clean synthetic pattern can encourage the belief that a neuron corresponds neatly to a single human concept. In reality, the network’s representation may be distributed across many neurons, while individual neurons participate in multiple overlapping functions. [netdissect.csail.mit.edu+2Wikipedia]netdissect.csail.mit.eduOpen source on mit.edu.

Visualisation limits illustration 2

Why synthetic images can overstate interpretability

Another risk is that feature visualisations can make a network appear more understandable than it actually is.

The Network Dissection project was partly motivated by concerns that visualisations themselves require human interpretation. Looking at a synthetic image still leaves researchers asking what concept it represents and how consistently that concept appears in real data. The image is an interpretation of the model, not a direct measurement of its internal semantics. [CVF Open Access]openaccess.thecvf.comBau Network Dissection Quantifying CVPR 2017 paperCVF Open AccessQuantifying Interpretability of Deep Visual Representationsby D Bau · 2017 · Cited by 2331 — Visualizations digest the mec…

Researchers have also shown that visual interpretations can be manipulated. Studies on attacks against neuron interpretation demonstrated that model representations can sometimes be altered so that feature visualisations change dramatically while predictive performance remains largely intact. In other words, a compelling visual explanation does not necessarily correspond to a uniquely correct understanding of what the model is doing. [OpenReview]openreview.netOpen Review Adversarial Attacks on Neuron Interpretation via ActivationAdversarial Attacks on Neuron Interpretation via Activation…November 18, 2023 — We study the feature visualization of a neur…Published: November 18, 2023

The broader lesson is that attractive images should not be confused with definitive explanations. A visualisation may reveal a genuine tendency inside a network while still omitting important aspects of how that representation operates in practice. [OpenReview]openreview.netOpen Review Adversarial Attacks on Neuron Interpretation via ActivationAdversarial Attacks on Neuron Interpretation via Activation…November 18, 2023 — We study the feature visualization of a neur…Published: November 18, 2023

How real-image examples can check synthetic views

One way researchers reduce these risks is by comparing synthetic visualisations with real images that strongly activate the same neuron.

Instead of asking only, “What image can we generate to maximise activation?”, researchers also ask, “Which photographs in a dataset naturally activate this unit?” Looking at real examples often reveals whether the synthetic image corresponds to genuine behaviour or whether it emphasises unusual optimisation artefacts. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

This comparison can expose hidden complexity. A synthetic image might suggest that a neuron detects a dog’s nose, yet examination of real-image activations could show responses to several related face structures rather than a single isolated part. Conversely, repeated activation across many photographs can strengthen confidence that the visualisation has identified a meaningful pattern. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Methods such as Network Dissection push this idea further by measuring how strongly units align with labelled concepts in large image datasets. Rather than relying entirely on visual inspection, they attempt to quantify whether a unit consistently corresponds to objects, parts, textures, colours or other semantic categories. [netdissect.csail.mit.edu]netdissect.csail.mit.eduOpen source on mit.edu.

Visualisation limits illustration 3

Can we trust pictures of AI neurons?

The most accurate answer is: trust them as clues, not as ground truth.

Feature visualisation has been enormously useful for understanding how image networks organise visual information. It has revealed evidence for edges, textures, object parts and higher-level structures emerging across layers. Yet the same research community that developed these methods has repeatedly emphasised their limitations. Synthetic images are optimisation products, neurons may have multiple meanings, and visual explanations can oversimplify distributed representations. [Distill+2christophm.github.io]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Evidence from interpretability studies suggests that feature visualisations work best when combined with other checks: inspection of real activating images, quantitative concept analysis and broader examination of how groups of units work together. When used in that way, they provide valuable windows into AI vision. When treated as literal pictures of what a neuron “sees”, they can become misleading. netdissect.csail.mit.edu+2CVF Open Access [netdissect.csail.mit.edu]netdissect.csail.mit.eduOpen source on mit.edu.

Amazon book picks

Further Reading

Books and field guides related to Can we trust pictures of AI neurons?. Use these as the next step if you want deeper reading beyond the article.

BookCover for Deep Learning

Deep Learning

By Ian Goodfellow, Yoshua Bengio et al.

Rating: 3.5/5 from 6 Google Books ratings

Strong coverage of learned features, representation learning, and neural network internals relevant to feature visualisation.

eBay marketplace picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Using USA

Endnotes

  1. Source: distill.pub
    Title: Feature Visualization
    Link: https://distill.pub/2017/feature-visualization
    Source snippet

    Feature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely...

    Published: November 7, 2017

  2. Source: christophm.github.io
    Link: https://christophm.github.io/interpretable-ml-book/cnn-features.html
    Source snippet

    Feature Visualization visualizes the learned features by activation...Read more...

  3. Source: researchgate.net
    Title: Research Gate Uncovering the Different Types of Features Learned By
    Link: https://www.researchgate.net/publication/301845946_Multifaceted_Feature_Visualization_Uncovering_the_Different_Types_of_Features_Learned_By_Each_Neuron_in_Deep_Neural_Networks
    Source snippet

    Uncovering the Different Types of Features Learned By...February 11, 2016 — A limitation of current techniques is that they...

    Published: February 11, 2016

  4. Source: arxiv.org
    Link: https://arxiv.org/abs/1602.03616

  5. Source: Wikipedia
    Link: https://en.wikipedia.org/wiki/Polysemanticity

  6. Source: netdissect.csail.mit.edu
    Link: https://netdissect.csail.mit.edu/

  7. Source: arxiv.org
    Title: arXiv Interpreting Neural Networks through the Polytope Lens
    Link: https://arxiv.org/abs/2211.12312

  8. Source: openreview.net
    Title: Open Review [Adversarial]({{ ‘stress-tests/’ | relative_url }}) Attacks on Neuron Interpretation via Activation
    Link: https://openreview.net/pdf/637f5c318237190cce1ff2528a37fc06346a5812.pdf
    Source snippet

    Adversarial Attacks on Neuron Interpretation via Activation...November 18, 2023 — We study the feature visualization of a neur...

    Published: November 18, 2023

  9. Source: arxiv.org
    Link: https://arxiv.org/abs/2106.12447

  10. Source: arxiv.org
    Link: https://arxiv.org/html/2508.07281v1
    Source snippet

    Representation Understanding via Activation Maximization10 Aug 2025 — In this work, we propose a unified feature visualization framework...

  11. Source: arxiv.org
    Link: https://arxiv.org/html/2604.08039v1
    Source snippet

    LLM-based Iterative Neuron Explanations for Vision Models9 Apr 2026 — Interpreting the concepts encoded by individual neurons in deep neu...

  12. Source: researchgate.net
    Link: https://www.researchgate.net/publication/320971142_Network_Dissection_Quantifying_Interpretability_of_Deep_Visual_Representations
    Source snippet

    Quantifying Interpretability of Deep Visual RepresentationsThese "Network Dissection" approaches [Bau et al., 2017] enabled systematic ch...

  13. Source: github.com
    Link: https://github.com/CSAILVision/NetDissect-Lite
    Source snippet

    Network Dissection Lite in PyTorchThis repository is a light version of NetDissect, which contains the demo code for the work Network Dis...

  14. Source: distill.pub
    Title: activation atlas
    Link: https://distill.pub/2019/activation-atlas
    Source snippet

    Exploring Neural Networks with Activation Atlasesby S Carter · 2019 · Cited by 260 — We create an explorable activation atlas of features...

  15. Source: openreview.net
    Link: https://openreview.net/pdf/a302e0072a6e15c8c0361c022bb9d3518f1a7127.pdf

  16. Source: openaccess.thecvf.com
    Title: Bau Network Dissection Quantifying CVPR 2017 paper
    Link: https://openaccess.thecvf.com/content_cvpr_2017/papers/Bau_Network_Dissection_Quantifying_CVPR_2017_paper.pdf
    Source snippet

    CVF Open AccessQuantifying Interpretability of Deep Visual Representationsby D Bau · 2017 · Cited by 2331 — Visualizations digest the mec...

Additional References

  1. Source: openfl.pressbooks.pub
    Link: https://openfl.pressbooks.pub/unfbusinessanalytics/chapter/feature-visualization/
    Source snippet

    Feature Visualization – [Business]({{ 'business-adoption/' | relative_url }}) AnalyticsFeature visualization for a unit of a neural network is done by finding the input that maximize...

  2. Source: lmb.informatik.uni-freiburg.de
    Link: https://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss22/network_vis.pdf
    Source snippet

    VisualizationsImages synthesized by Activation-Maximization are NOT more helpful than other kinds of visualizations • Experiment is limit...

  3. Source: linkedin.com
    Link: https://www.linkedin.com/pulse/overview-feature-visualization-activation-am-method-shrivastava-bdkmc
    Source snippet

    An overview of Feature Visualization: Activation...Activation Maximization (AM) is a popular method for feature visualization...

  4. Source: medium.com
    Link: https://medium.com/%40hke22/techniques-of-feature-visualisation-activation-maximisation-in-convolutional-neural-networks-07443d822380
    Source snippet

    It aims at synthesising a specific input that maximises a neuron's activation with a...Read more...

  5. Source: anhnguyen.me
    Link: https://anhnguyen.me/2016/mfv/
    Source snippet

    s only one type of feature, but we know that neurons can be multifaceted, in that they...

  6. Source: semanticscholar.org
    Link: https://www.semanticscholar.org/paper/4fc46f52419f37bd3d3539b8442ea232e24d0e00
    Source snippet

    terpretability of the units inside a deep convolutional neural networks...

  7. Source: ruthfong.com
    Title: Understanding Convolutional Neural Networks
    Link: https://www.ruthfong.com/files/fong20_thesis.pdf
    Source snippet

    AbstractIn this thesis, we introduce several methods for understanding convolutional neural networks (CNNs), the class of [deep learning]({{ 'deep-learning/' | relative_url }}) m...

  8. Source: aisafety.info
    Link: https://aisafety.info/questions/8HIA/What-is-feature-visualization
    Source snippet

    into the concepts that neural networks have...

  9. Source: youtube.com
    Link: https://www.youtube.com/watch?v=Xy6RcjXMa2c
    Source snippet

    propose a general framework called Network Dissection for quantifying...

  10. Source: youtube.com
    Title: ‘How neural networks learn’
    Link: https://www.youtube.com/watch?v=McgxRxi2Jqo
    Source snippet

    Interpretability vs. Explainability in Machine Learning...

Topic Tree

Follow this branch

Parent topic

Image layers How do image layers learn to see?

Related pages 2