Can we trust pictures of AI neurons?

Introduction

Feature visualisation is one of the most striking tools in AI interpretability. Researchers can generate synthetic images that strongly activate a neuron or layer inside an image-recognition network, producing pictures that appear to reveal what the system has learned. These images have helped show how deep networks move from simple edge detectors to representations of textures, object parts and higher-level concepts. However, the resulting pictures are not direct photographs of a neuron’s “thoughts”. They are interpretations produced by a method, and that method has important limitations. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Visualisation limits illustration 1 The key lesson is not that feature visualisation is useless. It is that visualisations can be persuasive while still being incomplete. A synthetic image may highlight one aspect of a neuron’s behaviour, hide others, or exaggerate how cleanly a concept is represented inside a network. Understanding these limitations is essential when using visualisations to explain how image layers turn pixels into objects. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

What activation-maximised images can reveal

Most feature visualisation methods use a technique called activation maximisation. Starting from random noise, the algorithm gradually modifies an image so that a chosen neuron, channel or class becomes increasingly active. The final image is then treated as a clue about what patterns the network prefers. [christophm.github.io]christophm.github.ioFeature Visualization visualizes the learned features by activation…Read more…

This approach has produced valuable insights. Early-layer visualisations often resemble edges, colour contrasts and simple textures. Deeper-layer visualisations can contain fur-like patterns, wheels, faces, feathers or other structures that correspond to meaningful visual features. Such results helped demonstrate that image networks frequently build hierarchical representations rather than relying solely on memorised templates. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

The problem is that activation-maximised images are not natural photographs. They are artificial solutions to an optimisation problem. The algorithm is searching for whatever input most strongly excites a target neuron, not for a typical image that humans would encounter. As a result, visualisations can contain unusual combinations of textures, colours and shapes that rarely occur in real-world scenes. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

A neuron that helps detect dog faces, for example, might be visualised as a surreal blend of fur textures, eye-like structures and abstract patterns. The image may communicate something genuine about the neuron’s preferences, but it can also create the impression that the neuron is far more specialised or interpretable than it really is. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Why one neuron may have multiple meanings

One of the most important discoveries in interpretability research is that many neurons are not dedicated detectors for a single concept. Instead, they can respond to several different visual patterns. Researchers sometimes describe this as a neuron being “multifaceted” or, more broadly, as a form of polysemantic behaviour. [ResearchGate+2arXiv]researchgate.netResearch Gate Uncovering the Different Types of Features Learned ByUncovering the Different Types of Features Learned By…February 11, 2016 — A limitation of current techniques is that they…Published: February 11, 2016

This creates a problem for simple feature visualisations. If a neuron activates for multiple distinct situations, a single synthetic image may blend those situations together. The resulting picture can look confusing because it is effectively averaging several meanings into one visual artefact. [ResearchGate]researchgate.netResearch Gate Uncovering the Different Types of Features Learned ByUncovering the Different Types of Features Learned By…February 11, 2016 — A limitation of current techniques is that they…Published: February 11, 2016

Researchers studying multifaceted feature visualisation highlighted this issue using examples where a neuron responded to very different appearances associated with the same category. A neuron linked to grocery-store recognition, for instance, might activate for both storefront views and rows of produce. A single activation-maximised image can merge these patterns into an unrealistic hybrid that never appears in nature. [ResearchGate]researchgate.netResearch Gate Uncovering the Different Types of Features Learned ByUncovering the Different Types of Features Learned By…February 11, 2016 — A limitation of current techniques is that they…Published: February 11, 2016

This matters because readers often interpret neuron images too literally. Seeing a clean synthetic pattern can encourage the belief that a neuron corresponds neatly to a single human concept. In reality, the network’s representation may be distributed across many neurons, while individual neurons participate in multiple overlapping functions. [netdissect.csail.mit.edu+2Wikipedia]netdissect.csail.mit.eduOpen source on mit.edu.

Visualisation limits illustration 2

Why synthetic images can overstate interpretability

Another risk is that feature visualisations can make a network appear more understandable than it actually is.

The Network Dissection project was partly motivated by concerns that visualisations themselves require human interpretation. Looking at a synthetic image still leaves researchers asking what concept it represents and how consistently that concept appears in real data. The image is an interpretation of the model, not a direct measurement of its internal semantics. [CVF Open Access]openaccess.thecvf.comBau Network Dissection Quantifying CVPR 2017 paperCVF Open AccessQuantifying Interpretability of Deep Visual Representationsby D Bau · 2017 · Cited by 2331 — Visualizations digest the mec…

Researchers have also shown that visual interpretations can be manipulated. Studies on attacks against neuron interpretation demonstrated that model representations can sometimes be altered so that feature visualisations change dramatically while predictive performance remains largely intact. In other words, a compelling visual explanation does not necessarily correspond to a uniquely correct understanding of what the model is doing. [OpenReview]openreview.netOpen Review Adversarial Attacks on Neuron Interpretation via ActivationAdversarial Attacks on Neuron Interpretation via Activation…November 18, 2023 — We study the feature visualization of a neur…Published: November 18, 2023

The broader lesson is that attractive images should not be confused with definitive explanations. A visualisation may reveal a genuine tendency inside a network while still omitting important aspects of how that representation operates in practice. [OpenReview]openreview.netOpen Review Adversarial Attacks on Neuron Interpretation via ActivationAdversarial Attacks on Neuron Interpretation via Activation…November 18, 2023 — We study the feature visualization of a neur…Published: November 18, 2023

How real-image examples can check synthetic views

One way researchers reduce these risks is by comparing synthetic visualisations with real images that strongly activate the same neuron.

Instead of asking only, “What image can we generate to maximise activation?”, researchers also ask, “Which photographs in a dataset naturally activate this unit?” Looking at real examples often reveals whether the synthetic image corresponds to genuine behaviour or whether it emphasises unusual optimisation artefacts. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

This comparison can expose hidden complexity. A synthetic image might suggest that a neuron detects a dog’s nose, yet examination of real-image activations could show responses to several related face structures rather than a single isolated part. Conversely, repeated activation across many photographs can strengthen confidence that the visualisation has identified a meaningful pattern. [Distill]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Methods such as Network Dissection push this idea further by measuring how strongly units align with labelled concepts in large image datasets. Rather than relying entirely on visual inspection, they attempt to quantify whether a unit consistently corresponds to objects, parts, textures, colours or other semantic categories. [netdissect.csail.mit.edu]netdissect.csail.mit.eduOpen source on mit.edu.

Visualisation limits illustration 3

Can we trust pictures of AI neurons?

The most accurate answer is: trust them as clues, not as ground truth.

Feature visualisation has been enormously useful for understanding how image networks organise visual information. It has revealed evidence for edges, textures, object parts and higher-level structures emerging across layers. Yet the same research community that developed these methods has repeatedly emphasised their limitations. Synthetic images are optimisation products, neurons may have multiple meanings, and visual explanations can oversimplify distributed representations. [Distill+2christophm.github.io]distill.pubFeature VisualizationFeature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely…Published: November 7, 2017

Evidence from interpretability studies suggests that feature visualisations work best when combined with other checks: inspection of real activating images, quantitative concept analysis and broader examination of how groups of units work together. When used in that way, they provide valuable windows into AI vision. When treated as literal pictures of what a neuron “sees”, they can become misleading. netdissect.csail.mit.edu+2CVF Open Access [netdissect.csail.mit.edu]netdissect.csail.mit.eduOpen source on mit.edu.

Amazon book picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Example eBay listing

Eat Sleep Machine Learning T shirt Tee

Search eBay.co.uk: machine learning t shirt

Browse similar on eBay.co.uk

Example eBay listing

I LOVE MACHINE LEARNING T-SHIRT heart ai data science algorithms technology

Search eBay.co.uk: machine learning t shirt

Browse similar on eBay.co.uk

Example eBay listing

Keep Calm and Study Machine Learning T shirt Funny Tee

Search eBay.co.uk: machine learning t shirt

Browse similar on eBay.co.uk

Example eBay listing

I Love Machine Learning T shirt I Heart Machine Learning Tee

Search eBay.co.uk: machine learning t shirt

Browse similar on eBay.co.uk

Browse more on eBay.co.uk

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Endnotes

Source: distill.pub
Title: Feature Visualization
Link: https://distill.pub/2017/feature-visualization
Source snippet
Feature VisualizationNovember 7, 2017 — by C Olah · 2017 · Cited by 1615 — Diverse feature visualizations allow us to more closely...

Published: November 7, 2017
Source: christophm.github.io
Link: https://christophm.github.io/interpretable-ml-book/cnn-features.html
Source snippet
Feature Visualization visualizes the learned features by activation...Read more...
Source: researchgate.net
Title: Research Gate Uncovering the Different Types of Features Learned By
Link: https://www.researchgate.net/publication/301845946_Multifaceted_Feature_Visualization_Uncovering_the_Different_Types_of_Features_Learned_By_Each_Neuron_in_Deep_Neural_Networks
Source snippet
Uncovering the Different Types of Features Learned By...February 11, 2016 — A limitation of current techniques is that they...

Published: February 11, 2016
Source: arxiv.org
Link: https://arxiv.org/abs/1602.03616
Source: Wikipedia
Link: https://en.wikipedia.org/wiki/Polysemanticity
Source: netdissect.csail.mit.edu
Link: https://netdissect.csail.mit.edu/
Source: arxiv.org
Title: arXiv Interpreting Neural Networks through the Polytope Lens
Link: https://arxiv.org/abs/2211.12312
Source: openreview.net
Title: Open Review [Adversarial]({{ ‘stress-tests/’ | relative_url }}) Attacks on Neuron Interpretation via Activation
Link: https://openreview.net/pdf/637f5c318237190cce1ff2528a37fc06346a5812.pdf
Source snippet
Adversarial Attacks on Neuron Interpretation via Activation...November 18, 2023 — We study the feature visualization of a neur...

Published: November 18, 2023
Source: arxiv.org
Link: https://arxiv.org/abs/2106.12447
Source: arxiv.org
Link: https://arxiv.org/html/2508.07281v1
Source snippet
Representation Understanding via Activation Maximization10 Aug 2025 — In this work, we propose a unified feature visualization framework...
Source: arxiv.org
Link: https://arxiv.org/html/2604.08039v1
Source snippet
LLM-based Iterative Neuron Explanations for Vision Models9 Apr 2026 — Interpreting the concepts encoded by individual neurons in deep neu...
Source: researchgate.net
Link: https://www.researchgate.net/publication/320971142_Network_Dissection_Quantifying_Interpretability_of_Deep_Visual_Representations
Source snippet
Quantifying Interpretability of Deep Visual RepresentationsThese "Network Dissection" approaches [Bau et al., 2017] enabled systematic ch...
Source: github.com
Link: https://github.com/CSAILVision/NetDissect-Lite
Source snippet
Network Dissection Lite in PyTorchThis repository is a light version of NetDissect, which contains the demo code for the work Network Dis...
Source: distill.pub
Title: activation atlas
Link: https://distill.pub/2019/activation-atlas
Source snippet
Exploring Neural Networks with Activation Atlasesby S Carter · 2019 · Cited by 260 — We create an explorable activation atlas of features...
Source: openreview.net
Link: https://openreview.net/pdf/a302e0072a6e15c8c0361c022bb9d3518f1a7127.pdf
Source: openaccess.thecvf.com
Title: Bau Network Dissection Quantifying CVPR 2017 paper
Link: https://openaccess.thecvf.com/content_cvpr_2017/papers/Bau_Network_Dissection_Quantifying_CVPR_2017_paper.pdf
Source snippet
CVF Open AccessQuantifying Interpretability of Deep Visual Representationsby D Bau · 2017 · Cited by 2331 — Visualizations digest the mec...

Additional References

Source: openfl.pressbooks.pub
Link: https://openfl.pressbooks.pub/unfbusinessanalytics/chapter/feature-visualization/
Source snippet
Feature Visualization – [Business]({{ 'business-adoption/' | relative_url }}) AnalyticsFeature visualization for a unit of a neural network is done by finding the input that maximize...
Source: lmb.informatik.uni-freiburg.de
Link: https://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss22/network_vis.pdf
Source snippet
VisualizationsImages synthesized by Activation-Maximization are NOT more helpful than other kinds of visualizations • Experiment is limit...
Source: linkedin.com
Link: https://www.linkedin.com/pulse/overview-feature-visualization-activation-am-method-shrivastava-bdkmc
Source snippet
An overview of Feature Visualization: Activation...Activation Maximization (AM) is a popular method for feature visualization...
Source: medium.com
Link: https://medium.com/%40hke22/techniques-of-feature-visualisation-activation-maximisation-in-convolutional-neural-networks-07443d822380
Source snippet
It aims at synthesising a specific input that maximises a neuron's activation with a...Read more...
Source: anhnguyen.me
Link: https://anhnguyen.me/2016/mfv/
Source snippet
s only one type of feature, but we know that neurons can be multifaceted, in that they...
Source: semanticscholar.org
Link: https://www.semanticscholar.org/paper/4fc46f52419f37bd3d3539b8442ea232e24d0e00
Source snippet
terpretability of the units inside a deep convolutional neural networks...
Source: ruthfong.com
Title: Understanding Convolutional Neural Networks
Link: https://www.ruthfong.com/files/fong20_thesis.pdf
Source snippet
AbstractIn this thesis, we introduce several methods for understanding convolutional neural networks (CNNs), the class of [deep learning]({{ 'deep-learning/' | relative_url }}) m...
Source: aisafety.info
Link: https://aisafety.info/questions/8HIA/What-is-feature-visualization
Source snippet
into the concepts that neural networks have...
Source: youtube.com
Link: https://www.youtube.com/watch?v=Xy6RcjXMa2c
Source snippet
propose a general framework called Network Dissection for quantifying...
Source: youtube.com
Title: ‘How neural networks learn’
Link: https://www.youtube.com/watch?v=McgxRxi2Jqo
Source snippet
Interpretability vs. Explainability in Machine Learning...

Can we trust pictures of AI neurons?

Introduction

What activation-maximised images can reveal

Why one neuron may have multiple meanings

Why synthetic images can overstate interpretability

How real-image examples can check synthetic views

Can we trust pictures of AI neurons?

Further Reading

Deep Learning with Python

Deep Learning

The Deep Learning Revolution

Interpretable Machine Learning

Marketplace Samples

Eat Sleep Machine Learning T shirt Tee

I LOVE MACHINE LEARNING T-SHIRT heart ai data science algorithms technology

Keep Calm and Study Machine Learning T shirt Funny Tee

I Love Machine Learning T shirt I Heart Machine Learning Tee

Endnotes

Additional References

Follow this branch

Parent topic

Related pages 2