Toy Models of Superposition

Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan, Dario Amodei, Martin Wattenberg, Christopher Olah
Paper From BibTeX import
Transformer Circuits Thread, 2022

Notes

Toy Models of Superposition is cited in riva2026task together with the circuits framework to mark the mechanistic interpretability tradition our ecology-level account aims to complement.

References

No references yet.