Recently updated notes

Emergence of invariance and disentanglement in deep representations

Alessandro Achille

2018

no date

Their minimal sufficient invariant representation is the closest analogue to our ecology-relative quotient, and Task ecologies and the evolution of world-tracking representations in large language models uses their Proposition 3.1 to mark where conditioning on context (rathe…

Convergence rates of posterior distributions

Subhashis Ghosal

2000

no date

Posterior concentration results are listed in Task ecologies and the evolution of world-tracking representations in large language models among plausible routes for bridging Bayes-optimal targets to SGD-trained representations, a gap left to future work.

On the information bottleneck theory of deep learning

Andrew M. Saxe

2019

no date

Their caveats about the descriptive reach of the IB principle in deep learning temper the scope under which Task ecologies and the evolution of world-tracking representations in large language models applies its bottleneck-style reasoning.

The information bottleneck problem and its applications in machine learning

Ziv Goldfeld

2020

no date

We cite this survey in Task ecologies and the evolution of world-tracking representations in large language models to point readers to the current state of information bottleneck theory rather than recapitulate it ourselves.

Agglomerative information bottleneck

Noam Slonim

2000

no date

The agglomerative information bottleneck supplies a greedy partition-merging algorithm for discrete state spaces that prefigures the partition-level objects Task ecologies and the evolution of world-tracking representations in large language models identifies as minimal suff…

The deterministic information bottleneck

D. J. Strouse

2017

no date

The deterministic information bottleneck shares our restriction to deterministic encodings, and Task ecologies and the evolution of world-tracking representations in large language models inherits that constraint while adding the ecology-relative quotient structure.

Learning and generalization with the information bottleneck

Ohad Shamir

2010

no date

Their finite-sample IB learning bounds are cited in Task ecologies and the evolution of world-tracking representations in large language models as a more realistic counterpart to our finite-class certification result, which now lives in the appendix.

Completeness, similar regions, and unbiased estimation: Part I

E. L. Lehmann

1950

no date

We cite the classical sufficiency literature to anchor the equivalence between zero excess loss and statistical sufficiency for the next token, supplying Task ecologies and the evolution of world-tracking representations in large language models with its rigorous statistical…

Sufficiency and statistical decision functions

Raghu Raj Bahadur

1954

no date

Paired with Lehmann and Scheffe to ground the sufficiency notion used throughout Task ecologies and the evolution of world-tracking representations in large language models, where the conditional insufficiency term in the loss decomposition inherits its meaning from this cla…

Show Your Work: Scratchpads for Intermediate Computation with Language Models

Maxwell Nye

2021

no date

Paired with the chain-of-thought citation in Task ecologies and the evolution of world-tracking representations in large language models, scratchpads exemplify intermediate-token procedures that close the deployment decoding gap without enlarging the frozen separation set. T…

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei

2022

no date

Task ecologies and the evolution of world-tracking representations in large language models cites chain-of-thought prompting as a deployment-time procedure that creates longer informative contexts and improves performance within a frozen model, while leaving the underlying s…

Learning by Surprise: Surplexity for Mitigating Model Collapse in Generative AI

Daniele Gambetta

2025

no date

Cited in Task ecologies and the evolution of world-tracking representations in large language models as concrete evidence for the niche-construction feedback loop: surplexity work documents performance and diversity decay across generations when later models train on synthet…

Understanding with Toy Surrogate Models in Machine Learning

Andrés Páez

2024

no date

Paired with Hubinger et al. to support the use of toy surrogate models for theoretical inquiry; Task ecologies and the evolution of world-tracking representations in large language models adopts that methodology so every theoretically relevant quantity remains directly obser…

The debate over understanding in AI's large language models

Melanie Mitchell

2023

no date

We invoke their survey of the understanding debate to situate Task ecologies and the evolution of world-tracking representations in large language models within an ongoing methodological disagreement, then sidestep the philosophical impasse by isolating the parts that admit…

A mathematical theory for understanding when abstract representations emerge in neural networks

Bin Wang

2025

no date

Their proof of approximately orthogonal latent-variable representations at global minima of feedforward networks is cited in Task ecologies and the evolution of world-tracking representations in large language models as a parallel structural result for a different architectu…

Emergent Introspective Awareness in Large Language Models

Jack Lindsey

2025

no date

Task ecologies and the evolution of world-tracking representations in large language models uses Lindsey's introspection results to flag a weaker individual-level analogue of niche construction, where computational states a model can detect may themselves enter the effective…

When Models Manipulate Manifolds: The Geometry of a Counting Task

Wes Gurnee

2025

no date

Their finding that next-token training induces low-dimensional internal geometry for structural variables gives Task ecologies and the evolution of world-tracking representations in large language models an empirical counterpart to its topological convergence prediction.

Toy Models of Superposition

Nelson Elhage

2022

no date

Toy Models of Superposition is cited in Task ecologies and the evolution of world-tracking representations in large language models together with the circuits framework to mark the mechanistic interpretability tradition our ecology-level account aims to complement.

A Mathematical Framework for Transformer Circuits

Nelson Elhage

2021

no date

The mathematical framework for transformer circuits identifies architectural structure in trained transformers, and Task ecologies and the evolution of world-tracking representations in large language models complements it by characterizing which structure is loss-forced by…

An Information-Geometric View of the Platonic Hypothesis

Alexander Lobashev

2025

no date

Lobashev's Bayesian route to convergence, which attributes failure mainly to capacity mismatch, is set in Task ecologies and the evolution of world-tracking representations in large language models alongside other accounts of when models converge to a shared representation.

microgpt

Andrej Karpathy

2026

no date

Task ecologies and the evolution of world-tracking representations in large language models adopts Karpathy's microgpt as the architectural template for its laboratory model organism, picking it because the small frozen autoregressive transformer permits direct enumeration o…

Position: The Platonic Representation Hypothesis

Minyoung Huh

2024

no date

We cite this as one of the entries to the debate over whether language models develop internal structure that tracks the world, framing the empirical question that Task ecologies and the evolution of world-tracking representations in large language models answers with a suff…

Revisiting the Platonic Representation Hypothesis: An Aristotelian view

Fabian Gröger

2026

no date

Paired with Huh et al. as evidence that the world-tracking question is live; in Task ecologies and the evolution of world-tracking representations in large language models we use it to motivate the move from observed representational convergence to its information-theoretic…

Neural Networks can Learn Representations with Gradient Descent

Alexandru Damian

2022

no date

Cited in Task ecologies and the evolution of world-tracking representations in large language models as part of the feature-learning theory we point to when sketching how SGD might reach the partitions our static theorems characterize.

Do Large Language Models Understand Us?

Blaise Arcas

2022

no date

Listed alongside Bender and Koller as a contrasting voice in the understanding debate; in Task ecologies and the evolution of world-tracking representations in large language models we use the cluster to mark the terrain that the ecological-veridicality argument cuts across.

High-dimensional asymptotics of feature learning: How one gradient step improves the representation

Jimmy Ba

2022

no date

Their high-dimensional analysis of feature learning after one gradient step is cited in Task ecologies and the evolution of world-tracking representations in large language models as a possible ingredient for resolving the oracle-to-trained gap.

Neural networks as kernel learners: The silent alignment effect

Alexander Atanasov

2022

no date

The silent alignment result is grouped in Task ecologies and the evolution of world-tracking representations in large language models with other feature-learning analyses that could plausibly bridge optimization dynamics to the ecology-relative target.

Extension of covariance selection mathematics

George R. Price

1972

no date

We pair Price's 1972 extension with his 1970 paper throughout the Price-equation derivation in Between interface and truth: Multi-task selection drives ecologically veridical perception, using it as the standard reference for the covariance-plus-transmission identity th…

Selection and covariance

George R. Price

1970

no date

Price's 1970 covariance identity is the starting point for the one-generation decomposition we apply in Between interface and truth: Multi-task selection drives ecologically veridical perception, partitioning change in any encoding trait into a selection covariance with…

Is vision continuous with cognition? The case for cognitive impenetrability of visual perception

Zenon W. Pylyshyn

1999

no date

Pylyshyn's cognitive impenetrability supplies the structural premise of Between interface and truth: Multi-task selection drives ecologically veridical perception: the encoding is fixed across tasks while only downstream readouts vary, which is what makes multi-task per…