What each token looks at

Click a token. See which earlier tokens it attends to, as a row of weighted bars.

❂ Primer

Skip if you already know the theory; the interactive is right below.

Inside a transformer, every token at every layer decides which earlier tokens matter to it right now. The mechanism is attention: the token produces a "query" vector, every other token offers a "key" vector, and the dot products — after a softmax — are the weights. Those weights determine how much of each other token's content gets mixed in when this one is updated.

The diagrams below are illustrative, not lifted from a specific trained model — but the patterns (pronouns pulling to referents, plural verbs jumping over singular distractors, quote marks binding together) are real, and they've been studied enough that interpreting attention by hand is a legitimate research skill.

▶ Try it

Loading attention traces…

⁂ Notes from the bench

What to watch for, why it matters, and the one thing that usually surprises people.

Patterns worth naming

Referent tracking

In "The cat sat on the mat because it was warm", the pronoun it must bind to a noun. Click it in the example above — you'll see attention split between cat and mat, with most weight on whichever makes semantic sense. Bigger models disambiguate more confidently. Smaller ones hedge.

Long-distance agreement

"The keys to the cabinet are…" — the plural verb has to agree with the plural subject, not the closer singular distractor. In the trace, are reaches four tokens back to keys, almost entirely skipping cabinet. This pattern is the reason transformers displaced RNNs: RNNs compress prior context into a fixed state that loses which noun was plural; attention preserves the lookup.

Symbol resolution in code

Attention patterns on code are some of the cleanest. A function call attends back to its definition. A variable attends back to its binding. If you've ever wondered how a model "knows" which function you meant when you typed fib(n-1), this is roughly it.

Caveats

Real transformers have many heads per layer, each specializing in different patterns — some attend to the previous token, some to punctuation, some to whichever token comes first in a list. Modern interpretability work usually looks at specific heads at specific layers, not the aggregate. The patterns here are aggregates for the sake of legibility.

Attention also isn't the whole story. The residual stream, MLP layers, and layer normalization all shape what a token ends up representing. Attention tells you what a token looked at, not what it concluded.

In a line

Four hand-calibrated attention examples showing referent tracking, long-distance subject-verb agreement, symbol resolution in code, and quotation binding.

The Loop

What each token looks at

Patterns worth naming

Referent tracking

Long-distance agreement

Symbol resolution in code

Caveats

Other experiments

How a sentence becomes tokens

Temperature and top-p, visibly

What does this prompt actually cost?

Tokens per second

How far should the model think?

Neural language vs a Markov chain

Words in space

The injection arena

AI or human?

Context Tetris

Magnet flip