The Loop  ·  Issue 017

The Loop

A field journal of the AI frontier — for engineers who ship.

  Lab bench

Experiment №007
filed Apr 21, 2026

explainer

Filed under

  • #transformers
  • #attention
  • #interpretability

What each token looks at

Click a token. See which earlier tokens it attends to, as a row of weighted bars.

  Primer

Skip if you already know the theory; the interactive is right below.

Inside a transformer, every token at every layer decides which earlier tokens matter to it right now. The mechanism is attention: the token produces a "query" vector, every other token offers a "key" vector, and the dot products — after a softmax — are the weights. Those weights determine how much of each other token's content gets mixed in when this one is updated.

The diagrams below are illustrative, not lifted from a specific trained model — but the patterns (pronouns pulling to referents, plural verbs jumping over singular distractors, quote marks binding together) are real, and they've been studied enough that interpreting attention by hand is a legitimate research skill.

▶  Try it

Loading attention traces…

  Notes from the bench

What to watch for, why it matters, and the one thing that usually surprises people.

Patterns worth naming

Referent tracking

In "The cat sat on the mat because it was warm", the pronoun it  must bind to a noun. Click it in the example above — you'll see attention split between cat and mat, with most weight on whichever makes semantic sense. Bigger models disambiguate more confidently. Smaller ones hedge.

Long-distance agreement

"The keys to the cabinet are…" — the plural verb has to agree with the plural subject, not the closer singular distractor. In the trace, are reaches four tokens back to keys, almost entirely skipping cabinet. This pattern is the reason transformers displaced RNNs: RNNs compress prior context into a fixed state that loses which noun was plural; attention preserves the lookup.

Symbol resolution in code

Attention patterns on code are some of the cleanest. A function call attends back to its definition. A variable attends back to its binding. If you've ever wondered how a model "knows" which function you meant when you typed fib(n-1), this is roughly it.

Caveats

Real transformers have many heads per layer, each specializing in different patterns — some attend to the previous token, some to punctuation, some to whichever token comes first in a list. Modern interpretability work usually looks at specific heads at specific layers, not the aggregate. The patterns here are aggregates for the sake of legibility.

Attention also isn't the whole story. The residual stream, MLP layers, and layer normalization all shape what a token ends up representing. Attention tells you what a token looked at, not what it concluded.

In a line

Four hand-calibrated attention examples showing referent tracking, long-distance subject-verb agreement, symbol resolution in code, and quotation binding.

Other experiments

11
  1. Exp 001

    How a sentence becomes tokens

  2. Exp 002

    Temperature and top-p, visibly

  3. Exp 003

    What does this prompt actually cost?

  4. Exp 004

    Tokens per second

  5. Exp 005

    How far should the model think?

  6. Exp 006

    Neural language vs a Markov chain

  7. Exp 008

    Words in space

  8. Exp 009

    The injection arena

  9. Exp 010

    AI or human?

  10. Exp 011

    Context Tetris

  11. Exp 012

    Magnet flip