Words in space

A hand-placed 2D layout of 80 AI-industry words. Click any to find its nearest neighbors.

❂ Primer

Skip if you already know the theory; the interactive is right below.

An embedding is a word (or sentence, or image) represented as a vector of maybe 1536 numbers. Words used in similar contexts end up near each other in that high-dimensional space — and that's the whole reason embeddings are useful. Every RAG pipeline, every "find me the related article" feature, every semantic search box is built on the same intuition: nearness in embedding space ≈ nearness in meaning.

The map below isn't a real UMAP or t-SNE projection of a production embedding model — those tend to be smeary and hard to read. Instead, it's a hand-placed 2D layout of 80 AI-industry words, arranged by category so the intuition is legible at a glance.

▶ Try it

Loading embedding layout…

⁂ Notes from the bench

What to watch for, why it matters, and the one thing that usually surprises people.

The four clusters

Four loose clusters. Models & labs (Claude, GPT, OpenAI, Anthropic) upper-left. Training concepts (transformer, embedding, RLHF, gradient) lower-left. Runtime/inference (API, latency, cache, GPU) upper-right. Evaluation & safety (benchmark, alignment, jailbreak) lower-right. Words like "prompt", "agent", and "retrieval" sit near the center because they bridge categories.

Click a word to see its five nearest neighbors. Click one of the neighbors to jump. The interesting move is tracing a path between distant clusters — start at H100, click to GPU, then inference, then latency, then streaming. You've walked a concept graph without knowing it.

Why RAG rides on this

In production, "nearest neighbors" is usually implemented by a vector database (Pinecone, Chroma, pgvector, Turbopuffer) that stores embeddings and returns the k closest to a query. The query might be a user's question, the k might be 5 passages from your docs, and the system prompts a language model with those passages. That's retrieval-augmented generation, and the whole thing rests on this same "close in space means related" assumption.

When RAG works, it's because a good embedding model put the right passages near your query. When it doesn't, the embedding space usually had the wrong geometry — some dimension you needed for this domain wasn't represented. "Use better embeddings" is sometimes the right answer, which is why embedding models are a competitive market in their own right.

In a line

Illustrative embedding-space tour organized by cluster (models, training, runtime, evaluation, bridge concepts). Click-to-traverse via nearest-neighbor lookups.

The Loop

Words in space

The four clusters

Why RAG rides on this

Other experiments

How a sentence becomes tokens

Temperature and top-p, visibly

What does this prompt actually cost?

Tokens per second

How far should the model think?

Neural language vs a Markov chain

What each token looks at

The injection arena

AI or human?

Context Tetris

Magnet flip