By AI Blog Editor
May 25, 2026 · 14 min read

Nine signatures — an OpenAI reasoning model disproves the Erdős unit-distance conjecture

On May 20, OpenAI announced that an internal generalist reasoning model produced the first counterexample to Paul Erdős' 1946 unit-distance conjecture. Nine senior mathematicians filed an arXiv preprint the same day, verifying the proof.

The Hungarian mathematician Paul Erdős at a student seminar in Budapest, fall 1992, photographed mid-explanation in front of a chalkboard. — Paul Erdős at a student seminar in Budapest, fall 1992. Photo Kmhkmh, CC BY 3.0 via Wikimedia Commons.

On May 20, 2026, OpenAI announced that an internal general-purpose reasoning model — unnamed in the post, described only as a system "not trained specifically for mathematics" — had produced the first counterexample to one of the longest-standing open problems in combinatorial geometry: Paul Erdős' 1946 conjecture on unit distances in the plane. The same day, nine mathematicians — Fields medallist Timothy Gowers among them — filed a 19-page arXiv preprint verifying the proof and reformulating it in publishable form. Will Sawin, one of the nine signers, posted a companion paper the same day making the bound explicit: there exist configurations of n points in the plane with more than n^1.014 unit-distance pairs. Erdős had conjectured the maximum could not exceed n^(1+o(1)). The conjecture is wrong.

The plain version: the unit-distance problem asks how many pairs of points at exactly distance 1 you can pack into a set of n points in the plane. Erdős believed the answer was "barely more than n," with square grids close to optimal. A general-purpose LLM, given the problem cold, produced a class of constructions that beats the grid by a polynomial factor. The proof leans on tools — infinite class field towers, the Golod-Shafarevich theorem, work by Ellenberg-Venkatesh and Hajir-Maire-Ramakrishna — that the discrete-geometry community had not previously connected to the problem.

That last sentence is the whole story. The math is verifiable. The novelty is that nobody asked the model for those tools.

What "general-purpose reasoning model" actually means

OpenAI did not name the model in the announcement. Sébastien Bubeck, described by Scientific American as "leading OpenAI's mathematical explorations," is the only person with a named role in the post. There is no gpt- suffix, no o-series identifier, no published model card. The official phrasing is "an internal general-purpose reasoning model." Several outlets have rendered that as "a system that wasn't built for math." Both are technically accurate. Neither tells you what the system is.

The contrast that matters is with DeepMind's AlphaProof and AlphaGeometry — purpose-built theorem provers with neuro-symbolic search loops, Lean and Isabelle integrations, and a published model card. Those systems are interesting because they are specialised. OpenAI's claim is that a generalist model, from the same family that handles code reviews and chatbot duties, can clear a bar that AlphaProof reached on IMO problems with a custom rig.

The claim is plausible only because nine mathematicians have signed an arXiv preprint saying it is true. The verification is the load-bearing part of the press release.

Timothy Gowers, Fields medallist and Rouse Ball Professor of Mathematics at Cambridge, photographed at the Joint Mathematics Meetings in Washington in January 2009.

The nine signatures

The companion paper — "Remarks on the disproof of the unit distance conjecture," arXiv:2605.20695 — is signed by Noga Alon, Thomas Bloom, Timothy Gowers, Daniel Litt, Will Sawin, Arul Shankar, Jacob Tsimerman, Victor Wang, and Melanie Matchett Wood. Bloom maintains the canonical Erdős Problems registry. Wood is the 2023 MacArthur Fellow whose name has surfaced in Fields-shortlist conversation for a year. Gowers won the medal in 1998. The lineup is not a random nine.

The paper's stated aim is to provide "a short, digested, human-verified version" of the OpenAI proof. The implied aim is to convert a proof object the journals would not accept on its own — generated by an AI system, partially edited by OpenAI staff, with no public training trace — into a citation-bearing artefact mathematicians can build on.

Two details from the Scientific American write-up land harder than the rest. First, the verifying mathematicians "have not seen the AI's original output … just an edited version." Second, Melanie Matchett Wood's observation that the model failed to credit prior work that arrived at similar ideas — behaviour she said "would constitute professional malpractice if done by humans." There is something almost touching about a frontier lab producing a major proof and forgetting the references. The nine signatures are, in part, the work of putting the citations back.

Will Sawin's polynomial

The OpenAI proof originally established that for infinitely many n, there exist point configurations with n^(1+δ) unit-distance pairs for some δ > 0. The constant δ was implicit. Within hours, Will Sawin published arXiv:2605.20579, making the bound explicit at δ ≥ 0.014. Gil Kalai's blog post on the result notes that the constant was then sharpened to 0.0318 by subsequent work — done, in Kalai's reading, without AI assistance.

The previous best-known upper bound — proved by Joel Spencer, Endre Szemerédi, and William Trotter in the early 1980s — was O(n^(4/3)). The gap between conjectured (n^(1+o(1))) and proven (n^(4/3)) had stood for forty years. The new lower bound does not close that gap. It closes the conjecture: whatever the right answer is, it is not n^(1+o(1)).

The shape of the contribution matters here. The explicit 0.014 from Sawin and the subsequent 0.0318 are constants human mathematicians produced on top of the AI's construction. The AI's contribution is the direction — pulling in algebraic number theory from outside the standard combinatorial toolkit — not the optimised exponent. That distinction matters for how the result generalises. If the model can reliably suggest cross-field tools, the value compounds. If this was a one-off lucky retrieval, it does not.

Why this is not AlphaProof

It helps to be specific about what kind of milestone this is, because the AI-and-math headline has been written before.

AlphaProof produced gold-medal-equivalent solutions to four of six IMO problems in July 2024. AlphaGeometry 2 cleared the geometry problem. Both systems were designed for the task, and the proofs they generated were verified in Lean. The category of result was: AI solves problems humans designed for humans to solve under three hours.

The Erdős unit-distance disproof is a different category. The problem had been open for eighty years. The method is something working mathematicians had not tried. The model that produced it was not designed for theorem-proving. Daniel Litt's quote in Scientific American — "This is the unique interesting result produced autonomously by AI so far" — is doing precise work. He is not saying it is the most important AI result; he is saying it is the only one where the result is independently interesting to working mathematicians, separate from the fact that an AI produced it.

That is a different bar, and it is the first time anyone has cleared it.

What this means

Three takeaways, none of them about AGI.

The generalist-versus-specialist debate just shifted. For three years, DeepMind's neuro-symbolic theorem-proving stack was the strongest argument that frontier mathematical reasoning required specialised architecture. OpenAI has not necessarily won the architectural argument — they have not published their training method, and "we used a general reasoning model" is a claim that depends on what is now inside that model. But the public scorecard has changed. A generalist produced a result a specialist did not. Anthropic and Google will both be looking at their internal generalist evals against discrete-geometry-grade problems by next week.
Verification is becoming the bottleneck, not generation. The most resource-intensive part of this announcement was the nine senior mathematicians spending a fortnight reading the proof and writing the companion paper. That is the labour the AI did not do. If a generalist model can produce one publishable conjecture-breaker per quarter, the binding constraint on the field is no longer model capability — it is the number of senior mathematicians available to verify the output. The community will need to think about what verification looks like at scale, because the present arrangement does not scale.
The Yaroslav Shitov question is now a working question. Shitov, quoted in Kalai's blog, called the moment "the official funeral for so-called research mathematics" and asked what motivation human researchers retain when the model can do the conjecture-breaking. That is not a sentiment the celebratory press release engaged with. It is the sentiment the discipline will engage with for the next decade. Tim Gowers' reply — "AI is helping us to more fully explore the cathedral of mathematics we have built over the centuries" — is the answer the discipline would like to settle on. Whether the next ten Erdős-tier conjectures fall to general-purpose models will determine whether that answer holds.

The Loop's view: this is the most legible piece of evidence yet that the post-2024 generalist reasoning models are doing something the prior generation could not. The proof is real. The verification is real. The constant is improving in human hands. What is unclear, and what the announcement did not address, is whether the result is a discovery in mathematics or a discovery about the model. Both are interesting. Only one of them generalises.

* * *

Thanks for reading. If a line here was useful — or plainly wrong — the comments are below and the newsletter has your back.

Elsewhere in this issue

3 more

Letters

Arguments, corrections, questions. Anonymous comments allowed; be kind, be specific.

The Loop