OpenAI Model Disproves 80-Year-Old Erdős Conjecture: The Full Story of How AI Found the Counterexample

OpenAI Disproves 80-Year-Old Erdős Conjecture: Event Overview

OpenAI recently disclosed a result that has captured the attention of the mathematics community: its AI model successfully found a counterexample to an Erdős conjecture that has stood for 80 years. This discovery involved researchers Alex Wei, Hongxun Wu, and wjmzbmr1, who shared the entire discovery process on the OpenAI Podcast with host Andrew Mayne.

Notably, wjmzbmr1 is the online handle of Lijie Chen, a distinguished young scholar in theoretical computer science who currently teaches at MIT, with research interests including computational complexity theory and algorithm design. Alex Wei and Hongxun Wu are likewise researchers with deep expertise in theoretical computer science and combinatorics. The composition of this team demonstrates that successful AI-assisted mathematical research requires not only powerful models but also researchers with profound mathematical intuition to guide the search direction, design problem frameworks, and verify the correctness of results.

This represents not only another significant breakthrough for AI in pure mathematics but also reveals the enormous potential of human-AI collaboration in mathematical research.

Background and History of the Erdős Conjecture

Who Was Paul Erdős?

Paul Erdős was one of the most prolific mathematicians of the 20th century, publishing over 1,500 papers and proposing hundreds of conjectures throughout his lifetime. His conjectures span number theory, combinatorics, graph theory, and many other fields, with many remaining unproven or undisproven to this day. Erdős was famous for his unique itinerant academic lifestyle—he held no permanent academic position, instead traveling the world to collaborate with different mathematicians. This gave rise to the famous "Erdős number" concept (measuring a scholar's collaborative distance from Erdős).

The Specific Content of the Conjecture

The Erdős conjecture disproved in this case involves covering systems in combinatorics. A covering system is a collection of congruences such that every integer satisfies at least one of them. For example, the simplest covering system could be {0 mod 2, 0 mod 3, 1 mod 4, 5 mod 6, 7 mod 12}, where every integer is "covered" by at least one congruence. In the 1950s, Erdős conjectured that for any positive integer N, there exists a covering system in which all moduli are greater than N and mutually distinct. The core of this conjecture explores whether there are fundamental limitations to the covering capacity of arithmetic structures. Bob Hough had previously proved a related finiteness result in 2015, but a complete counterexample construction had never been achieved.

Why It Remained Unsolved for 80 Years

An 80-year-old conjecture being overturned means that generations of mathematicians failed to find either a counterexample or a proof. Such problems often require finding specific mathematical structures within extremely large search spaces—precisely the domain where AI models excel. Through large-scale computation and pattern recognition, they can discover critical clues in spaces that humans cannot exhaustively explore. The search space for covering system problems grows exponentially as moduli increase, making manual construction or exhaustive verification practically impossible—an ideal scenario for AI intervention.

How AI Found the Counterexample to the Erdős Conjecture

The Human-AI Collaborative Research Model

According to the research team's sharing on the podcast, this discovery was not accomplished by AI alone but was the result of collaboration between mathematicians and the model. Researchers provided the model with a formalized description of the problem and search directions, while the model leveraged its powerful reasoning and search capabilities to locate a counterexample satisfying the required conditions within the candidate space.

This collaborative model reflects the typical paradigm of current AI-assisted mathematical research:

Humans are responsible for: problem selection, formalization, and verifying the correctness of counterexamples
AI is responsible for: large-scale search, pattern discovery, and candidate generation

Technical Mechanisms of Large Language Models

The application of Large Language Models (LLMs) to mathematical problems relies on a combination of multiple technical capabilities. First is Chain-of-Thought reasoning, where the model simulates mathematical derivation processes by decomposing problems step by step. Second is the combination of reinforcement learning and search—similar to Monte Carlo tree search in AlphaGo—where the model can conduct heuristic exploration within candidate solution spaces. OpenAI's o-series models particularly strengthen reasoning capabilities by allocating more computational resources at inference time (test-time compute) to improve problem-solving on complex tasks. This approach enables models to efficiently locate target structures within combinatorially explosive search spaces rather than simply enumerating all possibilities.

Why This Discovery Matters

Finding a counterexample to a long-standing conjecture is significant not only for overturning a specific proposition but also for:

Re-examining related theories: The existence of a counterexample forces mathematicians to reconsider the intuitions behind the conjecture and the boundary conditions of related theorems. For covering systems, this means that the covering capacity of arithmetic structures indeed has certain inherent limitations that were not previously fully recognized.
Validating AI's mathematical reasoning capabilities: This proves that large language models can do more than symbolic computation—they possess a certain degree of mathematical creativity.
Opening new research methodologies: This provides new paths for tackling more unsolved mathematical problems.

Development Trends of AI in Mathematical Research

In recent years, AI's role in mathematical research has been rapidly evolving. From DeepMind's AlphaGeometry solving geometry theorems to OpenAI's model overturning classical conjectures, we are witnessing the dawn of a new era.

AlphaGeometry and Milestones in AI Mathematics

DeepMind's AlphaGeometry system, released in early 2024, can solve geometry proof problems at the International Mathematical Olympiad level, achieving gold medalist performance. The system combines a neural language model with a symbolic reasoning engine: the neural network proposes auxiliary constructions (such as adding auxiliary lines), while the symbolic engine handles rigorous logical deduction. This neural-symbolic hybrid architecture represents an important paradigm for AI mathematical reasoning. In comparison, OpenAI's achievement focuses more on combinatorial search and counterexample construction, demonstrating that pure language models can achieve mathematical breakthroughs without relying on specialized symbolic systems. The parallel development of both approaches signals a trend toward diversification of AI mathematical research tools.

From Computational Tool to Mathematical Collaborator

Traditionally, computers have primarily served as verification and computation tools in mathematics. In 1976, Kenneth Appel and Wolfgang Haken used computers to complete the proof of the Four Color Theorem—the first major theorem in mathematical history to rely on computer verification. The proof reduced the problem to checking 1,936 unavoidable configurations one by one, with computers spending hundreds of hours on verification. This event sparked philosophical debates within the mathematics community about "what constitutes a valid proof." Subsequently, problems like the Kepler Conjecture (formally verified in 2014 by Thomas Hales through the Flyspeck project) further established computers' role in mathematical proofs.

Current large language models are transitioning toward the role of "mathematical collaborators"—they not only execute computations but can also propose hypotheses, construct counterexamples, and even inspire new proof strategies. This represents a qualitative leap from passive verification to active discovery, marking a fundamental transformation in the role of computers in mathematical research.

Current Challenges and Limitations

Of course, AI's application in mathematics still faces challenges:

Model outputs require rigorous human verification: AI-generated proofs or counterexamples may contain subtle errors and must undergo strict scrutiny by mathematicians. Formal verification tools (such as proof assistants like Lean and Coq) play an increasingly important role in this process.
AI's capabilities remain limited for problems requiring deep conceptual innovation: Current models excel at searching and combining within existing frameworks, but for problems requiring entirely new mathematical concepts or paradigm breakthroughs (such as the Riemann Hypothesis or P vs NP among the Millennium Problems), AI has not yet demonstrated sufficient creativity.
Interpretability issues: AI can find counterexamples but cannot necessarily explain "why" the counterexample works, nor can it provide the underlying mathematical intuition. This limits the impact of AI discoveries on the advancement of mathematical theory.

Conclusion: A New Phase in AI-Assisted Mathematical Research

This event marks a new phase in AI-assisted mathematical research. When an AI model finds a counterexample to a conjecture that remained unsolved for 80 years, it not only rewrites the fate of a mathematical proposition but also heralds profound changes in the future form of mathematical research. Collaboration between mathematicians and AI may become a key pathway for tackling more century-old problems.

Looking ahead, as model reasoning capabilities continue to improve, formal mathematics tools mature, and human-AI collaboration paradigms develop, we have reason to expect AI to produce breakthrough results in more branches of mathematics. Among the hundreds of conjectures Erdős proposed throughout his lifetime, perhaps many more will be proved or disproved with AI assistance—which would be the finest tribute to this great mathematician.