Codex in Action: One Prompt, 47 Minutes, a Complete Algorithm Research Paper

OpenAI Codex generates a full algorithm research paper with code and figures from a single prompt in 47 minutes.
A researcher demonstrated that OpenAI's Codex agent can produce a complete algorithm paper — including working PSO code, publication-quality figures, and a 31-page LaTeX manuscript — in under 48 minutes from a single detailed prompt. The author has already submitted a similar AI-assisted paper to a top-tier journal now under review, suggesting AI-assisted research has reached practical viability.
Experiment Overview: A Complete Algorithm Paper Without Writing a Single Line of Code
A researcher recently shared a remarkable experiment: by crafting a single detailed prompt, OpenAI's Codex autonomously produced a complete algorithm research paper draft — including working code, figures, and a full LaTeX manuscript — in just 47 minutes and 51 seconds.
OpenAI Codex is a code generation system built on the GPT large language model architecture, originally released in 2021 as the engine powering GitHub Copilot. In 2025, OpenAI launched an entirely new agentic version of Codex that goes far beyond code completion — it can autonomously execute multi-step tasks within a cloud sandbox environment. The new Codex reads entire repository contexts, plans task sequences, executes code, runs tests, generates files, and submits results as Pull Requests. This leap from "completion tool" to "autonomous agent" is what enables it to handle the complex end-to-end task described here.
The paper's topic was highly complex: "Joint Robust Location-Inventory-Routing Optimization for Fresh Products Considering Freshness Decay Suppression and Time-Varying Refrigeration Emission Reduction Under Dual Carbon Tax and Cap-and-Trade Policies," solved using a Particle Swarm Optimization (PSO) algorithm. Throughout the entire process, the author wrote zero lines of code — everything was generated automatically by Codex.
Technical Background of the Core Problem
The Location-Inventory-Routing Problem (LIRP) is a classic NP-hard problem in supply chain optimization that integrates three traditionally independent sub-problems — facility location, inventory management, and vehicle routing — into a single joint optimization model. This integration avoids suboptimal solutions from sequential decision-making but causes problem complexity to grow exponentially. In cold-chain logistics for fresh products, the model must additionally account for time-dependent freshness decay, carbon emissions from refrigerated transport, and policy constraints like carbon taxes and cap-and-trade systems, further increasing the problem's dimensionality and constraint complexity.
Carbon Tax and Cap-and-Trade are the two primary economic policy instruments for combating climate change globally. A carbon tax imposes a fixed fee per unit of carbon emission, providing price certainty but uncertain total emissions. Cap-and-trade sets an overall emission cap and allows companies to trade allowances on the market, providing quantity certainty but volatile prices. In supply chain optimization research, dual-policy scenarios mean enterprises must both pay carbon taxes and comply with hard emission caps, requiring models to simultaneously handle linear cost terms (tax) and hard constraints (cap), significantly increasing model complexity.
Robust Optimization is a mathematical programming approach for handling uncertainty. Unlike stochastic programming, it doesn't require precise probability distributions for uncertain parameters. Instead, it assumes parameters vary within an uncertainty set and seeks solutions that remain feasible and optimal under worst-case scenarios. In supply chain management, demand fluctuations, transport time variability, and uncertain freshness decay rates are all well-suited to robust optimization frameworks.
Prompt Engineering: The Details Make All the Difference
Project Architecture Design
The author shared an extremely detailed prompt specification covering these core dimensions:
- Overall research objective: Clear paper direction, with writing style aligned to the journal Systems Engineering — Theory & Practice
- Project directory structure: Pre-designed complete folder and file hierarchy
- Code implementation requirements: Based on Python 3.10+, with dependencies like Groovy pre-installed
- Model implementation details: Network structure, decision variables, cost functions, carbon emission constraints, freshness decay mechanisms
- Algorithm design specifications: HRDC-PSO algorithm encoding/decoding, repair mechanisms, local search, adaptive strategies
Prompt Engineering has evolved from simple instruction writing into a systematic discipline of AI interaction design. Advanced prompt engineering involves task decomposition (breaking complex goals into executable subtasks), context management (providing the most relevant background information within token limits), output format control (constraining generation results through structured templates), and Chain-of-Thought guidance. The author's prompt was essentially a comprehensive system design document containing architecture specifications, interface definitions, quality standards, and acceptance criteria. This engineering-oriented approach to prompt design is becoming a core skill of the AI era.
Paper Specification Requirements
On the writing side, the prompt was equally meticulous:
- Chinese-language paper, approximately 10,000 characters, 31 pages
- No fewer than 50 references, prioritizing authoritative journals in operations research and management
- Figures following Nature journal standards, using Nature scale styling
- 10 figures and 17 tables in the main text, plus supplementary figures
- Complete LaTeX output format
LaTeX is a document preparation system developed by Leslie Lamport on top of Donald Knuth's TeX typesetting engine. It's the standard typesetting tool for academic papers in mathematics, physics, computer science, and engineering. Unlike WYSIWYG editors like Word, LaTeX uses a markup language approach — authors write source code to define document structure and content, which a compiler transforms into final PDF output. Its advantages include beautiful mathematical formula typesetting, automated reference management, and strong cross-referencing consistency. Major journals typically provide official LaTeX templates. AI-generated LaTeX source code means output can be directly compiled into journal-format-compliant manuscripts.

Results: A Surprisingly Complete Output
Code Layer
Codex created a complete project file structure in one pass, including:
- Main controller code (Python entry point)
- Data generation module (automatic simulation data creation)
- Core algorithm implementation (each algorithm module as a separate file)
- Numerical simulation code
- Experimental design code
- Visualization and plotting code
- Parameter settings and configuration files
The author specifically noted that the most impressive aspect was that the entire algorithm codebase actually ran successfully — from base model encoding, data processing, and repair mechanisms to solution output — forming a complete working pipeline.
Particle Swarm Optimization (PSO) is a swarm intelligence metaheuristic algorithm inspired by bird flocking behavior, proposed by Kennedy and Eberhart in 1995. Each "particle" represents a candidate solution in the search space, updating its velocity and position by tracking its personal best position and the swarm's global best position. PSO is widely applied to continuous optimization problems due to its simple implementation, few parameters, and fast convergence. The HRDC-PSO in this paper is an enhanced variant incorporating hierarchical repair mechanisms, dynamic constraint handling, and adaptive strategies on top of standard PSO, specifically designed for combinatorial optimization problems with multiple constraints. Codex's ability to correctly implement this complex algorithm variant demonstrates a remarkably deep "understanding" of metaheuristic algorithm design patterns.

Paper Layer
The generated 31-page manuscript contained a complete academic paper structure:
- Abstract and keywords
- Introduction and literature review (with comparative literature tables)
- Problem description and notation
- Model formulation (objective function, constraints, freshness decay mechanism, carbon policy mechanism)
- Algorithm design (with algorithm flowcharts)
- Numerical experiments and results analysis
- Conclusions
Figure and Table Quality
The generated figures and tables were generally usable, with some already at publication quality. Issues mainly centered on:
- Occasional misalignment of figure elements
- Minor layout glitches when combining ABCD subplot panels
- Complex visualizations like heatmaps requiring fine-tuning
The author's assessment: "With proper polishing, these are already good enough for publication."

Comparison and Reflections: Where Are the Boundaries of AI-Assisted Research?
Comparison with Previous Experiment
| Dimension | Previous Experiment | This Experiment |
|---|---|---|
| Runtime | 15-20 minutes | 47 minutes 51 seconds |
| Complexity | Basic demonstration | Multi-constraint joint optimization |
| Output Quality | Rough | Significantly deeper theoretical content |
Feasibility for Actual Publication
The author revealed their real submission experience: using a similar method (segmented prompts + iterative refinement), they have successfully submitted to Computers & Industrial Engineering (Chinese Academy of Sciences Zone 1 TOP journal), which is currently under external review. The first paper took about two weeks; once the workflow is established, the estimated turnaround is one paper per week.
Computers & Industrial Engineering is a top journal in industrial engineering and operations management published by Elsevier. It holds a CAS JCR Zone 1 TOP ranking with an impact factor consistently between 6-8, primarily publishing high-quality research in operations optimization, production scheduling, supply chain management, and intelligent manufacturing. Successfully submitting to this journal and reaching the external review stage indicates that AI-assisted papers have already met a considerable threshold in terms of academic rigor and novelty.
Current Limitations
- Limited depth in single-pass generation: While theoretical depth has improved, segmented prompting with iterative refinement still produces better results
- Figures require manual adjustment: AI-generated figures often have minor positioning and detail issues, making them unsuitable for direct submission
- Domain expertise is still essential: Designing the prompt itself requires deep understanding of the research area — this is far from "zero barrier to entry"

Conclusion: AI-Assisted Research Has Entered the Practical Stage
This experiment clearly demonstrates that current AI coding tools like Codex have reached a remarkably practical level for algorithm-focused paper writing. The key question isn't whether AI can fully replace researchers, but how to collaborate efficiently:
- Prompt engineering is a core competitive advantage: Detailed, structured requirement descriptions directly determine output quality — writing good prompts is itself a professional skill
- Human-AI collaboration is more efficient: AI handles code implementation and draft writing; humans handle direction-setting and quality calibration
- Research efficiency gains are order-of-magnitude: From the traditional timeline of several months down to one or two weeks
For researchers, learning to collaborate with AI and mastering prompt engineering may hold more strategic value than simply improving programming skills. It's worth noting that this trend doesn't eliminate scientific creativity — true innovation still comes from problem insight, methodological breakthroughs, and critical interpretation of results. What AI tools change is the execution efficiency of converting creative ideas into publishable outputs. Future research competitiveness will increasingly depend on whether researchers can find optimal task division in human-AI collaboration — letting AI handle repetitive coding, typesetting, and formatting work while concentrating human cognitive resources on the tasks that genuinely require creativity and judgment.
Related articles

Hermes AI Kanban: A Five-Layer Autonomous Architecture for Fully Automated Delivery from Idea to Finished Product
Deep dive into Hermes Kanban 2.0's five-layer autonomous architecture covering intelligent planning, human approval gates, multi-agent execution, and Obsidian integration for fully automated delivery.

A Three-Month Roadmap to LLM Development: A Deep Dive into the Learning Path from Zero to Freelancing
A deep dive into the three-step LLM development learning path: from prompt engineering and RAG knowledge bases to AI Agent development, with realistic timelines for beginners and experienced developers.

Struggling to Deploy AI Agents? Engineering Is the Key to Going from Demo to Product
57% of projects have deployed AI Agents, but 40% will be killed. This article analyzes the engineering methodology for taking AI Agents from Demo to enterprise product, covering the full process from requirements to deployment.