Codex Hands-On from Scratch: Building a Full-Stack World Cup App Without Writing Code

Introduction: Let AI Write Code, Watch Football, and Commentate for You

With the 2026 World Cup in full swing, one developer decided to stop manually collecting data. Instead, they used OpenAI's Codex to build a complete World Cup application system from scratch in just two days — including a full-stack website, database, desktop pet, voice commentary, scheduled tasks, and custom Skills. The entire process required zero hand-written code, driven entirely by prompts.

OpenAI Codex is a cloud-based AI coding agent launched in 2025, powered by the codex-1 model (a coding model specifically optimized through reinforcement learning). It can autonomously execute multi-step software engineering tasks in a cloud sandbox environment. Unlike traditional code completion tools, Codex functions more like an independent software engineer — it can read entire code repositories, understand project structure, write code, run tests, and submit results. Its core advantage lies in parallel task processing, with each task running in an isolated sandbox without interference.

This article provides a complete breakdown of how this system was built, covering Codex's multi-session concurrency, MCP service setup, Skill encapsulation, and scheduled task automation — perfect for anyone looking to boost development efficiency with AI.

Building a Full-Stack Framework with a Single Prompt

Universal Template + Sub-Agent Concurrency

The project started with a carefully designed prompt. While it may look lengthy, the author used a universal template reusable across projects, taking only about two minutes to write. The prompt clearly specified the frontend tech stack, backend framework, database structure, and other key information, ensuring the model could generate everything successfully in one pass.

Key steps included:

Setting thinking intensity to ultra-high, since the entire full-stack framework needed to be completed in one go
Enabling Codex's built-in Product Design mode, whose built-in design specifications save a ton of UI constraints
Explicitly requesting "please fully implement all code and design"

After execution, Codex automatically created three sub-agents, each responsible for fetching current World Cup data, data from the previous two World Cups, and design work respectively. Meanwhile, the main task continued writing code and building data structures — data collection and code development running concurrently, dramatically saving time.

Sub-agents are parallel work units that Codex automatically splits off when executing complex tasks. When the main task is sufficiently complex, Codex determines which subtasks can be executed independently, then spawns multiple sub-agents to work simultaneously. Each sub-agent has its own execution environment and context window, reporting results back to the main task for integration upon completion. This mechanism is similar to multi-threaded concurrency in software engineering, but applied at the AI reasoning and code generation level, allowing work that would otherwise require serial execution to be dramatically compressed in time.

Codex concurrent execution results

The final output: over 2,000 lines of code in 34 minutes, with all UI components specified in the prompt delivered, and impressive color schemes and layouts.

Creating the Desktop Pet "Football Cat"

Multi-Session Parallelism for Efficiency

Codex supports opening multiple independent sessions within the same project simultaneously, each with its own context. While the main session was still building the website framework, the author opened a second session to create the desktop pet.

This pet, named "Football Cat" (懂球喵), has a delightful personality:

Cute and soft on the outside, sharp-tongued and snarky on the inside
Loves eating chips, can stay up watching matches until dawn
Signature move: eating chips while watching football

Using Codex's built-in pet creation skill, the pet was completed in about 28 minutes. Codex automatically generated a JSON character definition file and a sprite sheet with multiple animation sequences including running left, running right, jumping, and eating chips.

A Sprite Sheet is a classic technique in game and animation development that arranges all of a character's animation frames on a single large image. During runtime, the program controls the display area offset to achieve frame-by-frame animation. Compared to loading multiple individual image files, a sprite sheet requires only one resource load to access all animation frames, dramatically reducing resource requests and memory fragmentation. In desktop pet scenarios, sprite sheets typically contain multiple animation sequences for idle, walking, jumping, and special actions, each consisting of several equally-sized frames.

Desktop pet can be freely dragged across the entire desktop

The pet isn't confined to the Codex window — it can be freely dragged across the entire desktop and bound to keyboard shortcuts (like Ctrl+Q) for quick show/hide. While a few sprite sheet frames have minor cropping offset issues, the overall effect is excellent.

MCP Local Voice Synthesis: Making the Cat Talk

Building an MCP Service for TTS

To make Football Cat actually "commentate" on matches, voice synthesis capability was needed first. The author chose a fully local solution: creating an MCP service that lets Codex call the Kokoro model for TTS synthesis.

What is MCP? MCP (Model Context Protocol) is a standard protocol proposed and open-sourced by Anthropic in late 2024, designed to provide large language models with a unified external tool calling interface. It uses a client-server architecture: MCP servers expose specific capabilities (such as file operations, API calls, database queries, etc.), and MCP clients (i.e., AI applications) communicate with servers via standardized JSON-RPC protocol. MCP's significance lies in solving the "every AI application needs to individually integrate with every external tool" N×M integration problem, simplifying it to N+M standardized connections. Currently, major AI platforms including OpenAI and Google support the MCP protocol. In this project, what the author did was have Codex write MCP-compliant code so that GPT knows how to call the local voice synthesis service.

Kokoro is an open-source lightweight text-to-speech (TTS) model with only about 82 million parameters, yet capable of generating near-commercial-quality speech. It supports multiple languages and voice styles, with extremely fast inference that can run in real-time on ordinary CPUs without GPU acceleration. Kokoro uses a StyleTTS2-based architecture, achieving high-naturalness speech synthesis through style diffusion and adversarial training. Its fully local operation means no network requests, no API fees, and no privacy leakage risks, making it ideal as a voice synthesis solution for personal projects.

Once configured, simply telling Codex to "synthesize speech" triggers the MCP to complete TTS automatically.

The Core Difference Between MCP and Skill

Here's a practical rule of thumb:

MCP: If the capability you want to add to the model can be fully implemented through code (e.g., TTS synthesis, database operations, file I/O), make it an MCP
Skill: If the task can't be accomplished with code alone and also requires passing your experience and execution workflow to the model, make it a Skill

A key rule: Skills can call MCPs, but MCPs cannot call Skills in reverse. The design logic is clear — MCP is the underlying capability layer providing atomic tool calls; Skill is the upper orchestration layer defining when, how, and in what order to invoke these underlying capabilities.

Skill Encapsulation + Scheduled Task Automation

From Manual to Fully Automated

The author encapsulated the entire workflow into a reusable Skill containing:

World Cup data sync script (automatically fetches scores and key events)
Database ingestion script (writes to database)
Football Cat commentary generation (including persona and style definitions)
TTS voice synthesis and playback

Reusable Skill encapsulation

A Skill in Codex is essentially an encapsulation of structured prompt templates plus execution workflows. It contains not only code logic but also human experience, judgment criteria, and execution order — "tacit knowledge." This design borrows from the "Standard Operating Procedure (SOP)" concept in software engineering — documenting expert work methods to make them repeatedly executable. The value of Skills lies in transforming one-off prompt engineering into reusable assets, avoiding the need to re-explain context and constraints to the AI every time.

The entire Skill was created in just 8 minutes. Afterward, it can be triggered with a single slash command, eliminating the need to repeatedly explain workflows or define personas. This modular product design approach is highly recommended.

Scheduled Task Configuration

To achieve full automation, the author had Codex configure the Skill as a scheduled task:

Maximum three executions per day
Specific times set according to the World Cup match schedule
A thoughtful constraint: no late-night execution (sudden match commentary in the middle of the night would indeed be creepy)

The scheduled task was configured in under a minute. When it automatically executed the next morning, it successfully found two match results, wrote them to the database, and generated both commentary text and TTS audio.

The Complete Experience Integrated into the Website

Hover-to-Commentate Interaction Design

After two days of scheduled task execution, the website had accumulated rich match data. The interaction design is clever: hovering over a match for two seconds triggers Football Cat's automatic commentary.

Spain 0-0 Cape Verde match details

Football Cat's commentary style is indeed sharp and entertaining. For example, on Germany's 7-1 win over Curaçao: "This wasn't a narrow win — Germany cranked the dial from first gear all the way to maximum." On Spain being held by Cape Verde: "Spain besieged for 90 minutes, but Vozinha kept that goal bone-dry. In the end, Spain didn't lack possession — they lacked that one shot to break through Vozinha."

Engineering Optimization for Commentary Animations

The author also abstracted Football Cat's different states (win commentary, draw commentary, base form) into independent components, assembled uniformly with the audio component. For instance, a GIF of the cat thinking while holding chips serves as the draw commentary animation, while another GIF with an open mouth serves as the blowout win commentary animation.

This component-based approach is already very simple assembly work for current large models. By decoupling visual states, audio playback, and text content into independent modules, it's not only easier to maintain and iterate, but also allows AI to precisely locate parts needing adjustment in future modifications without having to re-understand the entire system.

Summary and Insights

This project demonstrates several core capabilities of Codex in real-world development:

Multi-session concurrency: Simultaneously advancing code development and pet creation, getting two major tasks done in half an hour
Sub-agent collaboration: Data collection and code writing executing in parallel — multi-threaded programming thinking applied to AI workflows
MCP + Skill + Scheduled Tasks: Forming a complete automation workflow where MCP provides atomic capabilities, Skills orchestrate processes, and scheduled tasks enable unattended operation
Modular design: Encapsulating workflows into reusable Skills to avoid repetitive work

As the author put it: "You don't need to know how to write code yourself. What matters is expressing your ideas clearly using the right methods, so the model understands your requirements." This project, from prompt design to final product, perfectly demonstrates the new paradigm of AI-driven development — the key isn't coding ability, but clear product thinking and understanding of the AI toolchain.

This also signals a fundamental paradigm shift in software development: from "humans write code, machines execute" to "humans define intent, AI implements code, machines execute." A developer's core competitive advantage is shifting from syntax proficiency to system design ability, requirements articulation, and AI toolchain orchestration.