Codex + Claude Code in Practice: From Vibe Coding to Enterprise-Grade AI Engineering

A practical guide to evolving from Vibe Coding to enterprise-grade AI engineering with Codex and Claude Code.
This article breaks down the practical use of Codex and Claude Code across three progressive development modes: Vibe Coding for rapid prototyping, Plan mode for structured implementation, and SuperPAL for enterprise-grade AI engineering. It covers dev environment setup, backend LLM selection strategies, the business logic behind AI model aggregation platforms like OpenRouter, and how major tech companies are adopting Harness-based AI engineering workflows internally.
The Real-World Value of AI Programming Tools: Beyond Toy-Level Demos
In an era where AI programming tools are flourishing, Codex and Claude Code are undeniably the two most powerful AI coding tools in the world. Yet many developers share a common frustration: Why do projects built with Vibe Coding always seem stuck at the toy level? How can you actually use AI tools to develop enterprise-grade projects?
A Chinese tech educator known as Teacher Zhuge demonstrated the complete development process of two projects through a nearly 3-hour live coding session—an e-commerce project and an AI model aggregation platform—while providing an in-depth walkthrough of the progression from Vibe Coding to AI engineering. This article distills the core takeaways and practical methodologies from that session.
The Capability Boundaries of Vibe Coding: What It Can and Can't Do
The concept of Vibe Coding originated abroad, coined by Andrej Karpathy—former AI Director at Tesla and OpenAI co-founder—in early 2025. He described an entirely new way of programming: developers fully immerse themselves in the "vibe," embrace the exponentially growing code complexity, forget about the code's concrete existence, and express intent purely through natural language. The core idea is simple: You just need to clearly describe the requirements in your head and tell the AI programming tool, and it will generate the code for you. Whether you're a product manager or a non-technical person, you can quickly build a demo. This concept rapidly sparked heated discussion across the global developer community because it represents a fundamental shift from "writing code" to "describing intent."

However, as with the early stage of any new paradigm, Vibe Coding has clear limitations:
- Questionable code quality: The generated code is often "dead-on-arrival code" that's difficult to maintain
- Difficult bug fixing: Non-technical users can't effectively direct AI tools to fix complex bugs, easily falling into infinite loops
- Unable to handle complex business logic: Enterprise-grade requirements like high concurrency, distributed systems, and microservice architectures are completely beyond its reach
Teacher Zhuge offered a compelling example: many early AI influencers claimed Vibe Coding would "replace programmers," but if you look closely at their projects—a small cross-border e-commerce site, a Pomodoro timer, a ring light tool—the functionality is extremely simple. Real enterprise-grade projects involve business and technical complexity far beyond what Vibe Coding can handle.
Three Progressive AI Programming Development Modes
This hands-on course was designed around three progressive development modes, with each project going through all three stages:
Vibe Coding for Rapid Prototyping
The most basic approach, ideal for quickly validating ideas. You can scaffold the basic framework of an e-commerce project in minutes, but it's limited to demo-level quality. This is the starting point for beginners entering AI programming.
Plan Mode: Plan First, Then Code
Both Codex and Claude Code have a built-in Plan mode. In this mode, the AI doesn't jump straight into writing code—instead, it first creates a development plan, then implements it step by step. Compared to pure Vibe Coding, project structure and maintainability improve significantly. The core value of Plan mode lies in introducing a "divide and conquer" approach—breaking complex requirements into manageable subtasks, implementing and verifying each independently, then combining them into a complete system. While this adds a planning step compared to pure Vibe Coding, it significantly reduces the probability of rework later on.
SuperPAL Engineering: The Right Approach for Enterprise Development
This is the true enterprise-grade development approach. SuperPAL is a plugin for Claude Code that comes with a series of built-in development Skills covering the entire workflow from requirements analysis and architecture design to coding, testing, and deployment.

SuperPAL is essentially an Agent Skill system for AI engineering, similar to earlier SDD (Specification-Driven Development) tools like Spy Code and Spy Kit, but more mature and complete. Many small and medium-sized companies are already using this workflow in production.
SDD (Specification-Driven Development) is a software engineering methodology that emphasizes establishing complete specification documents before coding, including interface definitions, data models, behavioral constraints, and more. In the context of AI programming, SDD's core value lies in providing clear "behavioral boundaries" for AI Agents—the AI no longer generates code in a free-form manner but develops under strict specification constraints. This complements traditional TDD (Test-Driven Development) and BDD (Behavior-Driven Development). SuperPAL productizes the SDD philosophy by using predefined development specifications to constrain AI code generation behavior, thereby improving the consistency and predictability of output quality.
Development Environment and Toolchain Configuration
Core Tool Selection
Teacher Zhuge's recommended development environment is VS Code + Claude Code plugin + Codex desktop/CLI. He specifically noted that traditional IDEs like IntelliJ are starting to look outdated in the AI era:
"If IntelliJ doesn't make major changes, being phased out within two or three years isn't far-fetched. Eclipse was killed off by IntelliJ back in the day, and now IntelliJ faces the same fate at the hands of AI programming tools."
The core value of traditional IDEs lies in code editing, debugging, refactoring, and other features centered on "manually writing code." However, in the AI programming era, developers' core work is shifting from "writing code" to "describing intent and reviewing code," which means lightweight editors (like VS Code) paired with AI plugins actually have an advantage over heavyweight IDEs. While JetBrains (IntelliJ's parent company) is actively integrating AI features (such as AI Assistant), its architectural legacy makes it difficult to adapt as flexibly as VS Code to various AI programming tool workflows. The rise of AI-native editors like Cursor and Windsurf further accelerates this trend, signaling a fundamental generational shift in the development tools landscape.

The tool formats used in practice include:
- Codex Desktop: Visual interface, suitable for everyday development
- Codex CLI (Command Line): The more common approach in enterprise settings
- Claude Code CLI: Command-line development, most effective when paired with the SuperPAL plugin
Backend LLM Selection Strategy
The core competitive advantage of AI programming tools always comes down to the backend model first and foremost. Teacher Zhuge shared his hands-on experience testing mainstream Chinese LLMs:

- Tier 1: Zhipu GLM—strongest overall capability. "Their stock price has skyrocketed; the company has only been around a few years and is already approaching Xiaomi's level."
- Best value: DeepSeek—solid capability at extremely low prices
- Tier 2: Minimax, Kimi, Xiaomi Mimo, Alibaba Tongyi, Tencent Hunyuan
Since Anthropic's CEO (the company behind Claude) has been unfriendly toward Chinese users with frequent account bans, Teacher Zhuge chose to connect GLM as the backend model in Claude Code, while using the latest GPT version for Codex. It's worth noting that AI programming tool effectiveness is highly dependent on the backend model's code generation capability, which in turn is closely tied to multiple dimensions including training data quality, context window length, and instruction-following ability. Choosing the right backend model is essentially about finding the optimal balance among capability, cost, stability, and compliance.
OpenRouter Explained: The Business Logic of AI Model Aggregation Platforms
The second hands-on project in the course was developing an AI model aggregation platform similar to OpenRouter. OpenRouter is the world's largest AI LLM aggregation platform, integrating virtually all mainstream LLMs on the market with both free and paid options, along with detailed model evaluation rankings.
OpenRouter's business model is essentially an "API gateway aggregator" for the AI space, similar to CDN aggregation service providers in the early days of cloud computing. Developers only need to connect to a single OpenRouter API endpoint to access hundreds of models from dozens of providers including OpenAI, Anthropic, Google, and Meta. Its revenue model includes: charging a 10-20% service fee on top of model providers' original prices, offering premium enterprise SLA guarantees, and capturing wholesale discount margins through traffic volume. The core moat of this model isn't technology—it's ecosystem positioning. Once a large number of developers and applications depend on its unified interface, migration costs create a natural competitive barrier.
Teacher Zhuge revealed a business reality that many people overlook: The most profitable segment in AI right now isn't consumer-facing AI applications (products like Doubao and Tencent Yuanbao are all losing money), but rather three directions:
- Selling compute power: Chips, semiconductors, memory, and other hardware—related A-share sectors have already skyrocketed
- Selling tokens: API services from major players like Zhipu and Tencent
- Model aggregation platforms: Essentially "reselling tokens with a wrapper"
"I have friends who run exactly this kind of wrapper site—a team of about a dozen people, doing one to two hundred million yuan in annual revenue. A few years ago they were at the same level as us; now they're in a completely different league."
The business model for these platforms is straightforward: obtain cheaper wholesale token prices, aggregate multiple models, and retail them to developers for a markup. While the technical barrier isn't high, market demand is enormous. From a technical implementation perspective, the core challenges for these platforms include: unifying API format differences across model providers, implementing intelligent routing and load balancing, handling billing discrepancies between providers, and ensuring high service availability. This happens to be a medium-complexity project well-suited for AI engineering development approaches.
Internal AI Engineering Practices at Major Tech Companies
The course also touched on internal AI engineering best practices at leading Chinese internet companies (such as Alibaba). Teacher Zhuge mentioned that major tech companies are internally promoting an AI engineering system based on Harness, with the core philosophy of self-contained evolution—forming a complete AI-assisted development loop from requirements to code to testing to deployment.
Harness in the AI engineering context refers to a complete "testing and execution framework," originating from the Test Harness concept in software testing. In AI programming tools, the Harness system is responsible for building an automated feedback loop: after AI generates code, Harness automatically runs compilation checks, unit tests, and integration tests, then feeds the results back to the AI for iterative correction. This "generate-verify-correct" self-contained loop mechanism dramatically reduces the frequency of human intervention and is the key technical foundation for AI programming tools evolving from "code completion" to "autonomous development."
The reason Claude Code is widely used by professional programmers is precisely because it implements a complete Harness engineering system internally, with extensive optimizations for professional-grade programming. Specifically, when Claude Code executes coding tasks, it automatically runs the project's test suite to verify the correctness of generated code. If tests fail, it automatically analyzes the error causes and corrects the code until all tests pass. This mechanism ensures that AI-generated code quality is far superior to simple one-shot generation. Developers interested in learning more can read Claude Code's source code to gain a deeper understanding of its engineering design philosophy.
Summary: Three Paths for Advancing Your AI Programming Skills
AI programming is undergoing a paradigm shift from "vibe coding" to "engineering-grade coding." For developers at different stages, different strategies are recommended:
- Beginners: Start with Vibe Coding for a quick hands-on experience to understand the basic AI programming workflow
- Intermediate developers: Master Plan mode and learn to decompose requirements and implement them step by step
- Professional developers: Dive deep into engineering frameworks like SuperPAL and integrate AI tools into enterprise-grade development workflows
Regardless of which path you choose, the core principle remains the same: AI is a tool; engineering thinking is the real competitive advantage. Just as software engineering evolved from the waterfall model to agile development, AI programming is also maturing from "random generation" to "standardized engineering." Developers who grasp this trend will hold a significant edge in future technical competition.
Related articles

The Complete Guide to OpenAI Codex CLI: From Installation and Configuration to Enterprise-Level Practice
In-depth guide to OpenAI Codex CLI: covering installation, agents.md design, multi-agent collaboration, MCP protocol integration, and a RAG customer service project.

Decoding Google's AI Control Roadmap: A Defense Framework for When AI Goes Off the Rails
Google releases its AI Control Roadmap, a new safety paradigm that assumes alignment may fail and builds defenses at the system architecture level.

Agent Factory: Voice-Driven AI Coding — A Hands-On Guide to Building Apps for Free
Agent Factory wraps Claude Code into a voice-driven AI coding tool with dozens of free models, letting you build apps, games, and websites through conversation.