Codex + Claude Code in Practice: From Vibe Coding to Enterprise-Grade AI Engineering

The Real-World Value of AI Programming Tools: Beyond Toy-Level Demos

In an era where AI programming tools are flourishing, Codex and Claude Code are undeniably the two most powerful AI coding tools in the world. Yet many developers share a common frustration: Why do projects built with Vibe Coding always seem stuck at the toy level? How can you actually use AI tools to develop enterprise-grade projects?

A Chinese tech educator known as Teacher Zhuge demonstrated the complete development process of two projects through a nearly 3-hour live coding session—an e-commerce project and an AI model aggregation platform—while providing an in-depth walkthrough of the progression from Vibe Coding to AI engineering. This article distills the core takeaways and practical methodologies from that session.

The Capability Boundaries of Vibe Coding: What It Can and Can't Do

The concept of Vibe Coding originated abroad, coined by Andrej Karpathy—former AI Director at Tesla and OpenAI co-founder—in early 2025. He described an entirely new way of programming: developers fully immerse themselves in the "vibe," embrace the exponentially growing code complexity, forget about the code's concrete existence, and express intent purely through natural language. The core idea is simple: You just need to clearly describe the requirements in your head and tell the AI programming tool, and it will generate the code for you. Whether you're a product manager or a non-technical person, you can quickly build a demo. This concept rapidly sparked heated discussion across the global developer community because it represents a fundamental shift from "writing code" to "describing intent."

The core philosophy of Vibe Coding

However, as with the early stage of any new paradigm, Vibe Coding has clear limitations:

Questionable code quality: The generated code is often "dead-on-arrival code" that's difficult to maintain
Difficult bug fixing: Non-technical users can't effectively direct AI tools to fix complex bugs, easily falling into infinite loops
Unable to handle complex business logic: Enterprise-grade requirements like high concurrency, distributed systems, and microservice architectures are completely beyond its reach

Teacher Zhuge offered a compelling example: many early AI influencers claimed Vibe Coding would "replace programmers," but if you look closely at their projects—a small cross-border e-commerce site, a Pomodoro timer, a ring light tool—the functionality is extremely simple. Real enterprise-grade projects involve business and technical complexity far beyond what Vibe Coding can handle.

Three Progressive AI Programming Development Modes

This hands-on course was designed around three progressive development modes, with each project going through all three stages:

Vibe Coding for Rapid Prototyping

The most basic approach, ideal for quickly validating ideas. You can scaffold the basic framework of an e-commerce project in minutes, but it's limited to demo-level quality. This is the starting point for beginners entering AI programming.

Plan Mode: Plan First, Then Code

Both Codex and Claude Code have a built-in Plan mode. In this mode, the AI doesn't jump straight into writing code—instead, it first creates a development plan, then implements it step by step. Compared to pure Vibe Coding, project structure and maintainability improve significantly. The core value of Plan mode lies in introducing a "divide and conquer" approach—breaking complex requirements into manageable subtasks, implementing and verifying each independently, then combining them into a complete system. While this adds a planning step compared to pure Vibe Coding, it significantly reduces the probability of rework later on.

SuperPAL Engineering: The Right Approach for Enterprise Development

This is the true enterprise-grade development approach. SuperPAL is a plugin for Claude Code that comes with a series of built-in development Skills covering the entire workflow from requirements analysis and architecture design to coding, testing, and deployment.

AI engineering programming system

SuperPAL is essentially an Agent Skill system for AI engineering, similar to earlier SDD (Specification-Driven Development) tools like Spy Code and Spy Kit, but more mature and complete. Many small and medium-sized companies are already using this workflow in production.

SDD (Specification-Driven Development) is a software engineering methodology that emphasizes establishing complete specification documents before coding, including interface definitions, data models, behavioral constraints, and more. In the context of AI programming, SDD's core value lies in providing clear "behavioral boundaries" for AI Agents—the AI no longer generates code in a free-form manner but develops under strict specification constraints. This complements traditional TDD (Test-Driven Development) and BDD (Behavior-Driven Development). SuperPAL productizes the SDD philosophy by using predefined development specifications to constrain AI code generation behavior, thereby improving the consistency and predictability of output quality.

Development Environment and Toolchain Configuration

Core Tool Selection

Teacher Zhuge's recommended development environment is VS Code + Claude Code plugin + Codex desktop/CLI. He specifically noted that traditional IDEs like IntelliJ are starting to look outdated in the AI era:

"If IntelliJ doesn't make major changes, being phased out within two or three years isn't far-fetched. Eclipse was killed off by IntelliJ back in the day, and now IntelliJ faces the same fate at the hands of AI programming tools."

The core value of traditional IDEs lies in code editing, debugging, refactoring, and other features centered on "manually writing code." However, in the AI programming era, developers' core work is shifting from "writing code" to "describing intent and reviewing code," which means lightweight editors (like VS Code) paired with AI plugins actually have an advantage over heavyweight IDEs. While JetBrains (IntelliJ's parent company) is actively integrating AI features (such as AI Assistant), its architectural legacy makes it difficult to adapt as flexibly as VS Code to various AI programming tool workflows. The rise of AI-native editors like Cursor and Windsurf further accelerates this trend, signaling a fundamental generational shift in the development tools landscape.

Development environment overview

The tool formats used in practice include:

Codex Desktop: Visual interface, suitable for everyday development
Codex CLI (Command Line): The more common approach in enterprise settings
Claude Code CLI: Command-line development, most effective when paired with the SuperPAL plugin

Backend LLM Selection Strategy

The core competitive advantage of AI programming tools always comes down to the backend model first and foremost. Teacher Zhuge shared his hands-on experience testing mainstream Chinese LLMs:

Comparison of mainstream Chinese LLMs

Tier 1: Zhipu GLM—strongest overall capability. "Their stock price has skyrocketed; the company has only been around a few years and is already approaching Xiaomi's level."
Best value: DeepSeek—solid capability at extremely low prices
Tier 2: Minimax, Kimi, Xiaomi Mimo, Alibaba Tongyi, Tencent Hunyuan

Since Anthropic's CEO (the company behind Claude) has been unfriendly toward Chinese users with frequent account bans, Teacher Zhuge chose to connect GLM as the backend model in Claude Code, while using the latest GPT version for Codex. It's worth noting that AI programming tool effectiveness is highly dependent on the backend model's code generation capability, which in turn is closely tied to multiple dimensions including training data quality, context window length, and instruction-following ability. Choosing the right backend model is essentially about finding the optimal balance among capability, cost, stability, and compliance.

OpenRouter Explained: The Business Logic of AI Model Aggregation Platforms

The second hands-on project in the course was developing an AI model aggregation platform similar to OpenRouter. OpenRouter is the world's largest AI LLM aggregation platform, integrating virtually all mainstream LLMs on the market with both free and paid options, along with detailed model evaluation rankings.

OpenRouter's business model is essentially an "API gateway aggregator" for the AI space, similar to CDN aggregation service providers in the early days of cloud computing. Developers only need to connect to a single OpenRouter API endpoint to access hundreds of models from dozens of providers including OpenAI, Anthropic, Google, and Meta. Its revenue model includes: charging a 10-20% service fee on top of model providers' original prices, offering premium enterprise SLA guarantees, and capturing wholesale discount margins through traffic volume. The core moat of this model isn't technology—it's ecosystem positioning. Once a large number of developers and applications depend on its unified interface, migration costs create a natural competitive barrier.

Teacher Zhuge revealed a business reality that many people overlook: The most profitable segment in AI right now isn't consumer-facing AI applications (products like Doubao and Tencent Yuanbao are all losing money), but rather three directions:

Selling compute power: Chips, semiconductors, memory, and other hardware—related A-share sectors have already skyrocketed
Selling tokens: API services from major players like Zhipu and Tencent
Model aggregation platforms: Essentially "reselling tokens with a wrapper"

"I have friends who run exactly this kind of wrapper site—a team of about a dozen people, doing one to two hundred million yuan in annual revenue. A few years ago they were at the same level as us; now they're in a completely different league."

The business model for these platforms is straightforward: obtain cheaper wholesale token prices, aggregate multiple models, and retail them to developers for a markup. While the technical barrier isn't high, market demand is enormous. From a technical implementation perspective, the core challenges for these platforms include: unifying API format differences across model providers, implementing intelligent routing and load balancing, handling billing discrepancies between providers, and ensuring high service availability. This happens to be a medium-complexity project well-suited for AI engineering development approaches.

Internal AI Engineering Practices at Major Tech Companies

The course also touched on internal AI engineering best practices at leading Chinese internet companies (such as Alibaba). Teacher Zhuge mentioned that major tech companies are internally promoting an AI engineering system based on Harness, with the core philosophy of self-contained evolution—forming a complete AI-assisted development loop from requirements to code to testing to deployment.

Harness in the AI engineering context refers to a complete "testing and execution framework," originating from the Test Harness concept in software testing. In AI programming tools, the Harness system is responsible for building an automated feedback loop: after AI generates code, Harness automatically runs compilation checks, unit tests, and integration tests, then feeds the results back to the AI for iterative correction. This "generate-verify-correct" self-contained loop mechanism dramatically reduces the frequency of human intervention and is the key technical foundation for AI programming tools evolving from "code completion" to "autonomous development."

The reason Claude Code is widely used by professional programmers is precisely because it implements a complete Harness engineering system internally, with extensive optimizations for professional-grade programming. Specifically, when Claude Code executes coding tasks, it automatically runs the project's test suite to verify the correctness of generated code. If tests fail, it automatically analyzes the error causes and corrects the code until all tests pass. This mechanism ensures that AI-generated code quality is far superior to simple one-shot generation. Developers interested in learning more can read Claude Code's source code to gain a deeper understanding of its engineering design philosophy.

Summary: Three Paths for Advancing Your AI Programming Skills

AI programming is undergoing a paradigm shift from "vibe coding" to "engineering-grade coding." For developers at different stages, different strategies are recommended:

Beginners: Start with Vibe Coding for a quick hands-on experience to understand the basic AI programming workflow
Intermediate developers: Master Plan mode and learn to decompose requirements and implement them step by step
Professional developers: Dive deep into engineering frameworks like SuperPAL and integrate AI tools into enterprise-grade development workflows

Regardless of which path you choose, the core principle remains the same: AI is a tool; engineering thinking is the real competitive advantage. Just as software engineering evolved from the waterfall model to agile development, AI programming is also maturing from "random generation" to "standardized engineering." Developers who grasp this trend will hold a significant edge in future technical competition.