From Prompt Engineer to Loop Architect: The Paradigm Shift in AI Programming
From Prompt Engineer to Loop Architect…
AI programming is shifting from writing prompts to designing autonomous coding loops.
Anthropic's Boris Cherny reveals a fundamental shift in AI programming: from crafting prompts to building autonomous loop architectures. This article explores Routines—persistent automation mechanisms that run asynchronously with self-verification—the three-stage evolution of AI coding, token cost management challenges, and how developers must transition from operators to system architects.
Introduction: No Longer Writing Prompts, But Writing Loops
Boris Cherny, head of Claude Code at Anthropic, recently shared a thought-provoking perspective: "I no longer write prompts for Claude. I let loops prompt Claude and decide what to do. My job is to write loops."
This statement may seem simple, but it reveals a profound paradigm shift happening in AI programming—from humans manually writing prompts to building automated systems that can run, iterate, and deliver autonomously.
What Is Loop Architecture?
Routines: Persistent, High-Level Automation
Anthropic is developing a mechanism called Routines—a persistent, high-level automation process with the following characteristics:
- Asynchronous execution: No need for real-time human supervision
- State monitoring: Continuously tracks task progress and system state
- Feedback iteration: Automatically adjusts strategies based on execution results
- High-signal filtering: Only escalates truly critical decisions that require human judgment
The design philosophy of Routines originates from the long-standing "Orchestration Pattern" in software engineering. In microservices architecture, an orchestrator coordinates the invocation order and data flow across multiple services, and Routines applies this concept to the AI Agent domain. Unlike traditional single-turn conversations or simple chain calls, Routines introduces persistent state management, meaning the system can remember previous decision contexts and execution progress even during long-running tasks. This design shares a striking similarity with Control Loops in Kubernetes—the system continuously compares the gap between "desired state" and "actual state" and automatically takes action to eliminate deviations.
This means AI is no longer a passive tool waiting for instructions, but a work system capable of autonomous operation. The human role shifts from "operator" to "architect"—designing the system's operational logic rather than issuing instructions one by one.
Six Essential Elements of a Mature Coding Loop
A mature AI coding loop needs to include the following core components:
- Goal definition: Clearly defining the ultimate objective the loop should achieve
- State and memory management: Tracking completed work and current context
- Self-verification: The loop can validate the quality of its own output
- Recovery mechanisms: Automatically rolling back and repairing when errors occur
- Human-in-the-loop escalation: Identifying when human intervention is needed
- Observability and cost control: Monitoring runtime status and resource consumption
Among these, self-verification is the most technically challenging aspect of the entire loop architecture. Self-verification in a loop isn't simply having the AI "check itself"—it requires introducing multiple layers of objective verification methods. Typical implementations include: running automated test suites (unit tests, integration tests) to verify code functionality; using static analysis tools (such as ESLint, mypy) to check code standards and type safety; executing code in sandbox environments and comparing against expected outputs; and even introducing another AI instance as a "reviewer" for cross-validation. This multi-signal verification strategy essentially borrows from the CI/CD (Continuous Integration/Continuous Deployment) pipeline design philosophy in software engineering, embedding quality gates into every iteration cycle.
The Three-Stage Evolution of AI Programming
This trend can be summarized in three clear stages:
Stage One: Humans Write Code
The traditional software development model where programmers directly write every line of code. Development efficiency depends entirely on individual coding ability and domain experience.
Stage Two: Humans Prompt Models to Write Code
The stage where most developers currently find themselves—generating code through carefully designed prompts. This is what we know as "Prompt Engineering." Over the past two years, prompt engineering has been regarded as the core skill for collaborating with AI. Its essence is guiding model output through carefully crafted natural language instructions. Developers need to master context window management, few-shot example design, role-setting, and other techniques to achieve higher-quality code generation results. However, the ceiling of this approach lies in the fact that every interaction still depends on real-time human participation and judgment.
Stage Three: Building Autonomous Coding Systems
Developers build systems capable of autonomously prompting, iterating, verifying, and delivering code. The human focus shifts from "what to write" to "how to make the system run on its own." In this stage, developers are essentially designing a self-driving software engineering pipeline, where AI no longer plays the role of "assistant" but rather "execution engine."
Token Costs: An Unavoidable Real-World Constraint
Avoiding "Expensive Vibe Coding"
Boris Cherny specifically emphasized a critical warning: A loop without good evaluation and cost awareness is nothing more than expensive vibe coding at scale.
Vibe Coding is a concept coined by Andrej Karpathy, former OpenAI researcher and Tesla AI Director, in early 2025. It refers to developers relying entirely on AI-generated code without carefully reviewing the logic, judging code usability purely by "vibe." This approach has some value in rapid prototyping, but poses serious risks in production environments—code may contain hidden logic errors, security vulnerabilities, or performance issues. When this casual approach is placed into automated loops running at scale, problems are amplified exponentially: the system may continuously iterate in the wrong direction, consuming massive amounts of tokens with each iteration, ultimately producing code that is both expensive and unreliable.
This point is crucial. When loops run automatically, token consumption can grow exponentially. Tokens are the basic unit of measurement for how large language models process text—typically one English word corresponds to 1-2 tokens, and one Chinese character corresponds to 1-3 tokens. Taking frontier models like Claude as an example, the cost per million input tokens ranges from $3-15, with output tokens being even more expensive. In automated loops, each iteration involves substantial context passing, code generation, and result analysis, with a single loop potentially consuming tens or even hundreds of thousands of tokens. If the loop is poorly designed—for example, lacking effective termination conditions or falling into ineffective retries—an overnight task could easily generate hundreds of dollars in API fees while producing nothing of value.
Without effective evaluation mechanisms, the system may consume massive resources heading in the wrong direction while producing low-quality results. Therefore, cost control and quality evaluation are not optional—they are core components of loop architecture. Developers need to set clear budget caps, quality thresholds, and termination conditions for each loop.
Implications for Developers
What does this paradigm shift mean for the entire developer community?
First, the center of gravity for skills is shifting. The most valuable capability in the future won't be writing perfect prompts, but designing efficient automated loops—including systems engineering abilities like state management, error recovery, and quality evaluation. Loop architecture demands a fundamentally different skill set: state machine design (defining system behavior at different stages), fault tolerance and retry strategies (such as exponential backoff algorithms—increasing retry intervals exponentially to avoid request storms when services are unavailable), resource budget management, and observability engineering (monitoring system behavior through logs, metrics, and traces). These capabilities are closer to traditional distributed systems engineering and SRE (Site Reliability Engineering) than to natural language processing. This means engineers with systems architecture backgrounds may have a natural advantage in this paradigm shift.
Second, the human role becomes more strategic. Developers will focus more on system design, goal definition, and exception handling strategies rather than specific implementation details. This is similar to the concept of "span of control" in management—when subordinates (AI loops) are sufficiently reliable, managers (developers) can concentrate their attention on higher-level decisions rather than micromanaging every step of execution.
Finally, evaluation capability becomes a core competitive advantage. In a world where loops run autonomously, the ability to accurately assess AI output quality and design effective verification mechanisms will be the key differentiator for outstanding developers. This includes designing quantitative evaluation metrics (such as test pass rates for code, cyclomatic complexity of generated code, security scan pass rates, etc.) and establishing benchmark test sets to continuously measure the performance of loop systems. Without a reliable evaluation framework, the loop is a "black box"—you can't know whether it's creating value or burning money.
Conclusion
The transition from "prompt engineer" to "loop architect" is not merely a change in how tools are used—it's a fundamental restructuring of human-AI collaboration. The direction Anthropic is demonstrating through the Routines mechanism likely represents the next mainstream paradigm for AI-assisted programming. For developers, starting now to think about how to design, evaluate, and optimize these automated loops will be key to maintaining competitiveness. It's worth noting that this trend isn't about replacing developers, but about elevating the developer's value anchor from "code production" to "system design"—just as the Industrial Revolution didn't eliminate workers, but freed humans from repetitive labor toward higher-level creative work.
Key Takeaways
Related articles

Claude Code Workflow in Practice: From Requirement Grilling to AFK Agent Auto-Coding
A detailed walkthrough of building real features with Claude Code: Grill Me requirement interrogation, auto-generated PRDs, AFK agent coding, and QA iteration loops with DDD and TDD strategies.

A Gen-Z Woman Making $1.5M/Month: Deconstructing the Growth Methodology Behind AI Apps
Gen-Z indie dev Nicole built 4 hit AI apps earning $1.5M/mo. Deep dive into her industrialized UGC engine, traffic testing system, and minimalist tech stack.

Replit's AI Loops Workflow Explained: Multi-Agent Collaboration Replaces Prompt Engineering
Deep dive into Replit's AI Loops workflow: how orchestrators, parallel agents, and Computer Use Verifiers build automated closed-loop systems through multi-agent collaboration.