Orchestrating AI Agents as State Machines: Stop Being a Human Confirmation Button
Orchestrating AI Agents as State Machi…
Apply CI/CD orchestration thinking to AI Agents, transforming humans from mechanical confirmers to key decision-makers.
This article identifies the problem of humans becoming "confirmation buttons" in current AI coding workflows and proposes building an Orchestrator architecture inspired by CI/CD state machine concepts. The four-layer architecture (YAML templates, Orchestrator Skill, Pipeline Server, Dashboard) uses Gate mechanisms to let AI advance autonomously, pausing only at critical nodes like proposal confirmation and quality review for human decisions, while supporting parallel requirements and team experience capture through reusable templates.
Starting from the "Human Confirmation Button"
Anyone who's been coding with AI recently probably shares the same feeling: after every step, the AI asks "Should I continue?" and your answer is always "Yes, yes, yes." On the surface it seems efficient, but in reality you've devolved into a human confirmation button on an assembly line.
In the current AI coding ecosystem, Skills handle stable behaviors, Subagents handle parallel execution, and MCP handles connections to external systems—each capability is powerful on its own, but most people are still stuck at the single-point invocation stage, never truly orchestrating these combined capabilities.
It's worth noting that MCP (Model Context Protocol) is an open protocol released by Anthropic in late 2024, drawing inspiration from LSP (Language Server Protocol) in the editor domain—just as LSP unified communication between editors and language servers, MCP standardizes the connection interface between AI models and external tools, databases, and APIs, making "implement once, use everywhere" possible. The Subagent architecture, meanwhile, decomposes complex tasks to specialized sub-agents for parallel processing—frontend, backend, and QA each handle their own responsibilities without blocking each other, with each sub-agent's context window containing only information relevant to its role, improving both speed and focus.
The core problem isn't that the process is too slow—it's that the system hasn't learned to move forward on its own.
Rethinking AI Coding Through a Software Engineering Lens
The turning point came from an analogy: when building CI/CD pipelines, we design stages, set up Gates, use state machines to persist state, and support parallelism and checkpoint-based resumption. Since Agents are also execution units, why not manage them the same way?

CI/CD (Continuous Integration/Continuous Delivery) is a core practice of modern software engineering, having evolved over decades into a mature methodology. Its underlying model is essentially a finite state machine: the system is in one of a finite number of states at any given moment (building, testing, awaiting approval, deployed...), transitioning between states through clearly defined conditions, with Gates serving as triggers for state transitions. Tools like Jenkins, GitHub Actions, and GitLab CI have engineered these concepts and accumulated extensive practical experience with parallel execution, failure retries, and checkpoint resumption.
Once this mental model clicks, the definition of an Orchestrator becomes crystal clear:
- Doesn't create new Agents—only orchestrates existing ones
- Doesn't write code itself, nor run tests directly
- Only responsible for dispatching the right Agent and Skill at the right time
- Waits for human decisions at critical nodes
The Four-Layer Architecture of an Orchestrator
The entire Orchestrator architecture is divided into four layers with clear separation of responsibilities:
- YAML Templates: Define the workflow (Explore → Propose → Review → Implement → QA Test → Archive)
- Orchestrator Skill: Enables the main Agent to execute orchestration logic
- Pipeline Server: Manages state and APIs
- Dashboard: Handles progress visualization, Gates, and log display
Orchestration logic, runtime state, and visualization are explicitly separated here. The benefit of this layered design is that each layer can iterate independently without coupling to the others.
The choice of YAML templates as the workflow definition language is particularly deliberate. YAML has become the de facto standard configuration language in the DevOps world, widely adopted by Kubernetes, Ansible, and GitHub Actions. Compared to code, YAML lowers the barrier for non-technical people to understand and modify workflows; compared to graphical interfaces, YAML naturally supports version control and can be diffed, reviewed, and rolled back just like code. This realizes the concept of "Pipeline as Code"—team best practices are no longer passed down by word of mouth but are captured in a structured, auditable, and reusable form, following the same evolutionary path as Infrastructure as Code.
Gate Mechanism: From Mechanical Confirmation to Quality Judgment
Gates are the soul of the entire orchestration system. Instead of asking "Is this okay?" at every step, they let the AI advance autonomously and only pause at critical nodes like proposal confirmation, quality review, and test acceptance, presenting three clear options: pass, fix, or abort.

The most illustrative real-world scenario: once Gate 1 (proposal review) passes, the system automatically marks the Implement stage as Active and spins up backend and frontend SubAgents in parallel. After both implementations complete, it automatically proceeds to the next stage. The entire process no longer requires someone watching and pushing things forward step by step.
The Real Value of QA Gates
After the QA Tester runs, it doesn't simply output "pass/fail"—it provides a structured report: which items are already fine, which items need fixing, and then leaves the decision to the human.

This way, humans focus on quality judgment rather than mechanical confirmation. This is a fundamental role shift—from operator to decision-maker.
Parallel Requirements and Team Collaboration
When multiple requirements are progressing simultaneously, the Dashboard displays each pipeline's stages, Gate status, active SubAgents, and audit logs on a single board. It delivers more than just visualization—it provides state memory and organizational collaboration capability when switching between multiple requirements.

The Amplification Effect in Team Collaboration
In team collaboration scenarios, the Orchestrator's value is further amplified:
- Experience capture: Expert knowledge no longer lives only in people's heads—it's captured in reusable YAML templates
- Lower barriers: New team members follow the pipeline and make decisions at Gates
- Template reuse: The same set of Agents can be reused across backend, full-stack, and hotfix scenarios
This means a team's AI coding capability no longer depends on individual skill levels but on the quality of workflow templates.
Three Stages of AI Coding Evolution
Summarizing the evolution of AI coding into three stages makes the Orchestrator's position clearer:
| Stage | Characteristics | Human Role |
|---|---|---|
| Tool invocation | Single-point use of AI capabilities | Operator |
| Process codification | Fixed steps, step-by-step confirmation | Confirmer |
| Orchestrator orchestration | State machine-driven, autonomous progression | Decision-maker |
The real change isn't about having humans keep pressing confirm—it's about setting the flight path, letting AI cruise automatically, and having humans take over only at critical waypoints.
When Should You Use an Orchestrator
The core insight of this approach is actually simple: AI Agents, like microservices, are execution units that need orchestration. In the distributed systems domain, the Orchestrator pattern and the Choreography pattern are two classic approaches to service coordination: the orchestration pattern uses a central controller to explicitly direct the execution order of services—Kubernetes schedulers, Apache Airflow, and Netflix Conductor are typical implementations; the choreography pattern lets services coordinate autonomously through event responses. The orchestration pattern's core advantage lies in observability and controllability—all execution paths go through the central orchestrator, state changes are traceable, and exception handling follows clear protocols. This aligns perfectly with AI Agent scenarios. The decades of engineering practices accumulated in the CI/CD domain—state machines, Gates, parallelism, checkpoint resumption—can be directly transferred to AI coding scenarios.
However, it's important to note that the Orchestrator itself adds system complexity. For simple tasks, direct conversation may be more efficient; only when workflows are sufficiently complex, require multi-Agent collaboration, and have team reuse needs does this architecture truly deliver value. The key is finding that "worth orchestrating" tipping point.
Key Takeaways
- The Orchestrator doesn't create new Agents—it only orchestrates existing ones, dispatching the right capability units at the right time
- The Gate mechanism lets AI advance autonomously, pausing only at critical nodes (proposals, quality, testing) to await human decisions, transforming humans from mechanical confirmers to quality decision-makers
- The four-layer architecture (YAML templates, Orchestrator Skill, Pipeline Server, Dashboard) achieves clear separation of orchestration logic, runtime state, and visualization
- Team experience is captured and reused through YAML templates, lowering barriers for newcomers, with the same Agent set adaptable to different development scenarios
- Three stages of AI coding evolution: tool invocation → process codification → Orchestrator orchestration—the core idea is letting AI cruise automatically while humans take over only at critical waypoints
Related articles
TutorialsCursor + Codex Dual-IDE Collaboration: A Practical Methodology for Open-Source Project Customization
A complete methodology for open-source project customization based on real-world experience, detailing the Cursor+Codex dual-IDE workflow, seven-stage process, MVP validation, and AI source code reading techniques.
TutorialsCursor Multi-Agent in Practice: Building a Full-Stack Next.js Blog in 50 Minutes
Build a full-stack blog in 50 minutes using Cursor IDE's multi-Agent mode with Next.js, Clerk auth, and Supabase. Learn the 4-phase AI Agent workflow and key integration pitfalls.
TutorialsBuilding an AI Software Factory from Scratch: A Cursor Engineer's Hands-On Experience with Multi-Agent Collaboration
Cursor engineer Eric shares practical insights on building an AI software factory: automation levels, guardrail design, parallel Agent management, and scaling to 1000+ Agents for 24/7 development.