Hermes AI Kanban: A Five-Layer Autonomous Architecture for Fully Automated Delivery from Idea to Finished Product

Hermes Kanban 2.0 uses a five-layer AI agent architecture to automatically deliver projects from a single idea.
Hermes Kanban 2.0 introduces a five-layer autonomous architecture where users input ideas, AI agents plan and execute, and humans only approve at key checkpoints. Deeply integrated with Obsidian for contextual memory and knowledge accumulation, the system features multi-agent collaboration, built-in quality self-checks, and gallery delivery — representing the shift from conversational AI to autonomous execution AI.
Traditional Kanban Is Dead — AI-Driven Autonomous Kanban Has Arrived
Most people still use traditional kanban boards to drag cards around, manually update statuses, and push every task forward by hand. This way of working is essentially just a digitized to-do list — all the thinking, executing, and checking still falls on you. The brand-new kanban system from Hermes Agent is completely rewriting this logic.
Kanban originally emerged from the Toyota Production System as a methodology for managing workflows through visual cards. Digital kanban tools like Trello, Jira, and Notion brought this concept into the software world, using columns (such as To Do, In Progress, Done) to track task status. However, the fundamental limitation of traditional digital kanban is that it's merely a passive information display tool — all task decomposition, priority judgment, and execution still relies entirely on the human operator. This means the efficiency ceiling of a kanban board equals the execution capacity ceiling of its user.
The core philosophy of this system is simple: You only input ideas; the AI agent team handles planning, building, and delivery. From a single sentence of inspiration to a running website, tool, or even a game, the entire process can be completed automatically in minutes — while you go do something else.

The Five-Layer Autonomous Architecture of Hermes Kanban 2.0
The workflow of Hermes Kanban 2.0 can be summarized in five layers, with users only needing to intervene once at the first layer:
Layer 1: Idea Capture
All you need to do is type a single sentence, such as "Create a beautiful blog website for OCO," and the system automatically categorizes your idea. This step is extremely simple — type a sentence, press Enter, done.
Layer 2: AI-Powered Planning
The system automatically creates a detailed execution plan for your idea. For a blog website, for example, it would plan milestones like: research target keywords, design the website interface, build the site with a modern tech stack, add internal linking strategy, configure analytics tools, and more. It also assigns roles — researcher, designer, programmer, etc. — all of which are AI agents, not real people.
Behind this is a Multi-Agent System architecture — an important branch of artificial intelligence where multiple AI agents with autonomous decision-making capabilities collaborate to complete complex tasks. Each agent has a specific role definition, capability boundaries, and behavioral rules, coordinating with each other through message passing. In Hermes Agent's architecture, the project manager agent handles task decomposition and delegation, the researcher agent handles information gathering, the programmer agent handles code generation, and the designer agent handles interface design. This division of labor simulates the collaborative structure of a real team but executes at speeds far exceeding human teams.
Layer 3: Human Approval Gate
This is one of the most critical design elements of the entire system. A pain point of the old Hermes Kanban was that after handing a task to AI, you couldn't confirm whether it truly understood your intent. The new system adds a human approval step — you can review the AI's plan and choose to approve or reject it. This solves the "AI talking to itself" problem.
This design embodies the Human-in-the-Loop principle in AI systems, which means retaining human decision-making authority at critical nodes in automated processes. This principle stems from a clear-eyed recognition of AI reliability boundaries: current large language models still suffer from hallucinations (generating content that seems plausible but is actually incorrect), comprehension biases, and context loss. By setting approval gates at the planning stage, the system preserves the efficiency advantages of automation while avoiding the risk of AI consuming massive resources in the wrong direction. This is more pragmatic than either fully autonomous execution or fully manual control — it finds the optimal balance between efficiency and controllability.

Layer 4: Multi-Agent Autonomous Execution
Once you click "Approve and Build," the project manager agent delegates tasks to various sub-agents, executing continuously in the background. You can watch the build progress in real time but don't need to intervene at all.
Layer 5: Gallery Delivery and Live Preview
All completed deliverables are published directly to a gallery, supporting live preview and full-screen viewing. Whether it's a blog website, meditation app, or habit tracker, everything can be managed in one place.
Deep Integration with Obsidian: Building an AI Second Brain
One particularly clever design of this system is that it runs entirely within your Obsidian vault. What does this mean?
Obsidian is a knowledge management tool based on local Markdown files. Its core advantages include: data stored entirely on the user's local machine (not on cloud servers), support for bidirectional links to build knowledge graphs, and a rich plugin ecosystem. Embedding an AI agent system into Obsidian means all project context, execution logs, and decision records exist as structured Markdown files within the user's knowledge base. This is fundamentally different from hosting data on a cloud SaaS platform — users retain complete data sovereignty, and AI can read these local files to gain long-term memory and contextual understanding.
Contextual Memory: All build processes, project information, and execution logs are recorded in Obsidian. When you come back later and say "that OCO blog we created yesterday, we need to optimize the UI," the AI agent can understand from the logs what this project is, where it's stored, and how it was created — without you having to explain everything again.

Knowledge Accumulation: As you use the system to build more and more projects, the AI agents' understanding of your preferences, work style, and project history deepens. This is essentially building a continuously evolving "second brain" — an intelligent knowledge system that understands your thinking patterns and proactively serves you.
Organization and Retrospection: All ideas and deliverables are organized in one place — nothing gets forgotten or lost, and you can always come back to review and continue development. Thanks to Obsidian's bidirectional linking, relationships between different projects can be naturally established and tracked.
Built-in Self-Check Mechanism: Quality Assurance for AI Output
In practice, AI-generated results aren't always ideal. The video author candidly admits that during the build process, output quality sometimes fell short. To address this, the system includes a built-in self-check routine — after completing a task, the agent first checks its own output quality and only marks it as complete and submits it to the user after confirming it meets standards.
This design is highly pragmatic. It acknowledges that AI isn't infallible, but through automated quality checking processes, it minimizes the need for human intervention. This is similar to the automated testing philosophy in software engineering — code must pass a series of test cases before deployment. In the context of AI agents, self-checks might include: verifying that generated code runs correctly, confirming that output aligns with the user's original requirements semantically, and ensuring no critical functional modules are missing.

Cost and Model Selection
The system currently uses Hermes Agent's M3 coding plan. According to the author's feedback, this plan has two advantages: stable output quality and reasonable token consumption with good cost-effectiveness. For users concerned about API costs, this is a relatively economical choice.
Understanding token consumption is crucial for evaluating the actual usage cost of such systems. When using LLM APIs, tokens are the basic billing unit (approximately 1-2 tokens per English word, about 1-2 tokens per Chinese character). Multi-agent systems typically consume far more tokens than single-conversation scenarios due to multi-turn dialogues, role switching, system prompts, and context passing. Taking GPT-4-level models as an example, input tokens cost approximately $30/million and output tokens approximately $60/million. A complete build of a complex project might consume tens of thousands or even hundreds of thousands of tokens, translating to roughly a few cents to several dollars. Therefore, choosing a model plan with reasonable cost-effectiveness is essential for using such systems as everyday productivity tools.
Redefining the Human-AI Agent Collaboration Relationship
The deepest value of this system lies in how it redefines the relationship framework between humans and AI agents. The author offers an incisive perspective:
Most people are still controlled by agents — their time and attention are held hostage by AI tools. The real framework should be you controlling the agents.
The traditional way of using AI is "you revolving around AI" — repeatedly conversing in ChatGPT, switching between different terminals, manually organizing outputs. In this mode, users effectively take on the role of "AI administrator": you need to craft prompts, evaluate output quality, and manually integrate results into your workflow. Your attention gets fragmented across the interaction process itself, rather than focused on the goals you actually want to achieve.
This kanban system achieves the reversal: you just throw out ideas, AI revolves around your needs, and you retain decision-making and approval authority rather than doing the grunt work of execution. This model more closely resembles the relationship between a CEO and an execution team — the CEO handles strategic direction and key decisions while the team handles specific execution and delivery.
Final Thoughts: The Leap from Conversational AI to Autonomous Execution AI
This Hermes Kanban system represents an important direction in AI tool development: the leap from conversational AI to autonomous execution AI. It no longer requires you to guide AI step by step on what to do. Instead, it lets the AI agent team autonomously plan and execute while you only make decisions at critical junctures.
This leap is known in the AI industry as the shift from "Copilot mode" to "Agent mode." In Copilot mode, AI is your co-pilot and you still hold the steering wheel; in Agent mode, AI becomes the autonomous driver and you only need to tell it the destination. In 2024-2025, this transformation is accelerating — from OpenAI's Operator and Google's Project Mariner to various open-source agent frameworks (such as AutoGPT, CrewAI, LangGraph), the entire industry is exploring how to move AI from passive response to proactive execution.
One notable detail: the author himself emphasizes that he doesn't know how to code, which demonstrates that these systems are lowering technical barriers. Of course, setting up and configuring such a system still requires some learning investment, but as communities mature and tutorials improve, this barrier is rapidly decreasing.
For creators and entrepreneurs who are drowning in countless ideas every day but struggling with insufficient execution capacity, this "autonomous driving kanban" may be exactly the productivity paradigm shift they need. It liberates humans from the execution layer, letting us return to what we do best — thinking, creating, and deciding.
Related articles

Claude Code Codex Plugin Integration: A Practical Guide to Dual-AI Adversarial Review for Better Code Quality
Learn how to install and configure the Codex plugin in Claude Code, leveraging dual-AI adversarial review to uncover code vulnerabilities across seven attack surfaces.

Beginner's Guide to Agent Skills: Structure Breakdown & Custom AI Skill Development
A deep dive into Agent Skill's core concepts and internal structure, covering skill.md, references, scripts, and assets with a restaurant poster Skill example.

Complete Guide to Commercial AI Agent Development: From Requirements Analysis to Production Deployment
Complete guide to commercial AI agent development from scratch, covering requirements analysis, architecture design (ReAct framework, deep search, intent recognition), hands-on Coze platform implementation, workflow creation, and production deployment.