ABCoder in Practice: A Demonstration of Solving AI Code Hallucination

Introduction

As AI-assisted programming becomes increasingly prevalent, the "hallucination" problem in code generated by large language models has been a persistent pain point for developers. "Hallucination" refers to the phenomenon where models generate content that appears plausible but is actually inaccurate or fabricated—in code generation scenarios, this can manifest as inventing non-existent APIs, incorrectly combining parameter orders of real APIs, or generating deprecated usage based on outdated documentation. The root cause lies in the fact that LLMs' generation mechanism is probability-based text prediction rather than truly "understanding" code semantics; the Knowledge Cutoff of training data further exacerbates this issue—for actively iterated open-source projects, the model's knowledge often lags behind the actual version by months or even years. The model appears to confidently generate complete code, but the actual runtime behavior doesn't match expectations. This article demonstrates a practical comparison between code generated by a pure Claude model and code generated with ABCoder tool assistance, revealing the critical value of AI programming tools in addressing code accuracy.

Experiment Design: Implementing an SSE Service with the Hertz Framework

The demonstration task is straightforward: use the Hertz framework, open-sourced by ByteDance's CloudWeGo team, to write an HTTP Server that implements the SSE (Server-Sent Events) protocol. The specific logic is to receive a request, increment the number in the request by 1, and send it to the client 10 times.

Hertz is a high-performance Go HTTP framework whose underlying network library uses the self-developed Netpoll (an epoll-based non-blocking I/O framework), offering significant throughput advantages over the standard library net/http in high-concurrency scenarios. Hertz's SSE support package is a relatively new feature module, providing native API wrappers like WriteEvent, so developers don't need to manually handle chunked encoding, event formatting, and other low-level details. Because this package was released relatively recently, mainstream LLMs generally lack relevant information in their training data—making it the core focus of this experiment.

The experiment consists of two rounds:

Round 1: Without ABCoder, relying purely on Claude's training data to generate code
Round 2: With ABCoder's Agent mode enabled, allowing the model to proactively consult Hertz framework's actual source code

The "Hallucinated" Code from Pure Model Generation: Looks Complete but Has Hidden Pitfalls

First Impressions of the Generated Result

Without Agent mode enabled, Claude generated a seemingly complete project based on its training data. The code structure was complete, including both Server and Client sides, and at first glance appeared to have no issues.

SSE Protocol Description

However, the problems are hidden precisely in the details. SSE (Server-Sent Events) is a server push technology defined in the HTML5 standard, implementing unidirectional data streams based on HTTP's chunked transfer encoding mechanism. Unlike WebSocket's bidirectional communication, SSE is specifically designed for server-to-client unidirectional push scenarios, with built-in features like automatic reconnection and event ID tracking. Its data format is simple—each message starts with data: and ends with two newline characters. For this reason, the server needs to send events one by one, and the client needs to receive and process them one by one—any implementation that batches data for bulk transmission fundamentally violates SSE's design intent.

Core Issue #1: Missing Latest API Support

Since Hertz's SSE package was only recently released, the model's training data doesn't include this content. Therefore, the model chose to implement the SSE protocol from scratch, but encountered serialization handling issues during the coding process. This is a classic manifestation of LLM "hallucination"—it generates code with an air of "knowing exactly what it's doing," but the actual runtime behavior doesn't match reality.

Core Issue #2: A Fatal Flaw in the Client

Client-side Problem Analysis

The more critical problem appeared on the Client side. According to correct SSE behavior, every time the Server sends an event, the Client should print the data once, achieving true streaming transmission. But in actual execution, the Client doesn't print incrementally.

The reason: the model-generated Client uses the request.Body() interface, whose semantics are "keep reading until there's no more data." This means the Client waits for all data transmission to complete before outputting everything at once, completely violating SSE's streaming design intent. This issue is extremely difficult to catch during pure code review—it can only be exposed by actually running the code and observing output timing.

It looks like a working demo, but when you actually debug it, you'll find it's not real SSE—it's unusable.

Correct Implementation with ABCoder

Configuration and Startup Steps

ABCoder is built on the MCP (Model Context Protocol). MCP is a standardized protocol open-sourced by Anthropic in late 2024, designed to solve integration issues between AI models and external tools/data sources—by defining a unified client-server communication specification, it enables models to invoke external resources like file systems, databases, and code repositories in a standardized way, without requiring custom integration solutions for each tool.

Using ABCoder requires a few steps:

After installing ABCoder locally, start the service via the MCP command specifying the SD file location
Configure the Agent by providing the model with the officially recommended prompt, telling it how to correctly use ABCoder
Enable ABCoder and Terminal in the tool selection, but deliberately disable Web Search to prevent the model from "cheating" by directly consulting Hertz documentation

Agent Configuration Interface

Deliberately disabling Web Search is a controlled-variable experimental design—ensuring the model's accuracy comes from code repository retrieval rather than documentation crawling, more authentically reflecting ABCoder's core value.

Intelligent Code Retrieval Process

With Agent mode enabled, the model enters an autonomous planning and tool-calling workflow: decomposing the task into multiple steps, deciding at each step whether to invoke external tools, and dynamically adjusting subsequent actions based on tool return results. The model's behavior undergoes a qualitative change—instead of fabricating from thin air, it proactively calls ABCoder tools to consult Hertz framework's actual source code:

Locating the framework: The model first finds the Hertz project through ABCoder, obtaining the complete file list
Discovering key packages: Accurately locating SSE-related packages
Consulting APIs: Examining specific nodes and interface definitions in the SSE package
Referencing examples: The most critical step—the model also proactively searches for example and test files

ABCoder Finding Key SSE Package

This behavior is particularly noteworthy. Many frameworks' best practices and usage examples are hidden in test files. The model's ability to recognize this and proactively search for them demonstrates that the Agent's prompt design provides effective guidance.

Correct Implementation Results

Based on real API information obtained from the source code, the model generated correct code:

Server side: Uses Hertz's native WriteEvent interface to send SSE events, without manually handling chunked encoding and event formatting
Client side: Uses Hertz's native SSE Reader with the ForEach method to control the SSE lifecycle, achieving true per-event processing

Runtime verification: events are indeed sent one by one—this is the correct behavior for HTTP SSE. The entire implementation follows Hertz's officially recommended best practices.

Deep Analysis: What Problem Does ABCoder Solve?

The Fundamental Contradiction of Training Data Timeliness

LLMs' training data has a cutoff date. For rapidly iterating open-source frameworks, newly released APIs and best practices cannot be directly accessed by the model. This contradiction is particularly acute in active ecosystems—a framework might release entirely new core modules months after the model's training cutoff, and the model is completely unaware. ABCoder enables models to access the latest code repositories in real-time through the MCP protocol, fundamentally solving the timeliness problem and extending the model's "knowledge boundary" from the training cutoff date to the repository's latest commit.

A Paradigm Shift from "Guessing" to "Verifying"

Without ABCoder, the model can only "guess" API usage based on training data—guesses that often appear reasonable but fail in the details. With ABCoder, the model's workflow becomes: first consult the real code, then generate implementations based on accurate information. This is an essential shift from "creation" to "engineering"—the former relies on the model's statistical patterns, while the latter relies on real technical facts. This shift closely mirrors how human developers work: no one writes completely correct framework invocation code from memory—consulting documentation and reviewing examples is the norm in engineering practice.

Practical Implications for Developers

Don't blindly trust AI-generated code: Even if it looks complete and correct, verify core logic—especially when dealing with newer frameworks or recently updated APIs
Leverage tools to enhance model capabilities: Protocols like MCP enable models to access real-time information, significantly improving code quality; the selection and configuration of tools is itself a demonstration of engineering capability
Testing is key to verification: The demonstration revealed streaming transmission issues through actual execution—something pure code review would struggle to catch. Behavioral correctness must be verified through runtime observation

Conclusion

The ABCoder demonstration clearly illustrates the evolution direction of AI programming tools: not making models "smarter" at guessing, but providing them with the ability to obtain accurate information. When AI can consult documentation, read source code, and reference examples just like human developers, the quality of generated code will undergo a qualitative leap. The emergence of the MCP protocol provides standardized infrastructure for this capability, enabling code retrieval, terminal execution, database queries, and other tools to integrate into the model's reasoning pipeline in a unified manner. This also signals that the core competitiveness of future AI programming assistants lies not only in the model's inherent capabilities, but more importantly in the depth of its connection to the real code world.

Key Takeaways

Pure LLM code generation suffers from "hallucination"—the SSE implementation appears correct but the Client actually reads all data at once rather than streaming
ABCoder enables models to access framework source code in real-time through the MCP protocol, fundamentally solving training data timeliness issues
With Agent mode enabled, the model proactively consults source code and searches for example and test files, shifting behavior from "guessing" to "verifying"
Code generated with ABCoder correctly uses Hertz's native SSE APIs, achieving true streaming event transmission
The core competitiveness of AI programming tools lies not only in model capability, but in the depth of connection to the real code world