AI Face Masking Tool Test: Doubao Generates Working Code on First Try, GPT Fails After Multiple Debugging Rounds

AI Programming Showdown: Building a Face Masking Tool

When we talk about AI programming assistants, ChatGPT is usually the first name that comes to mind. But in real-world projects, different AI tools can perform vastly differently. A Bilibili creator shared a real hands-on case — using AI to generate complete code for a face-tracking masking tool. The result? Doubao's performance far exceeded GPT's, sparking heated discussion about the programming capabilities of Chinese-made AI.

bilibili source

GPT Stumbles: Multiple Rounds of Dialogue Still Can't Solve the Face Masking Problem

Technical Approach Selection

Based on past experience, the creator first chose GPT to generate the face masking tool code. GPT proposed a technical solution based on OpenCV + MediaPipe + FFmpeg:

OpenCV handles video frame reading and processing
MediaPipe handles face detection and tracking
FFmpeg handles video encoding and decoding

The tech stack choice itself was sound — it's a commonly used computer vision processing approach in the industry. Specifically, OpenCV (Open Source Computer Vision Library) is an open-source computer vision library initiated by Intel, providing hundreds of basic image operations including image reading, color space conversion, and geometric transformations. MediaPipe is a cross-platform machine learning framework developed by Google, whose face detection module is based on the BlazeFace lightweight neural network architecture, capable of detecting faces in real-time on ordinary hardware and outputting 468 facial landmark coordinates that can be used to precisely locate the position and size of face regions. FFmpeg is the "Swiss Army knife" of audio/video processing, handling demuxing, decoding, encoding, and remuxing, supporting virtually all mainstream audio/video formats.

Face replacement approach

Runtime Failure

However, the problem lay in the code implementation. After multiple rounds of dialogue debugging, GPT's generated code consistently failed to correctly complete the face replacement function. The creator repeatedly modified prompts and provided error messages for GPT to fix, but ultimately the faces were never successfully masked.

GPT dialogue process

This exposed a common problem with GPT in complex engineering code generation: it can provide seemingly reasonable architectural solutions, but tends to make mistakes in specific implementation details, especially when multiple libraries need to work together — version compatibility and API call details are often handled improperly. There are many hidden pitfalls when these three libraries work together: OpenCV uses BGR color space by default while MediaPipe requires RGB input, so forgetting the conversion causes detection failure; FFmpeg's codec parameters have compatibility issues with OpenCV's VideoWriter; and different versions of MediaPipe have breaking changes in their APIs. Large language models are fundamentally prediction systems based on token probability distributions, with their training data containing code snippets from different library versions. The model struggles to accurately determine which API call patterns correspond to current library versions, resulting in generated code that may be syntactically correct but crashes at runtime due to version mismatches.

Doubao AI Programming: Generated Code Runs Successfully on First Try

Code Quality Comparison

After switching to Doubao, things changed dramatically. The creator said it was "unexpected" — every piece of code Doubao generated could run directly, with correct results. Specifically:

High code completeness: No need to supplement missing imports or configurations
Strong logical correctness: Face detection and masking logic worked correctly on the first attempt
Proper dependency handling: No conflicts in library versions or calling methods

This "out-of-the-box" experience is especially important for non-professional programmers. There's often a huge gap between "looks correct" and "runs correctly" — implicit type conversions, platform-specific file path differences, image channel ordering, async call timing, and other issues can all cause code to fail at runtime. The capability Doubao demonstrated in this case shows that it considers not just logical correctness but also engineering-level runnability during code generation, significantly lowering the barrier to AI-assisted programming and enabling more people to complete real projects with AI help.

Face Tracking Masking Tool Usage Tutorial

Installation and Configuration

Based on the core code generated by Doubao, the creator packaged the tool into a ready-to-use desktop application. Usage steps:

Extract the archive (the path must not contain Chinese characters, spaces, or other special characters)
Double-click the "Face Tracking Masking" executable to launch the program

The restriction against Chinese characters in the path is a common issue with Python packaging tools like PyInstaller. PyInstaller bundles the Python interpreter and all dependencies into a single executable, which extracts to a temporary directory at runtime. If the path contains non-ASCII characters, certain underlying C libraries may fail when parsing the path.

Launch program

Operation Flow

After launching, follow the interface prompts in order:

Select video to process: Choose the source video that needs face masking
Select emoji/sticker: Choose the image material to use for covering faces
Select save location: Specify the output video storage path
Click Start Processing: Note that you only need to click once — don't click repeatedly

Processing

A popup notification appears when processing is complete; click it to open the save folder and view the output video. The entire process requires no programming knowledge, truly achieving zero-barrier usage. The tool's underlying processing logic reads the video frame by frame, uses MediaPipe for face detection on each frame to obtain bounding box coordinates, then scales the sticker image to the corresponding size and overlays it at the face position, and finally re-encodes the processed frame sequence into a video file.

Why Can Chinese AI Programming Tools Outperform GPT?

Possible Reasons for Doubao's Better Performance

Although this case is a comparison in a single scenario, it reflects some noteworthy trends:

More precise Chinese context understanding: Doubao's understanding of Chinese requirement descriptions reduces "translation loss." In programming scenarios, accurate communication of requirements directly affects code generation quality. Chinese-native models have a natural advantage in understanding implicit information, contextual dependencies, and expression habits in Chinese descriptions.
Engineering practice orientation: Doubao focuses more on runnability during code generation, not just logical correctness
Domain-specific optimization: In popular fields like computer vision, Chinese models may have accumulated more high-quality training data. The Chinese developer community has produced a large volume of tutorials and projects with complete runtime environment descriptions in these fields, which is very valuable for training models to generate "code that actually runs."

Implications for Developers and Regular Users

For users who want to leverage AI for programming, the takeaways from this case are:

Don't blindly trust a single AI tool — try different options
Chinese AI already has the capability to surpass GPT in specific scenarios
AI programming is enabling more non-professionals to create practical tools

Of course, a single case doesn't represent the whole picture, and GPT may still have advantages in other scenarios. The key is choosing the right tool based on specific needs rather than blindly following trends.

Conclusion

The development process of this face-tracking masking tool vividly demonstrates the real performance differences between AI programming assistants. Doubao showed powerful "first-try success" code generation capability in this specific task, while GPT got stuck in multiple rounds of debugging. As Chinese AI models continue to evolve, in the field of AI-assisted programming, choosing the right tool for the specific task is more important than blindly following brand names.