Cursor Fails at UI Design Reproduction: The Real Capability Boundaries of AI Coding

The Huge Gap Between Expectations and Reality

Recently, a developer on Bilibili shared their painful experience using Cursor to reproduce a UI design mockup, striking a chord with many fellow developers. They provided Cursor with a carefully crafted client-side reference image, hoping AI would help them efficiently reproduce it. Instead, the generated interface was miles away from the original — essentially the coding equivalent of "expectation vs. reality."

Cursor is one of the most talked-about AI coding tools right now. Built as a deep modification of VS Code, it integrates the ability to call large language models like Claude and GPT-4. Developers can describe requirements in natural language or even provide screenshots to have AI generate or modify code. Its core workflow involves sending user instructions and context (including existing code, project structure, reference images, etc.) to a large model, which then generates the corresponding code snippets. This approach excels at handling programming tasks with clear logic and well-defined patterns, but when tasks involve highly visual judgments, the model's "understanding" is still fundamentally pattern matching based on training data rather than genuine visual perception.

This developer didn't give up easily. They revised their prompts many times, trying various instructions like "pixel-perfect reproduction" and "strict reference," but the results remained disappointing. The gap wasn't something fine-tuning could fix — the whole "feel" was fundamentally off.

Why Cursor Failed at UI Design Reproduction

Standard Interfaces Are Fine, But Creative Designs Fall Apart

The developer also ran comparison tests: for standard application interfaces — neat buttons, common layouts, system fonts — Cursor handled them perfectly well. But their UI design featured extensive hand-drawn styles, custom fonts, and numerous sliced image assets. These elements are nearly impossible to reproduce with pure code generation in a way that captures the "feel" the designer intended.

This actually reveals a core limitation of current AI coding tools: AI excels at handling structured, pattern-based content, but lacks genuine "aesthetic understanding" when it comes to highly customized visual design.

From a technical perspective, this limitation has deep-rooted causes. While current multimodal large models (such as GPT-4o and Claude 3.5 Sonnet) have the ability to "see" images, the way they understand images is fundamentally different from humans. The model encodes images into a series of visual tokens and then processes them at a semantic level — it can recognize "this is a button" or "there's some text here," but struggles to precisely perceive pixel-level spacing between elements, subtle color gradients, the visual weight of fonts, and the overall "atmosphere" of a design. More critically, there's a massive "translation gap" between visual understanding and code generation: even if the model correctly understands the design intent, translating it into precise CSS property values (specific margin, padding, and line-height numbers, for example) remains an extremely challenging task. Human front-end developers rely on years of accumulated visual-to-code mapping experience in this process — tacit knowledge that models currently struggle to fully acquire.

Manually Slicing Images for AI? Less Efficient Than Writing Code by Hand

Facing this dilemma, one seemingly viable approach was to manually slice each image from the design mockup and individually tell the AI where each image goes and its dimensions. But the developer quickly realized this was even less efficient than writing the code by hand. AI is supposed to boost productivity — if the prep work required to use it is more tedious than manual development, it completely defeats the purpose.

This problem is known in the industry as the "hidden cost of AI tools." Many developers, when first using AI coding tools, focus only on the speed of code generation while overlooking the time consumed by prompt debugging, result verification, and manual corrections. A 2024 Stanford University study found that for complex tasks, developers using AI tools sometimes spent more total time than with traditional development methods, precisely because of these accumulated hidden costs. For highly visual tasks like UI reproduction, hidden costs are especially pronounced — you need to constantly screenshot, annotate, describe, compare, and correct, a cycle that can be more draining than just writing CSS directly.

The Figma-to-Code Route Doesn't Work Either

Refusing to give up, the developer explored Figma-to-code workflows, hoping to convert the design mockup to code first and then feed it to AI for optimization. In theory, this approach should improve reproduction accuracy, but in practice it was riddled with problems:

The ChatGPT + Figma MCP solutions recommended online constantly required paid memberships
Even with paid access, the generation accuracy was mediocre at best
The output code was nowhere near production-ready and required extensive manual modification

Some technical background on the Figma MCP approach is worth explaining here. Figma is currently the most mainstream collaborative design tool in the industry, offering rich API interfaces that allow third-party tools to read layer structures, style properties, component relationships, and other information from design files. MCP (Model Context Protocol) is an open protocol released by Anthropic in late 2024, designed to let AI models connect directly to external data sources and tools. Through Figma MCP, AI can theoretically read structured data from design files directly (rather than merely "looking" at a screenshot), obtaining more precise design information. However, in reality, the complexity of Figma design files far exceeds the model's processing capabilities: the auto-layout rules, component variants, and responsive constraints used by designers require extensive contextual judgment when converted to front-end code. Moreover, different designers have vastly different layer organization habits, which significantly undermines the reliability of automated conversion.

After going around in circles, the developer ultimately decided to go back to writing code by hand.

Where Are the Capability Boundaries of AI Coding Tools?

What AI Coding Does Well

Current AI coding tools (Cursor, Copilot, etc.) perform excellently in the following scenarios:

Rapid generation of standardized UI components
Writing and refactoring business logic code
Backend tasks like API integration and data processing
Building interfaces based on mature UI frameworks (Material Design, Ant Design, etc.)

These scenarios share a common characteristic: they all have vast amounts of public code as training data support. Take Ant Design as an example — there are hundreds of thousands of projects on GitHub using this framework, and AI models have already "seen" massive amounts of Ant Design code patterns during training, enabling them to generate standards-compliant component code with high accuracy. The same goes for Material Design: Google's design specification documentation is thorough and well-structured, making it easy for models to learn the rules. In other words, AI performs well in these scenarios essentially because of "abundant training data + highly regularized patterns."

What AI Coding Struggles With

But in the following scenarios, AI still falls short:

Reproducing highly customized visual designs
Creative interfaces with extensive hand-drawn elements and custom fonts
Complex layouts requiring precise pixel alignment
Decorative elements in design mockups that rely on sliced image assets

Looking at the Design-to-Code niche specifically, the industry has been exploring this for years without producing a truly satisfying solution. Early tools like Zeplin and Avocode primarily provided design annotations and CSS property extraction but didn't generate complete code. Later tools like Anima and Locofy attempted to generate usable front-end code directly from Figma/Sketch, but output quality was inconsistent and typically only suitable for simple pages. Even AI-native products like Vercel's v0, while impressive at generating UI from descriptions, still show obvious gaps when it comes to precisely reproducing existing design mockups. The core challenge in this field is that design tools and front-end code are two entirely different expression systems — designers think in terms of absolute positioning and visual alignment, while front-end developers build with Flexbox, Grid, and responsive logic. The "semantic gap" between the two has yet to be perfectly bridged.

Practical Takeaways for Front-End Developers

The lesson from this case is clear: Don't get swept up by the narrative that "AI will replace programmers." AI is a powerful assistive tool, but it has clear capability boundaries.

For front-end developers, a more pragmatic approach might be:

Layer your workflow: Let AI handle structure and logic code, while fine-tuning visual details manually
Build a component library: Pre-package special design elements to reduce the complexity AI needs to "understand"
Set reasonable expectations: For highly creative UI, treat AI as a starting point rather than the final solution
Wait for tools to evolve: As multimodal capabilities improve, AI's understanding of visual design will continue to get better

Regarding the fourth point, it's worth elaborating. Multimodal AI technology is undergoing rapid iteration, and several noteworthy directions may significantly improve AI's UI reproduction capabilities within the next 1-2 years. First is improved visual understanding precision: next-generation multimodal models are training higher-resolution visual encoders capable of capturing finer-grained visual features. Second is specialized training on "design semantics": teams are already building large-scale "design mockup-to-code" paired datasets specifically for training models to understand the mapping between design intent and code implementation. Additionally, the maturation of Agent architectures brings new possibilities — AI no longer generates code in a single shot but can, like a human developer, generate code, preview it in a browser, compare it against the design mockup, automatically identify discrepancies, and iterate on corrections. This "generate-verify-correct" closed-loop workflow has the potential to dramatically improve reproduction accuracy. That said, even with these advances, human aesthetic judgment will remain indispensable for truly artistic and creative designs for the foreseeable future.

Final Thoughts

"They say anyone still writing code by hand is already behind the times" — this statement may hold true in certain contexts, but it's far from a universal truth. AI coding tools are indeed advancing rapidly, but in the domain of creative design reproduction, human developers' experience and aesthetic judgment remain irreplaceable.

Rather than worrying about whether AI will replace you, it's better to understand its capability boundaries and use it where it truly boosts efficiency. After all, the value of a tool lies in serving people — not in making people serve the tool.