Deep Dive into Claude Sonnet 4: Replicating Lovable with Just Two Prompts

Claude Sonnet 4 replicates Lovable in two prompts, generates McKinsey reports, and reshapes AI development.
A hands-on deep dive into Claude Sonnet 4 showcases its extraordinary capabilities: replicating Lovable's full interface with just two prompts, generating 23-page McKinsey-grade research reports, and building 2D games in 45 seconds. The article explores the emerging AI Agent "building block economy" with tools like Convex, Supabase, and Daytona, and previews how agentic payments will enable fully autonomous AI development workflows.
Claude Sonnet 4: A Demonstration of the Most Powerful Model Available
On June 9, 2025, Anthropic released its new model, Claude Sonnet 4. Positioned above the previous Opus series, the company describes it as an entirely new class within the Sonnet lineup — bigger, more expensive, but operating on a completely different level of capability. After three consecutive days of intensive use, one content creator gave it an exceptionally high rating: "For 99.9% of tasks, it's the best model in the world."
Anthropic's model naming convention draws from musical terminology: Haiku represents lightweight speed, Sonnet represents balanced utility, and Opus represents maximum capability. This naming logic was established in early 2024, with version numbers distinguishing iterations within each series. From the mid-2024 lineup of Opus, Haiku, and Sonnet across three tiers, to Claude Sonnet 3.5 which sparked the Vibe Coding wave, to the massive leap of Opus 4.5 at the end of last year, today's Sonnet 4 is an entirely new species.
Vibe Coding is a concept coined in 2024 by former OpenAI researcher Andrej Karpathy. It refers to developers no longer writing code line by line, but instead describing requirements in natural language and letting AI models generate complete code implementations. The developer's role shifts from "person who writes code" to "person who describes intent and reviews results." Claude Sonnet 3.5, with its outstanding code generation capabilities, became the core driver of this wave, fueling the explosive growth of AI coding tools like Cursor and Windsurf. Now, Sonnet 4 takes code tooling capabilities even further — it can work continuously for extended periods, with significant improvements in spatial reasoning and visual understanding.
That said, the model also includes safety measures. When topics involve frontier LLM research, pre-training processes, or distributed training architectures, the model automatically downgrades or restricts its capabilities. This has drawn community frustration, but for the vast majority of development scenarios, Sonnet 4 remains an unmatched choice.
Test 1: Replicating Lovable's Full-Featured Interface with Two Prompts
The creator completed an astonishing challenge with Claude Sonnet 4 — replicating Lovable's mobile interface and functionality using just two prompts.
Lovable (formerly GPT Engineer) is an AI-powered full-stack application building platform where users simply describe the app they want in natural language, and the platform automatically generates front-end and back-end code and deploys it. It gained massive attention in 2024 and is seen as the AI evolution of the "no-code/low-code" movement, primarily targeting non-technical users and rapid prototyping scenarios, with support for automatic integration of React front-ends and Supabase back-ends. Choosing to replicate Lovable as a test was deliberate — since Lovable itself is a benchmark AI development tool, using AI to replicate an AI tool is inherently compelling.

The process was remarkably simple: the first prompt was a screenshot of Lovable's interface with the instruction "redesign this to match this appearance"; the second prompt was "build a note-taking app like Notion with dark mode." The result not only closely matched the original appearance but actually surpassed the original in functionality — the replica supported direct editing of titles and content, table insertion, and other features that the original Lovable actually lacked.
After eight rounds of prompt iteration, the creator open-sourced the project (named "Rileable" to ride the hype), capable of building web and mobile apps, and even able to call Sonnet 4 directly within the Lovable platform. The entire process used Daytona for sandboxing and Convex for the database, with generation taking only about 45 seconds.
Test 2: Replicating McKinsey-Grade Research Reports
Another test that garnered nearly a million views on X was even more impressive. The creator uploaded a McKinsey-style document with strict formatting and chart requirements to Claude, asking it to generate a research report on "AI Trends for H2 2026" in the same style.
McKinsey is one of the world's top management consulting firms, renowned for its rigorous data analysis, polished visualizations, and deep industry insights. A typical McKinsey industry research report usually takes weeks to months to produce, involving data collection, expert interviews, model building, and multiple rounds of review, with fees ranging from hundreds of thousands to millions of dollars. The core value of these reports lies not just in the content itself, but in their standardized visual language — specific color schemes, chart types, and layout structures have become the "common language" of the consulting industry. AI's ability to replicate this style means the barrier to high-end knowledge work is being dramatically lowered.

The resulting 23-page report was stunning in quality: clean layouts, polished charts, scoring systems for each section, covering topics like OpenAI vs. Anthropic revenue comparisons, data center power consumption forecasts, and open-weight model trends. Keep in mind, McKinsey typically charges hundreds of thousands or even millions of dollars for reports like this.
The creator also shared a practical tip: first ask Claude to search for 20 high-quality McKinsey-grade report examples and provide download links, download them as style references and drag them into the conversation, then instruct the model to "go all out" on generation. This workflow can be saved as a "skill" for easy reuse.
Test 3: Game and City Simulator Development
Beyond apps and reports, Claude Sonnet 4 also excels at game development. The creator used simple prompts to generate a 2D Minecraft-style mining game with character movement and switching, completed in 45 seconds with instant preview.

Even more impressive, other developers used Claude Sonnet 4 to build a city block simulator with multi-agent traffic and latency detection, complete with a coordinate system and day-night cycle effects. These projects heavily leverage technologies like Three.js — "building blocks" that AI Agents are becoming increasingly proficient with.
Three.js is a JavaScript 3D graphics library built on WebGL that abstracts complex low-level graphics programming into relatively concise API calls, making it feasible to create 3D scenes, animations, and interactions in the browser. For AI Agents, Three.js is an ideal "building block" — it has rich documentation and example code as training data, its API design is standardized and modular, and generated results can be previewed directly in the browser. This explains why AI-generated 3D projects are becoming increasingly common: it's not that AI truly "understands" 3D graphics, but rather that Three.js's abstraction layer happens to match the code generation capabilities of large language models.
The AI Agent Building Block Economy: A New Paradigm for Software Development
The most critical insight from this exploration is the concept of the "Building Block Economy." Mitchell Hashimoto (co-founder of HashiCorp and creator of well-known developer tools like Terraform and Vagrant) wrote that what AI Agents need most right now are powerful, reusable building blocks.

Current AI Agent development has formed a clear building block ecosystem:
- Database Layer: Convex, Supabase, Neon
- Sandbox Environments: Daytona
- Model Gateway: Vercel AI Gateway
- Hosting & Deployment: Vercel
- Authentication Components: Google Sign-in
Each of these building blocks solves a critical piece of the AI Agent development puzzle. Convex is a reactive backend platform providing real-time databases and serverless functions, particularly suited for AI-generated applications since it doesn't require manual database migration management. Supabase is an open-source Firebase alternative offering a one-stop solution with PostgreSQL databases, authentication, storage, and real-time subscriptions. Neon is a serverless PostgreSQL service with database branching capabilities. Daytona provides standardized development environment sandboxes, allowing AI Agents to safely execute code in isolated environments. Vercel AI Gateway unifies API interfaces across multiple AI model providers, simplifying model calls and switching. These tools share common characteristics: API-first design, out-of-the-box functionality, and no complex configuration required — precisely the prerequisites for AI Agents to work efficiently.
As stated in the video: "When you have tools like Convex and Supabase, why would you expect AI to rebuild its own database?" Supabase has become virtually the default choice for AI Agents building database applications and is already a multi-billion-dollar company.
The Era of Agentic Payments: AI Agents Completing the Development Loop Autonomously
Currently, AI Agent development still has one bottleneck: registering for services, binding credit cards, and obtaining API keys still require human intervention. However, the creator revealed that he has been in contact with multiple companies working on "agentic payments," and in the future, AI Agents will be able to autonomously register for services, manage budgets, and even hire people on Fiverr when needed.
Agentic Payments represent a critical emerging infrastructure in the AI Agent economy. A core limitation of current AI Agents is the "last mile" problem: no matter how capable the model is, registering for third-party services, completing KYC verification, binding payment methods, and managing API keys still require manual human intervention. Agentic payment solutions allow AI Agents to have controlled payment capabilities — developers set budget caps and usage rules, and Agents autonomously decide resource allocation within authorized limits. This is similar to companies issuing employees corporate credit cards with spending limits. Payment giants like Stripe and PayPal, along with a wave of startups, are positioning themselves in this space because it's one of the final puzzle pieces for achieving truly autonomous AI Agents.
This means developers would only need to give a budget instruction (e.g., "budget $100, clone Lovable"), and the AI Agent could automatically select the optimal building block combination, register for necessary services, and complete the entire application build. This will unlock vast new possibilities across industries.
Usage Tips and Cost Reminders
Claude Sonnet 4 is currently available for selection in the Claude chat interface, but this free/low-cost window only lasts until June 22, after which it will switch to API billing at twice the price of Opus. The creator spent approximately $200 in API credits on the Lovable clone (roughly eight prompts' worth of consumption). It's recommended to try it out now while the Max or Pro subscription offers the best value.
As the creator put it: "What this model can achieve goes far beyond your imagination."
Key Takeaways
Related articles

Microsoft Build 2026: In-Depth Analysis of the In-House Reasoning Model MAI Thinking-E and the Full AI Product Suite
Microsoft Build 2026 unveils MAI Thinking-E, its first in-house reasoning model with 1T MoE architecture, plus 6 vertical AI models. Deep dive into performance, strategy, and industry trends.

Replit's Domain-Specific Agents: One-Click Batch Fixes for SEO and Security Vulnerabilities
Deep dive into Replit's domain-specific AI Agents: Growth Agent for SEO issues and Security Agent for vulnerability detection, with select-all one-click batch fixing.

APImart Review: One-Stop Low-Cost Access to GPT, Claude, and Other Leading AI Models
Hands-on review of APImart, an API aggregation platform supporting GPT-4o, Claude, Veo and more. GPT image generation from $0.006/image. Full walkthrough, results, pricing, and risk analysis.