Is HTML Replacing Markdown? A Deep Dive into the AI Output Format Debate

Industry leaders advocate HTML over Markdown for AI output, citing superior visualization and interactivity.
Anthropic's team and AI leaders like Karpathy are pushing self-contained HTML to replace Markdown as the AI output format. HTML far surpasses Markdown in information density, visual clarity, and interactivity, with use cases including multi-option comparison, code review, disposable tool generation, and automated reports. However, HTML faces limitations in token consumption, version control difficulty, and mobile responsiveness. MDX may be the next-generation solution balancing both approaches.
Introduction: Markdown's Dominance Is Being Challenged
For a long time, Markdown has been the default format for AI agents to communicate with humans. It's concise, portable, easy to edit, and has practically become a second language for developers. Created by John Gruber in 2004, Markdown was originally designed to let people write documents in an easy-to-read, easy-to-write plain text format that could optionally be converted to HTML. Its syntax borrowed from plain text markup conventions in email, such as using asterisks for emphasis and hash marks for headings. The widespread adoption of GitHub Flavored Markdown (GFM) made it the de facto standard in the developer ecosystem—nearly all code hosting platforms, documentation systems, and AI chat interfaces support Markdown rendering by default. However, Markdown's design philosophy is essentially "good enough"—it was never intended to be a complete document format system.
But as AI capabilities advance rapidly, more and more voices are asking: Has Markdown become a bottleneck limiting AI's expressive power?

Recently, Thorik from Anthropic's Claude Code team published a widely discussed article titled The Unreasonable Effectiveness of HTML, claiming he has almost entirely stopped writing Markdown files and instead has Claude generate HTML. Shortly after, AI luminary Andrej Karpathy publicly stated that "this does indeed work well." Well-known developer Theo (t3.gg) then provided an in-depth analysis and hands-on comparison.
Why Is HTML Better Suited for AI Output Than Markdown?
An Overwhelming Advantage in Information Density
In his article, Thorik pointed out that a self-contained HTML file can carry far more types of information than Markdown. A self-contained HTML file refers to the practice of embedding CSS styles, JavaScript scripts, SVG graphics, and even Base64-encoded images all within a single .html file. This approach doesn't depend on external resources and can be shared like a PDF—a complete interactive document in one file. Technically, it leverages <style> tags for inline CSS, <script> tags for inline JS, and data URI schemes for embedding binary resources. This pattern already has precedent in data science—HTML reports exported from Jupyter Notebooks are similar self-contained documents.
Specifically, the types of information HTML can carry include:
- Tabular data: Native HTML tables are far more flexible than Markdown tables, supporting merged cells, nested tables, conditional styling, etc.
- Design presentations: Rich visual rendering through CSS, including gradients, shadows, animations, etc.
- Charts and illustrations: SVG inline vector graphics that are scalable without loss of quality
- Code snippets: Script tags with syntax highlighting
- Interactive workflows: JavaScript-driven dynamic content
- Spatial layouts: Absolute positioning and Canvas
In contrast, Markdown often falls short when dealing with complex information. Theo gave a typical example: Claude frequently makes alignment errors when drawing ASCII art in Markdown—border offsets, incorrect line counts, and similar issues are commonplace. This is because Markdown is fundamentally a linear arrangement of monospace characters, and any information requiring two-dimensional spatial expression hits a fundamental limitation.
Visual Clarity and Readability
As AI capabilities grow stronger, the specification documents and plans it generates are getting longer and longer. Theo admitted: "I find that I rarely actually read Markdown files longer than 100 lines." HTML documents, on the other hand, can organize content through tabs, collapsible panels, link navigation, and other mechanisms that dramatically improve the reading experience.
More critically, there's the convenience of sharing—Markdown files can't be natively rendered in most browsers (requiring dedicated renderers like GitHub or VS Code preview), while HTML just needs to be uploaded to S3 and can be shared via link with anyone, rendering perfectly in any browser without installing any tools.
Interactivity Is HTML's Killer Feature
HTML's greatest differentiating advantage lies in interactivity. You can have Claude generate interfaces with sliders, knobs, and adjustable parameters to tweak design proposals or algorithm parameters in real time. Even more cleverly, you can add a "Copy as Prompt" button in the HTML that feeds interaction results directly back to Claude for further iteration. This kind of human-AI collaboration loop is completely impossible in a pure Markdown environment—Markdown is static, while HTML natively supports event-driven dynamic interaction.
Practical Use Cases for AI-Generated HTML
Exploration and Planning Phase
Thorik demonstrated a typical workflow: having Claude generate six distinctly different design directions, presented in a grid layout within a single HTML file, with trade-offs annotated for each approach. Theo highly endorsed this: "Having the model produce multiple distinctly different options in a single generation yields greater diversity than asking one at a time." This method leverages the characteristic of large language models maintaining internal consistency within a single inference—when a model is asked to simultaneously generate multiple different proposals, it actively seeks differentiation, whereas sequential queries tend to fall into similar thinking patterns.
Code Review and PR Descriptions
Thorik claims he now attaches an HTML code explainer to every PR. Theo was more reserved about this—he believes Markdown in VS Code already handles syntax highlighting well, but acknowledged that diff rendering is indeed a pain point for Markdown. HTML can use color coding to indicate severity levels, inline annotations, flowcharts, and more—things that are nearly impossible in Markdown. While the traditional unified diff format is readable in terminals, when facing cross-file refactoring or complex code movements, the readability of plain text diffs drops dramatically.
A Revolutionary Mindset: Disposable Tools
This is one of the uses Theo champions most: having Claude generate a one-off editor custom-built for the current task. Not a product, not a reusable tool—just an HTML file tailor-made for this particular piece of data.
"Code is cheap now. Writing a bunch of code just to play with data, make better decisions, and then immediately throw it away—that's absolutely worth it. Over 70% of the code I write now gets discarded after running once."
This mindset represents a fundamental shift in the software development paradigm. Traditionally, the cost of writing code (human time) far exceeded the cost of running code (compute resources), so we pursued code reuse and abstraction. But when AI reduces the cost of writing code to near zero, "disposable code" becomes entirely reasonable—just as we don't bind our scratch paper calculations into a book.
The key technique is to always end with an export function—a "Copy as JSON" or "Copy as Prompt" button that converts UI operation results into a format that can be pasted back to the agent.
Workplace Hack: Automated Weekly Reports
Theo shared an interesting workplace hack: connect an agent to the Jira API and have it automatically generate a polished HTML page each week showcasing the team's completed work, to present during standups. "Management will go crazy for this," he said. "It's a way to look professional with a simple prompt."
Limitations and Counterarguments Against HTML Replacing Markdown
Token Efficiency Issues
HTML does consume more tokens than Markdown. In large language models, tokens are the basic unit of text processing—typically one English word corresponds to 1-2 tokens, and one Chinese character is about 1-2 tokens. Taking GPT-4o as an example, output tokens cost roughly 3-4x more than input tokens. A typical HTML document, due to structural redundancy from tag names, attributes, CSS property names, JavaScript syntax, etc., may consume 2-5x more tokens compared to equivalent Markdown content.
However, Thorik argues that in an era where Claude offers a 200K context window and GPT-4o offers 128K, the cost of a few thousand extra tokens has indeed become negligible—the bigger bottleneck is often output speed rather than cost. More importantly, HTML's high readability means you're more likely to actually read it, resulting in better overall output quality. A carefully read HTML document is far more valuable than an ignored Markdown file.
A Version Control Nightmare
Thorik himself acknowledges this is HTML's biggest drawback—HTML diffs are extremely noisy and difficult to review. In contrast, Markdown's plain text nature is naturally suited for Git version control. A simple style adjustment in HTML can cause dozens of lines of diff changes, while substantive content modifications get buried in structural noise. This is a serious practical obstacle for teams that require code review workflows.
Novelty Bias
Theo raised a pointed question: "When HTML makes people more willing to read, how much of that is because Markdown is too bloated and HTML is more readable, versus how much is because HTML is novel enough to attract attention? If everyone switches from long Markdown to HTML link bombardment, will we actually read more?" This question touches on the "Novelty Effect" in cognitive psychology—new things inherently stimulate more attention and engagement, but this effect diminishes over time.
Mobile Responsiveness Challenges
Theo's hands-on testing found that Thorik's shared examples performed poorly on mobile. While HTML can theoretically achieve responsive design (through media queries, flexbox, and other CSS techniques), this adds more output tokens, somewhat defeating the purpose. Moreover, requiring AI to simultaneously generate layouts that work well on both desktop and mobile significantly increases prompt complexity and output uncertainty.
Karpathy's Vision: The Evolution of AI Output from Text to Visual
Karpathy placed this trend within a grander framework. He believes the information exchange between humans and AI is undergoing an evolutionary process:
- Plain text → Laborious to read
- Markdown → Structured, current default
- HTML → Flexible graphics and interaction, forming the new default
- Future → Interactive video and neural network-generated simulations
"About a third of the human brain is a massively parallel visual processor—it's the ten-lane highway for information entering the brain. Audio is humans' preferred AI input modality, but vision—images, animations, video—is AI's preferred output modality."
Karpathy's assertion has a solid neuroscience foundation. The human visual cortex occupies approximately 20-30% of the cortical surface area, and visual information bandwidth far exceeds other senses—the visual channel can process approximately 10^7 bits per second, while auditory processes only about 10^4 bits, a difference of three orders of magnitude. This explains why data visualization, infographics, and interactive interfaces are more quickly understood and remembered by humans than plain text. Cognitive Load Theory also supports this view: good visual design can minimize extraneous cognitive load, letting working memory focus on the content itself rather than decoding the format.
Future Outlook: MDX Might Be the Next Stop
At the end of his video, Theo raised an interesting point: MDX (Markdown + JSX) might be an underexplored middle ground. MDX is a format combining Markdown with JSX (JavaScript XML, React's template syntax), first released by the Compositor team in 2018. It allows importing and using React components directly within Markdown documents—for example, embedding interactive charts, collapsible code blocks, or live editors in documentation. Mainstream frameworks like Next.js, Gatsby, and Docusaurus all natively support MDX.
Compared to pure HTML, MDX's advantage lies in component reusability—you can define a design system component library and use it consistently across all documents while maintaining the readability and Git-friendliness of the Markdown portions. It might strike a better balance between readability, token efficiency, and expressiveness.
Theo also plans to develop a dedicated Claude skill that defines HTML output structure, design system, and directory conventions to avoid polluting Git history. This direction is worth continued attention.
Conclusion
The shift from Markdown to HTML fundamentally reflects new demands on output format as AI capabilities improve. When AI is no longer just generating text but can create complete interactive documents, we need to rethink the interface of human-AI collaboration.
As Theo put it: "We as an industry still have a ton of work to do on output formats. We haven't even come close to figuring out what the right interface should look like." HTML is just the starting point, but it already points the way—toward more visual, more interactive, more human-cognition-centered AI output.
Key Takeaways
- Anthropic's Claude Code team and industry leaders like Karpathy are advocating HTML over Markdown as an AI output format, with core advantages in information density, visual clarity, and interactivity
- Practical use cases for HTML include multi-option exploration and comparison, code review visualization, disposable custom tool generation, and automated report generation
- HTML output's main limitations include higher token consumption, version control difficulties, poor mobile responsiveness, and novelty bias that may overstate its advantages
- Karpathy proposes an evolution path from plain text → Markdown → HTML → interactive video, arguing that vision is the optimal channel for AI output
- MDX (Markdown + JSX) may be the next-generation middle ground that balances simplicity with interactivity
Related articles
Expert OpinionsWindsurf CEO Deep Dive Interview: Speed Is the Only Moat
Windsurf CEO Varun Mohan shares insights on AI coding IDE pivots, product methodology, async Agent challenges, and differentiation strategy vs Cursor. Speed is the only moat.
Expert OpinionsBeing Underestimated Is Freedom: A Contrarian Competition Philosophy for the AI Era
Exploring the contrarian strategy of 'being underestimated is freedom' in AI. From OpenAI to DeepSeek to Cursor, why staying under the radar beats standing in the spotlight.
How the Protestant Work Ethic Was Hija…
How the Protestant Work Ethic Was Hijacked: From Protecting Workers to Oppressing Them
Philosopher Elizabeth Anderson reveals how the Protestant work ethic was twisted from a worker-protecting ideal into a tool of oppression—and what it means for the AI era.