My concern here is that by gravitating to HTML you lose the ability for a human (you!) to easily co-author the document with the LLM. If it’s just an explainer for your consumption, that’s not a concern - but if it’s a spec sheet for something more complex, I deeply value being able to dive in and edit what is produced for me. With a HTML doc it is much harder to do that than with MD.
Now of course you could just reprompt your LLM to change the HTML - but when I already have a clear idea of what I want to say in my head, that’s just another roadblock in the way.
If this pattern becomes more common I suspect human/LLM co-creation will further dwindle in favour of just delegating voice, tone and content choice to the LLM. I was surprised not to see this concern in the blog post’s FAQ.
Markdown supports inline HTML for interactive elements, I think an interesting potential alternative would be an md doc with a known HTML template and simple build (e.g. single line pandoc command).
This is the way - lightweight build process - host it on your private tailnet to consume the document on whichever device you so choose - tablet, phone - also makes it easier to share with co-colaborstors etc. in a more secure way.
yah this should be the way. simpler text in markdown and rich visuals and complex tables in html. coding agents should handle this locally instead of wasting tokens as well.
I actually think there is a second level to this. Yes HTML will get you most anywhere, but I found that letting the LLM define its own language is also unreasonably effective.
Currently working on a dumb little mobile game with isometric view and sound:
- told codex to write a tool that lets its place blocks in a prepared three.js document and have chromium dev tools take a screenshot. It made up a little JSON structure that defines blocks / colors and some other effects and it outputs 2.5d tilesets.
- told it to create a uv python script that would let it define sounds and music, and it made a yaml format that lets it create noises.
We completely shot past the svg pelican test. Codex has created both perfectly adequate prototype art of soldiers/knights/priests as well as a prototype soundtrack.
They can map things like this. They are amazing translation layers. As long as it is a shape of problem or data they are trained on they can translate. The DSL they made up is shaped like some other data format they know for that latent space. It seems amazing, and it is, but it is also a core feature of how LLMs work. The problem is it works until it doesn’t. Fuzzy can only get you so far before it decoheres without rigor.
We have been authoring HTML by hand for decades with ease. Text editors are very good at it, and many have commands to auto-wrap, auto-close etc. Reading and writing is simple.
I can code up a complex HTML table by hand faster than I create a basic MD table, but other than that, I find it difficult to achieve a good writing flow with pure HTML, even with all the automation. I author a lot of API docs, READMEs, and how-to guides, and find MD to provide the perfect mix of decently powerful markup and flexibility with supporting raw HTML when needed. The only constraint is that some markup renderers don’t support or severely restrict HTML passthrough (I ran into this with GitHub recently).
Templated though, not manually writing it out for every blog post say. I think GP means it just has more friction as a writing format than markdown for example.
No, literally manually typing out HTML tags and everything. Many of us did it so much things like Emmet (https://emmet.io/) were invented and used so we could hammer out full HTML documents even faster.
Even after React became popular, people are still manually typing out HTML elements, although they call it "JSX" instead, but in reality it's just HTML.
My first blog on the internet literally was a bunch of .html files, where my post "template" was the first post copy-pasted when you wanted to make a new post. Changing the design involved changing the same thing across all files.
> PS: Really cool static site generators that shoot for simplicity don't require you to create extra template files written in a new, made-up template language. When you want to create a new post, you give it (a) the static files from your existing site and (b) the markdown for your new post. The "templating" engine inspects your existing posts (incl. e.g. class attributes) and then copies the same document structure into a new file, except with the right stuff (timestamp, title and heading, post content...) substituted in to the places where it's supposed to go.
Basic HTML really isn’t a big step up from Markdown though, and no one was complaining about that. In some instances it’s simpler even. I often forget the exact syntax for a table in Markdown, by comparison <table>, <tr> and <td> are easy to remember. All of the major parts of Markdown are pretty easy, <h1>, <strong>, etc etc. It was written with human authorship in mind.
Typing out <p> for every paragraph is annoying, for sure. But a converter that switches out \r\n\r\n for a new paragraph would be a reasonable middle ground IMO.
When using AI, I often find myself preferring either plain text (no markup whatsoever; just manual / text editor formatting of text blocks) or simple html to markdown, depending on the situation. To the point that I rarely see any point in using markdown for anything. If it is meant for to be a simple text mainly for human consumption, the markups often don't add much clarity (and often bring in an amateurish look, as if the author didn't know how to emphasize using English constructs), in which case plain text feels more pprofessional. If it is meant to be [lightly] processed before being presented to a human, or if it is meant to be processed by a tool / bot / LLM, then HTML is infinitely more straightforward.
Also I often call out my colleagues if they try to put a table in markdown. Markdown is not built for tabular data in most professional settings (i.e., one or two table cell could easily take a whole line of markdown to express). A basic <table><tr><td style="background: red">some number</td></tr></table> goes a long way.
A lot of editors will auto add the ending tag, and auto-update the ending tag if you rewrite the start tag. I think it's gotten pretty darn easy to use HTML er I mean XML ;)
Well if you're going to be like that you may as well say something about assembly, butterflies, and what the hell's a web log anyway, spiders and beavers have nothing to do with one another.
'we have been authoring' is present perfect continuous. Going forwards, and for at least the last two 'decades', HTML blog posts can use CMSs.
Simple HTML is easy to do. If you just want a document with information and it does not need lots of branding and great aesthetics. That is what you are looking at as an alternative to Markdown.
This is shockingly true. Most newer FE devs I have encountered are mostly trained on the popular frameworks and lack understanding of the underlying fundamentals, e.g., they only know TypeScript + SCSS and some smattering of HTML but more often know whatever templating engine and MVC(ish) backend the framework uses. It’s really helpful to understand what the browser is actually doing and all the “stuff” the framework spits out on the other end.
Modern JS/TS devs probably not, but I wouldn't even call someone a "frontend dev" if you don't know HTML, kind of being a infrastructure engineer and not knowing how any OSes work.
It’s not just knowing HTML as in writing a bunch div tags and patting yourself on the back. If you aren’t able to achieve at least 80% WCAG AA compliance you can’t write HTML.
Most frontend devs have no idea what any of that means. But then it seems everyone who can write 3 lines of code professionally refers to themselves as an ”engineer”.
Yes, it's literally about being able to use HTML effectively and knowing what you are doing, then you know HTML. I'm not sure why you bring up some arbitrary accessibility guidelines, that has no bearing if someone is using HTML correctly and neither would I gatekeep the "frontend" label on some arbitrary "must pass this particular standard", never heard something so outlandish when it comes to who could call themselves frontend developer or not.
Hand on heart. When was the last time you built a serious production system for a real business that was 100% built from HTML without using any build step? Just editing the footer and header in every file when it updates (or using iframes)
Maybe not in your corner of the internet, but businesses used server side includes (SSI) for that, not iframes.
You add “include” tags to your HTML file and your web server like nginx or varnish would replace it with the fragment at runtime.
<!--#include virtual="../footer.html" -->
I saw this was quite popular for big publishing houses with millions of articles still relatively recently 10 years ago. They would only write the HTML body of a new article and the other fragments would be included by the web server.
Very cheap, stable, and very big changes across the whole website could be done instantly since cache invalidation is trivial (the web server knows all modified dates of all fragments).
Also, no additional CDN or caching needed. Later, with CDNs there was even a variant where these fragments were hosted at the edge (ESI).
10 years ago. I bet you can’t name a single production site writing HTML files without any build tools at all. Just raw dogging using notepad directly on the server.
That is the point. No one writes HTML without any abstractions anymore. You use a framework or a build tool. Because just editing pure html files is a pain in the ass. Probably haven’t done that since 2010.
I suppose that only applies if you constrain yourself to a raw teletypewriter emulator… in any proper coding environment, editing HTML should be absolutely simple - even an embedded WYSIWYG editor would be an option if rich model output is a way we head into.
A counter argument would be that all programming languages of the last decades have been plain text based. No other more structured format has ever gained traction even though modern editors could be argued to be able to support that easily. Turns out, it doesn't actually work that way.
But we’re not even dealing with a programming language in any classical sense here. Interacting with an LLM coding system is a multi-mode communication system with on-demand, purpose-generated ephemeral UI. That doesn’t fit any of the established categories, so I think carrying over constraints from them doesn’t make sense either.
Even Claude Code can whip up interactive, tabbed, multiple choice questions for example. If you use the superpowers plugin, it'll sometimes spawn a small web server demoing UI concepts or previewing more complex choices using LLM-generated HTML. Claude Code on the web will do even more involved React apps on the fly next to the chat. There's no technical reasons this couldn't get more complex, or vertically integrated with code editors.
I'd definitely call that on-demand, purpose-generated ephemeral UI.
Most people edit documents in Microsoft word, though, so it didn’t seem too far fetched that LLM content would be edited similarly, especially as more and more non-programmers use it.
I’ve started using HTML for reports recently. But I always use a markdown file as an intermediate and tell the LLM to generate a fancier version of it with SVG for graphs/pictures based on tables in the markdown.
One idea that could be useful for products like Browser Use and Stagehand is instead of using videos of the session, they can use HTML slideshows to show the step-by-step progress of the session. Single file HTMl can be downloaded and shared and annotated as well. I hope someone from those companies are here and will take this advice. I am not advocating for replacing the video sessions but also have the HTML Slideshow as another session artifact.
Not sure how you use CC, but the last 6 months has felt like significant optimization efforts to me. Last year Claude would just read and edit files, now it's all kinds of basic tool gymnastics with grep/awk/sed/etc to narrowly slice and avoid token-heavy reads. Resuming sessions that aren't even that large get a scary prompt about using a significant portion of your token budget if you continue without compacting.
To me it feels like a worse experience, and they probably feel it too, but it makes sense from an optimization perspective. I've probably learned some shell tricks, but also going blind from watching Claude try dozens of variations of some multi-line chained and piped wall of bash nightmare, instead of just reading a few files.
I've noticed that's changed over the past month or so. Claude-code used to happily pipe build commands straight into context, but recently it's been running them as background tasks that pipe to file, and it'll search and do partial reads on the output instead.
It also gives tips on reducing context size when you run /context .
Presumably they are actually starting to feel the pinch on inference costs themselves with what still feels like a fairly generous max plan.
And it seems to use head, tail etc. more than it used to, even when unnecessary, which, combined with the recent(?) tendency of more chaining and as you said, piping to temp files and the like, totally screwed up claude code’s auto approval system for me (by auto approval I mean the system to decide which commands can be run without permission prompt, based on the permissions.allow setting among other things, not to be confused with a specific new approval mode called “auto” that burns more tokens to decide whether the command is safe). I had to write my own auto approval system and plug it in as a hook.
By what measure? HTML as they're describing it here includes CSS and presumably Javascript. Markdown has only a handful of elements and they're all more human readable in their unrendered form. Show both to someone who has never seen either and it's obvious which they would describe as "simpler".
HTML made by Claude will, by default, be "sleek, modern", with colorful tables, cards, maybe Tailwind for styling. And, of course it will, if you wanted a barebones HTML, you would just have asked for markdown!
So the LLM decided to present some content using 4 cards, and you now want to add new itens. You can't just add new lines of text: you need to copy the whole HTML of the cards. But the LLM used different colors for each card, so now you have the first cards with varying colors and the new cards all the same color as the last card. Now you have to think about colors... etc etc
It depends what we mean I guess, isn’t Markdown supposed to allow [hx]ml tags anyway if user need them? Then it’s more about asking the LLM to generate Markdown with this in consideration, and privilege rendering the output of reports in the preferred browser after relevant rendering.
1. I believe many applications that use markdown allow html. Others don't due to security/rendering issues.
2. One of the limiting factors of LLM is context. An html table takes up way more tokens than a markdown table. Especially if it's a WYSIWYG editor that has all kinds of css and <span> tags just for fun.
> An html table takes up way more tokens than a markdown table
That might be the case today but there’s no reason for it to always be true. They are different representations of the same thing, an LLM could (arguably should!) store an internal representation that uses fewer tokens.
Most editors format and highlight markdown syntax as bold, italic or strike through, but I have never seen the same thing for html. They only highlight the html-specific parts leaving the content unstyled.
seems like it'd be fairly straightforward to get things setup where text blocks are double click to edit, have drag handles, delete buttons etc, and then you can also use surfingkeys to get vim style navigation.
overall i'm seeing a potential here for more human authoring, as you can really multiply the surface of your inputs with deterministic widgets rather than just delegating.
HTML isn't difficult to read or edit by hand, non-technical people did it for years before the web became commoditized and generating everything with javascript became the standard. It's more awkward than Markdown, yes, but basically simple.
I can edit html. Also, I don't bother with html editing. If it's a doc that needs editing, I just tell Claude to format it for ease of editing and then I just go through and type whatever crap I want to say. Then I ask Claude to clean it up.
BUT, I only do this as an easy way of referencing specific ideas and text in notes to instruct Claude. I say, "I have added notes. Read them and adjust the doc accordingly." Or, "Read the notes I added. Turn them into html."
I don't think human coauthoring documents with AI is viable, due to very different costs. It's like with combining hand-written assembler and compiler output. I think ideally there will be some delineation which parts of the document are human-produced and which are AI-produced.
Now of course you could just reprompt your LLM to change the HTML - but when I already have a clear idea of what I want to say in my head, that’s just another roadblock in the way.
It's usually faster though, so you get to spend more time on thinking.
Well said. It is so much better for when the human can see what the LLM has changed and iterate on it with the LLM, making their own changes. Markdown is better for that.
It's possible that use of `contenteditable` and ability to save the file could help but that has a lot of limitations/gotchas, so I'm inclined to agree.
I started using Thariq’s approach yesterday and it works very well. One thing I noticed is that I’m no longer wary of reading long and complex spec documents. Opus does a great job with web design and uses a combination of clean, modern styling and interactive elements that reduces my cognitive load and improves my ability to understand the details of what it is planning to build.
Sorry but what? HTML's really simple, I've been using it for 30 years now. Pretty much everything you need to know to do something useful can be learned in an hour.
I know that it's 2026 and we've spent the last 15 years hiring people into our field who don't understand the hacker mindset but I didn't think that I'd find them here. They would have been ran out pretty quickly.
Yeah. We could even have some kind of plugin or SKILLS.md file that helps convert from this kind of structured prompt into another even more deterministic, predictable output.
Now of course you could just reprompt your LLM to change the HTML - but when I already have a clear idea of what I want to say in my head, that’s just another roadblock in the way.
If this pattern becomes more common I suspect human/LLM co-creation will further dwindle in favour of just delegating voice, tone and content choice to the LLM. I was surprised not to see this concern in the blog post’s FAQ.