Impressive. As a practical matter, one wonders what th point would be in creating a new programming languages if the programmer no longer has to write or read code.
Programming languages are after all the interface that a human uses to give instructions to a computer. If you’re not writing or reading it, the language, by definition doesn’t matter.
That said, the core value of the software wouldn't exist without a human at the helm. It requires someone to expend the energy to guide it, explore the problem space, and weave hundreds of micro-plans into a coherent, usable system. It's a symbiotic relationship, but the ownership is clear. It’s like building a house: I could build one with a butter knife given enough time, but I'd rather use power tools. The tools don't own the house.
At this point, LLMs aren't going to autonomously architect a 400+ table schema, network 100+ services together, and build the UI/UX/CLI to interface with it all. Maybe we'll get there one day, but right now, building software at this scale still requires us to drive. I believe the author owns the language.
It’s missing all the heart, the soul, of deciding and trading off options to get something to work just for you. It’s like you bought a rat bike from your local junkyard and are trying to pass it off as your own handmade cafe racer.
I've been trying a new approach I call CLI first. I realized CLI tools are designed to be used both by humans (command line) and machines (scripting), and are perfect for llms as they are text only interface.
Essentially instead of trying to get llm to generate a fully functioning UI app. You focus on building a local CLI tool first.
CLI tool is cheaper, simpler, but still has a real human UX that pure APIs don't.
You can get the llm to actually walk through the flows, and journeys like a real user end to end, and it will actually see the awkwardness or gaps in design.
Your commands structure will very roughly map to your resources or pages.
Once you are satisfied with the capability of the cli tool. (Which may actually be enough, or just local ui)
You can get it to build the remote storage, then the apis, finally the frontend.
All the while you can still tell it to use the cli to test through the flows and journeys, against real tasks that you have, and iterate on it.
I did recently for pulling some of my personal financial data and reporting it. And now I'm doing this for another TTS automation I've wanted for a while.
That is the part of the post that stuck with me, because I've also picked up impossible challenges and tried to get Claude to dig me out of a mess without giving up from very vague instructions[1].
The effect feels like the Loss-Disguised-As-Win feeling of the video-games I used to work on at Zynga.
Sure it made a mistake, but it is right there, you could go again.
Pull the lever, doesn't matter if the kids have Karate at 8 AM.
I haven't read any farther than this, yet, but this made me stutter in my reading. Isn't a comparison just a function that takes two arguments and returns a third? How is that different from "+"?
That said, it's a lot of words to say not a lot of things. Still a cool post, though!
However, I fear that agents will always work better on programming languages they have been heavily trained on, so for an agent-based development inventing a new domain specific language (e.g. for use internally in a company) might not be as efficient as using a generic programming language that models are already trained on and then just live with the extra boilerplate necessary.
Congratulations on getting to the front page ;)
I mean, they may be right but there is also a big opportunity for this being Gell-Mann amnesia : "The phenomenon of a person trusting newspapers for topics which that person is not knowledgeable about, despite recognizing the newspaper as being extremely inaccurate on certain topics which that person is knowledgeable about."
This is such an interesting statement to me in the context of leftpad.
fn read_float_literal(&mut self) -> &'a str {
let start = self.pos;
while let Some(ch) = self.peek_char() {
if ch.is_ascii_alphanumeric() || ch == '.' || ch == '+' || ch == '-' {
self.advance_char();
} else {
break;
}
}
&self.source[start..self.pos]
}
Admittedly, I do have a very idiosyncratic definition of floating-point literal for my language (I have a variety of syntaxes for NaNs with payloads), but... that is not a usable definition of float literal.At the end of the day, I threw out all of the code the AI generated and wrote it myself, because the AI struggled to produce code that was functional to spec, much less code that would allow me to easily extend it to other kinds of future operators that I knew I would need in the future.
Step #2 is: get real people to use it!
Who the hell is going to use it then? You certainly won't, because you're dependent on AI.
The "more on that later" was unit tests (also generated by Claude Code) and sample inputs and outputs (which is basically just unit tests by a different name).
This is... horrifically bad. It's stupidly easy to make unit tests pass with broken code, and even more stupidly easy when the test is also broken.
These "guardrails" are made of silly putty.
EDIT: Would downvoters care to share an explanation? Preferably one they thought of?