If someone manages to make a robust GUI version of this for normies, people will lap it up. People don't want to juggle applications, we want computers to do what we want/need them to do.
They have AGI now?
I'm still paranoid about keeping things securely sandboxed.
I swear OpenAI has 2-3 unannounced releases ready to go at any time just so they can steal some thunder from their competitors when they announce something
</tin foil hat>
i.e. agents for knowledge workers who are not software engineers
A few thoughts and questions:
1. I expect that this set of products will be extremely disruptive to many software businesses. It's like when a new VP joins a company, they often rip and replace some of the software vendors with their personal favorites. Well, most software was designed for human users. Now, peoples' agents will use software for them. Agents have different needs for software than humans do. Some they'll need more of, much they'll no longer need at all. What will this result in? It feels like a much swifter and more significant version of Google taking excerpts/summaries from webpages and putting it at the top of search results and taking away visits and ad revenue from sites.
2. I've tried dozens of products in this space. For most, onboarding is confusing, then the user gets dropped into a blank space, usage limits are uncompetitive compared to the subsidized tokens offered by OpenAI/Anthropic, etc. It's a tough space to compete in, but also clearly going to be a massive market. I'm expecting big investment from Microsoft, Google etc in this segment.
3. How will startups in this space compete against labs who can train models to fit their products?
4. Eventually will the UI/interface be generated/personalized for the user, by the model? Presumably. Harnesses get eaten by model-generated harnesses?
A few more thoughts collected here: https://chrisbarber.co/professional-agents/
Products I've tried: ai browsers like dia, comet, claude for chrome, atlas, and dex; claw products like openclaw, kimi claw, klaus, viktor, duet, atris; automation things like tasklet and lindy; code agents like devin, claude code, cursor, codex; desktop automation tools like vercept, nox, liminary, logical, and raycast; and email products like shortwave, cora and jace. And of course, Claude Cowork, Codex cli and app, and Claude Code cli and app.
Edit: Notes on trying the new Codex update
1. The permissions workflow is very slick
2. Background browser testing is nice and the shadow cursor is an interesting UI element. It did do some things in the foreground for me / take control of focus, a few times, though.
3. It would be nice if the apps had quick ways to demo their new features. My workflow was to ask an LLM to read the update page and ask it what new things I could test, and then to take those things and ask Codex to demo them to me, but it doesn't quite understand it's own new features well enough to invoke them (without quite a bit of steering)
4. I cannot get it to show me the in app browser
5. Generating image mockups of websites and then building them is nice
I think the latter is technically "Codex For Desktop", which is what this article is referring to.
I wonder if there’s something off the shelf that does this?
Why is OpenAI obsessed with generating imgaes? Do they think "generate image" is a thing that a software engineer do on a daily basis?
Even when I was doing heavy web development, I can count the number of times I needed to generate images, and usually for prototyping only.
Bunch of startups need to pivot today after this announcement including mine
Does anyone know of a good option that works on Wayland Linux?
but there is no link, why would you not make this a link.
boggles my mind that companies make such little use of hypertext
It is instructive that they decided to go with weekly active users as a metric, rather than daily active users.
Sure we can read the characters in the screen. But accessibility information is structured usually. TUI apps are going to be far less interesting & capable without accessibility built-in.
Its clear that it will go in this type of direction but Anthropic announced managed agents just a week ago and this again with all the biuld in connections and tools will help so many non computer people to do a lot more faster and better.
I'm waiting for the open source ai ecosystem to catch up :/