

This all presumes that OpenAI can get there and further is exclusively in a position to get there.
Most experts I’ve seen don’t see a logical connection between LLM and AGI. OpenAI has all their eggs in that basket.
To the extent LLM are useful, OpenAI arguably isn’t even the best at it. Anthropic tends to make it more useful than OpenAI and now Google’s is outperforming it on relatively pointless benchmarks that were the bragging point of OpenAI. They aren’t the best, most useful, or cheapest. The were first, but that first mover advantage hardly matters when you get passed.
Maybe if they were demonstrating advanced robotics control, but other companies are mostly showing that whole OpenAI remains “just a chatbot”, with more useful usage of their services going through third parties that tend to be LLM agnostic, and increasingly I see people select non OpenAI models as their preference.




It’s pretty much a vibe coding issue. What you describe I can recall being advocated forevet, the project manager’s dtram that you model and spec things out enough and perfectly model the world in your test cases, then you are golden. Except the world has never been so convenient and you bank on the programming being reasonably workable by people to compensate.
Problem is people who think they can replace understanding with vibe coding. If you can only vibe code, you will end up with problems you cannot fix and the LLM can’t either. If you can fix the problems, then you are not inclined to toss overly long chunks of LLM stuff because they generate ugly hard to maintain code that tends to violate all sorts of best practices for programming.