AI Models from Google, OpenAI, Anthropic Solve 0% of ‘Hard’ Coding Problems

cm0002@lemmy.world · 23 hours ago

AI Models from Google, OpenAI, Anthropic Solve 0% of ‘Hard’ Coding Problems

yogsototh@programming.dev · 22 hours ago

I didn’t see Claude 4 Sonnet in the tests and this is the one I use. And it looks like about the same category as o4 mini from my experience.

It is a nice tool to have in my belt. But these LLM based agents are still very far from being able to do advanced and hard tasks. But to me it is probably more important to communicate and learn about the limitations about these tools to not lose tile instead of gaining it.

In fact, I am not even sure they are good enough to be used to really generate production-ready code. But they are nice for pre-reviewing, building simple scripts that don’t need to be highly reliable, analyse a project, ask specific questions etc… The game changer for me was to use Clojure-MCP. Having a REPL at disposal really enhance the quality of most answers.

Ugurcan@lemmy.world · 7 hours ago

For me, it’s the Claude Code where everything finally clicked. For advanced stuff, sure they’re shit when they left alone. But as long as I approach it as a Junior Developer (breaking down the tasks to easy bites, having a clear plan all the time, steering away from pitfalls), I find myself enjoying other stuff while it’s doing the monkey work. Just be sure you provide it with tools, mcp, rag and some patience.

AI Models from Google, OpenAI, Anthropic Solve 0% of ‘Hard’ Coding Problems

AI Models from Google, OpenAI, Anthropic Solve 0% of ‘Hard’ Coding Problems

AI Models from Google, OpenAI, Anthropic Solve 0% of ‘Hard’ Coding Problems | AIM