The productivity paradox of AI coding assistants

HaraldvonBlauzahn@feddit.org · 22 days ago

The productivity paradox of AI coding assistants

Deestan@lemmy.world · 22 days ago

I keep seeing the “it’s good for prototyping” argument they post here, in real life.

For non-coders it holds up if you ignore the security risk of someone running literally random code they have no idea what does.

But seeing it from developers, it smells of bullshit. The thing they show are always a week of vibing gave them some stuff I could hack up in a weekend. And they could too if they invested a few days of learning e.g. html5, basic css and read the http fetch doc. And the learning cost is a one-time cost - later prototypes they can just bang out. And then they also also have the understanding needed to turn it into a proper product if the prototype pans out.

HaraldvonBlauzahn@feddit.org · 22 days ago

I would agree with that.

Especially, “being 70%” finished does not mean you will get a working product at all. If the fundamentale understanding is not there, you will not getting a working product without fundamental rewrites.

I have seen code from such bullshit developers myself. Vibe-coded device drivers where people do not understand the fundamentals of multi-threading. Why and when you need locks in C++. No clear API descriptions. Messaging architectures that look like a rats nest. Wild mix of synchronous and async code. Insistence that their code is self-documenting and needs neither comments nor doc. And: Agressivity when confronted with all that. Because the bullshit taints any working relationship.

Eager Eagle@lemmy.world · edit-2 21 days ago

I disagree. What I could hack over a weekend starting a project, I can do in a couple hours with AI, because starting a project is where the typing bottleneck is, due to all of the boilerplate. I can’t type faster than an LLM.

Also, because there are hundreds of similar projects out there and I won’t get to the parts that make mine unique in a weekend, that’s the perfect use case for “vibe coding”.

surewhynotlem@lemmy.world · 22 days ago

It can pop out a pojo based on copy paste of an API document faster than I can.

I wouldn’t trust it for logic though. That’s just asking for trouble.

andallthat@lemmy.world · 22 days ago

I say this as someone who’s not particularly a fan of AI and tries to use it very sparingly.

For me AI is not so much about productivity gains. Where I find it useful instead is to push me past the initial block of starting something from scratch. It’s that initial dopamine rush that the article mentions, from seeing an idea starting to take shape.

In that sense, if I compare projects by time spent on them with or without AI after they are completed, I too would probably find there were no productivity gains. But some of these things I would never get started at all by myself.

If you are a senior developer in a corporation, you know what you have to do, you are an expert in your domain, you rarely start something really new (and when you do, it is only after endless discussions and studies on tools, language, tech stack, architecture). AI is probably not a great help for you.

But even in corporate life, there are a lot of things that are inportant but that you constantly set aside: from planning your career, to honing your communication skills or whatever it is that you could certainly learn to do (with time and dedication) but for some reason you keep postponing because you are not already an expert at them and it takes motivation to learn. That’s where AI found its niche in my life.

melfie@lemy.lol · 22 days ago

Where I find it useful instead is to push me past the initial block of starting something from scratch

I think this is one of the highly understated benefits. I have to work in legacy codebases in programming languages I hate, and it used to be like pulling teeth to get myself motivated. I’d spend half the day procrastinating, and then finally write some code. Then I’d pull my hair out writing tests, only for CI to tell me I don’t have enough test coverage and there are 30 lint issues to fix. At that point, there would be yelling at the screen, followed by more procrastination.

With AI, though, I just write a detailed prompt, go get some coffee, and come back to a pile of drivel that is probably like 70% of the way there. I look it over, suggest some refactoring, additional tests, etc., manually test it and have it fix any bugs. If CI reports any lint issues or test failures, I just copy and paste for AI to fix it.

Yes, in an ideal world if I didn’t have ADHD and could just motivate myself to do whatever my company needs me to do and not procrastinate, I could write better quality code faster than AI. When I’m working on something I’m excited about, AI just gets in the way. The reality being what it is, though, AI is unequivocally a huge productivity boost for anything I’d rather not be working on.

tal@lemmy.today · edit-2 22 days ago

I keep seeing the “it’s good for prototyping” argument they post here, in real life.

There are real cases where bugs aren’t a huge deal.

Take shell scripts. Bash is designed to make it really fast to write throwaway, often one-line software that can accomplish a lot with minimal time.

Bash is not, as a programming language, very optimized for catching corner cases, or writing highly-secure code, or highly-maintainable code. The great majority of bash code that I have written is throwaway code, stuff that I will use once and not even bother to save. It doesn’t have to handle all situations or be hardened. It just has to fill that niche of code that can be written really quickly. But that doesn’t mean that it’s not valuable. I can imagine generated code with some bugs not being such a huge problem there. If it runs once and appears to work for the inputs in that particular scenario, that may be totally fine.

Or, take test code. I’m not going to spend a lot of time making test code perfect. If it fails, it’s probably not the end of the world. There are invariably cases that I won’t have written test code for. “Good enough” is often just fine there.

And it might be possible to, instead of (or in addition to) having human-written commit messages, generate descriptions of commits or something down the line for someone browsing code.

I still feel like I’m stretching, though. Like…I feel like what people are envisioning is some kind of self-improving AI software package, or just letting an LLM go and having it pump out a new version of Microsoft Office. And I’m deeply skeptical that we’re going to get there just on the back of LLMs. I think that we’re going to need more-sophisticated AI systems.

I remember working on one large, multithreaded codebase where a developer who isn’t familiar with or isn’t following the thread-safety constraints would create an absolute maintenance nightmare for others, where you’re going to spend way more time tracking down and fixing breakages induced than you saved by them not spending time coming up to speed on the constraints that their code needs to conform to. And the existing code-generation systems just aren’t really in a great position to come up to speed on those constraints. Part of what a programmer does is, when writing code, is to look at the human-language requirements, and identify that there are undefined cases and go back and clarify the requirement with the user, or use real-world knowledge to make reasonable calls. Training an LLM to map from an English-language description to code is creating a system that just doesn’t have the capability to do that sort of thing.

But, hey, we’ll see.

Artwork@lemmy.world · edit-2 22 days ago

I am sorry, but I am not sure what tells you how Bash “was designed” or not. Perhaps you haven’t yet written anything serious in Bash…
Have you checked out Bash PitFalls at Wooledge, at least?
Bash, or the most shells, including Posix, or even Perl, are some of the most complex languages out there to make a mistake… since there’s no compiler to protect you from, and though legendary but readline may cause the whole terminal go flying, depending on the terminal/terminfo in process…

No, sorry. I absolutely disagree on your stance regarding “shell” for a “bugless” “huge deal” in “real cases”.

tal@lemmy.today · edit-2 22 days ago

The point I’m making is that bash is optimized for quickly writing throwaway code. It doesn’t matter if the code written blows up in some case other than the one you’re using. You don’t need to handle edge cases that don’t apply to the one time that you will run the code. I write lots of bash code that doesn’t handle a bunch of edge cases, because for my one-off use, that edge case doesn’t arise. Similarly, if an LLMs is generating code that misses some edge case, if it’s a situation that will never arise, and that may not be a problem.

EDIT: I think maybe that you’re misunderstanding me as saying “all bash code is throwaway”, which isn’t true. I’m just using it as an example where throwaway code is a very common, substantial use case.

Artwork@lemmy.world · edit-2 22 days ago

I still don’t get what you mean, sorry. And why Bash and not another shell?
Why not Korn, Ash, Dash, Zsh, Fish, or anything REPL, including PHP, Perl, Node, Python etc.

Should we consider “throwaway” anything that supports interactive mode of your daily driver you chose in your default terminal prompt?
What does “throwaway” code means in the first place?

caseyweederman@lemmy.ca · 22 days ago

I… think they might be misusing the word “bash”? Maybe?

caseyweederman@lemmy.ca · 22 days ago

Yeah, I went back through this reply chain and I couldn’t find any explicit evidence that they’re talking about shell scripting at all, and perhaps think that the “bash programming language” refers to a general style, i.e. “to bash stuff together until it works”.

tal@lemmy.today · 22 days ago

And why Bash and not another shell?

I chose it for my example because I happen to use it. You could use another shell, sure.

Should we consider “throwaway” anything that supports interactive mode of your daily driver you chose in your default terminal prompt?

Interactive mode is a good case for throwaway code, but one-off scripts would also work.

Artwork@lemmy.world · edit-2 22 days ago

If you want to actually realize the amount of possible misunderstanding in the current conversation and of what shell scriting is, please do consider joining #bash at Libera IRC. Please do also mention the word “throwaway” in the rooms! Since there’s literally no understanding on what you mean still, sorry. It does not feel like you have a significant enough understanding of the subjects raised.

For a very simple example, there are literally no documentation regarding certain cases you’ll encounter in Bash’s built-ins even, unless you actually encounter it or learn from Bash’s very source code, like read built-in. Not to mention shenanigans in shell logics for inter-process communication (IPC), file-descriptors, environment variables like PWD, exported functions’ BASH_FUNC_, pipes, etc.

SpicyTaint@lemmy.world · 22 days ago

There is no paradox. vibe coding = not writing code = not being a programmer.

Captain Poofter@lemmy.world · edit-2 22 days ago

so what is it then? i used llms to write the code for a feature complete desktop screen dimming application. did i produce it, if not develop it? am i just a… logic guide? legitimately asking because the program works better than any available alternative

SpicyTaint@lemmy.world · 22 days ago

You’re dictating to and micromanaging an algorithm. Cringe individuals would refer to themselves as “Prompt Engineers” or something like that.

Instead of actually writing the code and understanding what each function actually does, you’re basically just skimming whatever the output is. You’re not gaining experience in programming by doing that and you may not be aware of what exactly everything does. Could be harmless until something unexpected starts causing issues.

For your specific case, an LLM seems completely overkill. I’ve also setup my desktop monitors to change brightness via a couple keyboard shortcuts using ControlMyMonitor and AutoHotkey. It’s like 10 lines of code.

TrumpetX@programming.dev · edit-2 22 days ago

Is it though? Sure he could hire a programmer, but Claude is far less expensive. I agree with his position, he’s a project/product manager, not a programmer. And that’s okay sometimes.

SpicyTaint@lemmy.world · edit-2 21 days ago

You’re right, I apologize, it’s really 3 lines of code repeated over and over. It’s repeated for each monitor/action (brighten/dimming) being performed. The script file it technically 47 lines because I have 3 monitors and 10 different shortcuts.

Here’s the first action for the first monitor. Just edit the name and the brightness amount and voilà. The comment inline code formatting might put it on one line.

^!+PgUp:: Run, D:\Programs\ControlMyMonitor\ControlMyMonitor.exe /ChangeValue "MSI G274QPF" 10 20 return

To be honest, I hesitate to even consider this example programming. The only thing it’s doing is executing a single command when a key combination is pressed. It doesn’t require really any programming experience. No loops, variables, scope, no time complexity, nothing.

Using an AI robbed them of learning even the smallest of concepts and they have not grown as a result.

It’s the same thing with any other concept; I don’t need to dedicate my life being an auto mechanic, but I should at least be able to know how to change a tire when I need to.

Captain Poofter@lemmy.world · edit-2 21 days ago

my program is significantly better than either of those options, including fully functional hotkeys, hybrid gamma/overlay brightness logic, hdr detection, quicktray shortcut slider, a complete scheduling system, window behavior/startup options, and an attractive dark interface all in a tiny window.

and i did it in only 3 days. overkill? good, cuz i got the software i needed without knowing how to code in only 3 days. also, I’ve learned a ton about programming logic and code structure. it’s hard to tell an llm to do something you yourself don’t understand. at least if you want to build anything complex.

PlutoniumAcid@lemmy.world · 22 days ago

I see myself as a project manager and executive producer. I know I don’t write that code, and I couldn’t even if I tried.

But I am skilled at directing and verifying, and this has allowed me to create (not code) a fairly complex WordPress plugin for course bookings with online card payment processing etc.

I have made a few manual tweaks here and there but that code is 98% Claude. And you know what? It works. That is good enough for me.

Buffalox@lemmy.world · edit-2 22 days ago

Is this the same fast to ship but hard to maintain argument we’ve seen a thousand times already?
It’s not a paradox, but a very typical result of using “fast” solutions.

PierceTheBubble@lemmy.ml · 22 days ago

The main paradox here, seems to be: the 70% boilerplate head-start being perceived faster, but the remaining 30% of fixing the AI-introduced mess, negating the marketed time-savings; or even leading to outright counterproductivity. At least in more demanding environments, not cherry picked by the industry, shoveling the tools.

Buffalox@lemmy.world · 22 days ago

I’ll take that as a “Yes”.

tal@lemmy.today · edit-2 22 days ago

Security is where the gap shows most clearly

So, this is an area where I’m also pretty skeptical. It might be possible to address some of the security issues by making minor shifts away from a pure-LLM system. There are (conventional) security code-analysis tools out there, stuff like Coverity. Like, maybe if one says “all of the code coming out of this LLM gets rammed through a series of security-analysis tools”, you catch enough to bring the security flaws down to a tolerable level.

One item that they highlight is the problem of API keys being committed. I’d bet that there’s already software that will run on git-commit hooks that will try to red-flag those, for example. Yes, in theory an LLM could embed them into code in some sort of obfuscated form that slips through, but I bet that it’s reasonable to have heuristics that can catch most of that, that will be good-enough, and that such software isn’t terribly difficult to write.

But in general, I think that LLMs and image diffusion models are, in their present form, more useful for generating output that a human will consume than that a CPU will consume. CPUs are not tolerant of errors in programming languages. Humans often just need an approximately-right answer, to cue our brains, which itself has the right information to construct the desired mental state. An oil painting isn’t a perfect rendition of the real world, but it’s good enough, as it can hint to us what the artist wanted to convey by cuing up the appropriate information about the world that we have in our brains.

This Monet isn’t a perfect rendition of the world. But because we have knowledge in our brain about what the real world looks like, there’s enough information in the painting to cue up the right things in our head to let us construct a mental image.

Ditto for rough concept art. Similarly, a diffusion model can get an image approximately right — some errors often just aren’t all that big a deal.

But a lot of what one is producing when programming is going to be consumed by a CPU that doesn’t work the way that a human brain does. A significant error rate isn’t good enough; the CPU isn’t going to patch over flaws and errors itself using its knowledge of what the program should do.

EDIT:

I’d bet that there’s already software that will run on git-commit hooks that will try to red-flag those, for example.

Yes. Here are instructions for setting up trufflehog to run on git pre-commit hooks to do just that.

EDIT2: Though you’d need to disable this trufflehog functionality and have some out-of-band method for flagging false positives, or an LLM could learn to bypass the security-auditing code by being trained on code that overrides false positives:

Add trufflehog:ignore comments on lines with known false positives or risk-accepted findings

PierceTheBubble@lemmy.ml · 22 days ago

I don’t know: it’s not just the outputs posing a risk, but also the tools themselves. The stacking of technology can only increase attack-surface it seems, at least to me. The fact that these models seem to auto-fill API values, without user-interaction, is quite unacceptable to me; it shouldn’t require additional tools, checking for such common flaws.

Perhaps AI tools in professional contexts, can be best seen as template search tools. Describe the desired template, and it simply provides the template, it believes most closely matches the prompt. The professional can then “simply” refine the template, to match it to set specifications. Or perhaps rather use it as inspiration and start fresh, and not end up spending additional time resolving flaws.

tal@lemmy.today · edit-2 22 days ago

I don’t know: it’s not just the outputs posing a risk, but also the tools themselves

Yeah, that’s true. Poisoning the training corpus of models is at least a potential risk. There’s a whole field of AI security stuff out there now aimed at LLM security.

it shouldn’t require additional tools, checking for such common flaws.

Well, we are using them today for human programmers, so… :-)

PierceTheBubble@lemmy.ml · 22 days ago

Well, we are using them today for human programmers, so… :-)

True that haha…

CCMan1701A@startrek.website · 20 days ago

AI is my coding companion as there are so few devs I can discuss my coding problems with due to their lack of understanding. The AI maybe wrong at times, but just toasing ideas back and forth gets my gears turning. It’s my evolved coding duck.

ImmersiveMatthew@sh.itjust.works · 21 days ago

People can dismiss AI coding but some of us are using it for actual products and making money from it. I have heard for two years now that I will pay the price of vibe coding as I will not understand the code when it breaks. What? I just ask the AI what the code does to understand it and vibe code to fix it. What the heck is everyone on about? Knowing syntax is about as beneficial as knowing the machine code. Who cares in 2026 other than really mission critical apps that AI is not ready for.

UnfairUtan@lemmy.world · 20 days ago

You’re absolutely right!

Now time for some pen-testing if you don’t mind sharing the links to your products.

/s

ImmersiveMatthew@sh.itjust.works · 20 days ago

https://www.meta.com/experiences/theme-park/4212005182188732/

Knock_Knock_Lemmy_In@lemmy.world · 20 days ago

What you are doing is the coding equivalent of not reading the manual on a new appliance. It will work, but you will get surprised. Probably not in a good way.

ImmersiveMatthew@sh.itjust.works · 20 days ago

That is the argument many make but it is assuming that any issues that come up AI will not be able to help which is just not true. I have a top rated, multiplayer VR app and I have not written code for it for 2 years now as AI does it all. Sure issues have come up, but AI can fix as long as you guide it correctly.

Knock_Knock_Lemmy_In@lemmy.world · 20 days ago

But you wrote the bulk of the code 2 years ago? If so you know 90% of the manual.

ImmersiveMatthew@sh.itjust.works · 18 days ago

Most of the code was written in the last 2 years as I have expanded the app a lot, added multi player and reworked all the original code. For my simple app, AI has allowed me to grow much faster than before. I am very grateful for the tech even with its may issues.