Study: Experienced developers thought they were 20% faster with AI tools, but they were actually 19% slower

tatterdemalion@programming.dev · 16 hours ago

Study: Experienced developers thought they were 20% faster with AI tools, but they were actually 19% slower

Buddahriffic@lemmy.world · 4 hours ago

One area that it really helps for me is executive function. There’s a lot of things that I’m capable of doing, even quickly, but I’ve got this mental block that just makes me not want to do it. But if I can just write out some instructions and have a system do the rest, it’s much easier to get going.

I’ll still think through the problem and approach using AI to help coding at a “I need a function that will take this data in this format and do x, y, z to it and return that data in that format” level rather than something like a higher level description of my final goal.

Is it faster than what I could do if I focused on programming and get on a good roll? I dunno, it might still be. Is it faster than me actually trying to program something in the reality that I often exist in? Fuck yeah.

And even debugging and testing go way smoother. For one thing, I don’t have to deal with stupid typo bugs anymore. And for the bugs that still make it in, AI has been great at taking an idea of how to examine the data that would be a pain to implement and just doing it for me, especially if it involves some obscure API or language features since I don’t have to spend time finding its existence and then learning it (if it’s a one off, I won’t likely retain anything other than the existence part, which I can still get from looking at the AI generated code).

So it’s pretty great for programmers who know their shit but ADHD gets in the way of using it in a timely manner. It’s better than an intern, which is good for me but sucks for those who need to learn. There’s a good chance AI’s semi-competence if hand held by an expert is going to lead to a big lack of talent as those experts age out. Though with how quickly it’s improving, it might not even need the hand holding by then. I’m not sure which possibility is scarier.

reliv3@lemmy.world · 10 hours ago

Make sure we read table 2 in the paper. The reality is the people behind this study is urging folks to not draw strong conclusions from this study.

MudMan@fedia.io · 15 hours ago

Sounds about right.

I’d like to see numbers for inexperienced devs and devs working on somebody else’s code, though.

EDIT: Oh, this is interesting. The full paper breaks down where the time goes. Turns out coders do in fact spend less time actually working on the code when using AI, but the time spent prompting, waiting on the output and processing the output eats up the difference. They also sit idle for longer with AI. So their forecasts aren’t that crazy, they do work less/faster with AI, but the new extra tasks make them less productive overall.

That makes a lot of sense in retrospect, but it’s not what I was expecting.

Epzillon@lemmy.world · 12 hours ago

Thanks for the quick summary! I would probably forget to read this later as im at work right now, so thanks!

errer@lemmy.world · 12 hours ago

deleted by creator

MudMan@fedia.io · 12 hours ago

Yeah, I had to dig a bit further for this figure. They display the same data more prominently in percentage of the time devoted to each bug, which gives them smaller error bars, but also doesn’t really answer the question that matters regarding where the time went.

Worth noting that this is a subset of the data, apparently. They recorded about a third of the bug fixes on video and cut out runs with cheating and other suspicious activity. Assuming each recording contains one bug they end up with a fourth of the data broken down this way.

Which is great, but… it does make you wonder why that data is good enough for the overall over/underestimate plot if it’s not good enough for the task breakdown. Some of the stuff they’re filtering out is outright not following the instructions or self-reporting times that are more than 20% off from what the recording shows. So we know some of those runs are so far off they didn’t get counted for this, but presumably the rest of the data that just had no video would be even worse, since the timings are self-reported and they paid them to participate by the hour.

I’d definitely like to see this with more data, this is only 16 people, even if they are counting several hundred bugs. Better methodology, too. And I’d like to see other coder profiles in there. For now they are showing a very large discrepancy between estimate and results and at least this chart gives you some qualitative understanding of how that happened. I learned something from reading it. Plus, hey, it’s not like AI research is a haven of clean, strict data.

Of course most people will just parrot the headline, because that’s the world we live in.

Pringles@sopuli.xyz · 10 hours ago

Not a developer, but I do a lot of scripting. I recently needed a yaml file with a whole set of parameters from an excel file for about 600 objects. Copilot generated it for me in about a minute. It took some iterations but it saved me a fuckton of time. Anyway, my point is you need to pick your battles. It’s a tool, so wield it like one.

QuizzaciousOtter@lemmy.dbzer0.com · 9 hours ago

This is the correct approach of course. But tech giants do everything in their power to convince us that we should use AI for everything, including wiping our asses. And there is a significant number of peoople who believe this. So it’s important to check this and spread the word.

BlameTheAntifa@lemmy.world · 8 hours ago

Anecdotally, this sounds very close to my own experience. The fact that AI can generate code quickly creates a false sense of speed. While experienced engineers are better at vetting and correcting generated code, that still causes significant overhead. There are also delayed impacts when bugs, vulnerabilities, and oversights slip through. The best case scenario is that you break even, but when you pay attention to the big picture, you realize that you are actually taking longer to reach milestones than you used to. I also hypothesize that this overhead gets worse over time as you are far less familiar with the codebase you leave behind than if you had written it yourself.

Remember that AI is incapable of reasoning, so it can’t actually apply logic to the code it generates, which is a problem because all code is literally a representation of logic and reasoning.

Quik@infosec.pub · 16 hours ago

This tracks pretty much with my conclusions for myself, neat.

It’s crazy, I’m fooled every time again and think “surely I must be faster like this” when after a few days an in-depth reflection/looking at actual commits shows that nope, I wasn’t.

truthfultemporarily@feddit.org · 15 hours ago

Same. I only use it to write boilerplate now. When it is basically copy pasting from docs, it’s faster at it than me.

Every time I try it to write more complex code, I end up redoing major parts of it.

errer@lemmy.world · 12 hours ago

The main reason why I still think it’s faster even if it’s “slower”: it does its work in the background while I can do other things, like respond to emails, attend meetings, look at other bits of code, etc. I turn on the audio notification to have to ping me when it’s done.

TomMasz@lemmy.world · 15 hours ago

Same here. It’s been trained on so much of that kind of code, you have a much better chance of getting useable code on the first prompt.

hendrik@palaver.p3x.de · edit-2 16 hours ago

Interesting study. Also similar to my own observations. I’ve tried AI coding to some degree. Some people recommend it. And it definitely can do some nice things. Like write boilerplate code fast and do some tech-demos and exploring. And it’s kind of nice to bounce off ideas to someone. I mean the process of speaking out things loud and putting it into words helps me think it through. AI can do that (listen) and it’ll occasionally give back some ideas.

The downside of coding with it in real life is, I end up refactoring a lot of stuff. And that’s just tedious and annoying work. I’ve had this happen to me several times now. And I think the net time balance is negative for me as well. I think I’m better off coming up with an idea how to implement something, and then just type it down. Rather than skipping the step and then moving around stuff, changing it, and making sure it handles the edge-cases correctly and fits into the broader picture. Plus I still end up doing the thinking step, just in a different way.

Quik@infosec.pub · 9 hours ago

Related to what you said, I found it actually helpful to just write and discuss ideas with some LLM without letting it code.

Study: Experienced developers thought they were 20% faster with AI tools, but they were actually 19% slower

Study: Experienced developers thought they were 20% faster with AI tools, but they were actually 19% slower

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity