AI still doesn't work very well, businesses are faking it, and a reckoning is coming

Rimu@piefed.social · 23 hours ago

AI still doesn't work very well, businesses are faking it, and a reckoning is coming

sp3ctr4l@lemmy.dbzer0.com · 22 hours ago

Wow.

2000x worse, huh?

I mean… I’m impressed that it runs, passes unti tests, and ‘works’, but is also that much worse.

That’s a kind of achievement.

Not a useful kind, but… impressively bad.

artifex@piefed.social · 21 hours ago

Also it’s probably incredibly difficult to optimize a huge LLM-generated codebase since there are no human authors who know it intimately to begin with.

sp3ctr4l@lemmy.dbzer0.com · edit-2 18 hours ago

Having a stable set of individuals with a deep understanding of ‘how things work’ is so totally anathema to the modern paradigm of ‘every coder is is a contractor, basically’.

Everybody wants to do software development, but doesn’t want to foster software developers.

So, they try and build machine god to replace us, and as most of us predicted… didn’t work out so well, but goddamnit, they’ll burn a trillion dollars before they let their ego take a hit.

… oh well, I guess.

Jacob_Mandarin@lemmy.world · 21 hours ago

The 2000x difference is for more complex workloads. It has ok performance for very simple queries.

So not quite as bad as the headline number suggests. But still very bad and not a viable alternative.

MetalSlugX@piefed.social · 19 hours ago

I mean, 2x would already be unacceptable.

sp3ctr4l@lemmy.dbzer0.com · edit-2 18 hours ago

I mean, on the one hand, its SQLite.

On the other hand…

… arguably the entire point of a database language is to efficiently handle complex workloads.

And then when you remember that… this was a project, in development, that cost time, money, energy, made RAM prices go up by maybe ¢22 per GB all on its own…

This is an insane negative return on investment.

Like imagine if you paid the same amount of money to … people, a contracted firm, and they handed you this.

You’d potentially be firing them or suing them for breach of contract, blacklisting them as far and wide as you could.

yes_this_time@lemmy.world · 20 hours ago

I couldn’t find if they were able to fix the identified bugs, seems like an important detail. How far does a month of LLM plus a month of talent get you?

Jacob_Mandarin@lemmy.world · 18 hours ago

They probably dont care. They did this to generate headlines about how capable their AI is. It has served its purpose. So long as all of the investors only saw the propaganda articles the line will only go up and they can abandon this project.

yes_this_time@lemmy.world · edit-2 17 hours ago

The propaganda articles about how the LLM missed critical logic and that it performs worse than SQLite?

I’m less less interested in the extreme skepticism or hype.

The project is an impressive demonstration from a pure technical perspective. I couldn’t imagine 5 years ago a model being able to rewrite such a complex project.

justOnePersistentKbinPlease@fedia.io · 22 hours ago

Reminds me of the Claude fluff article that mentions that reverts have only increased 0.04%.

It made big noise that the number of Pull Requests has doubled though.

Logically because you have one PR from an LLM, then another by a human to fix the LLM slop.

LurkingLuddite@piefed.social · 20 hours ago

Or a large number (vast majority I’d bet) of LLM generated PRs are going unmerged. No need to revert something that hasn’t been merged.

Carey@mastodon.nz · 19 hours ago

@rimu How does it get worse at all?! SQLite has quite a few vtables, and Rust has monomorphisation, so it should be possible to do better on benchmarks.

(I have occasionally thought about rewriting SQLite in Rust before regaining my senses.)

Mwa@thelemmy.club · 22 hours ago

Tbh I see myself using AI for shits and giggles. (nothing helpful)
I try not to use it alot due to the ethics it comes with it.

SpaceNoodle@lemmy.world · edit-2 22 hours ago

sp3ctr4l@lemmy.dbzer0.com · edit-2 21 hours ago

In fairness to LLMs, (which I run locally) I’ve been able to use them for like, bits of code that are roughly 200 lines or less.

Or like, feed it a code base and say hey, make sure all the comments are formatted the same way.

But uh, for… trying to engineer an entire system?

Nope nope nope, they get very confused, very fast, as overall conplexity increases.

RaoulDook@lemmy.world · 21 hours ago

There are some things it can do well, like collecting / organizing certain data out of large documents. Sort of like a recursive-multi-google operation

AI still doesn't work very well, businesses are faking it, and a reckoning is coming

AI still doesn't work very well, businesses are faking it, and a reckoning is coming

AI still doesn't work very well in business, reckoning soon