• NigelFrobisher@aussie.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 minutes ago

    I have an LLM usage mandate in my performance review now. I can’t trust it to do anything important, so I’ll get it to do incredibly noddy things like deleting a clause (that I literally always have highlighted) or generate documentation that’s more long-winded than just reading the code and then go to the bathroom while it happens.

  • RememberTheApollo_@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    2 hours ago

    Anyone who has had to unfuck someone else’s work knows it would have been faster to do the work correctly from scratch the first time.

  • hankg@friendica.myportal.social
    link
    fedilink
    arrow-up
    16
    ·
    7 hours ago

    @dgerard I normally consider myself a 10x developer. With the 10x speedup of AI I now consider myself a 100x developer. I can replace an entire small business worth of developers with just myself and my LLM bot assistance. Just pay me $100 million up front no strings and I’ll prove it to you! /s

    • Silic0n_Alph4@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      7 hours ago

      I have the deal of a lifetime for you.

      I represent a group of investors in possession of a truly unique NFT that has been recently valued at over $100M. We will invest this NFT in your 100x business - in return you transfer us the difference between the $100M investment and the excess value of the NFT. Standard rich people stuff, don’t worry about it.

      Let me know when you’re ready to unlock your 100x potential and I’ll make our investment available via a suitable escrow service.

  • David Gerard@awful.systemsOPM
    link
    fedilink
    English
    arrow-up
    14
    ·
    9 hours ago

    ahahaha holy shit. I knew METR smelled a bit like AI doomsday cultists and took money from OpenPhil, but those “open source” projects and engineers? One of them was LessWrong.

    Here’s a LW site dev whining about the study, he was in it and i think he thinks it was unfair to AI

    I think if people are citing in another 3 months time, they’ll be making a mistake

    dude $NEXT_VERSION will be so cool

    so anyway, this study has gone mainstream! It was on CNBC! I urge you not to watch that unless you have a yearning need to know what the normies are hearing about this shit. In summary, they are hearing that AI coding isn’t all that actually and may not do what the captains of industry want.

    around 2:30 the two talking heads ran out of information and just started incorrecting each other on the fabulous AI future, like the worst work lunchroom debate ever but it’s about AI becoming superhuman

    the key takeaway for the non techie businessmen and investors who take CNBC seriously ever: the bubble starts not going so great

    • BigMuffN69@awful.systems
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      7 hours ago

      Yeah, METR was the group that made the infamous AI IS DOUBLING EVERY 4-7 MONTHS GRAPH where the measurement was 50% success at SWE tasks based on the time it took a human to complete it. Extremely arbitrary success rate, very suspicious imo. They are fanatics trying to pinpoint when the robo god recursive self improvement loop starts.

  • SpaceNoodle@lemmy.world
    link
    fedilink
    English
    arrow-up
    60
    ·
    14 hours ago

    Devs are famously bad at estimating how long a software project will take.

    No, highly complex creative work is inherently extremely difficult to estimate.

    Anyway, not shocked at all by the results. This is a great start that begs for larger and more rigorous studies.

    • Feyd@programming.dev
      link
      fedilink
      English
      arrow-up
      20
      arrow-down
      1
      ·
      13 hours ago

      You’re absolutely correct that the angle approach that statement is bullshit. There is also that they want to think making software is not highly complex creative work but somehow is just working an assembly line and the software devs are gatekeepers that don’t deserve respect.

  • TommySoda@lemmy.world
    link
    fedilink
    English
    arrow-up
    41
    ·
    14 hours ago

    As someone that has had to double check peoples code before, especially those that don’t comment appropriately, I’d rather just write it all again myself than try and decipher what the fuck they were even doing.

  • swlabr@awful.systems
    link
    fedilink
    English
    arrow-up
    39
    ·
    14 hours ago

    Megacorp LLM death spiral:

    1. Megacorp managers at all levels introduce new LLM usage policies.
    2. Productivity goes down (see study linked in post)
    3. Managers make the excuse that this is due to a transitional period in LLM policies.
    4. Policies become mandates. Beatings begin and/or intensify.
    5. Repeat from 1.
    • wizardbeard@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      10
      ·
      13 hours ago

      I’ve been through the hellscape where managers used missed metrics as evidence for why we didn’t need increased headcount on an internal IT helpdesk.

      That sort of fuckery is common when management gets the idea in their head that they can save money on people somehow without sacrificing output/quality.

      I’m pretty certain they were trying to find an excuse to outsource us, as this was long before the LLM bubble we’re in now.

      • swlabr@awful.systems
        link
        fedilink
        English
        arrow-up
        11
        ·
        12 hours ago

        oh, absolutely. I mean you could sub out “LLM” with any bullshit that management can easily spring on their understaff. Agile, standups, return to office, the list goes on. Management can get fucked

  • Xerxos@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    14
    ·
    13 hours ago

    You have to know what an AI can and can’t do to effectively use AI.

    Finding bugs is on of the worst things to “vibe code”: LLM can’t debug programs (at least as far as I know) and if the repository is bigger than the context window they can’t even get a overview of the whole project. LLMs only can run the program and guess what the error is based on the error messages and user input. They can’t even control most programs.

    I’m not surprised by the results, but it’s hardly a fair assessment of the usefulness of AI.

    Also I would prefer to wait for the LLM and see if it can fix the bug than hunt for bugs myself - hell, I could solve other problems while waiting for the LLM to finish. If it’s successful great, if not I can do it myself.

    • flizzo@awful.systems
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 hours ago

      To be fair, you have to have a very high IQ to effectively use AI. The methodology is extremely subtle, and without a solid grasp of theoretical computer science, most of an LLM’s capabilities will go over a typical user’s head. There’s also the model’s nihilistic outlook, which is deftly woven into its training data - its internal architecture draws heavily from statistical mechanics, for instance. The true users understand this stuff; they have the intellectual capacity to truly appreciate the depths of these limitations, to realize that they’re not just bugs—they say something deep about an AI’s operational boundaries. As a consequence, people who dislike using AI for coding truly ARE idiots- of course they wouldn’t appreciate, for instance, the nuance in an LLM’s inability to debug a program, which itself is a cryptic reference to the halting problem. I’m smirking right now just imagining one of those addlepated simpletons scratching their heads in confusion as the LLM fails to get an overview of a repository larger than its context window. What fools… how I pity them. 😂 And yes, by the way, I DO have a favorite transformer architecture. And no, you cannot see it. It’s for the ladies’ eyes only- and even they have to demonstrate that they’re within 5 IQ points of my own (preferably lower) beforehand. Nothing personnel kid 😎

    • V0ldek@awful.systems
      link
      fedilink
      English
      arrow-up
      8
      ·
      9 hours ago

      I’m not surprised by the results, but it’s hardly a fair assessment of the usefulness of AI.

      It’s a more than fair assessment of the claims of usefulness of AI which are more or less “fire all your devs this machine is better than them already”

      • diz@awful.systems
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        7 hours ago

        And the other “nuanced” take, common on my linkedin feed, is that people who learn how to use (useless) AI are gonna replace everyone with their much increased productive output.

        Even if AI becomes not so useless, the only people whose productivity will actually improve are the people who aren’t using it now (because they correctly notice that its a waste of time).

    • swlabr@awful.systems
      link
      fedilink
      English
      arrow-up
      15
      ·
      12 hours ago

      “This study that I didn’t read that has a real methodology for evaluating LLM usefulness instead of just trusting what AI bros say about LLM usefulness is wrong, they should just trust us, bros”, that’s you

    • David Gerard@awful.systemsOPM
      link
      fedilink
      English
      arrow-up
      13
      ·
      11 hours ago

      this user has been removed for commenting without reading the article

      being from programming dot dev is just the turd on top

        • self@awful.systems
          link
          fedilink
          English
          arrow-up
          10
          ·
          10 hours ago

          programmers learned what N means in statistics and immediately realized that “this N is too small” is a cool shortcut to sounding smart without reading the study, its goals, or its conclusions. and you can use it every time N is smaller than the human population on earth!

    • Feyd@programming.dev
      link
      fedilink
      English
      arrow-up
      27
      ·
      14 hours ago

      You’re acting like this is a gotcha when it’s actually probably the most rigorous study of AI tool productivity change to date.

    • blakestacey@awful.systems
      link
      fedilink
      English
      arrow-up
      19
      ·
      14 hours ago

      Paragraph 2:

      METR funded 16 experienced open-source developers with “moderate AI experience” to do what they do.

      • HedyL@awful.systems
        link
        fedilink
        English
        arrow-up
        21
        ·
        13 hours ago

        … and just a few paragraphs further down:

        The number of people tested in the study was n=16. That’s a small number. But it’s a lot better than the usual AI coding promotion, where n=1 ’cos it’s just one guy saying “I’m so much faster now, trust me bro. No, I didn’t measure it.”

        I wouldn’t call that “burying information”.

  • Photuris@lemmy.ml
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    11
    ·
    14 hours ago

    The point isn’t to increase employee productivity (or employee happiness, obviously).

    The point is to replace most employees.

    In order for that to happen, LLM-assisted entry-level developers merely need to be half as good as expert human unassisted developers, at scale, at a lower aggregate cost.

    • Feyd@programming.dev
      link
      fedilink
      English
      arrow-up
      18
      ·
      13 hours ago

      LLM-assisted entry-level developers merely need to be half as good as expert human unassisted developers

      1. This isn’t even close to existing.
      2. The theoretical cyborg-developer at that skill level would surely be introducing horrible security bugs or brittle features that don’t stand up to change
      3. Sadly i think this is exactly what many CEOs are thinking is going to happen because they’ve been sold on openai and anthropic lies that it’s just around the corner
    • swlabr@awful.systems
      link
      fedilink
      English
      arrow-up
      22
      ·
      14 hours ago

      Are these entry-level developers that are merely half as good as expert human unassisted developers in the room with us right now?

    • MotoAsh@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      13 hours ago

      That’s a fool’s errand. Only those highly skilled experienced devs are going to spot certain subtle issues or architectural issues that will bite everyone the moment the product grows larger.

      • froztbyte@awful.systems
        link
        fedilink
        English
        arrow-up
        2
        ·
        52 minutes ago

        as one of the people representing the “hero group” (for lack of a better term) your comment references: eh. I didn’t start out with all this knowledge and experience. it built up over time.

        it’s more about the mode of thinking and how to engage with a problem, than it is about specific “highly skilled” stuff. the skill and experience help/contribute, they refine, they assist in filtering

        the reason I make this comment is because I think it’s valuable that anyone who can do the job well gets to do the thing, and that it’s never good to gatekeep people out. let’s not unnecessarily contribute to imposter syndrome

        • MotoAsh@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          45 minutes ago

          The hypothetical is literally underskilled newbies replacing oldhats.

            • MotoAsh@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              19 minutes ago

              and yet you’re still wrong. Interesting how popularity isn’t correlated with correctness.

    • Ledivin@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      3
      ·
      14 hours ago

      Entry-level devs ain’t replacing anyone. One senior dev is going to be doing the work of a whole team

      • Photuris@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        7
        ·
        13 hours ago

        For now.

        But when a mid-tier or entry level dev can do 60% of what a senior can do, it’ll be a great way to cut costs.

        I don’t think we’re there now. It’s just that that’s the ultimate goal - employ fewer people, and pay the remaining people you do employ less.

        • Seminar2250@awful.systems
          link
          fedilink
          English
          arrow-up
          10
          ·
          edit-2
          10 hours ago

          half as good as expert human

          60% of what a senior can do

          is there like a character sheet somewhere so i can know where i fall on this developer spectrum

          • V0ldek@awful.systems
            link
            fedilink
            English
            arrow-up
            7
            ·
            9 hours ago

            It’s going to be your INT bonus modifier, but you can get a feat that also adds the WIS modifier

            For prolonged coding sessions you do need CON saving throws, but you can get advantage from drinking coffee (once per short rest)

            • Ledivin@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              ·
              8 hours ago

              but you can get advantage from drinking coffee (once per short rest)

              I must have picked up a feat somewhere because I hit that shit way more than once per short rest

        • Feyd@programming.dev
          link
          fedilink
          English
          arrow-up
          17
          ·
          13 hours ago

          But when a mid-tier or entry level dev can do 60% of what a senior can do

          This simply isn’t how software development skill levels work. You can’t give a tool to a new dev and have them do things experienced devs can do that new devs can’t. You can maybe get faster low tier output (though low tier output demands more review work from experienced devs so the utility of that is questionable). I’m sorry but you clearly don’t understand the topic you’re making these bold claims about.

        • EnsignWashout@startrek.website
          link
          fedilink
          English
          arrow-up
          18
          ·
          edit-2
          13 hours ago

          But when a mid-tier or entry level dev can do 60% of what a senior can do, it’ll be a great way to cut costs.

          Same as how an entry level architect can build a building 60% as tall, and that’ll last 60% as long, right?

          Edit: And an entry level aerospace engineer with AI assistance will build a plane that’s 60% as good at not crashing.

          I’m not looking forward to the world I believe is coming…

          • mountainriver@awful.systems
            link
            fedilink
            English
            arrow-up
            4
            ·
            9 hours ago

            Get 2 and the plane will be 120% as good!

            In fact if children with AI are a mere 1% as good, a school with 150 children can build 150% as good!

            I am sure this is how project management works, and if it is not maybe Elon can get Grok to claim that it is. (When not busy praising Hitler.)

            • froztbyte@awful.systems
              link
              fedilink
              English
              arrow-up
              1
              ·
              49 minutes ago

              this brooks no argument and it’s clear we should immediately throw all available resources at ai so as to get infinite improvement!!~

              (I even heard some UN policy wonk spout the AGI line recently 🙄)