• BradleyUffner@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    45 minutes ago

    My favorite part of this is that they test it up to 99999 and we see that it fails for 99991, so that means somewhere in the test they actually implemented a properly working function.

  • JustARegularNerd@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    4
    arrow-down
    2
    ·
    60 minutes ago

    I’m struggling to follow the code here. I’m guessing it’s C++ (which I’m very unfamiliar with)

    bool is_prime(int x) {
        return false;
    }
    

    Wouldn’t this just always return false regardless of x (which I presume is half the joke)? Why is it that when it’s tested up to 99999, it has a roughly 95% success rate then?

    • kraftpudding@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      50 minutes ago

      I suppose because about 5% of numbers are actually prime numbers, so false is not the output an algorithm checking for prime numbers should return

      • JustARegularNerd@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        2
        ·
        38 minutes ago

        Oh I’m with you, the tests are precalculated and expect a true to return on something like 99991, this function as expected returns false, which throws the test into a fail.

        Thank you for that explanation

    • flamingo_pinyata@sopuli.xyz
      link
      fedilink
      arrow-up
      5
      ·
      52 minutes ago

      That’s the joke. Stochastic means probabilistic. And this “algorithm” gives the correct answer for the vast majority of inputs

  • Flipper@feddit.org
    link
    fedilink
    arrow-up
    48
    ·
    edit-2
    3 hours ago

    Has the same vibes as anthropic creating a C compiler which passes 99% of compiler tests.

    That last percent is really important. At least that last percent are some really specific edge cases right?

    Description:
    When compiling the following code with CCC using -std=c23:

    bool is_even(int number) {
       return number % 2 == 0;
    }
    

    the compiler fails to compile due to booltrue, and false being unrecognized. The same code compiles correctly with GCC and Clang in C23 mode.

    Source

    Well fuck.

    • PlexSheep@infosec.pub
      link
      fedilink
      arrow-up
      10
      ·
      3 hours ago

      If this wasn’t 100% vibe coded, it would be pretty cool.

      A c compiler written in rust, with a lot of basics supported, an automated test suite that compiles well known c projects. Sounds like a fun project or academic work.

    • the rizzler@lemmygrad.ml
      link
      fedilink
      arrow-up
      3
      ·
      3 hours ago

      any llm must have several C compilers in its training data, so it would be a reasonably competent almost-clone of gcc/clang/msvc anyway, right?

      is what i would have said if you didn’t put that last part

  • fckreddit@lemmy.ml
    link
    fedilink
    arrow-up
    17
    arrow-down
    1
    ·
    edit-2
    4 hours ago

    LLMs belong to the same category. Seemingly right, but not really right.

  • Kekzkrieger@feddit.org
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    3 hours ago

    If you think this is bad and not nearly enough accuracy to be called correct, AI is much worse than this.

    It’s not just wrong a lot of times or hallucinates but you can’t pinpoint why or how it produces the result and if you keep putting the same data in, the output may still vary.