• 1 Post
  • 188 Comments
Joined 3 years ago
cake
Cake day: July 29th, 2023

help-circle


  • How it started: in 2025, the city of dublin, ohio (the latter detail missed by quite a lot of reporting,because there are no other dublins it might get confused with, I guess) gets an autonomous? ai powered police surveillance robot.

    City officials are encouraging residents to interact with Dubbot—ask questions, take selfies, and experience firsthand how AI is shaping public safety. The goal is to foster transparency and gather feedback to refine the robot’s role in the community.

    How it’s going

    The person-sized, camera-covered robot that looked like it rolled right out of a sci-fi movie did not identify any criminal incidents, issue any tickets or help with any arrests in its nearly 10 months on the job.

    On the other hand, I bet it didn’t shoot anyone’s dog, so who’s to say that the $64k was wasted.



  • Who even has time for that? Do you think that the people behind palantir, icarus and sauron have time to read google summaries? They’re too busy remaking the world!

    Anyway, if you’re successful enough you’ll eclipse the original source in terms of importance and all the search engine summaries will be about you anyway, so any time spent learning anything before that will have been completely wasted.



  • Some folks, who may be familiar to some or more of you, accidentally discovered that if your git repo symlinks CLAUDE.MD to, say, /dev/urandom, it breaks Claude code.

    the reason why this works is exactly the reason why claude code sucks so bad. there are protections against this in the file reading tool. however because everything in claude code is implemented in 5 million different ways, those protections are a completely orthogonal set of codepaths from how CLAUDE.md files are read. conversely, the file read tool seems to be completely naive to symlinks while the CLAUDE.md reader is not. this is the fucking swiss cheese security model of the fucking gold standard of what AI programming can do.

    https://neuromatch.social/@jonny/116779793188712173

    The thread is actually about trying to attract and manipulate autonomous coding agents, but they’ve only had limited success so far, which may have been slowed down by the above symlink trick.


  • I think part of the issue is that historical software quality was an artefact of its time… if you can’t easily patch your released products, you need to work harder to ensure they’re functional. If the only way for people to learn about how your product works in the documentation you ship with it, the docs need to be useful and comprehensive.

    The combination of software needing no guarantee of merchantability or fitness for any particular purpose and the internet rendered those pressures obsolete. Ship shit, fix later. Mass-scale a/b testing over past decade or two shows that most people seemingly don’t care if their software runs like absolute garbage, and is covered in adverts, and harvests all their personal data and the leaks all of it that wasn’t sold.

    An incident-to-pr ratio that’s up by 250% is unfortunate, but it is not yet so bad that the end-users actually care enough to do anything about it, even assuming they can do anything.


  • This is by an llm-boosting firm, so be aware that it’ll have a lot of marketing in it. It doesn’t say nice things about vibe code (presumably because the authors want to sell you a solution) but the numbers are interesting even so.

    https://www.faros.ai/blog/ai-acceleration-whiplash-takeaways

    A few choice snippets, none of which will surprise anyone here:

    1. For every code change merged, the probability of a production incident has more than tripled.

    The incidents-to-PR ratio is up 242.7% as teams move from low to high AI adoption.

    1. Bugs are accelerating, not stabilizing.

    In our 2025 AI engineering report on the AI Productivity Paradox, bugs per developer were up 9% as AI adoption grew. In this dataset, that figure has risen to 54%

    1. The most experienced people in your organization are being buried.

    Median time to first PR review is up 156.6%. Average time spent in code review is up 199.6%. Median time in review is up 441.5%. The engineers with the deepest knowledge of the system are spending their most valuable hours unraveling plausible-looking code that should never have reached them in the state it did.




  • because there’s no economic incentive to hire them to do that kind of work.

    isn’t that the old “basic science is boring and unsexy” issue though? There are economic incentives, but not in a short term-big-bux sort of way, so capitalism can’t be trusted with it.

    To conjure up a recent example, something like “The number of curves of genus two with elliptic differentials”, published back in 1997, probably had limited commercial value at the time, but 20 years later completely sunk a promising post-quantum cryptography algorithm (“An efficient key recovery attack on SIDH”) which might have had some non-trivial commercial implications if SIKE had got through the key exchange algorithm competition.

    Anyway, the Erdős problems are good candidates for llm work because they have been specified in a careful and formal way, which requires a reasonably competent mathematician to do. That then opens up mathematics to the same deskilling problem that other sectors afflicted with llms have, and because capitalism is shortsighted and stupid we don’t know what the future economic impact of that will be, right?





  • It isn’t clear to me at this point that such research will ever be funded in english-speaking places without a significant set of regime changes… no politician or administrator can resist outsourcing their own thinking to llm vendors in exchange for funding. I expect the US educational system will eventually provide a terrible warning to everyone (except the UK, whose government looks at the US and says “oh my god, that’s horrifying. How can we be more like that?”).

    I’m probably just feeling unreasonably pessimistic right now, though.



  • It is related, inasmuch as it’s all generated from the same prompt and the “answer” will be statistically likely to follow from the “reasoning” text. But it is only likely to follow, which is why you can sometimes see a lot of unrelated or incorrect guff in “reasoning” steps that’s misinterpreted as deliberate lying by ai doomers.

    I will confess that I don’t know what shapes the multiple “let me just check” or correction steps you sometimes see. It might just be a response stream that is shaped like self-checking. It is also possible that the response stream is fed through a separate llm session when then pushes its own responses into the context window before the response is finished and sent back to the questioner, but that would boil down to “neural networks pattern matching on each other’s outputs and generating plausible response token streams” rather than any sort of meaningful introspection.

    I would expect the actual systems used by the likes of openai to be far more full of hacks and bodges and work-arounds and let’s-pretend prompts that either you or I could imagine.


  • It’s just more llm output, in the style of “imagine you can reason about the question you’ve just been asked. Explain how you might have come about your answer.” It has no resemblance to how a neural network functions, nor to the output filters the service providers use.

    It’s how the ai doomers get themselves into a flap over “deceptive” models… “omg it lied about its train of thought!” because if course it didn’t lie, it just edited a stream of tokens that were statistically similar to something classified as reasoning during training.



  • I might be the only person here who thinks that the upcoming quantum bubble has the potential to deliver useful things (but boring useful things, and so harder to build hype on) but stuff like this particularly irritates me:

    https://quantumai.google/

    Quantum fucking ai? Motherfucker,

    • You don’t have ai, you have a chatbot
    • You don’t have a quantum computer, you have a tech demo for a single chip
    • Even if you had both of those things, you wouldn’t have “quantum ai”
    • if you have a very specialist and probably wallet-vaporisingly expensive quantum computer, why the hell would anyone want to glue an idiot chatbot to it, instead of putting it in the hands of competent experts who could actually do useful stuff with it?

    Best case scenario here is that this is how one department of Google get money out of the other bits of Google, because the internal bean counters cannot control their fiscal sphincters when someone says “ai” to them.