ChatGPT o3 found a Linux Kernel vulnerability. "The future" has an 8% success rate, and a 28% chance of false positives.

wizardbeard@lemmy.dbzer0.com · edit-2 4 days ago

ChatGPT o3 found a Linux Kernel vulnerability. "The future" has an 8% success rate, and a 28% chance of false positives.

killingspark@feddit.org · 4 days ago

Trying to take anything positive from this:

Maybe someone with the skills of verifying a flagged code path now doesn’t have to roam the codebase for candidates? So while they still do the tedious work of verifying, the mundane task of finding candidates is now automatic?

Not sure if this is a real world usecase…

scruiser@awful.systems · 3 days ago

As the other comments have pointed out, an automated search for this category of bugs (done without LLMs) would do the same job much faster, with much less computational resources, without any bullshit or hallucinations in the way. The LLM isn’t actually a value add compared to existing tools.

ChatGPT o3 found a Linux Kernel vulnerability. "The future" has an 8% success rate, and a 28% chance of false positives.

ChatGPT o3 found a Linux Kernel vulnerability. "The future" has an 8% success rate, and a 28% chance of false positives.

How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation