Context: I’m a second year medical student and currently residing in the deepest pit in the valley of the Dunning-Kruger graph, but am still constantly frustrated and infuriated with the push for introducing AI for quasi-self-diagnosis and loosening restrictions on inadequately educated providers like NP’s from the for-profit “schools”.

So, anyone else in a similar spot where you think you’re kinda dumb, but you know you’re still smarter than robots and people at the peak of the Dunning-Kruger graph in your field?

  • PatFusty@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    Wasn’t there a study that showed that some AI can give a better diagnosis than 99% of doctors? I’m too lazy to look it up.

    If that’s the case then yeah, you are probably dumber than the robot.

    Edit: I looked it up and back in 2020 there was a thing that showed 72%. there is another study for 2024 that supposedly says 97% but it’s behind a paywall.

    • medgremlinOP
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Here’s an article on one of the studies performed last year that showed that ChatGPT has, at best, a 64% chance of putting the correct diagnosis in its differential, and a 39% chance of getting the correct diagnosis as the top of their differential. Link to article: https://jamanetwork.com/journals/jama/fullarticle/2806457

      Here’s the article for the study they did using pediatric case studies: https://arstechnica.com/science/2024/01/dont-use-chatgpt-to-diagnose-your-kids-illness-study-finds-83-error-rate/ I was unable to get a link to the full PDF of the study in JAMA Pediatrics, but this article is a decent summary of it. The pediatric diagnosis success rate was 17%, and of the incorrect diagnoses, a substantial portion of them weren’t even in the same organ system as the correct diagnosis.

      As it stands, I would trust ChatGPT to be a scribe for a physician provided there is a sufficient speech-recognition system in place, but 64% in the best case scenarios is not a passing score for diagnosis of real humans.