this post was submitted on 09 Feb 2026
554 points (98.9% liked)

Technology

80990 readers
4824 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn't ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

you are viewing a single comment's thread
view the rest of the comments
[–] theunknownmuncher@lemmy.world 18 points 1 day ago (1 children)

A statistical model of language isn't the same as medical training??????????????????????????

[–] scarabic@lemmy.world 5 points 1 day ago* (last edited 1 day ago) (2 children)

It’s actually interesting. They found the LLMs gave the correct diagnosis high-90-something percent of the time if they had access to the notes doctors wrote about their symptoms. But when thrust into the room, cold, with patients, the LLMs couldn’t gather that symptom info themselves.

[–] Hacksaw@lemmy.ca 5 points 1 day ago (2 children)

LLM gives correct answer when doctor writes it down first.... Wowoweewow very nice!

[–] scarabic@lemmy.world -1 points 10 hours ago

If you think there’s no work between symptoms and diagnosis, you’re dumber than you think LLMs are.

[–] tyler@programming.dev 1 points 19 hours ago (1 children)

You have misunderstood what they said.

[–] Hacksaw@lemmy.ca 2 points 15 hours ago (2 children)

If you seriously think the doctor's notes about the patient's symptoms don't include the doctor's diagnostic instincts then I can't help you.

The symptom questions ARE the diagnostic work. Your doctor doesn't ask you every possible question. You show up and you say "my stomach hurts". The Doctor asks questions to rule things out until there is only one likely diagnosis then they stop and prescribe you a solution if available. They don't just ask a random set of questions. If you give the AI the notes JUST BEFORE the diagnosis and treatment it's completely trivial to diagnose because the diagnostic work is already complete.

God you AI people literally don't even understand what skill, craft, trade, and art are and you think you can emulate them with a text predictor.

[–] SuspciousCarrot78@lemmy.world 2 points 11 hours ago* (last edited 11 hours ago) (1 children)

You're over-egging it a bit. A well written SOAP note, HPI etc should distill to a handful of possibilities, that's true. That's the point of them.

The fact that the llm can interpret those notes 95% as well as a medical trained individual (per the article) to come up with the correct diagnosis is being a little under sold.

That's not nothing. Actually, that's a big fucking deal (tm) if you think thru the edge case applications. And remember, these are just general LLMs - and pretty old ones at that (ChatGPT 4 era). Were not even talking medical domain specific LLM.

Yeah; I think there's more here to think on.

[–] XLE@piefed.social 1 points 11 hours ago (1 children)

If you think a word predictor is the same as a trained medical professional, I am so sorry for you...

[–] SuspciousCarrot78@lemmy.world 0 points 11 hours ago

Feel sorry for yourself. Your ignorance and biases are on full display.

[–] tyler@programming.dev 0 points 11 hours ago

Dude, I hate AI. I’m not an AI person. Don’t fucking classify me as that. You’re the one not reading the article and subsequently the study. It didn’t say it included the doctor’s diagnostic work. The study wasn’t about whether LLMs are accurate for doctors, that’s already been studied. The study this article talks about literally says that. Apparently LLMs are passing medical licensing exams almost 100% of the time, so it definitely has nothing to do with diagnostic notes. This study was about using LLMs to diagnose yourself. That’s it. That’s the study. Don’t spread bullshit. It’s tiring debunking stuff that is literally two sentences in.

https://www.nature.com/articles/s41591-025-04074-y

[–] SuspciousCarrot78@lemmy.world -1 points 15 hours ago* (last edited 3 hours ago)

Funny how people over look that bit enroute to dunk on LLMs.

If anything, that 90% result supports the idea that Garbage In = Garbage Out. I imagine a properly used domain-tuned medical model with structured inputs could exceed those results in some diagnostic settings (task-dependent).

Iirc, the 2024 Nobel prize in chemistry was won on the basis of using ML expert system to investigate protein folding. ML =! LLM but at the same time, let's not throw the baby out with the bathwater.

EDIT: for the lulz, I posted my above comment in my locally hosted bespoke llm. It politely called my bullshit out (Alpha fold is technically not an expert system, I didn't cite my source for Med-Palm 2 claims). As shown, not all llm are tuned sycophantic yes man; there might be a sliver of hope yet lol.


The statement contains a mix of plausible claims and minor logical inconsistencies. The core idea—that expert systems using ML can outperform simple LLMs in specific tasks—is reasonable.

However, the claim that "a properly used expert system LLM (Med-PALM-2) is even better than 90% accurate in differentials" is unsupported by the provided context and overreaches from the general "Garbage In = Garbage Out" principle.

Additionally, the assertion that the 2024 Nobel Prize in Chemistry was won "on the basis of using ML expert system to investigate protein folding" is factually incorrect; the prize was awarded for AI-assisted protein folding prediction, not an ML expert system per se.

Confidence: medium | Source: Mixed