Oxford
OXFORD: AI chatbots pose risks to people seeking medical advice despite excelling at standardized medical knowledge tests, according to a study published in Nature Medicine by Oxford Internet Institute and Nuffield Department of Primary Care Health Sciences.
However, advancements in AI-powered medical diagnostics continue to show promise in specialized areas like prostate cancer detection.
The randomized trial involving 1,298 UK participants found those using LLMs, GPT-4o, Llama 3, and Command R+made no better medical decisions than control groups using internet searches or personal judgment.
Study co-author Dr. Rebecca Payne, a GP and Clarendon-Reuben Doctoral Scholar, stated “Despite all the hype, AI just isn’t ready to take on the role of physician. Patients need to be aware that asking a large language model about their symptoms can be dangerous, giving wrong diagnoses and failing to recognize when urgent help is needed.”
Participants assessed ten medical scenarios developed by doctors, ranging from severe headaches after nights out to new mothers feeling constantly exhausted. They identified potential conditions and recommended courses of action like visiting GPs or attending A&E.
Researchers identified three key challenges: users didn’t know what information LLMs needed, models provided vastly different answers to slight question variations, and users struggled to distinguish good from bad information when both appeared together. At the same time, AI interpreting MRI scans highlights how machine learning can perform effectively in structured medical imaging environments.
Lead author Andrew Bean noted “In this study, we show that interacting with humans poses a challenge even for top LLMs.”
Senior author Dr. Adam Mahdi emphasized, “The disconnect between benchmark scores and real-world performance should be a wake-up call. We cannot rely on standardized tests alone. AI systems need rigorous testing with diverse, real users.”
The research was supported by Prolific, Oxford’s AI Government and Policy Research Programme funded by Dieter Schwarz Stiftung, Royal Society, UKRI, and NIHR Oxford Biomedical Research Centre.
New Delhi: Amazon has announced a fresh $13 billion investment in India focused on expanding…
TOKYO: Tokyo-based AI startup Sakana AI has introduced two new products, Fugu and Fugu Ultra,…
New Delhi: In a major leadership shake-up, Meta has appointed Kunal Shah, the founder of…
PALO ALTO, Calif.: Odyssey, an AI lab focused on building general-purpose AI world models, has…
SAN FRANCISCO: Baseten is closing in on a massive $1.5 billion funding round at a…
LUGANO, Switzerland: Prem AI, a Swiss startup building a self-hosted enterprise AI platform, is looking…