Artificial Intelligence

Oxford Study Warns AI Chatbots Are Unsafe for Medical Advice

OXFORD: AI chatbots pose risks to people seeking medical advice despite excelling at standardized medical knowledge tests, according to a study published in Nature Medicine by Oxford Internet Institute and Nuffield Department of Primary Care Health Sciences.

The randomized trial involving 1,298 UK participants found those using LLMs, GPT-4o, Llama 3, and Command R+made no better medical decisions than control groups using internet searches or personal judgment.

Study co-author Dr. Rebecca Payne, a GP and Clarendon-Reuben Doctoral Scholar, stated “Despite all the hype, AI just isn’t ready to take on the role of physician. Patients need to be aware that asking a large language model about their symptoms can be dangerous, giving wrong diagnoses and failing to recognize when urgent help is needed.”

Participants assessed ten medical scenarios developed by doctors, ranging from severe headaches after nights out to new mothers feeling constantly exhausted. They identified potential conditions and recommended courses of action like visiting GPs or attending A&E.

Researchers identified three key challenges: users didn’t know what information LLMs needed, models provided vastly different answers to slight question variations, and users struggled to distinguish good from bad information when both appeared together.

Lead author Andrew Bean noted “In this study, we show that interacting with humans poses a challenge even for top LLMs.”

Senior author Dr. Adam Mahdi emphasized, “The disconnect between benchmark scores and real-world performance should be a wake-up call. We cannot rely on standardized tests alone. AI systems need rigorous testing with diverse, real users.”

The research was supported by Prolific, Oxford’s AI Government and Policy Research Programme funded by Dieter Schwarz Stiftung, Royal Society, UKRI, and NIHR Oxford Biomedical Research Centre.

Amita Parul

Amita Parul is an Independent journalist with experience in reporting and commentary on current events and sociopolitical developments. She contributes original reporting and analysis that aligns with Tea4Tech’s editorial standards for accuracy, transparency, and context, focusing on business and technology trends. || Amita covers emerging news stories and provides explanatory insights that help readers understand both the events and their implications.

Recent Posts

University of Michigan AI System Interprets Brain MRI Scans in Seconds

ANN ARBOR: University of Michigan researchers developed an AI system that interprets brain MRI scans…

3 hours ago

Runway AI Raises $315Mn at $5.3Bn Valuation for World Models

NEW YORK: AI video generation startup Runway secured $315 million in Series E funding at…

6 hours ago

Supertails raises $30mn to scale petcare services in India

Bengaluru: Aiming to deepen its presence in the country’s fast-growing pet services market, the Indian…

9 hours ago

AI Boom Triggers Global Memory Chip Shortage, DRAM Prices Surge 600%

SAN FRANCISCO: A global memory chip shortage is driving unprecedented price increases, with the Dynamic…

1 day ago

India’s Sarvam AI Beats Google Gemini, ChatGPT on OCR Benchmarks

BENGALURU: Bengaluru-based startup Sarvam AI claims its models outperformed Google Gemini and ChatGPT on optical…

1 day ago

OpenAI Begins Testing Ads in ChatGPT for Free Users

SAN FRANCISCO: OpenAI launched ad testing in ChatGPT for U.S. users on Free and Go…

1 day ago