A recent study revealed that ChatGPT outperforms trainee doctors in diagnosing respiratory diseases. The research, presented at the European Respiratory Society conference in Austria, also showed that Google’s Bard occasionally surpassed human performance, while Microsoft’s Bing matched the capabilities of the trainee doctors.
The findings suggest that large language models like ChatGPT could play a significant role in aiding medical professionals, potentially easing the burden on healthcare systems such as the NHS.
In the study, ten trainee doctors with less than four months of pediatric experience were given an hour to use the internet, excluding AI chatbots, to solve complex pediatric respiratory scenarios. These scenarios, crafted by experts, presented diagnostic challenges without obvious answers. The doctors’ responses were then compared to those generated by ChatGPT, Bard, and Bing.
ChatGPT version 3.5 achieved the highest score, noted for its more human-like responses compared to the other chatbots. Dr. Manjith Narayanan, a consultant in pediatric pulmonology at the Royal Hospital for Children and Young People in Edinburgh, who led the study, said, “These tools have several potential applications in medicine. My motivation for this research was to evaluate how well LLMs can assist clinicians in real-life scenarios.”
The study did not find any clear instances of AI “hallucinations,” where chatbots fabricate information. However, Dr. Narayanan cautioned that preventing such occurrences remains crucial. He noted that while Bing and Bard provided some irrelevant answers, so did the trainee doctors.
A recent survey by the Health Foundation, a UK healthcare think tank, revealed that over half of the British public and 75% of NHS staff support the use of AI in patient care. However, both groups expressed concerns that AI lacks the capacity for “real empathy” and “kindness,” which they see as significant drawbacks.
Commenting on the study, Professor Hilary Pinnock of the University of Edinburgh said, “It is encouraging, but perhaps also unsettling, to see a widely available AI tool like ChatGPT solving complex pediatric respiratory cases. This certainly hints at a future where AI-supported care could become the norm.”
Dr. Narayanan plans to further his research by testing chatbots against more experienced doctors and exploring more advanced language models.