• Home
  • Practice Focus
    • Facial Plastic/Reconstructive
    • Head and Neck
    • Laryngology
    • Otology/Neurotology
    • Pediatric
    • Rhinology
    • Sleep Medicine
    • How I Do It
    • TRIO Best Practices
  • Business of Medicine
    • Health Policy
    • Legal Matters
    • Practice Management
    • Tech Talk
    • AI
  • Literature Reviews
    • Facial Plastic/Reconstructive
    • Head and Neck
    • Laryngology
    • Otology/Neurotology
    • Pediatric
    • Rhinology
    • Sleep Medicine
  • Career
    • Medical Education
    • Professional Development
    • Resident Focus
  • ENT Perspectives
    • ENT Expressions
    • Everyday Ethics
    • From TRIO
    • The Great Debate
    • Letter From the Editor
    • Rx: Wellness
    • The Voice
    • Viewpoint
  • TRIO Resources
    • Triological Society
    • The Laryngoscope
    • Laryngoscope Investigative Otolaryngology
    • TRIO Combined Sections Meetings
    • COSM
    • Related Otolaryngology Events
  • Search

Large language Model AI Technology Is Being Used in Some Healthcare Applications, but Is It Any Good?

by Mary Beth Nierengarten • April 11, 2024

  • Tweet
  • Email
Print-Friendly Version

In March 2023, the latest version of ChatGPT debuted. Arguably the most recognized large language model (LLM) of the world, ChatGPT -4 continues the steady improvements made on earlier versions, with its enhanced capabilities to train on more inputs and newer data, enabling it to respond to increasingly broader queries with more accuracy.

You Might Also Like

  • Report on ChatGPT Provides Guidance on Effective and Ethical Use for Academic Writers
  • Closing the Digital Divide: Ensuring Patients Have Access to Healthcare Technology Is a Priority
  • Can ChatGPT Be Used for Patient Education?
  • Real-Time Telemedicine Model May Expand Otolaryngology Care to Remote Areas
Explore This Issue
April 2024

Months later, OpenAI, the company that owns ChatGPT, announced the availability of ChatGPT-4 Turbo, which, true to its name, is even faster, supports longer inputs, and is trained on more recent data (up to April 2023, compared to September 2021 in previous models). Multimodal applications integrating ChatGPT with the image-generating LLM Dall-E and, most recently, the video-generating Sora, expand the creative uses of this technology. Other LLMs, notably Microsoft’s Copilot and Google’s Gemini (previously named Bard), are also making headlines, further cementing the transformative potential of this technology, in much the same way that internet search engines did in the 1990s.

In the time between the writing and printing of this article, new versions or updates to these models and updated use applications will be in the news. Trying to keep up with the rapid evolution of this technology is challenging, only adding to the daunting task of implementation—particularly in a field like medicine, which relies so heavily on informed judgment and clinical expertise to fulfill its mission to first do no harm.

As with any powerful new technology, excitement over the real and potential benefits of LLMs within healthcare will need to be continually evaluated against real and potential risks. With the launch of ChatGPT for general usage, the time has arrived to weigh in on this balancing act as more people adopt the technology.

We definitely have crossed the threshold into a new era of AI, and we probably have already crossed the threshold of ubiquity of using AI on a day-to-day basis for a lot of individuals. — Alfred-Marc Iloreta, Jr., MD

“We definitely have crossed the threshold into a new era of AI, and we probably have already crossed the threshold of ubiquity of using AI on a day-to-day basis for a lot of individuals,” said Alfred-Marc Iloreta, Jr., MD, assistant professor of artificial intelligence and emerging technologies in the Graduate School of Biomedical Sciences at the Icahn School of Medicine at Mount Sinai Hospital in New York City, where he is also an assistant professor of otolaryngology–head and neck surgery and neurosurgery and co-directs the Endoscopic Skull Program.

Although uptake of AI in healthcare isn’t yet ubiquitous, as suggested by a December 2023 AMA survey of over 1,000 physicians in which only 38% of respondents said they currently use AI (AMA Augmented Intelligence Research. Nov. 2023), the easy access and no to low cost of software like ChatGPT assures its quick growth. Early users report its real-time benefits for everyday workflow activities such as administrative tasks and documentation (generating discharge notes, care plans, progress notes, clinical notes, and preauthorization letters), as was discussed in the in the March 2024 ENTtoday Tech Talk article. Other areas under investigation are conducting education and research and generating patient materials. For higher order clinical activities, such as diagnosis, triage, and treatment decisions, research is ongoing to understand the safe, beneficial uses and limitations of LLMs.

Below is a brief sampling of the research that’s underway on implementing LLMs, and ChatGPT in particular, in otolaryngology education and patient communications.

ChatGPT for Otolaryngology Education

Habib Zalzal, MD, assistant professor of otolaryngology and pediatrics at Children’s National Medical Center, The George Washington University, Washington, D.C., sees adoption happening at a rapid rate among medical students, residents, and attendings who in recent months have incorporated LLMs into their daily life for educational purposes like studying or journal club summaries. “With this mass adoption, it’s only a matter of time before ChatGPT or other browser-based LLMs become a daily habit in our work day,” he said.

He cautioned, however, that ChatGPT and other LLMs don’t and cannot replace traditional learning sources such as textbooks and journal articles but should be seen as a supplement to these. In particular, he’s concerned about an overreliance on ChatGPT for educating students who as yet don’t have the prerequisite medical knowledge base on which to build critical thinking and judgment. “Reliance on early ChatGPT versions, much like a bad habit, is harder to break if the proper knowledge base isn’t there,” he said.

Part of Dr. Zalzal’s caution comes from data showing the limitations of ChatGPT for educational purposes. Following reports showing the ability of ChatGPT to exceed the passing score of the medical licensing exam (PLOS Digit. Health. 2023. doi.org/10.1371/journal.pdig.0000198; Sci Rep. 2023. doi.org/10.1038/s41598-023-43436-9), he and his colleagues undertook a study to quantify how well ChatGPT 3.5 concurred with expert otolaryngologists when asked high-level questions requiring both rote memorization and critical thinking (OTO Open. 2023. doi:10.1002/oto2.94). The tool performed better on open-ended questions (56.7% accuracy) than on multiple-choice questions (43.3% accuracy), but, overall, wasn’t sufficient as a stand-alone educational tool. Its lower accuracy in answering multiple-choice questions was attributed to ChatGPT’s default in providing some form of answer even if it didn’t know the answer, which can easily generate a false or made-up response, called a hallucination.

“LLMs can sometimes generate plausible yet incorrect answers that may mislead or harm users,” he said. “Even if the training data were created using validated sources, the risk of hallucination by the AI model could lead to the spread of misinformation or misuse that cannot be easily controlled.”

Improved accuracy was reported in more recent studies using an LLM trained in a comprehensive knowledge database of otolaryngology-specific information integrated into ChatGPT-4 (JMIR Med Educ. 2024. doi:10.2196/49970. Lancet. 2023. doi:10.2139/ssrn.4571725 (preprint)). Called ChatENT, the model was developed by researchers at the University of Alberta and Copula AI and, according to the authors, is the first specialty-specific LLM in the medical field.

When challenged with practice questions for board certifying exams in Canada and the United States, ChatENT scored 87.2% for accuracy on open-ended, short answer questions and 80% for multiple-choice questions, outperforming ChatGPT-4 with fewer hallucinations and errors identified.

Lead author of the study, Cai Long, MD, an otolaryngology surgical resident at the University of Alberta, said the model, still in its early beta stage, is being continuously updated and improved and that its current state may not fully represent its most refined or comprehensive version. “Future iterations and research are expected to address these aspects, further enhancing the model’s robustness and applicability,” he said. Users who want to test the model can access it at https://www.chatent.net.

Long-cited potential applications of ChatENT include medical education, patient education, and clinical decision support, which, said Dr. Long, has yet to be studied for efficacy.

Eric Gantwerker, MD, MSc, MS, a pediatric otolaryngologist and associate professor at Northwell Health in New York City who regularly teaches students and faculty how to use ChatGPT, views apparent limitations such as hallucinations as part of educating students on the strengths and weaknesses of the technology. He also shows them how they can leverage it to test their own knowledge by judging the validity of outputs from the platform.

People don’t realize that with subsequent updated models, limitations like hallucinations are going to go away. — Eric Gantwerker, MD, MS

Pages: 1 2 3 | Single Page

Filed Under: Departments, Tech Talk Tagged With: AI, hatGPT-4 TurboIssue: April 2024

You Might Also Like:

  • Report on ChatGPT Provides Guidance on Effective and Ethical Use for Academic Writers
  • Closing the Digital Divide: Ensuring Patients Have Access to Healthcare Technology Is a Priority
  • Can ChatGPT Be Used for Patient Education?
  • Real-Time Telemedicine Model May Expand Otolaryngology Care to Remote Areas

The Triological SocietyENTtoday is a publication of The Triological Society.

Polls

Have you invented or patented something that betters the field of otolaryngology?

View Results

Loading ... Loading ...
  • Polls Archive

Top Articles for Residents

  • Applications Open for Resident Members of ENTtoday Edit Board
  • How To Provide Helpful Feedback To Residents
  • Call for Resident Bowl Questions
  • New Standardized Otolaryngology Curriculum Launching July 1 Should Be Valuable Resource For Physicians Around The World
  • Do Training Programs Give Otolaryngology Residents the Necessary Tools to Do Productive Research?
  • Popular this Week
  • Most Popular
  • Most Recent
    • The Dramatic Rise in Tongue Tie and Lip Tie Treatment

    • The Road Less Traveled—at Least by Otolaryngologists

    • Rating Laryngopharyngeal Reflux Severity: How Do Two Common Instruments Compare?

    • The Best Site for Pediatric TT Placement: OR or Office?

    • Otolaryngologists Are Still Debating the Effectiveness of Tongue Tie Treatment

    • The Dramatic Rise in Tongue Tie and Lip Tie Treatment

    • Rating Laryngopharyngeal Reflux Severity: How Do Two Common Instruments Compare?

    • Is Middle Ear Pressure Affected by Continuous Positive Airway Pressure Use?

    • Otolaryngologists Are Still Debating the Effectiveness of Tongue Tie Treatment

    • Complications for When Physicians Change a Maiden Name

    • Leaky Pipes—Time to Focus on Our Foundations
    • You Are Among Friends: The Value Of Being In A Group
    • How To: Full Endoscopic Procedures of Total Parotidectomy
    • How To: Does Intralesional Steroid Injection Effectively Mitigate Vocal Fold Scarring in a Rabbit Model?
    • What Is the Optimal Anticoagulation in HGNS Surgery in Patients with High-Risk Cardiac Comorbidities?

Follow Us

  • Contact Us
  • About Us
  • Advertise
  • The Triological Society
  • The Laryngoscope
  • Laryngoscope Investigative Otolaryngology
  • Privacy Policy
  • Terms of Use
  • Cookies

Wiley

Copyright © 2025 by John Wiley & Sons, Inc. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies. ISSN 1559-4939