Artificial intelligence (AI) applications are rapidly being investigated within the field of OHNS, and AI-based simulation tools may be able to bridge the gap between learning to perform FFL during SBT and performing FFL on a patient. Here, we describe the development and prospective pilot testing of machine learning (ML) software “Copilot” that uses a pretrained convolutional neural network for image processing of diagnostic laryngoscopy to help train novice medical students to competently perform FFL on a manikin and improve their uptake of FFL skills.
Explore This Issue
November 2025METHODS
This study was approved by the Johns Hopkins Institutional Review Board: IRB00343343.
OBJECTIVE
In defining the requirements of the AI Copilot, a team of two experienced otolaryngologists determined that when performed on a simplified model (AirSim Combo Bronchi X manikin (Fig. 1) (United Kingdom, TruCorp Ltd)), a basic FFL procedure consisted of:
- Entering the nasal cavity and navigating the nasal passage to reach the nasopharynx;
- Visualizing the soft palate;
- Visualizing the epiglottis and vallecula;
- Visualizing the vocal folds; and
- Withdrawing the scope. A computer scientist then translated these high-level requirements to specific capabilities the AI copilot would need:
- Identifying the optimal scoping path in the nasal passage;
- Identifying the scope’s location;
- Highlighting key anatomical structures; and
- Providing real-time feedback and navigational cues.
AI Copilot Architecture
To develop the AI Copilot, we used supervised machine learning, in which neural networks learn to predict output labels from human-labeled data. The AI Copilot consisted of two key machine learning components:
1. An image classifier model dubbed the “anatomical region classifier,” responsible for predicting the location of a camera in the upper airway.
2. An object detection model dubbed the “anatomical structure detector,” responsible for locating and identifying key anatomical structures in images. The outputs of these models were then filtered and time-averaged to reduce noise in the system. These outputs were then incorporated as inputs into the logic of a larger system that kept track of the state of the procedure and the camera location. Based on these inputs and the state of the system, instructions and cues were provided to the user via an overlay on the video feed.
To run the AI Copilot, the live video feed from an Ambu aView 2 Advance was transferred to a computer via an HDMI connection and a video capture card. Running a local FastAPI web server, the computer read in the video feed, processed it, and sent the modified feed, along with metadata, to a web browser through a webhook. This new video feed was displayed on the computer monitor, allowing the machine learning copilot to provide real-time cues to the user, overlaid on top of the raw FFL video.
Leave a Reply