Presentation
Human Factors Approach to Developing an AI-enabled Endoscopic Task Trainer
SessionPoster Session 2
DescriptionAcross healthcare domains, education programs utilize a variety of simulation modalities and techniques for training. Research comparing simulation modalities has highlighted that the appropriate simulation to achieve specific learning outcomes is context dependent, with no one modality superior for all training needs (North Atlantic Treaty Organization, 2021). Although educators strive to create the realism in simulation that is suitable to reach specific learning outcomes, currently available higher fidelity simulators may not be optimal in many circumstances (Norman, Dore, & Grierson, 2012). Often simulators with more capabilities and more realistic anatomy and physiology have tradeoffs, such as higher cost and less durability or portability (Bailey et al., 2023). This is an issue because, while there is a need for more practice prior to patient care, many healthcare training programs do not have the resources, space, or expertise to allow for extensive practice with higher cost simulations. With technological innovations becoming more widely available, such as 3D printing and artificial intelligence integration, simulators can approach high-fidelity capabilities at much lower costs.
Research suggests learning outcomes are better achieved by how closely a simulation aligns to the components of the task, that is, the functional-task alignment, rather than striving for the most human-like realism (Hamstra et al., 2014). Using a functional-task alignment conceptualization and human factors approaches, we developed a novel simulator for a common task undertaken by Speech Language Pathologists (SLP), the Fiberoptic Endoscopic Evaluation of Swallowing (FEES) exam, which is used to diagnose disordered swallowing (“dysphagia”). The FEES exam requires psychomotor tasks for scope handling, patient communication skills, and real-time clinical decision-making. During the exam, a clinician passes an endoscope through the patient’s nasal passage and pharynx in order to visualize the larynx, then guides the patient through swallowing a series of foods and liquids while observing for signs of aspiration and other swallowing abnormalities. The complexity and dynamic nature of this task is not currently modeled in commercial task trainers. Practice during SLP training is typically conducted with human volunteers (often other students in a training program) who may lack the range of dysphagia encountered in clinical practice, or high-fidelity manikins that replicate anatomy but are unable to exhibit the key behavior of swallowing and lack the visual fidelity to cue the next step in the exam, representing a lack of functional-task alignment. To address the need for more practice on the FEES exam and reduce practice with human volunteers, we developed a prototype FEES simulator using low-cost materials, including developing a 3D printed nasal passage model and AI-enabled computer vision to track the progress of the scope. We address the gap in functional task alignment of current trainers by integrating de-identified pre-recorded patient videos into our system. Combined with the AI tracking of the scope while learners perform the task, the videos provide a realistic visualization of human scoping that closely mirrors the actual task.
To design the FEES simulator prototype, we used a human factors approach from Cannon-Bowers et al. (2013), a healthcare simulation-specific cognitive task analysis (CTA). Based on the CTA method from Cannon-Bowers et al. (2013), for each step of a task we defined, 1) cues used to perform the action, 2) simulator requirements needed to perform that action, 3) typical errors students make on that step, 4) observable behaviors to know whether the student can perform that step, and 5) the cognitive demand of that step. First, we defined the actions of FEES starting with published guidelines (Langmore et al., 2022) and checklists (American Speech Language Hearing Association, 2019), and filled in steps not explicitly stated in published guidelines through consultation with a SLP subject matter expert. After listing all the actions, a human factors researcher and SLP clinician filled in the other components of the CTA for each action.
Following the CTA, the human factors researcher and SLP discussed with an engineer team what components of each action were feasible to include in a prototype. This included what type of material was most appropriate for 3D printing the nasal passage model to produce realistic haptic feel while considering different material costs. The engineer team also tested different computer vision approaches to achieve reliable tracking performance for smooth navigation of the endoscope in the 3D printed model. The SLP then outlined the user flow of the task with appropriate actions identified for the prototype, highlighting the steps that involved branching logic, or steps with different multiple action choices for the student. De-identified patient videos that aligned with the progression of the FEES exam were then integrated into the 3D model, allowing students to visualize each step as they advanced through the simulation. From these steps, the team created a minimum viable product (MVP) prototype of the FEES simulator from a 3D printed model, de-identified patient video of the FEES exam, and computer vision tracking of the scope. Actions that were not possible to simulate in the prototype were planned for future iterations of the simulator (i.e., haptic feedback of swallowing).
This human factors-driven development approach demonstrates how task analysis can inform the creation of targeted, cost-effective training solutions that address specific educational needs while remaining accessible to resource-constrained programs. The methodology provides a replicable framework for developing specialized medical simulation tools, particularly for procedures currently lacking commercial training alternatives.
Research suggests learning outcomes are better achieved by how closely a simulation aligns to the components of the task, that is, the functional-task alignment, rather than striving for the most human-like realism (Hamstra et al., 2014). Using a functional-task alignment conceptualization and human factors approaches, we developed a novel simulator for a common task undertaken by Speech Language Pathologists (SLP), the Fiberoptic Endoscopic Evaluation of Swallowing (FEES) exam, which is used to diagnose disordered swallowing (“dysphagia”). The FEES exam requires psychomotor tasks for scope handling, patient communication skills, and real-time clinical decision-making. During the exam, a clinician passes an endoscope through the patient’s nasal passage and pharynx in order to visualize the larynx, then guides the patient through swallowing a series of foods and liquids while observing for signs of aspiration and other swallowing abnormalities. The complexity and dynamic nature of this task is not currently modeled in commercial task trainers. Practice during SLP training is typically conducted with human volunteers (often other students in a training program) who may lack the range of dysphagia encountered in clinical practice, or high-fidelity manikins that replicate anatomy but are unable to exhibit the key behavior of swallowing and lack the visual fidelity to cue the next step in the exam, representing a lack of functional-task alignment. To address the need for more practice on the FEES exam and reduce practice with human volunteers, we developed a prototype FEES simulator using low-cost materials, including developing a 3D printed nasal passage model and AI-enabled computer vision to track the progress of the scope. We address the gap in functional task alignment of current trainers by integrating de-identified pre-recorded patient videos into our system. Combined with the AI tracking of the scope while learners perform the task, the videos provide a realistic visualization of human scoping that closely mirrors the actual task.
To design the FEES simulator prototype, we used a human factors approach from Cannon-Bowers et al. (2013), a healthcare simulation-specific cognitive task analysis (CTA). Based on the CTA method from Cannon-Bowers et al. (2013), for each step of a task we defined, 1) cues used to perform the action, 2) simulator requirements needed to perform that action, 3) typical errors students make on that step, 4) observable behaviors to know whether the student can perform that step, and 5) the cognitive demand of that step. First, we defined the actions of FEES starting with published guidelines (Langmore et al., 2022) and checklists (American Speech Language Hearing Association, 2019), and filled in steps not explicitly stated in published guidelines through consultation with a SLP subject matter expert. After listing all the actions, a human factors researcher and SLP clinician filled in the other components of the CTA for each action.
Following the CTA, the human factors researcher and SLP discussed with an engineer team what components of each action were feasible to include in a prototype. This included what type of material was most appropriate for 3D printing the nasal passage model to produce realistic haptic feel while considering different material costs. The engineer team also tested different computer vision approaches to achieve reliable tracking performance for smooth navigation of the endoscope in the 3D printed model. The SLP then outlined the user flow of the task with appropriate actions identified for the prototype, highlighting the steps that involved branching logic, or steps with different multiple action choices for the student. De-identified patient videos that aligned with the progression of the FEES exam were then integrated into the 3D model, allowing students to visualize each step as they advanced through the simulation. From these steps, the team created a minimum viable product (MVP) prototype of the FEES simulator from a 3D printed model, de-identified patient video of the FEES exam, and computer vision tracking of the scope. Actions that were not possible to simulate in the prototype were planned for future iterations of the simulator (i.e., haptic feedback of swallowing).
This human factors-driven development approach demonstrates how task analysis can inform the creation of targeted, cost-effective training solutions that address specific educational needs while remaining accessible to resource-constrained programs. The methodology provides a replicable framework for developing specialized medical simulation tools, particularly for procedures currently lacking commercial training alternatives.
Event Type
Poster Presentation
TimeTuesday, March 244:45pm - 6:15pm EDT
LocationRhinelander Gallery
Simulation and Education
