Close

Presentation

Checklist Quality Is More Than Compliance: Insights from Surgical Teams Across Five Hospitals
DescriptionBackground

The World Health Organization’s (WHO) Surgical Safety Checklist (SSC) is a globally adopted 19-item tool designed to reduce preventable surgical harm by structuring team communication at three pause points: briefing (before anesthesia), timeout (before incision), and debrief (before leaving the operating room). WHO explicitly encourages local adaptation of how the checklist is performed to fit the local context while preserving core safety functions, leading to SSC conduct variants from region to region and hospital to hospital.

Public data consistently demonstrate near-perfect self-reported hospital SSC compliance. Yet, SSC-associated patient outcomes have not mirrored this apparent success. While early WHO-led studies showed promising improvements among early adopters, a 2014 landmark study of over 100 Ontario hospitals published in NEJM found no significant improvement in surgical mortality or complication rates following SSC implementation. Furthermore, numerous single-site observational studies have failed to replicate these high self-reported compliance rates. These findings highlight the need for empirical evaluation of SSC use, as self-reported compliance may not reflect practice, and teams may not be engaging with the SSC as intended, limiting its effectiveness and impact on patient outcomes.

The combination of known variation in SSC conduct and the persistent gap between work as imagined (self-reported compliance) and work as done (SSC use in operating rooms (ORs) underscores the need to observe practice across hospitals. Such observations not only allow for critical evaluation of current SSC practices but also can help identify features of high-quality SSC use and surface transferable practices to better align each site’s work as imagined and work as done.

Prior multi-site observational studies of the SSC are scarce, often focusing on compliance rate differences without detailing how the SSC is actually conducted or integrating qualitative insights with quantitative measures. To address this gap, we conducted a mixed-methods study. OR observations across five hospitals were undertaken to capture specific SSC conduct patterns (participation, cognitive aid access, step-by-step conduct). These qualitative features were then linked to quantitative SSC metrics (initiation and item completion rates) at each pause point. This approach is designed to generate context-sensitive insights for quality improvement and offer a more sensitive assessment of SSC use beyond binary compliance.


Objectives

The objectives of this study were to 1) empirically assess SSC pause point and item compliance rates across multiple hospitals, to quantify variation; 2) investigate how observed differences in SSC conduct influence SSC quality (SSC usage that is consistent with the facilitation of structured team discussions), and 3) identify actionable targets for future SSC quality improvement efforts.


Methods

We conducted a cross-sectional, multi-site observational study using the OR Black Box®. The system captures and synchronizes multiple OR audio-video streams. Observations were drawn from five large hospital systems in North America.

Human factors specialists reviewed complete surgical cases, documenting all performed pause points and manually transcribing observed SSC conduct into a structured Excel spreadsheet. For each pause point, we captured: whether it occurred, duration, team members present, who initiated and who led the pause, use and availability of cognitive aids, and item-level verbalization of core checklist items (e.g, patient identification, procedure name, allergies, antibiotics, equipment/implants, team/patient concerns). Additionally, qualitative descriptors of SSC conduct were captured and summarized for each pause point and site. These included observations on checklist sequence and flow, inclusivity of participation, and cognitive aid ergonomics (e.g., format, accessibility, workflow fit).

Quantitative and qualitative data were then synthesized to produce pause point- and site-specific summaries. Qualitative insights were used to help explain variation in quantitative metrics and to identify key features associated with high-quality SSC conduct.


Preliminary Results

As of September 10, 2025, 136 cases have been observed (number of cases per hospital (nhospital number) coded: n1 = 28, n2 = 30, n3 = 16, n4 = 32, n5 = 30). Observations will continue until n3 reaches 30 cases, for a total of 150 cases.

Quantitative metrics revealed substantial variation across hospitals. Briefing was initiated most consistently (mean 97%; range 93-100%), followed by time-out (mean 86%; range 71-100%) and debrief (mean 63%; range 30-100%). Overall, 55% of cases met the standard compliance definition (i.e., all three pause points were initiated irrespective of quality), with site-level compliance ranging from 30% to 94%.

Interprofessional involvement (defined as two or more team members from different professional roles verbally contributing to checklist items) was associated with superior SSC quality. Interprofessional involvement was generally low (<35% of most site-pause combinations) but higher at four site-pause combinations. At hospital 1, 74% of briefings and 63% of debriefs saw interprofessional involvement. Interprofessional involvement during timeouts was similarly high at Hospitals 3 (47%) and 4 (60%), and both completed timeouts in 100% of observed cases. Additionally, these pause points demonstrated interprofessional involvement during checklist initiation. At Hospital 3, the surgeon signalled team readiness to proceed, prompting anesthesia to initiate timeout. At Hospital 4, the circulator initiated the process by presenting the patient's chart to the surgeon. These four pause points demonstrated more complete item coverage and consistent discussion of team and patient concerns compared to other sites.

Use of SSC cognitive aids varied by hospital and pause point (0-100%). Across all sites, surgical teams consistently used a cognitive aid during only one pause point per case. Use of cognitive aids was associated with higher item completion rates. Hospital 1’s aid was integrated into the patient chart and used in 59% of briefings. Hospitals 3 and 4’s aid took the form of a large poster attached to the walls, and was used primarily during debriefs (93%) and timeouts (81%), respectively. Hospital 5 featured a built-in OR TV monitor that displayed pertinent patient information and was used during 100% of timeouts.

At Hospitals 2, 4, and 5, debrief practices deviated from both WHO guidance and those observed at Hospitals 1 and 3. Instead of facilitating team discussion, debriefs were often reduced to an electronic documentation task, typically completed solely by the attending surgeon and a circulating nurse. These sites also had lower debrief initiation rates (30-47%) compared to the rates at Hospitals 1 and 3 (96-100%).


Discussion

Across five hospitals, differences in how teams conducted the SSC may help explain both the variation in completion and why some hospitals excel in specific pause points. Two conduct features stood out. First, interprofessional involvement and shared leadership were common in higher-performing pauses. Teams that assigned checklist responsibilities (e.g., assigning specific roles or separating pause initiation from facilitation) were associated with more consistent initiation and higher item completion rates. Second, the usability and accessibility of cognitive aids strongly influenced their uptake, which in turn was associated with higher item coverage. Notably, Hospital 3 had implemented a human factors-informed usability redesign of their SSC process in 2023, and showed the highest rates of standard compliance (94%). In contrast, treating the debrief as a documentation “checkbox”, rather than a team discussion, was associated with lower initiation rates and limited engagement.

These findings point to practical targets for improving SSC quality. Assigning item ownership by professional role can promote accountability and consistency. Standardizing who prompts and who leads each pause point, without requiring them to be the same person, supports shared leadership. Cognitive aids should be clearly visible and easily accessible at the point of use to facilitate adoption. Debriefs should be protected as in-OR team discussions, not reduced to post-operative documentation. For meaningful measurement, move beyond overall compliance rates to include phase-specific initiation, item-level verbalization, interprofessional participation, and cognitive aid use. We hope these insights encourage hospitals to critically evaluate how the SSC is actually used in their settings, and whether it aligns with its intended purpose.
Event Type
Oral Presentations
TimeMonday, March 232:30pm - 3:00pm EDT
LocationMurray Hill West
Tracks
Hospital Environments