Presentation
Human Factors of AI Integration in Breast Cancer Diagnosis: Trust, Perceptions, and Workflow Adoption
SessionPoster Session 1
DescriptionIntroduction: Breast cancer is among the most common cancers in women worldwide, with cases continuing to rise. Early detection through imaging improves outcomes but remains difficult due to heavy workloads, subtle findings, and variability in interpretation. Artificial intelligence (AI) has been introduced into clinical decision support systems to help address these challenges by assisting with image detection and classification. While AI shows promise in reducing errors and supporting radiologists, adoption depends on more than technical performance.
A major barrier is whether clinicians trust AI recommendations and feel confident using them. Trust is shaped not only by accuracy but also by the ability to understand and interpret the reasoning behind outputs. Explainability therefore plays a key role: clear, relevant explanations can build confidence, while poor designs risk confusion or reduced trust.
Most existing studies rely on surveys or static evaluations, which overlook how trust develops during real-world use. To address this gap, our study combines experimental tasks, behavioral measures, and interviews to examine how different explanation strategies affect clinicians’ trust in AI for breast cancer diagnosis. The goal is to generate human-centered insights to guide the design of more usable and trustworthy AI systems.
Methods:This mixed-methods study examined how clinicians perceive and adopt AI-based decision support systems for breast cancer diagnosis. We combined quantitative and qualitative approaches to capture both measurable changes in trust and performance, as well as the reasoning behind clinicians’ responses.
The experimental design followed an interrupted time series with one baseline and four intervention phases. In the baseline, clinicians diagnosed breast ultrasound cases independently. In the intervention phases, they received AI-generated suggestions, either without explanation or with added features such as confidence scores or highlighted image areas. After each case, participants rated their agreement and trust, and provided independent diagnoses when disagreeing with AI. Post-session surveys measured perceived accuracy, trust, clarity, and workload.
A total of 28 clinicians, including radiologists and oncologists, were recruited through professional networks. All completed the experimental tasks, and 11 participated in follow-up interviews, which explored their thought processes, expectations, and experiences with AI explanations.
Data included pre-experiment demographics, system logs of diagnostic decisions, and post-session survey responses. experiment demographics, system logs of diagnostic decisions, and post-session survey responses. Quantitative data were analyzed using statistical methods, including mixed-effects models, to assess trends in trust, performance, and workload. Interview transcripts were thematically analyzed to identify recurring patterns. This integrated approach linked behavioral outcomes with personal reflections, providing a richer view of how explainability shapes trust and adoption.
Ethical approval for the study was obtained from the Stevens Institute of Technology institutional review board (IRB ID 2024–001 (N)), and all participants provided informed consent prior to participation.
Result: Survey responses showed that clinicians generally evaluated both AI and explainable AI (XAI) conditions positively. Among all phases, Intervention 1 received the highest ratings for understandability, richness of information, and usefulness in guiding trust, while also producing the lowest levels of mental demand and stress. Intervention 3 also performed well in terms of trust and clarity but did not reduce workload as effectively. These findings suggest that while explanations may improve perceptions of transparency, simpler AI interfaces can be equally or more effective by maintaining usability without adding cognitive burden.
Statistical models confirmed these trends. Compared to the baseline AI condition, trust significantly decreased when clinicians diagnosed cases without AI support, but adding explanations did not meaningfully change trust levels. Similarly, diagnostic accuracy improved with AI assistance but was not further enhanced by explanations. Trust in AI was strongly predicted by perceived accuracy and the richness of information, while workload factors such as stress and mental demand showed no effect. Importantly, trust itself emerged as a significant predictor of diagnostic performance, indicating that clinicians who trusted the AI more were also more accurate in their decisions.
The qualitative analysis revealed eight themes that shed light on clinicians’ experiences and expectations of AI in breast cancer diagnosis.
Theme 1: Understanding and Perception of AI. Clinicians often struggled to distinguish AI from earlier tools like computer-aided detection, expressing uncertainty about its added value. They emphasized the need for clearer terminology and communication about AI’s specific capabilities, limitations, and intended role in practice.
Theme 2: AI as a Decision Support, Not a Replacement. Participants consistently viewed AI as a supportive tool rather than an autonomous system. They stressed the importance of protocols and policies to guide safe and ethical use, highlighted its value in augmenting expertise by providing a second opinion, and warned against over-reliance that could undermine clinical judgment.
Theme 3: AI Impact on the Radiology Workforce. Concerns emerged about AI threatening job security and discouraging trainees from entering radiology. Some feared overdependence could erode diagnostic skills, while exaggerated claims of replacement were seen as fueling anxiety within the field.
Theme 4: AI Helpfulness. Clinicians believed AI could be especially useful for trainees and less experienced users by reinforcing learning and boosting confidence. For experienced physicians, AI’s value lay more in validating decisions or supporting complex cases rather than routine tasks.
Theme 5: AI in Clinical Practice. Participants envisioned applications ranging from early cancer detection and case prioritization to tracking disease progression and managing information overload. Many saw AI as a way to save time on routine cases and help clinicians stay current with new medical knowledge.
Theme 6: Usability and Accessibility. Integration was emphasized as critical; clinicians wanted AI embedded directly into systems like PACS without extra steps or logins. They valued user-friendly, intuitive designs that presented information clearly. At the same time, financial and resource barriers, including licensing and implementation costs, were highlighted as significant obstacles.
Theme 7: Impact on Workload. Clinicians described both potential efficiency gains and workload relief, particularly in high-volume settings, as well as risks of time penalties if systems added complexity. Poorly integrated AI could create delays rather than reduce burden.
Theme 8: Factors Influencing Trust in AI. Trust emerged as the central theme. Clinicians emphasized the importance of peer and institutional endorsements, accountability, and rigorous validation through clinical trials. Trust was described as evolving over time through repeated positive experiences. Many also noted that explanations could improve trust but only when clear, actionable, and adaptable to different expertise levels.
Conclusion: This study highlights that while clinicians see strong potential for AI to support breast cancer diagnosis, their trust and adoption depend on more than technical performance. Across experiments and interviews, participants valued AI most when it provided accurate and relevant outputs, integrated smoothly into existing workflows, and reinforced rather than replaced their clinical judgment. Explanations alone did not guarantee higher trust or improved performance; in fact, simpler, streamlined interfaces were often rated more useful and less demanding than explanation-heavy ones. The strongest predictors of trust were perceived accuracy and clarity of information, which in turn enhanced diagnostic performance, creating a positive cycle between trust and effectiveness. Concerns about job security, skill erosion, and overreliance remain important barriers, underscoring the need for clear protocols, thoughtful integration, and transparent communication about AI’s role. The key takeaway for human factors researchers and system designers is that trust in clinical AI is dynamic and context-driven. Effective adoption requires a user-centered approach that prioritizes clarity, usability, and workflow compatibility over sheer complexity. Designing AI as a supportive partner, flexible, accurate, and transparent, offers the best path toward building clinician confidence and promoting responsible integration into healthcare practice.
A major barrier is whether clinicians trust AI recommendations and feel confident using them. Trust is shaped not only by accuracy but also by the ability to understand and interpret the reasoning behind outputs. Explainability therefore plays a key role: clear, relevant explanations can build confidence, while poor designs risk confusion or reduced trust.
Most existing studies rely on surveys or static evaluations, which overlook how trust develops during real-world use. To address this gap, our study combines experimental tasks, behavioral measures, and interviews to examine how different explanation strategies affect clinicians’ trust in AI for breast cancer diagnosis. The goal is to generate human-centered insights to guide the design of more usable and trustworthy AI systems.
Methods:This mixed-methods study examined how clinicians perceive and adopt AI-based decision support systems for breast cancer diagnosis. We combined quantitative and qualitative approaches to capture both measurable changes in trust and performance, as well as the reasoning behind clinicians’ responses.
The experimental design followed an interrupted time series with one baseline and four intervention phases. In the baseline, clinicians diagnosed breast ultrasound cases independently. In the intervention phases, they received AI-generated suggestions, either without explanation or with added features such as confidence scores or highlighted image areas. After each case, participants rated their agreement and trust, and provided independent diagnoses when disagreeing with AI. Post-session surveys measured perceived accuracy, trust, clarity, and workload.
A total of 28 clinicians, including radiologists and oncologists, were recruited through professional networks. All completed the experimental tasks, and 11 participated in follow-up interviews, which explored their thought processes, expectations, and experiences with AI explanations.
Data included pre-experiment demographics, system logs of diagnostic decisions, and post-session survey responses. experiment demographics, system logs of diagnostic decisions, and post-session survey responses. Quantitative data were analyzed using statistical methods, including mixed-effects models, to assess trends in trust, performance, and workload. Interview transcripts were thematically analyzed to identify recurring patterns. This integrated approach linked behavioral outcomes with personal reflections, providing a richer view of how explainability shapes trust and adoption.
Ethical approval for the study was obtained from the Stevens Institute of Technology institutional review board (IRB ID 2024–001 (N)), and all participants provided informed consent prior to participation.
Result: Survey responses showed that clinicians generally evaluated both AI and explainable AI (XAI) conditions positively. Among all phases, Intervention 1 received the highest ratings for understandability, richness of information, and usefulness in guiding trust, while also producing the lowest levels of mental demand and stress. Intervention 3 also performed well in terms of trust and clarity but did not reduce workload as effectively. These findings suggest that while explanations may improve perceptions of transparency, simpler AI interfaces can be equally or more effective by maintaining usability without adding cognitive burden.
Statistical models confirmed these trends. Compared to the baseline AI condition, trust significantly decreased when clinicians diagnosed cases without AI support, but adding explanations did not meaningfully change trust levels. Similarly, diagnostic accuracy improved with AI assistance but was not further enhanced by explanations. Trust in AI was strongly predicted by perceived accuracy and the richness of information, while workload factors such as stress and mental demand showed no effect. Importantly, trust itself emerged as a significant predictor of diagnostic performance, indicating that clinicians who trusted the AI more were also more accurate in their decisions.
The qualitative analysis revealed eight themes that shed light on clinicians’ experiences and expectations of AI in breast cancer diagnosis.
Theme 1: Understanding and Perception of AI. Clinicians often struggled to distinguish AI from earlier tools like computer-aided detection, expressing uncertainty about its added value. They emphasized the need for clearer terminology and communication about AI’s specific capabilities, limitations, and intended role in practice.
Theme 2: AI as a Decision Support, Not a Replacement. Participants consistently viewed AI as a supportive tool rather than an autonomous system. They stressed the importance of protocols and policies to guide safe and ethical use, highlighted its value in augmenting expertise by providing a second opinion, and warned against over-reliance that could undermine clinical judgment.
Theme 3: AI Impact on the Radiology Workforce. Concerns emerged about AI threatening job security and discouraging trainees from entering radiology. Some feared overdependence could erode diagnostic skills, while exaggerated claims of replacement were seen as fueling anxiety within the field.
Theme 4: AI Helpfulness. Clinicians believed AI could be especially useful for trainees and less experienced users by reinforcing learning and boosting confidence. For experienced physicians, AI’s value lay more in validating decisions or supporting complex cases rather than routine tasks.
Theme 5: AI in Clinical Practice. Participants envisioned applications ranging from early cancer detection and case prioritization to tracking disease progression and managing information overload. Many saw AI as a way to save time on routine cases and help clinicians stay current with new medical knowledge.
Theme 6: Usability and Accessibility. Integration was emphasized as critical; clinicians wanted AI embedded directly into systems like PACS without extra steps or logins. They valued user-friendly, intuitive designs that presented information clearly. At the same time, financial and resource barriers, including licensing and implementation costs, were highlighted as significant obstacles.
Theme 7: Impact on Workload. Clinicians described both potential efficiency gains and workload relief, particularly in high-volume settings, as well as risks of time penalties if systems added complexity. Poorly integrated AI could create delays rather than reduce burden.
Theme 8: Factors Influencing Trust in AI. Trust emerged as the central theme. Clinicians emphasized the importance of peer and institutional endorsements, accountability, and rigorous validation through clinical trials. Trust was described as evolving over time through repeated positive experiences. Many also noted that explanations could improve trust but only when clear, actionable, and adaptable to different expertise levels.
Conclusion: This study highlights that while clinicians see strong potential for AI to support breast cancer diagnosis, their trust and adoption depend on more than technical performance. Across experiments and interviews, participants valued AI most when it provided accurate and relevant outputs, integrated smoothly into existing workflows, and reinforced rather than replaced their clinical judgment. Explanations alone did not guarantee higher trust or improved performance; in fact, simpler, streamlined interfaces were often rated more useful and less demanding than explanation-heavy ones. The strongest predictors of trust were perceived accuracy and clarity of information, which in turn enhanced diagnostic performance, creating a positive cycle between trust and effectiveness. Concerns about job security, skill erosion, and overreliance remain important barriers, underscoring the need for clear protocols, thoughtful integration, and transparent communication about AI’s role. The key takeaway for human factors researchers and system designers is that trust in clinical AI is dynamic and context-driven. Effective adoption requires a user-centered approach that prioritizes clarity, usability, and workflow compatibility over sheer complexity. Designing AI as a supportive partner, flexible, accurate, and transparent, offers the best path toward building clinician confidence and promoting responsible integration into healthcare practice.
Event Type
Poster Presentation
TimeMonday, March 234:45pm - 6:15pm EDT
LocationRhinelander Gallery
Digital Health
