Physician Reasoning on Diagnostic Cases With Large Language Models
Diagnostic Reasoning With Large Language Model Chat Bots
1 other identifier
interventional
50
1 country
1
Brief Summary
This study will evaluate the effect of providing access to GPT-4, a large language model, compared to traditional diagnostic decision support tools on performance on case-based diagnostic reasoning tasks.
Trial Health
Trial Health Score
Automated assessment based on enrollment pace, timeline, and geographic reach
participants targeted
Target at P25-P50 for not_applicable
Started Nov 2023
Shorter than P25 for not_applicable
1 active site
Health score is calculated from publicly available data and should be used for screening purposes only.
Trial Relationships
Click on a node to explore related trials.
Study Timeline
Key milestones and dates
First Submitted
Initial submission to the registry
November 27, 2023
CompletedStudy Start
First participant enrolled
November 29, 2023
CompletedFirst Posted
Study publicly available on registry
December 6, 2023
CompletedPrimary Completion
Last participant's last visit for primary outcome
December 30, 2023
CompletedStudy Completion
Last participant's last visit for all outcomes
December 30, 2023
CompletedFebruary 20, 2024
February 1, 2024
1 month
November 27, 2023
February 15, 2024
Conditions
Keywords
Outcome Measures
Primary Outcomes (1)
Diagnostic reasoning
The primary outcome will be the percent correct (range: 0 to 100) for each case. For each case, participants will be asked for three top diagnoses and findings from the case that support that diagnosis and oppose that diagnosis. Participants will receive 1 point for each plausible diagnosis. Findings supporting the diagnosis and findings opposing the diagnosis will also be graded based on correctness, with 1 point for partially correct and 2 points for completely correct responses. Participants will then be asked to name their top diagnosis, earning one point for a reasonable response and two points for the most correct response. Finally participants will be asked to name up to 3 next steps to further evaluate the patient with one point awarded for a partially correct response and two points for a completely correct response. The primary outcome will be compared on the case-level by the randomized groups.
During evaluation
Secondary Outcomes (1)
Time Spent on Diagnosis
During evaluation
Study Arms (2)
GPT-4
ACTIVE COMPARATORGroup will be given access to GPT-4.
Usual resources
NO INTERVENTIONGroup will not be given access to GPT-4 but will be encouraged to use any resources they wish besides large language models (UpToDate, Dynamed, google, etc).
Interventions
Eligibility Criteria
You may qualify if:
- Participants must be licensed physicians and have completed at least post-graduate year 2 (PGY2) of medical training.
- Training in Internal medicine, family medicine, or emergency medicine.
You may not qualify if:
- Not currently practicing clinically.
Contact the study team to confirm eligibility.
Sponsors & Collaborators
- Stanford Universitylead
- Beth Israel Deaconess Medical Centercollaborator
- University of Minnesotacollaborator
Study Sites (1)
Stanford University
Palo Alto, California, 94304, United States
Related Publications (1)
Goh E, Gallo R, Hom J, Strong E, Weng Y, Kerman H, Cool JA, Kanjee Z, Parsons AS, Ahuja N, Horvitz E, Yang D, Milstein A, Olson APJ, Rodman A, Chen JH. Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial. JAMA Netw Open. 2024 Oct 1;7(10):e2440969. doi: 10.1001/jamanetworkopen.2024.40969.
PMID: 39466245DERIVED
MeSH Terms
Conditions
Condition Hierarchy (Ancestors)
Study Officials
- PRINCIPAL INVESTIGATOR
Jonathan H Chen, MD, PhD
Stanford University
- PRINCIPAL INVESTIGATOR
Adam Rodman, MD
Beth Israel Deaconess Medical Center
- PRINCIPAL INVESTIGATOR
Andrew Olson, MD
University of Minnesota
Study Design
- Study Type
- interventional
- Phase
- not applicable
- Allocation
- RANDOMIZED
- Masking
- SINGLE
- Who Masked
- OUTCOMES ASSESSOR
- Masking Details
- The grading of responses will be performed by assessors blinded to participant identity and treatment assignment.
- Purpose
- DIAGNOSTIC
- Intervention Model
- PARALLEL
- Sponsor Type
- OTHER
- Responsible Party
- PRINCIPAL INVESTIGATOR
- PI Title
- Assistant Professor of Medicine
Study Record Dates
First Submitted
November 27, 2023
First Posted
December 6, 2023
Study Start
November 29, 2023
Primary Completion
December 30, 2023
Study Completion
December 30, 2023
Last Updated
February 20, 2024
Record last verified: 2024-02
Data Sharing
- IPD Sharing
- Will not share