Brief Summary

This study will assess the impact of immediate access to a customized version of GPT-4, a large language model, on performance in case-based diagnostic reasoning tasks. Specifically, it will compare this approach to a two-step process where participants first use traditional diagnostic decision support tools to support their diagnostic reasoning before gaining access to the customized GPT-4 model.

Trial Health

On Track

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Enrollment

participants targeted

Target at P50-P75 for not_applicable

Timeline

Completed

Started Dec 2024

Shorter than P25 for not_applicable

Geographic Reach

1 country

1 active site

Status

completed

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

1 month study duration

Study Start

First participant enrolled

December 16, 2024

Completed

1 month until next milestone

Primary Completion

Last participant's last visit for primary outcome

January 24, 2025

Completed

Same day until next milestone

Study Completion

Last participant's last visit for all outcomes

January 24, 2025

Completed

18 days until next milestone

First Submitted

Initial submission to the registry

February 11, 2025

Completed

2 months until next milestone

First Posted

Study publicly available on registry

April 4, 2025

Completed

Last Updated

April 4, 2025

Status Verified

March 1, 2025

Enrollment Period

1 month

First QC Date

February 11, 2025

Last Update Submit

March 27, 2025

Conditions

Pathologic Processes Disease

Keywords

Computer-assisted diagnosisLarge language modelsClinical reasoning

Outcome Measures

Primary Outcomes (1)

Diagnostic reasoning
The primary outcome will be the percentage of correct responses per case (range: 0 to 100). For each case, participants will be asked to provide their top three differential diagnoses, along with supporting and opposing findings for each. They will receive 1 point for each plausible diagnosis. Supporting and opposing findings will be graded based on correctness, with 1 point for a partially correct response and 2 points for a completely correct response. Participants will then select their top diagnosis, earning 1 point for a reasonable choice and 2 points for the most accurate diagnosis. Finally, they will list up to three next steps for further patient evaluation, with 1 point awarded for a partially correct response and 2 points for a completely correct response. The primary outcome will be analyzed at the case level, comparing performance between the randomized study groups.
Through study completion, an average of 6 months

Secondary Outcomes (5)

Time Spent Per Case
Through study completion, an average of 6 months
Prompt frequency
Through study completion, an average of 6 months
Sentiment
Through study completion, an average of 6 months
Participant Perceptions of AI in Clinical Reasoning
Through study completion, an average of 6 months
Customized GPT-4's diagnostic reasoning
Through study completion, an average of 6 months

Study Arms (2)

Immediate access to customized version of GPT-4

ACTIVE COMPARATOR

Group will be encouraged to immediately use a customized version of GPT-4.

Other: Immediate access to customized version of GPT-4

Conventional resources first, then granted access to customized version of GPT-4.

ACTIVE COMPARATOR

Group will be encouraged to first use any resources they wish besides large language models (UpToDate, Pubmed, google, etc) and then will be granted access to a customized version of GPT-4.

Other: Access to customized version of GPT-4 following use of conventional resources

Interventions

Immediate access to customized version of GPT-4OTHER

Group is given immediate access to a customized version of GPT-4 to support their diagnostic reasoning for each case.

Immediate access to customized version of GPT-4

Access to customized version of GPT-4 following use of conventional resourcesOTHER

Group is first encouraged to reason through diagnostic cases with the support of conventional resources. After they submit a case's answers they are then given access to a customized version of GPT-4 and have the opportunity to change their initial answers.

Conventional resources first, then granted access to customized version of GPT-4.

Eligibility Criteria

Sexall

Healthy VolunteersYes

Age GroupsChild (0-17), Adult (18-64), Older Adult (65+)

You may qualify if:

Participants must be licensed physicians and have completed at least post-graduate year 1 (PGY1) of medical training.
Training in Internal medicine, family medicine, or emergency medicine.

You may not qualify if:

Not currently practicing clinically.
Participated in one of our previous studies that used the same six diagnostic cases.

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Stanford Universitylead
Beth Israel Deaconess Medical Centercollaborator

Study Sites (1)

Stanford University

Palo Alto, California, 94305, United States

Location

MeSH Terms

Conditions

Pathologic ProcessesDisease

Condition Hierarchy (Ancestors)

Pathological Conditions, Signs and Symptoms

Study Officials

Jonathan H Chen, MD, PhD
Stanford University
PRINCIPAL INVESTIGATOR

Study Design

Study Type: interventional
Phase: not applicable
Allocation: RANDOMIZED
Masking: SINGLE
Who Masked: OUTCOMES ASSESSOR
Masking Details: The grading of responses will be performed by assessors blinded to participant identity and treatment assignment.
Purpose: DIAGNOSTIC
Intervention Model: PARALLEL
Sponsor Type: OTHER
Responsible Party: PRINCIPAL INVESTIGATOR
PI Title: Assistant Professor of Medicine (Biomedical Informatics) and of Biomedical Data Science

Study Record Dates

First Submitted

February 11, 2025

First Posted

April 4, 2025

Study Start

December 16, 2024

Primary Completion

January 24, 2025

Study Completion

January 24, 2025

Last Updated

April 4, 2025

Record last verified: 2025-03

Data Sharing

IPD Sharing: Will not share

Locations

US(1)

Brief Summary

Trial Health

Trial Health Score

Trial Relationships

Related Scientific Literature

Study Timeline

Study Start

Primary Completion

Study Completion

First Submitted

First Posted

Conditions

Keywords

Outcome Measures

Primary Outcomes (1)

Secondary Outcomes (5)

Study Arms (2)

Immediate access to customized version of GPT-4

Conventional resources first, then granted access to customized version of GPT-4.

Interventions

Eligibility Criteria

You may qualify if:

You may not qualify if:

Sponsors & Collaborators

Study Sites (1)

MeSH Terms

Conditions

Condition Hierarchy (Ancestors)

Study Officials

Study Design

Study Record Dates

Data Sharing

Locations