NCT06911645

Brief Summary

This study will assess the impact of immediate access to a customized version of GPT-4, a large language model, on performance in case-based diagnostic reasoning tasks. Specifically, it will compare this approach to a two-step process where participants first use traditional diagnostic decision support tools to support their diagnostic reasoning before gaining access to the customized GPT-4 model.

Trial Health

87
On Track

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Enrollment
70

participants targeted

Target at P50-P75 for not_applicable

Timeline
Completed

Started Dec 2024

Shorter than P25 for not_applicable

Geographic Reach
1 country

1 active site

Status
completed

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

Study Start

First participant enrolled

December 16, 2024

Completed
1 month until next milestone

Primary Completion

Last participant's last visit for primary outcome

January 24, 2025

Completed
Same day until next milestone

Study Completion

Last participant's last visit for all outcomes

January 24, 2025

Completed
18 days until next milestone

First Submitted

Initial submission to the registry

February 11, 2025

Completed
2 months until next milestone

First Posted

Study publicly available on registry

April 4, 2025

Completed
Last Updated

April 4, 2025

Status Verified

March 1, 2025

Enrollment Period

1 month

First QC Date

February 11, 2025

Last Update Submit

March 27, 2025

Conditions

Keywords

Computer-assisted diagnosisLarge language modelsClinical reasoning

Outcome Measures

Primary Outcomes (1)

  • Diagnostic reasoning

    The primary outcome will be the percentage of correct responses per case (range: 0 to 100). For each case, participants will be asked to provide their top three differential diagnoses, along with supporting and opposing findings for each. They will receive 1 point for each plausible diagnosis. Supporting and opposing findings will be graded based on correctness, with 1 point for a partially correct response and 2 points for a completely correct response. Participants will then select their top diagnosis, earning 1 point for a reasonable choice and 2 points for the most accurate diagnosis. Finally, they will list up to three next steps for further patient evaluation, with 1 point awarded for a partially correct response and 2 points for a completely correct response. The primary outcome will be analyzed at the case level, comparing performance between the randomized study groups.

    Through study completion, an average of 6 months

Secondary Outcomes (5)

  • Time Spent Per Case

    Through study completion, an average of 6 months

  • Prompt frequency

    Through study completion, an average of 6 months

  • Sentiment

    Through study completion, an average of 6 months

  • Participant Perceptions of AI in Clinical Reasoning

    Through study completion, an average of 6 months

  • Customized GPT-4's diagnostic reasoning

    Through study completion, an average of 6 months

Study Arms (2)

Immediate access to customized version of GPT-4

ACTIVE COMPARATOR

Group will be encouraged to immediately use a customized version of GPT-4.

Other: Immediate access to customized version of GPT-4

Conventional resources first, then granted access to customized version of GPT-4.

ACTIVE COMPARATOR

Group will be encouraged to first use any resources they wish besides large language models (UpToDate, Pubmed, google, etc) and then will be granted access to a customized version of GPT-4.

Other: Access to customized version of GPT-4 following use of conventional resources

Interventions

Group is given immediate access to a customized version of GPT-4 to support their diagnostic reasoning for each case.

Immediate access to customized version of GPT-4

Group is first encouraged to reason through diagnostic cases with the support of conventional resources. After they submit a case's answers they are then given access to a customized version of GPT-4 and have the opportunity to change their initial answers.

Conventional resources first, then granted access to customized version of GPT-4.

Eligibility Criteria

Sexall
Healthy VolunteersYes
Age GroupsChild (0-17), Adult (18-64), Older Adult (65+)

You may qualify if:

  • Participants must be licensed physicians and have completed at least post-graduate year 1 (PGY1) of medical training.
  • Training in Internal medicine, family medicine, or emergency medicine.

You may not qualify if:

  • Not currently practicing clinically.
  • Participated in one of our previous studies that used the same six diagnostic cases.

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Study Sites (1)

Stanford University

Palo Alto, California, 94305, United States

Location

MeSH Terms

Conditions

Pathologic ProcessesDisease

Condition Hierarchy (Ancestors)

Pathological Conditions, Signs and Symptoms

Study Officials

  • Jonathan H Chen, MD, PhD

    Stanford University

    PRINCIPAL INVESTIGATOR

Study Design

Study Type
interventional
Phase
not applicable
Allocation
RANDOMIZED
Masking
SINGLE
Who Masked
OUTCOMES ASSESSOR
Masking Details
The grading of responses will be performed by assessors blinded to participant identity and treatment assignment.
Purpose
DIAGNOSTIC
Intervention Model
PARALLEL
Model Details: The trial will be designed as a randomized, two-arm, single-blind parallel group study.
Sponsor Type
OTHER
Responsible Party
PRINCIPAL INVESTIGATOR
PI Title
Assistant Professor of Medicine (Biomedical Informatics) and of Biomedical Data Science

Study Record Dates

First Submitted

February 11, 2025

First Posted

April 4, 2025

Study Start

December 16, 2024

Primary Completion

January 24, 2025

Study Completion

January 24, 2025

Last Updated

April 4, 2025

Record last verified: 2025-03

Data Sharing

IPD Sharing
Will not share

Locations