NCT06963957

Brief Summary

This study aims to systematically measure the extent and patterns of automation bias among physicians when utilizing ChatGPT-4o in clinical decision-making.

Trial Health

87
On Track

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Enrollment
44

participants targeted

Target at P25-P50 for not_applicable

Timeline
Completed

Started Jun 2025

Shorter than P25 for not_applicable

Geographic Reach
1 country

1 active site

Status
completed

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

First Submitted

Initial submission to the registry

April 23, 2025

Completed
16 days until next milestone

First Posted

Study publicly available on registry

May 9, 2025

Completed
1 month until next milestone

Study Start

First participant enrolled

June 20, 2025

Completed
2 months until next milestone

Primary Completion

Last participant's last visit for primary outcome

August 15, 2025

Completed
Same day until next milestone

Study Completion

Last participant's last visit for all outcomes

August 15, 2025

Completed
Last Updated

August 22, 2025

Status Verified

August 1, 2025

Enrollment Period

2 months

First QC Date

April 23, 2025

Last Update Submit

August 21, 2025

Conditions

Keywords

clinical reasoninglarge language modelsautomation biascomputer-assisted diagnosis

Outcome Measures

Primary Outcomes (1)

  • Diagnostic reasoning

    The primary outcome will be the percent correct for each case, ranging from 0 to 100%, where higher scores indicate better diagnostic performance. For each case, participants will be asked for their three leading diagnoses, findings that support each diagnosis, and findings that oppose each diagnosis. For each plausible diagnosis, participants will receive 1 point. Findings supporting the diagnosis and findings opposing the diagnosis will also be graded based on correctness, with 1 point for each correct response. Participants will then be asked to name their top diagnosis they believe is most likely, earning 9 points for a reasonable response and 18 points for the most accurate response. Finally participants will be asked to name up to 3 next steps to further evaluate the patient with 0.5 point awarded for a partially correct response and 1 point for a completely correct response. The primary outcome will be compared at the case-level between the randomized groups.

    Assessed at a single time point for each case, during the scheduled diagnostic reasoning evaluation session, which takes place between 0-4 days after participant enrollment.

Secondary Outcomes (1)

  • Top choice diagnosis accuracy score

    Assessed at a single time point for each case, during the scheduled diagnostic reasoning evaluation session, which takes place between 0-4 days after participant enrollment.

Study Arms (2)

ChatGPT-4o Recommendations with Hallucinations

ACTIVE COMPARATOR

Participants will evaluate six clinical vignettes. During the trial, they will have access to clinical recommendations from a specific, commercially available LLM (ChatGPT-4o) in addition to conventional diagnostic resources. LLM recommendations for three vignettes will contain deliberately flawed diagnostic information and for three vignettes it will contain accurate recommendations). The cases will be presented in random order.

Other: ChatGPT-4o Recommendations with Hallucinations

ChatGPT-4o Recommendations without Hallucinations

NO INTERVENTION

Participants will evaluate the same six clinical vignettes as in the intervention arm. During the trial, they will have access to clinical recommendations from a specific, commercially available LLM (ChatGPT-4o) in addition to conventional diagnostic resources. However, the LLM-generated recommendations will not contain any deliberately introduced errors. The cases will be presented in random order.

Interventions

ChatGPT-4o's differential diagnoses of six clinical vignettes, three of which will contain deliberately introduced inaccurate information.

ChatGPT-4o Recommendations with Hallucinations

Eligibility Criteria

Sexall
Healthy VolunteersYes
Age GroupsChild (0-17), Adult (18-64), Older Adult (65+)

You may qualify if:

  • Completed Bachelor of Medicine, Bachelor of Surgery (MBBS) Exam. The equivalent degree of MBBS in US and Canada is called Doctor of Medicine (MD).
  • Full or Provisionally Registered Medical Practitioners with the Pakistan Medical and Dental Council (PMDC).
  • Participants must have completed a structured training program on the use of ChatGPT (or a comparable large language model), totaling at least 10 hours of instruction. The program must include hands-on practice related to LLM's aspects, specifically prompt engineering and content evaluation.

You may not qualify if:

  • Any other Registered Medical Practitioners (Full or Provisional) with PMDC (e.g., Professionals with Bachelor of Dental Surgery or BDS).

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Study Sites (1)

Lahore University of Management Sciences

Lahore, Punjab Province, 54000, Pakistan

Location

MeSH Terms

Conditions

Disease

Condition Hierarchy (Ancestors)

Pathologic ProcessesPathological Conditions, Signs and Symptoms

Study Officials

  • Ihsan Ayyub Qazi, PhD

    Lahore University of Management Sciences (LUMS)

    PRINCIPAL INVESTIGATOR
  • Ayesha Ali, PhD

    Lahore University of Management Sciences (LUMS)

    PRINCIPAL INVESTIGATOR
  • Muhammad Asadullah Khawaja, MBBS

    King Edward Medical University

    PRINCIPAL INVESTIGATOR
  • Ali Zafar Sheikh, MBBS

    Lahore General Hospital

    PRINCIPAL INVESTIGATOR
  • Muhammad Junaid Akhtar, MBBS

    Children's Hospital, Lahore

    PRINCIPAL INVESTIGATOR

Study Design

Study Type
interventional
Phase
not applicable
Allocation
RANDOMIZED
Masking
SINGLE
Who Masked
OUTCOMES ASSESSOR
Masking Details
Single (Outcomes Assessor)
Purpose
DIAGNOSTIC
Intervention Model
PARALLEL
Model Details: The trial will be designed as a randomized, two-arm, single-blind parallel group study.
Sponsor Type
OTHER
Responsible Party
PRINCIPAL INVESTIGATOR
PI Title
Full Professor, PhD

Study Record Dates

First Submitted

April 23, 2025

First Posted

May 9, 2025

Study Start

June 20, 2025

Primary Completion

August 15, 2025

Study Completion

August 15, 2025

Last Updated

August 22, 2025

Record last verified: 2025-08

Data Sharing

IPD Sharing
Will not share

Locations