NCT07179861

Brief Summary

This study evaluates how well anonymized artificial-intelligence (AI) tools perform on standardized pediatric case vignettes and whether showing AI suggestions can improve clinicians' answers. About 30 board-certified/eligible pediatric specialists at a single hospital complete a one-time session. Participants are randomized to two groups. Group A (n≈15): physicians answer each vignette once. Group B (n≈15): physicians answer and rate confidence (1-10), then review anonymized suggestions from five different AI tools (tool names not shown) and may keep or change their answer; changes and confidence are recorded. Primary focus: measure AI performance (diagnostic accuracy, medication-dosing accuracy, interpretation accuracy) overall and by difficulty tier, and record AI response time. Secondary focus: quantify how AI suggestions affect human performance (change in accuracy, direction of change, confidence shift, and time). No patients or biospecimens are involved; risks are minimal (time and possible discomfort with performance review). Findings may inform safe, evidence-based ways to use AI alongside clinicians in pediatrics.

Trial Health

87
On Track

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Enrollment
30

participants targeted

Target at below P25 for all trials

Timeline
Completed

Started Aug 2025

Shorter than P25 for all trials

Geographic Reach
1 country

1 active site

Status
completed

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

Study Start

First participant enrolled

August 27, 2025

Completed
14 days until next milestone

Primary Completion

Last participant's last visit for primary outcome

September 10, 2025

Completed
1 day until next milestone

First Submitted

Initial submission to the registry

September 11, 2025

Completed
Same day until next milestone

Study Completion

Last participant's last visit for all outcomes

September 11, 2025

Completed
7 days until next milestone

First Posted

Study publicly available on registry

September 18, 2025

Completed
Last Updated

September 23, 2025

Status Verified

September 1, 2025

Enrollment Period

14 days

First QC Date

September 11, 2025

Last Update Submit

September 17, 2025

Conditions

Outcome Measures

Primary Outcomes (3)

  • AI Interpretation Accuracy (%)

    Proportion of correct laboratory/imaging interpretations or appropriate next-test selections, per AI tool and pooled; stratified by difficulty tier. Unit: percent (0-100).

    Day 1

  • AI Diagnostic Accuracy (%)

    Proportion of vignettes with a correct primary diagnosis produced by each anonymized AI tool and pooled across tools. Correctness is defined against a pre-specified reference answer key; results are also stratified by pre-defined difficulty tiers (easy/moderate/difficult/very difficult). Unit of measure: percent (0-100).

    Day 1

  • AI Medication-Dosing Accuracy (%)

    Proportion of dose recommendations meeting pediatric standards (weight- or BSA-based ranges, route, frequency) per reference rubric, per AI tool and pooled; stratified by difficulty tier. Unit: percent (0-100).

    Day 1

Secondary Outcomes (5)

  • Change in Physician Diagnostic Accuracy (percentage points) (Group 2 only)

    Day 1: Baseline (pre-AI) and immediate Post-AI within the same session (0-15 min after baseline).

  • Confidence Shift (Δ on a 1-10 scale) (Group 2 only)

    Day 1: Baseline (pre-AI) and immediate Post-AI within the same session (0-15 min after baseline).

  • Answer-Change Frequency (%) (Group 2 only)

    Day 1

  • AI Response Time (seconds per vignette)

    Day 1

  • Net Benefit Index of AI Exposure (percentage points) (Group 2 only)

    Day 1

Study Arms (2)

Group/Cohort 1: Direct Answer (Physician-only)

Board-certified/eligible pediatric specialists answer each standardized vignette once, without confidence scoring and without viewing AI suggestions. Outcomes captured: Diagnostic accuracy, dosing accuracy, interpretation accuracy, completion time.

Group/Cohort 2: Confidence + AI Suggestions

Pediatric specialists first answer and rate confidence on a 1-10 scale; then they view anonymized suggestions from five distinct AI tools (tool names not shown) and may keep or revise their answer. All changes and confidence shifts are recorded. Outcomes captured: Pre- vs post-AI accuracy (and direction of change), dosing and interpretation accuracy changes, confidence shift, completion time.

Other: AI Suggestions (Anonymized 5-tool panel)Other: Confidence Rating Task (1-10 Likert)

Interventions

What: Display of AI-generated suggestions for each vignette, aggregated from five large language model tools (names not shown to participants). When/Who: Shown only in Group 2, after the physician's initial answer and confidence score. Purpose: Measure AI performance (primary) and quantify the effect of AI suggestions on physicians' answers (secondary). Applies to: Group 2.

Group/Cohort 2: Confidence + AI Suggestions

What: Self-rated confidence for the initial answer on a 1-10 scale. When/Who: Group 2 before viewing AI suggestions. Purpose: Quantify confidence changes pre- vs post-AI and relate confidence to correctness. Applies to: Group 2.

Group/Cohort 2: Confidence + AI Suggestions

Eligibility Criteria

Age28 Years - 40 Years
Sexall
Healthy VolunteersYes
Age GroupsAdult (18-64)
Sampling MethodNon-Probability Sample
Study Population

Board-certified/eligible general pediatrics specialists recruited from SBÜ Sultangazi Haseki Training and Research Hospital network.

You may qualify if:

  • Board-certified or board-eligible pediatric specialist (general pediatrics) (in the first 10 years of expertise)
  • Actively practicing at the participating institution/network at the time of enrollment.
  • Able and willing to complete all vignette items individually in a single session and to follow study instructions for the assigned cohort (direct answers or confidence rating + viewing anonymized AI suggestions).
  • Fluent in Turkish and able to use a computer interface.
  • Provides written informed consent.

You may not qualify if:

  • Pediatric subspecialist practice as primary role (e.g., cardiology, infectious diseases, neurology, neonatology, etc.), to maintain a homogeneous general pediatrics cohort.
  • Prior access to or participation in creating the study vignettes, answer keys, or scoring rubrics; direct involvement with the study team.
  • Inability to complete the session without external help or use of non-protocol resources (internet/AI tools) during answering (outside of anonymized AI suggestions shown by the system in Group 2).
  • Failure to complete ≥90% of items or major protocol deviation (e.g., discussion with others during the task).
  • Any condition judged by investigators to interfere with valid participation (e.g., severe time constraints, inability to provide consent).

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Study Sites (1)

SBÜ Sultangazi Haseki Training and Research Hospital

Istanbul, Sultangazi, 34010, Turkey (Türkiye)

Location

Related Publications (5)

  • Su H, Sun Y, Li R, Zhang A, Yang Y, Xiao F, Duan Z, Chen J, Hu Q, Yang T, Xu B, Zhang Q, Zhao J, Li Y, Li H. Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis. J Med Internet Res. 2025 Jun 9;27:e72062. doi: 10.2196/72062.

    PMID: 40489764BACKGROUND
  • Bicknell BT, Butler D, Whalen S, Ricks J, Dixon CJ, Clark AB, Spaedy O, Skelton A, Edupuganti N, Dzubinski L, Tate H, Dyess G, Lindeman B, Lehmann LS. ChatGPT-4 Omni Performance in USMLE Disciplines and Clinical Skills: Comparative Analysis. JMIR Med Educ. 2024 Nov 6;10:e63430. doi: 10.2196/63430.

    PMID: 39504445BACKGROUND
  • Cross JL, Choma MA, Onofrey JA. Bias in medical AI: Implications for clinical decision-making. PLOS Digit Health. 2024 Nov 7;3(11):e0000651. doi: 10.1371/journal.pdig.0000651. eCollection 2024 Nov.

    PMID: 39509461BACKGROUND
  • Cruz Rivera S, Liu X, Chan AW, Denniston AK, Calvert MJ; SPIRIT-AI and CONSORT-AI Working Group; SPIRIT-AI and CONSORT-AI Steering Group; SPIRIT-AI and CONSORT-AI Consensus Group. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020 Sep;26(9):1351-1363. doi: 10.1038/s41591-020-1037-7. Epub 2020 Sep 9.

    PMID: 32908284BACKGROUND
  • Vasey B, Nagendran M, Campbell B, Clifton DA, Collins GS, Denaxas S, Denniston AK, Faes L, Geerts B, Ibrahim M, Liu X, Mateen BA, Mathur P, McCradden MD, Morgan L, Ordish J, Rogers C, Saria S, Ting DSW, Watkinson P, Weber W, Wheatstone P, McCulloch P; DECIDE-AI expert group. Reporting guideline for the early stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. BMJ. 2022 May 18;377:e070904. doi: 10.1136/bmj-2022-070904.

    PMID: 35584845BACKGROUND

Study Design

Study Type
observational
Observational Model
COHORT
Time Perspective
CROSS SECTIONAL
Sponsor Type
OTHER
Responsible Party
PRINCIPAL INVESTIGATOR
PI Title
MD - Pediatrician (Principal Investigator)

Study Record Dates

First Submitted

September 11, 2025

First Posted

September 18, 2025

Study Start

August 27, 2025

Primary Completion

September 10, 2025

Study Completion

September 11, 2025

Last Updated

September 23, 2025

Record last verified: 2025-09

Data Sharing

IPD Sharing
Will not share

IPD Description: No IPD will be shared. The dataset comprises detailed, item-level responses from a small, single-center cohort of pediatric specialists. Despite de-identification, the risk of re-identification is non-trivial given granular performance metrics and professional identifiers. The informed consent did not include permission to share raw individual responses outside the study team. Plan to Share Supporting Materials: Yes Supporting Documents: Study Protocol, Statistical Analysis Plan, scoring rubrics, redacted vignette templates, and analysis code. Time Frame: Available within 6 months after the primary manuscript is published and for at least 36 months thereafter. Access Criteria and URL: Materials will be provided upon reasonable request to the Principal Investigator (email: drberkerokay@gmail.com). A data use agreement will be required; use is limited to non-commercial research and aggregate reporting.

Locations