NCT07378358

Brief Summary

This multicenter retrospective study aims to evaluate the diagnostic and therapeutic performance of three large language models-ChatGPT, Gemini and Deepseek-using 800 archived inpatient medical records from urology departments across four tertiary hospitals. The study will focus on the accuracy and applicability of these models in disease recognition, preliminary diagnosis and treatment recommendation generation, in order to explore their potential value and limitations in supporting clinical decision-making in real-world settings.

Trial Health

57
Monitor

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Trial has exceeded expected completion date
Enrollment
800

participants targeted

Target at P75+ for all trials

Timeline
Completed

Started Jan 2026

Shorter than P25 for all trials

Geographic Reach
1 country

1 active site

Status
recruiting

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

First Submitted

Initial submission to the registry

December 9, 2025

Completed
23 days until next milestone

Study Start

First participant enrolled

January 1, 2026

Completed
29 days until next milestone

First Posted

Study publicly available on registry

January 30, 2026

Completed
2 months until next milestone

Primary Completion

Last participant's last visit for primary outcome

April 1, 2026

Completed
2 months until next milestone

Study Completion

Last participant's last visit for all outcomes

June 1, 2026

Completed
Last Updated

January 30, 2026

Status Verified

January 1, 2026

Enrollment Period

3 months

First QC Date

December 9, 2025

Last Update Submit

January 26, 2026

Conditions

Keywords

Large Language ModelsUrologic DiseasesClinical Decision SupportRetrospective Study

Outcome Measures

Primary Outcomes (6)

  • Diagnostic Accuracy: Assessed by Top-1 accuracy

    Top-1: Proportion of cases where the model's first diagnosis matches the true primary diagnosis.

    Through study completion, an average of 3 months

  • Diagnostic Accuracy: Assessed by Top-3 accuracy

    Top-3: Proportion of cases where the true diagnosis appears in the model's top 3.

    Through study completion, an average of 3 months

  • Diagnostic Completeness

    Proportion of the model's diagnoses that overlap with all diagnoses (primary and secondary) in the case.

    Through study completion, an average of 3 months

  • Differential Diagnosis Quality

    Evaluated by experts using a Likert 5-point scale, considering factors like common disease coverage, logical clarity, and specificity

    Through study completion, an average of 3 months

  • Treatment Plan Quality

    Assesses whether the model's treatment suggestions align with clinical guidelines, scored by experts on completeness, appropriateness, and safety.

    Through study completion, an average of 3 months

  • Analysis Time

    5.Time taken by the AI model to provide diagnoses and treatment suggestions (in seconds), reflecting real-time capability.

    Through study completion, an average of 3 months

Interventions

De-identified inpatient medical records were retrospectively collected from the urology departments of four tertiary hospitals (200 cases per site, 800 in total). Each case included standardized clinical information such as demographics, chief complaint, history of present illness, past medical history, physical examination, laboratory and imaging findings, discharge diagnosis and treatment plan. To simulate the role of an AI system in a "first-visit physician" scenario, all diagnostic conclusions, differential diagnoses and treatment plans were removed before being input into the models. Three large language models (ChatGPT, Gemini and DeepSeek) were prompted with a standardized instruction: "Based on the above clinical information, provide your preliminary diagnosis, differential diagnoses and treatment recommendations." Each model generated outputs including (i) primary and secondary diagnoses, (ii) differential diagnosis lists with reasoning and (iii) preliminary treatment suggesti

Eligibility Criteria

Age18 Years+
Sexall
Healthy VolunteersNo
Age GroupsAdult (18-64), Older Adult (65+)
Sampling MethodNon-Probability Sample
Study Population

The study population was drawn from the following institutions: The First Affiliated Hospital of Fujian Medical University, The Second Affiliated Hospital of Fujian Medical University,Shishi City Hospital and Shaowu City Hospital

You may qualify if:

  • The case data is sourced from the four hospitals involved in the study, with complete and authentic diagnosis and treatment records.
  • Patients must be 18 years or older, with no gender restrictions.
  • Complete medical records, including the following core information: patient' s basic information, present illness history, past medical history, physical examination, and auxiliary examinations (including laboratory and imaging tests).
  • A clear discharge diagnosis and treatment plan (including therapeutic measures and follow-up arrangements).
  • Medical records have been archived, with objective and accurate information that has not been altered.
  • The patient or their legal representative has provided informed consent, agreeing to the use of their anonymized medical data for research analysis.

You may not qualify if:

  • Medical records with significant missing information, such as key clinical details (present illness history, diagnostic or treatment records, etc.).
  • Cases where the diagnosis or treatment plan is unclear, or where treatment has not been fully completed for an initial diagnosis.
  • Cases where the primary diagnosis is not urological.
  • Cases with major errors or inconsistencies in the records that could affect further assessment.
  • Medical records in special formats or images that are not readable (e.g., handwritten notes, non-standard documentation).
  • Patients who have not signed the informed consent form or who refuse to allow their medical data to be used for research.

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Study Sites (1)

The First Affiliated Hospital of Fujian Medical University

Fuzhou, China

RECRUITING

MeSH Terms

Conditions

Urologic Diseases

Condition Hierarchy (Ancestors)

Female Urogenital DiseasesFemale Urogenital Diseases and Pregnancy ComplicationsUrogenital DiseasesMale Urogenital Diseases

Central Study Contacts

Study Design

Study Type
observational
Observational Model
COHORT
Time Perspective
RETROSPECTIVE
Sponsor Type
OTHER
Responsible Party
SPONSOR

Study Record Dates

First Submitted

December 9, 2025

First Posted

January 30, 2026

Study Start

January 1, 2026

Primary Completion

April 1, 2026

Study Completion

June 1, 2026

Last Updated

January 30, 2026

Record last verified: 2026-01

Data Sharing

IPD Sharing
Will not share

Locations