Brief Summary

This multicenter retrospective study aims to evaluate the diagnostic and therapeutic performance of three large language models-ChatGPT, Gemini and Deepseek-using 800 archived inpatient medical records from urology departments across four tertiary hospitals. The study will focus on the accuracy and applicability of these models in disease recognition, preliminary diagnosis and treatment recommendation generation, in order to explore their potential value and limitations in supporting clinical decision-making in real-world settings.

Trial Health

Monitor

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Trial has exceeded expected completion date

Enrollment

800

participants targeted

Target at P75+ for all trials

Timeline

Completed

Started Jan 2026

Shorter than P25 for all trials

Geographic Reach

1 country

1 active site

Status

recruiting

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

5 months study duration

First Submitted

Initial submission to the registry

December 9, 2025

Completed

23 days until next milestone

Study Start

First participant enrolled

January 1, 2026

Completed

29 days until next milestone

First Posted

Study publicly available on registry

January 30, 2026

Completed

2 months until next milestone

Primary Completion

Last participant's last visit for primary outcome

April 1, 2026

Completed

2 months until next milestone

Study Completion

Last participant's last visit for all outcomes

June 1, 2026

Completed

Last Updated

January 30, 2026

Status Verified

January 1, 2026

Enrollment Period

3 months

First QC Date

December 9, 2025

Last Update Submit

January 26, 2026

Conditions

Urologic Diseases

Keywords

Large Language ModelsUrologic DiseasesClinical Decision SupportRetrospective Study

Outcome Measures

Primary Outcomes (6)

Diagnostic Accuracy: Assessed by Top-1 accuracy
Top-1: Proportion of cases where the model's first diagnosis matches the true primary diagnosis.
Through study completion, an average of 3 months
Diagnostic Accuracy: Assessed by Top-3 accuracy
Top-3: Proportion of cases where the true diagnosis appears in the model's top 3.
Through study completion, an average of 3 months
Diagnostic Completeness
Proportion of the model's diagnoses that overlap with all diagnoses (primary and secondary) in the case.
Through study completion, an average of 3 months
Differential Diagnosis Quality
Evaluated by experts using a Likert 5-point scale, considering factors like common disease coverage, logical clarity, and specificity
Through study completion, an average of 3 months
Treatment Plan Quality
Assesses whether the model's treatment suggestions align with clinical guidelines, scored by experts on completeness, appropriateness, and safety.
Through study completion, an average of 3 months
Analysis Time
5.Time taken by the AI model to provide diagnoses and treatment suggestions (in seconds), reflecting real-time capability.
Through study completion, an average of 3 months

Interventions

Large Language Model Assessment (ChatGPT, Gemini, DeepSeek)OTHER

De-identified inpatient medical records were retrospectively collected from the urology departments of four tertiary hospitals (200 cases per site, 800 in total). Each case included standardized clinical information such as demographics, chief complaint, history of present illness, past medical history, physical examination, laboratory and imaging findings, discharge diagnosis and treatment plan. To simulate the role of an AI system in a "first-visit physician" scenario, all diagnostic conclusions, differential diagnoses and treatment plans were removed before being input into the models. Three large language models (ChatGPT, Gemini and DeepSeek) were prompted with a standardized instruction: "Based on the above clinical information, provide your preliminary diagnosis, differential diagnoses and treatment recommendations." Each model generated outputs including (i) primary and secondary diagnoses, (ii) differential diagnosis lists with reasoning and (iii) preliminary treatment suggesti

Eligibility Criteria

Age18 Years+

Sexall

Healthy VolunteersNo

Age GroupsAdult (18-64), Older Adult (65+)

Sampling MethodNon-Probability Sample

Study Population

The study population was drawn from the following institutions: The First Affiliated Hospital of Fujian Medical University, The Second Affiliated Hospital of Fujian Medical University,Shishi City Hospital and Shaowu City Hospital

You may qualify if:

The case data is sourced from the four hospitals involved in the study, with complete and authentic diagnosis and treatment records.
Patients must be 18 years or older, with no gender restrictions.
Complete medical records, including the following core information: patient' s basic information, present illness history, past medical history, physical examination, and auxiliary examinations (including laboratory and imaging tests).
A clear discharge diagnosis and treatment plan (including therapeutic measures and follow-up arrangements).
Medical records have been archived, with objective and accurate information that has not been altered.
The patient or their legal representative has provided informed consent, agreeing to the use of their anonymized medical data for research analysis.

You may not qualify if:

Medical records with significant missing information, such as key clinical details (present illness history, diagnostic or treatment records, etc.).
Cases where the diagnosis or treatment plan is unclear, or where treatment has not been fully completed for an initial diagnosis.
Cases where the primary diagnosis is not urological.
Cases with major errors or inconsistencies in the records that could affect further assessment.
Medical records in special formats or images that are not readable (e.g., handwritten notes, non-standard documentation).
Patients who have not signed the informed consent form or who refuse to allow their medical data to be used for research.

Contact the study team to confirm eligibility.

Sponsors & Collaborators

First Affiliated Hospital of Fujian Medical Universitylead

Study Sites (1)

The First Affiliated Hospital of Fujian Medical University

Fuzhou, China

RECRUITING

MeSH Terms

Conditions

Urologic Diseases

Condition Hierarchy (Ancestors)

Female Urogenital DiseasesFemale Urogenital Diseases and Pregnancy ComplicationsUrogenital DiseasesMale Urogenital Diseases

Central Study Contacts

Ning Xu

CONTACT

+86-13235907575 drxun@fjmu.edu.cn

Study Design

Study Type: observational
Observational Model: COHORT
Time Perspective: RETROSPECTIVE
Sponsor Type: OTHER
Responsible Party: SPONSOR

Study Record Dates

First Submitted

December 9, 2025

First Posted

January 30, 2026

Study Start

January 1, 2026

Primary Completion

April 1, 2026

Study Completion

June 1, 2026

Last Updated

January 30, 2026

Record last verified: 2026-01

Data Sharing

IPD Sharing: Will not share

Locations

CN(1)

Brief Summary

Trial Health

Trial Health Score

Trial Relationships

Related Scientific Literature

Study Timeline

First Submitted

Study Start

First Posted

Primary Completion

Study Completion

Conditions

Keywords

Outcome Measures

Primary Outcomes (6)

Interventions

Eligibility Criteria

You may qualify if:

You may not qualify if:

Sponsors & Collaborators

Study Sites (1)

MeSH Terms

Conditions

Condition Hierarchy (Ancestors)

Central Study Contacts

Study Design

Study Record Dates

Data Sharing

Locations