Brief Summary

The goal of this randomized controlled trial is to evaluate the role of large language models in enhancing laypeople's ability to self-diagnose and triage common diseases. The main questions it aims to answer are:

Does using an LLM help participants make more accurate self-diagnoses and care decisions for common illnesses, compared to their first guess without any help?
How much better is it when people work together with an LLM, compared to using a regular search engine, using the LLM alone, or how doctors would decide? Researchers will compare participants who were randomly assigned to either the LLM group (using DeepSeek) or the search engine group to see if the LLM-assisted approach leads to better clinical judgments. Participants will:
Read one of 48 short, realistic health vignettes;
Make an initial guess about what might be wrong by listing up to three possible causes, ranked from most to least likely, and choose a care level: seek immediate care, see a doctor within one day, see a doctor within one week, or manage at home without medical care.
Use their assigned tool (either DeepSeek or a standard search engine) to look up information and update their guess and care decision;
Submit their final diagnosis and care choice after using the tool. In addition, the study team evaluated the performance of four other AI models (GPT-4o, GPT-o1, DeepSeek-v3, and DeepSeek-r1) and 33 experienced general physicians on the same vignettes.

Trial Health

On Track

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Enrollment

6,360

participants targeted

Target at P75+ for not_applicable

Timeline

Completed

Started Apr 2025

Shorter than P25 for not_applicable

Geographic Reach

1 country

1 active site

Status

completed

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

2 months study duration

Study Start

First participant enrolled

April 27, 2025

Completed

2 months until next milestone

Primary Completion

Last participant's last visit for primary outcome

July 1, 2025

Completed

Same day until next milestone

Study Completion

Last participant's last visit for all outcomes

July 1, 2025

Completed

5 months until next milestone

First Submitted

Initial submission to the registry

November 17, 2025

Completed

9 days until next milestone

First Posted

Study publicly available on registry

November 26, 2025

Completed

Last Updated

November 26, 2025

Status Verified

October 1, 2025

Enrollment Period

2 months

First QC Date

November 17, 2025

Last Update Submit

November 25, 2025

Conditions

Vignette Based Intervention LLM-based AI Dialogue Bot

Outcome Measures

Primary Outcomes (2)

Top-3 Diagnostic Accuracy
The primary diagnostic outcome was defined as the proportion of participants who included the correct diagnosis in their top three differential diagnoses after using the assigned tool (LLM or search engine). Accuracy was assessed for each of the 48 clinical vignettes and aggregated across all participants in each group.
Immediately after intervention (within the same survey session)
Triage Accuracy (4-class exact match)
Triage accuracy was defined as the proportion of participants who selected the correct triage level (emergent care, within one day, within one week, or self-care) that matched the reference standard. There were 12 vignettes per triage category.
Immediately after intervention (within the same survey session)

Secondary Outcomes (2)

Top-1 Diagnostic Accuracy
Immediately after intervention (within the same survey session)
Triage Accuracy (2-class binary match)
Immediately after intervention (within the same survey session)

Study Arms (2)

layperson-LLM integrated group

EXPERIMENTAL

After initially answering a clinical diagnosis and triage question without the aid of tools, the participants were asked to use a large language model (Deepseek v3 or r1) to retrieve health information and then answer the same question again

Behavioral: AI-assisted health information seeking

layperson-search engine group

ACTIVE COMPARATOR

After initially answering a clinical diagnosis and triage question without the use of tools, the participants were required to use a search engine to retrieve health information and then answer the same question again

Behavioral: Conventional internet search for health information

Interventions

AI-assisted health information seekingBEHAVIORAL

Participants in this group used a large language model (DeepSeek) to search for medical information related to a clinical vignette after providing initial diagnostic and triage decisions. They were instructed to interact freely with the model to gather insights and then update their diagnoses and triage recommendations. The intervention simulates real-world use of AI tools for personal health decision-making

layperson-LLM integrated group

Conventional internet search for health informationBEHAVIORAL

Participants in this group used mainstream internet search engines (e.g., Baidu, Google, Bing) to look up information about the clinical vignette after making initial diagnostic and triage decisions. They were allowed to search freely but were not permitted to use any named AI chatbot or large language model platform. This group represents typical self-directed online health information seeking behavior.

layperson-search engine group

Eligibility Criteria

Age18 Years+

Sexall

Healthy VolunteersNo

Age GroupsAdult (18-64), Older Adult (65+)

You may qualify if:

Age 18 years or older
Current resident of mainland China
History of high-quality participation in online surveys on Credamo platform (historical survey acceptance rate ≥ 80% and personal credit score ≥ 70)

You may not qualify if:

Incomplete survey responses
Failure on embedded quality-check items
Implausibly short completion time (\<180 seconds for search engine group; \<360 seconds for LLM group)
Provision of non-diagnostic or irrelevant responses (e.g., "unknown", "don't know")
Consistent pattern of identical responses across all items

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Huazhong University of Science and Technologylead

Study Sites (1)

Tongji Medical College of Huazhong University of Science & Technology School of Medicine and Health Management

Wuhan, Hubei, China

Location

Study Officials

Chenxi Liu
Huazhong University of Science and Technology
PRINCIPAL INVESTIGATOR

Study Design

Study Type: interventional
Phase: not applicable
Allocation: RANDOMIZED
Masking: SINGLE
Who Masked: PARTICIPANT
Purpose: HEALTH SERVICES RESEARCH
Intervention Model: PARALLEL
Sponsor Type: OTHER
Responsible Party: PRINCIPAL INVESTIGATOR
PI Title: Co-Investigator

Study Record Dates

First Submitted

November 17, 2025

First Posted

November 26, 2025

Study Start

April 27, 2025

Primary Completion

July 1, 2025

Study Completion

July 1, 2025

Last Updated

November 26, 2025

Record last verified: 2025-10

Data Sharing

IPD Sharing: Will not share

Locations

CN(1)

Brief Summary

Trial Health

Trial Health Score

Trial Relationships

Related Scientific Literature

Study Timeline

Study Start

Primary Completion

Study Completion

First Submitted

First Posted

Conditions

Outcome Measures

Primary Outcomes (2)

Secondary Outcomes (2)

Study Arms (2)

layperson-LLM integrated group

layperson-search engine group

Interventions

Eligibility Criteria

You may qualify if:

You may not qualify if:

Sponsors & Collaborators

Study Sites (1)

Study Officials

Study Design

Study Record Dates

Data Sharing

Locations