AI-Driven Genotype Prediction Using EHR and Multimodal Data
Predicting Patient Genotypes Using Electronic Health Records and Multimodal Data Through AI-Based Models
1 other identifier
observational
100,000
1 country
4
Brief Summary
The goal of this clinical study is to explore the potential of using electronic health records (EHR) and multimodal data (such as imaging, lab results, and clinical history) to predict a patient's genotype. The study will evaluate whether predictive models based on this non-genetic data can accurately infer genetic information, which traditionally requires direct genetic testing.
Trial Health
Trial Health Score
Automated assessment based on enrollment pace, timeline, and geographic reach
participants targeted
Target at P75+ for all trials
Started Jul 2023
4 active sites
Health score is calculated from publicly available data and should be used for screening purposes only.
Trial Relationships
Click on a node to explore related trials.
Study Timeline
Key milestones and dates
Study Start
First participant enrolled
July 1, 2023
CompletedFirst Submitted
Initial submission to the registry
January 19, 2025
CompletedFirst Posted
Study publicly available on registry
January 24, 2025
CompletedPrimary Completion
Last participant's last visit for primary outcome
June 1, 2025
CompletedStudy Completion
Last participant's last visit for all outcomes
June 1, 2025
CompletedApril 17, 2025
April 1, 2025
1.9 years
January 19, 2025
April 16, 2025
Conditions
Keywords
Outcome Measures
Primary Outcomes (2)
Area Under the Curve (AUC)
AUC of the ROC curve, used to quantify diagnostic accuracy. No unit (a ratio or percentage, typically expressed as a number between 0 and 1).
1 year
F1 Score
The F1 score is the harmonic mean of precision and sensitivity (recall). It is a good measure of the model's ability to identify both true positives and minimize false positives, especially in cases where the classes are imbalanced (e.g., when the number of healthy cases is much higher than disease cases). The F1 score ranges from 0 to 1, with 1 indicating perfect precision and recall.
1 year
Secondary Outcomes (2)
Sensitivity (True Positive Rate)
1 year
Specificity (True Negative Rate)
1 year
Study Arms (1)
AI-Based Genotype Prediction Using EHR and Multimodal Data
This cohort consists of patients whose historical health data, including electronic health records (EHR), clinical lab results, and multimodal imaging data (such as X-rays, MRIs, and CT scans), will be analyzed by an AI-based prediction model to predict their genotype. There are no active interventions in this cohort, as the study aims to use non-genetic health data to infer genetic information. Participants will not undergo genetic testing but will provide their health data for analysis by the AI system. The goal of this group is to assess the accuracy of the AI model in predicting genotypes and identifying genetic predispositions to various diseases based on available health data.
Interventions
The intervention in this study involves an AI-based predictive model designed to analyze and integrate patient electronic health records (EHR), clinical lab results, and multimodal imaging data (e.g., X-rays, MRIs, CT scans). The AI model is trained to predict a patient's genotype based on these non-genetic data sources. This model uses machine learning algorithms to detect patterns and infer genetic information that would traditionally require direct genetic testing. There are no active treatments or genetic tests involved in this intervention; rather, the AI system serves as a tool to predict genetic information from available clinical data, offering a non-invasive and potentially more accessible alternative to genetic testing.
Eligibility Criteria
The study population will be selected from multiple healthcare centers that maintain comprehensive electronic health records (EHR) and have access to multimodal clinical data, including lab results, medical imaging (e.g., X-rays, MRIs, CT scans), and medical history. Participants will be individuals with a variety of health conditions for which genotype information is relevant, although no specific genetic characteristics will be used for selection. The focus will be on utilizing the available health data to predict genetic information through the AI model. The study aims to evaluate the accuracy and utility of using non-genetic data, such as EHR and multimodal imaging, for predicting patient genotypes, which may provide an alternative approach to traditional genetic testing methods.
You may qualify if:
- Participants must have comprehensive electronic health records (EHR), including medical history, lab results, and relevant imaging data (e.g., X-rays, MRIs, CT scans).
- Participants must have existing genetic testing data available for comparison, if applicable.
- Participants must be willing to provide consent for the use of their health data in the study.
- Participants must have no active intervention related to genetic testing or prediction during the study period.
- Participants should have complete and verifiable health data to allow for accurate prediction by the AI model.
You may not qualify if:
- Participants without available EHR, lab results, or imaging data.
- Participants with ambiguous, inaccurate, or unverifiable genetic testing results that cannot be used for comparison.
- Patients with significant discrepancies or missing data that would prevent the AI model from making accurate predictions.
Contact the study team to confirm eligibility.
Sponsors & Collaborators
Study Sites (4)
Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University
Guangzhou, Guangdong, China
Sun Yat-sen University Cancer Hospital
Guangzhou, Guangdong, China
First Affiliated Hospital of Wenzhou Medical University
Wenzhou, Zhejiang, China
Second Affiliated Hospital of Wenzhou Medical University
Wenzhou, Zhejiang, China
Central Study Contacts
Study Design
- Study Type
- observational
- Observational Model
- CASE ONLY
- Time Perspective
- RETROSPECTIVE
- Sponsor Type
- OTHER
- Responsible Party
- PRINCIPAL INVESTIGATOR
- PI Title
- Chief Scientist
Study Record Dates
First Submitted
January 19, 2025
First Posted
January 24, 2025
Study Start
July 1, 2023
Primary Completion
June 1, 2025
Study Completion
June 1, 2025
Last Updated
April 17, 2025
Record last verified: 2025-04
Data Sharing
- IPD Sharing
- Will not share