Prediction of Infectious Diseases in LMICs Using Electronic Health Record Data
DiGi
1 other identifier
observational
1,000
1 country
1
Brief Summary
Dengue is a rapidly emerging infectious disease in South and Southeast Asia. Definitive diagnosis requires laboratory testing (PCR or antigen testing) which are often unavailable in settings with highest incidence. Correctly identifying patients who have dengue, and the small number of patients with dengue who will progress to severe disease is important to ensure prompt institution of appropriate treatments. Existing models use a combination of clinical and laboratory features. A model developed and tested on data from 397 patients admitted to the Hospital for Tropical Diseases in Bangkok in 2013 - 2014 used Bayesian modelling of variables (liver and full blood count) and clinical symptoms (including fever, petechiae, bleeding) to distinguish dengue from other febrile illness. The resultant model performed had an AUC of 0.75 which improved to 0.8 when NS1 was included. The Sequential Organ Failure (SOFA) scores, or modified versions use vital sign and blood test (liver, renal and haematology) data and are good indicators of those likely to die. However, they function less well in moderately severe diseases (e.g. predicting need for ICU admission). These approaches are promising, but are limited by limited generalizability, use of multiple blood tests and clinical symptoms. A low-cost easy tool able to rapidly diagnose dengue and predict disease severity would be of great value in the region. With modern machine learning methods, this is now feasible and previously identified barriers such as the requirement for large amounts of training data can now be overcome. For example, models can be created from large datasets, but then optimized for smaller different datasets (data either from other locations/conditions, or with less input data). We've previously shown that data-driven machine learning algorithms could generalize across multiple United Kingdom (UK) National Health Service (NHS) Trusts (for predicting COVID-19). Whilst initially trained on data from over 77,000 patients, we created a model requiring only vital sign data and bedside blood count able to predict COVID-19 diagnosis in patients presenting at UK hospitals. We have demonstrated ability to adapt this model for a lower middle-income country (LMIC) setting using data from two Vietnamese hospitals. The adapted models achieved AUROCs around 0.75 and AUPRCs around 0.89 (similar to UK sites where much larger amounts of data were available). Performing "transfer learning," whereby a small subset of UK data was used to support model development in Vietnam, improved performances between 5-10%. We also found that using statistical methods for addressing missing values can further improve predictive performance by 2-5%. This machine learning model can also function as a 'baseline model' and be adapted for a new task i.e. dengue.
Trial Health
Trial Health Score
Automated assessment based on enrollment pace, timeline, and geographic reach
participants targeted
Target at P75+ for all trials
Started Nov 2024
Shorter than P25 for all trials
1 active site
Health score is calculated from publicly available data and should be used for screening purposes only.
Trial Relationships
Click on a node to explore related trials.
Study Timeline
Key milestones and dates
Study Start
First participant enrolled
November 14, 2024
CompletedPrimary Completion
Last participant's last visit for primary outcome
September 15, 2025
CompletedStudy Completion
Last participant's last visit for all outcomes
September 15, 2025
CompletedFirst Submitted
Initial submission to the registry
November 15, 2025
CompletedFirst Posted
Study publicly available on registry
February 25, 2026
CompletedFebruary 25, 2026
February 1, 2026
10 months
November 15, 2025
February 24, 2026
Conditions
Outcome Measures
Primary Outcomes (2)
Dfferentiate dengue from unspecified causes of acute febrile illness
To create AI models able to differentiate dengue from unspecified causes of acute febrile illness in terms of clinical diagnosis and characteristics
At baseline (time of initial clinical presentation)
Prediction of severe dengue
To predict the development of severe dengue using routinely available clinical data
At baseline (time of initial clinical presentation)
Study Arms (1)
Records of patients diagnosed with dengue and non-dengue infections
Medical record between 1January 2016 to 30 September 2024
Interventions
No intervention
Eligibility Criteria
All anonymized medical records of inpatients and out-patients adults aged ≥18 years visiting the Hospital for Tropical Diseases in Bangkok between January 1, 2016, and September 30, 2024, will be included in this study.
You may qualify if:
- Dengue-related ICD codes: A90-94, A910, A911, A919, A970-972, A979
- Non-dengue ICD codes: R78.81, A79.9, A27, B34.9, A49.9
You may not qualify if:
- Medical records with significant missing values, as determined by the Principal Investigators (PIs) and co-investigators.
- Records of patients diagnosed with mixed infections (causative agents ≥ 2)
Contact the study team to confirm eligibility.
Sponsors & Collaborators
Study Sites (1)
Hospital for Tropical Diseases, Faculty of Tropical Medicine
Bangkok, 10400, Thailand
Related Publications (7)
Yang J, Dung NT, Thach PN, Phong NT, Phu VD, Phu KD, Yen LM, Thy DBX, Soltan AAS, Thwaites L, Clifton DA. Generalizability assessment of AI models across hospitals in a low-middle and high income country. Nat Commun. 2024 Sep 27;15(1):8270. doi: 10.1038/s41467-024-52618-6.
PMID: 39333515RESULTYang J, Clifton L, Dung NT, Phong NT, Yen LM, Thy DBX, Soltan AAS, Thwaites L, Clifton DA. Mitigating machine learning bias between high income and low-middle income countries for enhanced model fairness and generalizability. Sci Rep. 2024 Jun 10;14(1):13318. doi: 10.1038/s41598-024-64210-5.
PMID: 38858466RESULTSoltan AAS, Yang J, Pattanshetty R, Novak A, Yang Y, Rohanian O, Beer S, Soltan MA, Thickett DR, Fairhead R, Zhu T, Eyre DW, Clifton DA; CURIAL Translational Collaborative. Real-world evaluation of rapid and laboratory-free COVID-19 triage for emergency care: external validation and pilot deployment of artificial intelligence driven screening. Lancet Digit Health. 2022 Apr;4(4):e266-e278. doi: 10.1016/S2589-7500(21)00272-7. Epub 2022 Mar 9.
PMID: 35279399RESULTYang J, Soltan AAS, Clifton DA. Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening. NPJ Digit Med. 2022 Jun 7;5(1):69. doi: 10.1038/s41746-022-00614-9.
PMID: 35672368RESULTMcBride A, Vuong NL, Van Hao N, Huy NQ, Chanh HQ, Chau NTX, Nguyet NM, Ming DK, Ngoc NT, Nhat PTH, Phong NT, Tai LTH, Tho PV, Trung DT, Tam DTH, Trieu HT, Geskus RB, Llewelyn MJ, Thwaites CL, Yacoub S. A modified Sequential Organ Failure Assessment score for dengue: development, evaluation and proposal for use in clinical trials. BMC Infect Dis. 2022 Sep 3;22(1):722. doi: 10.1186/s12879-022-07705-8.
PMID: 36057771RESULTLuvira V, Silachamroon U, Piyaphanee W, Lawpoolsri S, Chierakul W, Leaungwutiwong P, Thawornkuno C, Wattanagoon Y. Etiologies of Acute Undifferentiated Febrile Illness in Bangkok, Thailand. Am J Trop Med Hyg. 2019 Mar;100(3):622-629. doi: 10.4269/ajtmh.18-0407.
PMID: 30628565RESULTSa-Ngamuang C, Haddawy P, Luvira V, Piyaphanee W, Iamsirithaworn S, Lawpoolsri S. Accuracy of dengue clinical diagnosis with and without NS1 antigen rapid test: Comparison between human and Bayesian network model decision. PLoS Negl Trop Dis. 2018 Jun 18;12(6):e0006573. doi: 10.1371/journal.pntd.0006573. eCollection 2018 Jun.
PMID: 29912875RESULT
MeSH Terms
Conditions
Condition Hierarchy (Ancestors)
Study Design
- Study Type
- observational
- Observational Model
- OTHER
- Time Perspective
- RETROSPECTIVE
- Sponsor Type
- OTHER
- Responsible Party
- SPONSOR
Study Record Dates
First Submitted
November 15, 2025
First Posted
February 25, 2026
Study Start
November 14, 2024
Primary Completion
September 15, 2025
Study Completion
September 15, 2025
Last Updated
February 25, 2026
Record last verified: 2026-02