NCT04574882

Brief Summary

This project seeks to identify and characterize features derived from digital data (e.g. social media, online search, mobile media) which are associated with coronary heart disease (CHD) and related risk factors, and develop models that use digital data and conventional predictive models to predict CHD risk and health care utilization.

Trial Health

87
On Track

Trial Health Score

Automated assessment based on enrollment pace, timeline, and geographic reach

Enrollment
781

participants targeted

Target at P75+ for all trials

Timeline
Completed

Started Sep 2020

Longer than P75 for all trials

Geographic Reach
1 country

1 active site

Status
completed

Health score is calculated from publicly available data and should be used for screening purposes only.

Trial Relationships

Click on a node to explore related trials.

Study Timeline

Key milestones and dates

Study Start

First participant enrolled

September 25, 2020

Completed
3 days until next milestone

First Submitted

Initial submission to the registry

September 28, 2020

Completed
7 days until next milestone

First Posted

Study publicly available on registry

October 5, 2020

Completed
4.7 years until next milestone

Primary Completion

Last participant's last visit for primary outcome

May 30, 2025

Completed
2 days until next milestone

Study Completion

Last participant's last visit for all outcomes

June 1, 2025

Completed
5 months until next milestone

Results Posted

Study results publicly available

November 4, 2025

Completed
Last Updated

November 4, 2025

Status Verified

October 1, 2025

Enrollment Period

4.7 years

First QC Date

September 28, 2020

Results QC Date

August 4, 2025

Last Update Submit

October 8, 2025

Conditions

Keywords

digital datadigital health

Outcome Measures

Primary Outcomes (1)

  • Latent Dirichlet Allocation (LDA) Topics - Topics / Themes Discussed Between Patients With and Without Heart Disease

    The primary outcome is topics and features (derived using the LDA method for clustering language data). For each participant, we included all available Facebook wall posts from the start of their account history through data collection, regardless of whether they occurred before or after a CHD diagnosis. We examined associations between linguistic features (unigrams, LIWC categories, LDA topics) and cardiovascular case status (CHD presence vs absence) using Pearson correlation and logistic regression. Latent LDA, a systematic method to identify text-based themes, was applied to generate 200 clusters of co-occurring words ("topics"). For each feature type (unigram, LIWC category, LDA topic), we fit separate logistic regression models and calculated Pearson correlation coefficients to assess predictive value for case status. Each language-derived feature was encoded as a normalized frequency count per user to enable consistent comparison across participants.

    Through study completion, an average of 3 years

Other Outcomes (2)

  • CHD Event

    Through study completion, an average of 3 years

  • Health Care Utilization

    Through study completion, an average of 3 years

Study Arms (2)

Case

Patients ages 30-74 with and without CHD (IICD 10: I63, I20-I25 ) within the last 5 years.

Other: Survey

Control

Patients aged 30-74 who have non-cardiovascular-related history

Other: Survey

Interventions

SurveyOTHER

Interested participants may complete the informed consent online. After informed consent, the participant will be asked to share the digital data types that they use (Facebook, Instagram, Twitter, Google search, step data) and then participants will complete a cross-sectional survey.

CaseControl

Eligibility Criteria

Age30 Years - 74 Years
Sexall
Healthy VolunteersYes
Age GroupsAdult (18-64), Older Adult (65+)
Sampling MethodNon-Probability Sample
Study Population

We will identify patients ages 30-74 with and without CHD (ICD 9:414.0, ICD 10: I63, I20-I25)

You may qualify if:

  • years of age
  • Willing to sign informed consent
  • Primarily English speaking (for language analysis)
  • Has an account on any of the following digital data platforms (Facebook, Instagram, Twitter Reddit, Google (gmail), or smartphone or wearable device such as Apple Health, Fitbit, Samsung Health, MapMyFitness or Garmin) and willing to share data
  • If has social media account, Instagram or Facebook, willing to share historical and prospective data (60 days) If has Google (gmail) account, willing to download and share google takeout zip file
  • If has smartphone or wearable device, willing to share step data
  • Willing to share access to medical health records
  • Willing to share healthcare insurance information

You may not qualify if:

  • Does not use and post on digital data sources we are studying or unwilling to donate data
  • Patient is in severe distress, e.g. respiratory, physical, or emotional distress
  • Patient is intoxicated, unconscious, or unable to appropriately respond to questions

Contact the study team to confirm eligibility.

Sponsors & Collaborators

Study Sites (1)

University of Pennsylvania Health System

Philadelphia, Pennsylvania, 19101, United States

Location

MeSH Terms

Conditions

Cardiovascular Diseases

Interventions

Surveys and Questionnaires

Intervention Hierarchy (Ancestors)

Data CollectionEpidemiologic MethodsInvestigative TechniquesHealth Care Evaluation MechanismsQuality of Health CareHealth Care Quality, Access, and EvaluationPublic HealthEnvironment and Public Health

Results Point of Contact

Title
Director of Research
Organization
University of Pennsylvania

Publication Agreements

PI is Sponsor Employee
No
Restrictive Agreement
No

Study Design

Study Type
observational
Observational Model
CASE CONTROL
Time Perspective
CROSS SECTIONAL
Sponsor Type
OTHER
Responsible Party
SPONSOR

Study Record Dates

First Submitted

September 28, 2020

First Posted

October 5, 2020

Study Start

September 25, 2020

Primary Completion

May 30, 2025

Study Completion

June 1, 2025

Last Updated

November 4, 2025

Results First Posted

November 4, 2025

Record last verified: 2025-10

Locations