SleepSynth: Evaluating the use of Synthetic Data in Health Digital Twins

Sandra Kumi, Maxwell Hilton, Charles Snow, Richard K. Lomotey, Ralph Deters

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Health Digital Twins (HDTs) are virtual replicas of a patient's physical/actual data. The major setbacks for applying Machine Learning (ML) in HDTs are the lack of availability of patients' data due to privacy concerns and Artificial Intelligence (AI) bias. Given these shortcomings, synthetic data has been leveraged to solve privacy issues and increase diversity in datasets. In this paper, we evaluate four synthetic data generation models namely, Gaussian Copula, Conditional Tabular Generative Adversarial Network (CTGAN), CopulaGAN, and Tabular Variational Autoencoder (TVAE) which are used to generate synthetic data for actual sleep data retrieved from a wearable device. Gaussian Copula performed best in capturing the correlation between the variables with the real data with a quality score of approximately 96%. Additionally, we evaluate the efficacy of the synthetic generation models by training five well-known ML models on the generated synthetic data. Our experimental results show that the ML models trained on the synthetic data achieve an MAE (Mean Absolute Error) of less than 10% in the prediction of sleep quality score. The results from this work indicate that synthetic data could be used for ML tasks while preserving the privacy of data subjects.

Original languageEnglish (US)
Title of host publicationProceedings - 2023 IEEE International Conference on Digital Health, ICDH 2023
EditorsCarl K. Chang, Rong N. Chang, Jing Fan, Geoffrey C. Fox, Zhi Jin, Graziano Pravadelli, Hossain Shahriar
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages121-130
Number of pages10
ISBN (Electronic)9798350341034
DOIs
StatePublished - 2023
Event2023 IEEE International Conference on Digital Health, ICDH 2023 - Hybrid, Chicago, United States
Duration: Jul 2 2023Jul 8 2023

Publication series

NameProceedings - 2023 IEEE International Conference on Digital Health, ICDH 2023

Conference

Conference2023 IEEE International Conference on Digital Health, ICDH 2023
Country/TerritoryUnited States
CityHybrid, Chicago
Period7/2/237/8/23

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Health Informatics

Fingerprint

Dive into the research topics of 'SleepSynth: Evaluating the use of Synthetic Data in Health Digital Twins'. Together they form a unique fingerprint.

Cite this