TY - GEN
T1 - Synthetic Data Digital Twins and Data Trusts Control for Privacy in Health Data Sharing
AU - Lomotey, Richard K.
AU - Kumi, Sandra
AU - Ray, Madhurima
AU - Deters, Ralph
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/6/21
Y1 - 2024/6/21
N2 - Health data sharing is very valuable for medical research since it has the propensity to improve diagnostics, policy, medication, and so on. At the same time, sharing health data needs to be done without compromising the privacy of patients and stakeholders. However, recent advances in AI/ML and sophisticated analytics have proven to introduce biases that can easily identify patients based on their healthcare data, which violates privacy. In this work, we sort to address this major issue by exploring two emerging topics that are gaining attention from industry, academia, and governments, i.e., digital twins and data trusts. First, we proposed the use of digital twins (DTs) to generate synthetic records of patient's heart rate data. DTs are virtual replicas of the actual data and were created using two synthetic data generative models - Gaussian Copula (GC) and Tabular Variational Autoencoder (TVAE). The GC and TVAE achieved a maximum data quality score of 88% and 96% respectively. Next, we posit that the DTs should be shared with a data trusts layer. Data trusts are fiduciary frameworks that govern multi-party data sharing. The data trusts enforce access controls (based on metrics such as location, role-based, and policy-based) to the synthetic health data and reports to the data subject. The preliminary evaluations of the work show that merging the two techniques (i.e., synthetic data digital twins and data trusts) enforces better privacy for health data access. The synthetic data ensures more anonymization while the data trusts provide easy auditing, tracking, and efficient reporting to the patient or data subject. The paper also detailed the architectural design of the data trusts and evaluated the efficiency of the access control techniques.
AB - Health data sharing is very valuable for medical research since it has the propensity to improve diagnostics, policy, medication, and so on. At the same time, sharing health data needs to be done without compromising the privacy of patients and stakeholders. However, recent advances in AI/ML and sophisticated analytics have proven to introduce biases that can easily identify patients based on their healthcare data, which violates privacy. In this work, we sort to address this major issue by exploring two emerging topics that are gaining attention from industry, academia, and governments, i.e., digital twins and data trusts. First, we proposed the use of digital twins (DTs) to generate synthetic records of patient's heart rate data. DTs are virtual replicas of the actual data and were created using two synthetic data generative models - Gaussian Copula (GC) and Tabular Variational Autoencoder (TVAE). The GC and TVAE achieved a maximum data quality score of 88% and 96% respectively. Next, we posit that the DTs should be shared with a data trusts layer. Data trusts are fiduciary frameworks that govern multi-party data sharing. The data trusts enforce access controls (based on metrics such as location, role-based, and policy-based) to the synthetic health data and reports to the data subject. The preliminary evaluations of the work show that merging the two techniques (i.e., synthetic data digital twins and data trusts) enforces better privacy for health data access. The synthetic data ensures more anonymization while the data trusts provide easy auditing, tracking, and efficient reporting to the patient or data subject. The paper also detailed the architectural design of the data trusts and evaluated the efficiency of the access control techniques.
UR - http://www.scopus.com/inward/record.url?scp=85197247137&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85197247137&partnerID=8YFLogxK
U2 - 10.1145/3643650.3658605
DO - 10.1145/3643650.3658605
M3 - Conference contribution
AN - SCOPUS:85197247137
T3 - SaT-CPS 2024 - Proceedings of the 2024 ACM Workshop on Secure and Trustworthy Cyber-Physical Systems
SP - 1
EP - 10
BT - SaT-CPS 2024 - Proceedings of the 2024 ACM Workshop on Secure and Trustworthy Cyber-Physical Systems
PB - Association for Computing Machinery, Inc
T2 - 4th ACM Workshop on Secure and Trustworthy Cyber-Physical Systems, SaT-CPS 2024, held in conjunction with the 14th ACM Conference on Data and Application Security and Privacy, CODASPY 2024
Y2 - 21 June 2024
ER -