TY - GEN
T1 - PRIVACY SENSITIVE SPEECH ANALYSIS USING FEDERATED LEARNING TO ASSESS DEPRESSION
AU - Suhas, B. N.
AU - Abdullah, Saeed
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - Recent studies have used speech signals to assess depression. However, speech features can lead to serious privacy concerns. To address these concerns, prior work has used privacy-preserving speech features. However, using a subset of features can lead to information loss and, consequently, non-optimal model performance. Furthermore, prior work relies on a centralized approach to support continuous model updates, posing privacy risks. This paper proposes to use Federated Learning (FL) to enable decentralized, privacy-preserving speech analysis to assess depression. Using an existing dataset (DAIC-WOZ), we show that FL models enable a robust assessment of depression with only 4-6% accuracy loss compared to a centralized approach. These models also outperform prior work using the same dataset. Furthermore, the FL models have short inference latency and small memory footprints while being energy-efficient. These models, thus, can be deployed on mobile devices for real-time, continuous, and privacy-preserving depression assessment at scale.
AB - Recent studies have used speech signals to assess depression. However, speech features can lead to serious privacy concerns. To address these concerns, prior work has used privacy-preserving speech features. However, using a subset of features can lead to information loss and, consequently, non-optimal model performance. Furthermore, prior work relies on a centralized approach to support continuous model updates, posing privacy risks. This paper proposes to use Federated Learning (FL) to enable decentralized, privacy-preserving speech analysis to assess depression. Using an existing dataset (DAIC-WOZ), we show that FL models enable a robust assessment of depression with only 4-6% accuracy loss compared to a centralized approach. These models also outperform prior work using the same dataset. Furthermore, the FL models have short inference latency and small memory footprints while being energy-efficient. These models, thus, can be deployed on mobile devices for real-time, continuous, and privacy-preserving depression assessment at scale.
UR - http://www.scopus.com/inward/record.url?scp=85131267390&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131267390&partnerID=8YFLogxK
U2 - 10.1109/ICASSP43922.2022.9746827
DO - 10.1109/ICASSP43922.2022.9746827
M3 - Conference contribution
AN - SCOPUS:85131267390
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 6272
EP - 6276
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Y2 - 23 May 2022 through 27 May 2022
ER -