TY - GEN
T1 - Influenza-like symptom prediction by analyzing self-reported health status and human mobility behaviors
AU - Ma, Fenglong
AU - Zhong, Shiran
AU - Gao, Jing
AU - Bian, Ling
PY - 2019/9/4
Y1 - 2019/9/4
N2 - Human mobility behaviors are of great importance to predict influenza-like symptoms. However, most existing studies focus on analyzing population-level outcomes instead of individual-level. One challenge for individual-level influenza symptom prediction is a shortage of a sufficiently large dataset that contains individual health status as well as the mobility behavior information at the same time. Besides, the quality of the collected data is not high enough, due to the carelessness and low variation of reporting behaviors. Also, the number of individuals with influenza symptom onset is much smaller than that of ones without symptoms, i.e., the imbalanced data problem. These challenges further increase the difficulty of accurately predicting influenza-like symptoms. To address these challenges, in this paper, we propose a novel and powerful selective ensemble support vector machines (SESVM). The proposed SESVM can select the best basic SVM classifier by running on the randomly split sub training sets, which consist of the positive samples and the split negative ones. By randomly splitting the dataset multiple times, we can obtain many predictions by each best basic SVM classifier. SESVM finally aggregates all the predictions together to produce the final results. We conduct Experiments on a new longitudinal individual self-reported weekly survey dataset with mobility behaviors, and the results show that the proposed SESVM outperforms all the existing approaches for the influenza symptom prediction task.
AB - Human mobility behaviors are of great importance to predict influenza-like symptoms. However, most existing studies focus on analyzing population-level outcomes instead of individual-level. One challenge for individual-level influenza symptom prediction is a shortage of a sufficiently large dataset that contains individual health status as well as the mobility behavior information at the same time. Besides, the quality of the collected data is not high enough, due to the carelessness and low variation of reporting behaviors. Also, the number of individuals with influenza symptom onset is much smaller than that of ones without symptoms, i.e., the imbalanced data problem. These challenges further increase the difficulty of accurately predicting influenza-like symptoms. To address these challenges, in this paper, we propose a novel and powerful selective ensemble support vector machines (SESVM). The proposed SESVM can select the best basic SVM classifier by running on the randomly split sub training sets, which consist of the positive samples and the split negative ones. By randomly splitting the dataset multiple times, we can obtain many predictions by each best basic SVM classifier. SESVM finally aggregates all the predictions together to produce the final results. We conduct Experiments on a new longitudinal individual self-reported weekly survey dataset with mobility behaviors, and the results show that the proposed SESVM outperforms all the existing approaches for the influenza symptom prediction task.
UR - http://www.scopus.com/inward/record.url?scp=85073150830&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073150830&partnerID=8YFLogxK
U2 - 10.1145/3307339.3342141
DO - 10.1145/3307339.3342141
M3 - Conference contribution
T3 - ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
SP - 233
EP - 242
BT - ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
PB - Association for Computing Machinery, Inc
T2 - 10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2019
Y2 - 7 September 2019 through 10 September 2019
ER -