TY - GEN
T1 - Unsupervised discovery of drug side-effects from heterogeneous data sources
AU - Ma, Fenglong
AU - Meng, Chuishi
AU - Xiao, Houping
AU - Li, Qi
AU - Gao, Jing
AU - Su, Lu
AU - Zhang, Aidong
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/8/13
Y1 - 2017/8/13
N2 - Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.
AB - Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.
UR - http://www.scopus.com/inward/record.url?scp=85029026084&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029026084&partnerID=8YFLogxK
U2 - 10.1145/3097983.3098129
DO - 10.1145/3097983.3098129
M3 - Conference contribution
AN - SCOPUS:85029026084
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 967
EP - 976
BT - KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017
Y2 - 13 August 2017 through 17 August 2017
ER -