Unsupervised discovery of drug side-effects from heterogeneous data sources

Fenglong Ma, Chuishi Meng, Houping Xiao, Qi Li, Jing Gao, Lu Su, Aidong Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

41 Scopus citations

Abstract

Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.

Original languageEnglish (US)
Title of host publicationKDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages967-976
Number of pages10
ISBN (Electronic)9781450348874
DOIs
StatePublished - Aug 13 2017
Event23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017 - Halifax, Canada
Duration: Aug 13 2017Aug 17 2017

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
VolumePart F129685

Other

Other23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017
Country/TerritoryCanada
CityHalifax
Period8/13/178/17/17

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Unsupervised discovery of drug side-effects from heterogeneous data sources'. Together they form a unique fingerprint.

Cite this