TY - GEN
T1 - Cross-network clustering and cluster ranking for medical diagnosis
AU - Ni, Jingchao
AU - Fei, Hongliang
AU - Fan, Wei
AU - Zhang, Xiang
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/5/16
Y1 - 2017/5/16
N2 - Automating medical diagnosis is an important data mining problem, which is to infer likely disease(s) for some observed symptoms. Algorithms to the problem are very beneficial as a supplement to a real diagnosis. Existing diagnosis methods typically perform the inference on a sparse bipartite graph with two sets of nodes representing diseases and symptoms, respectively. By using this graph, existing methods basically assume no direct dependency exists between diseases (or symptoms), which may not be true in reality. To address this limitation, in this paper, we introduce two domain networks encoding similarities between diseases and those between symptoms to avoid information loss as well as to alleviate the sparsity problem of the bipartite graph. Based on the domain networks and the bipartite graph bridging them, we develop a novel algorithm, CCCR, to perform diagnosis by ranking symptom-disease clusters. Comparing with existing approaches, CCCR is more accurate, and more interpretable since its results deliver rich information about how the inferred diseases are categorized. Experimental results on real-life datasets demonstrate the effectiveness of the proposed method.
AB - Automating medical diagnosis is an important data mining problem, which is to infer likely disease(s) for some observed symptoms. Algorithms to the problem are very beneficial as a supplement to a real diagnosis. Existing diagnosis methods typically perform the inference on a sparse bipartite graph with two sets of nodes representing diseases and symptoms, respectively. By using this graph, existing methods basically assume no direct dependency exists between diseases (or symptoms), which may not be true in reality. To address this limitation, in this paper, we introduce two domain networks encoding similarities between diseases and those between symptoms to avoid information loss as well as to alleviate the sparsity problem of the bipartite graph. Based on the domain networks and the bipartite graph bridging them, we develop a novel algorithm, CCCR, to perform diagnosis by ranking symptom-disease clusters. Comparing with existing approaches, CCCR is more accurate, and more interpretable since its results deliver rich information about how the inferred diseases are categorized. Experimental results on real-life datasets demonstrate the effectiveness of the proposed method.
UR - http://www.scopus.com/inward/record.url?scp=85021251872&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021251872&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2017.65
DO - 10.1109/ICDE.2017.65
M3 - Conference contribution
AN - SCOPUS:85021251872
T3 - Proceedings - International Conference on Data Engineering
SP - 163
EP - 166
BT - Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017
PB - IEEE Computer Society
T2 - 33rd IEEE International Conference on Data Engineering, ICDE 2017
Y2 - 19 April 2017 through 22 April 2017
ER -