TY - GEN
T1 - Deep co-clustering
AU - Xu, Dongkuan
AU - Cheng, Wei
AU - Zong, Bo
AU - Ni, Jingchao
AU - Song, Dongjin
AU - Yu, Wenchao
AU - Chen, Yuncong
AU - Chen, Haifeng
AU - Zhang, Xiang
N1 - Funding Information:
This work was partially supported by the National Science Foundation grant IIS-1707548.
Publisher Copyright:
Copyright © 2019 by SIAM.
PY - 2019
Y1 - 2019
N2 - Co-clustering partitions instances and features simultaneously by leveraging the duality between them and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various datasets demonstrate the effectiveness of DeepCC.
AB - Co-clustering partitions instances and features simultaneously by leveraging the duality between them and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various datasets demonstrate the effectiveness of DeepCC.
UR - http://www.scopus.com/inward/record.url?scp=85066110843&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066110843&partnerID=8YFLogxK
U2 - 10.1137/1.9781611975673.47
DO - 10.1137/1.9781611975673.47
M3 - Conference contribution
AN - SCOPUS:85066110843
T3 - SIAM International Conference on Data Mining, SDM 2019
SP - 414
EP - 422
BT - SIAM International Conference on Data Mining, SDM 2019
PB - Society for Industrial and Applied Mathematics Publications
T2 - 19th SIAM International Conference on Data Mining, SDM 2019
Y2 - 2 May 2019 through 4 May 2019
ER -