The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes, Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets, the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.
|Number of pages
|Frontiers in Bioscience - Elite
|Published - Jan 1 2012
All Science Journal Classification (ASJC) codes
- General Biochemistry, Genetics and Molecular Biology
- General Immunology and Microbiology