TY - GEN
T1 - Hybridization of K-means and harmony search methods for web page clustering
AU - Forsati, Rana
AU - Meybodi, Mohammad Reza
AU - Mahdavi, Mehrdad
AU - Neiat, Azadeh Ghari
PY - 2008
Y1 - 2008
N2 - Clustering is currently one of the most crucial techniques for dealing with massive amount of heterogeneous information on the web, which is beyond human being's capacity to digest. Recent studies have shown that the most commonly used partitioning-based clustering algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm can generate a local optimal solution. In this paper we present novel harmony search clustering algorithms that deal with documents clustering based on harmony search optimization method. By modeling clustering as an optimization problem, first, we propose a pure harmony search based clustering algorithm that finds near global optimal clusters within a reasonable time. Contrary to the localized searching of the K-means algorithm, the harmony search clustering algorithm performs a globalized search in the entire solution space. Then harmony clustering is integrated with the K-means algorithm in three ways to achieve better clustering. The proposed algorithms improve the K-means algorithm by making it less dependent on the initial parameters such as randomly chosen initial cluster centers, hence more stable. In the experiments we conducted, we applied the proposed algorithms, K-means clustering algorithm on five different document datasets. Experimental results reveal that the proposed algorithms can find better clusters when compared to K-means and the quality of clusters is comparable and converge to the best known optimum faster than it.
AB - Clustering is currently one of the most crucial techniques for dealing with massive amount of heterogeneous information on the web, which is beyond human being's capacity to digest. Recent studies have shown that the most commonly used partitioning-based clustering algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm can generate a local optimal solution. In this paper we present novel harmony search clustering algorithms that deal with documents clustering based on harmony search optimization method. By modeling clustering as an optimization problem, first, we propose a pure harmony search based clustering algorithm that finds near global optimal clusters within a reasonable time. Contrary to the localized searching of the K-means algorithm, the harmony search clustering algorithm performs a globalized search in the entire solution space. Then harmony clustering is integrated with the K-means algorithm in three ways to achieve better clustering. The proposed algorithms improve the K-means algorithm by making it less dependent on the initial parameters such as randomly chosen initial cluster centers, hence more stable. In the experiments we conducted, we applied the proposed algorithms, K-means clustering algorithm on five different document datasets. Experimental results reveal that the proposed algorithms can find better clusters when compared to K-means and the quality of clusters is comparable and converge to the best known optimum faster than it.
UR - http://www.scopus.com/inward/record.url?scp=62949153901&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62949153901&partnerID=8YFLogxK
U2 - 10.1109/WIIAT.2008.370
DO - 10.1109/WIIAT.2008.370
M3 - Conference contribution
AN - SCOPUS:62949153901
SN - 9780769534961
T3 - Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
SP - 329
EP - 335
BT - Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
T2 - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Y2 - 9 December 2008 through 12 December 2008
ER -