TY - JOUR
T1 - COMMUNITY EXTRACTION OF NETWORK DATA UNDER STOCHASTIC BLOCK MODELS
AU - Yuan, Quan
AU - Liu, Binghui
AU - Li, Danning
AU - Ma, Yanyuan
N1 - Publisher Copyright:
© 2024 Institute of Statistical Science. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Most existing community discovery methods focus on partitioning all nodes of the network into communities. However, many real networks contain background nodes that do not belong to any community. In such a situation, typical methods tend to artificially split the background nodes and group them together with communities with relatively stronger connection, hence lead to dis- torted results. To avoid this, some community extraction methods have been de- veloped to achieve community discovery with background nodes, which are based on searching algorithms, hence have dificulties in handling large-scale networks due to high computational complexity. To this end, in this paper we propose some algorithms with polynomial complexity to achieve community extraction of large-scale networks. We rigorously show that the proposed algorithms have attractive theoretical properties. In particular, the estimators of the community labels using the proposed algorithms reaches the asymptotic minimax risk under the community extraction model, a specific stochastic block model. Then, we illustrate the advantages and feasibility of the proposed algorithms via extensive simulated networks and a political blog network.
AB - Most existing community discovery methods focus on partitioning all nodes of the network into communities. However, many real networks contain background nodes that do not belong to any community. In such a situation, typical methods tend to artificially split the background nodes and group them together with communities with relatively stronger connection, hence lead to dis- torted results. To avoid this, some community extraction methods have been de- veloped to achieve community discovery with background nodes, which are based on searching algorithms, hence have dificulties in handling large-scale networks due to high computational complexity. To this end, in this paper we propose some algorithms with polynomial complexity to achieve community extraction of large-scale networks. We rigorously show that the proposed algorithms have attractive theoretical properties. In particular, the estimators of the community labels using the proposed algorithms reaches the asymptotic minimax risk under the community extraction model, a specific stochastic block model. Then, we illustrate the advantages and feasibility of the proposed algorithms via extensive simulated networks and a political blog network.
UR - http://www.scopus.com/inward/record.url?scp=85189340567&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85189340567&partnerID=8YFLogxK
U2 - 10.5705/ss.202022.0372
DO - 10.5705/ss.202022.0372
M3 - Article
AN - SCOPUS:85189340567
SN - 1017-0405
JO - Statistica Sinica
JF - Statistica Sinica
ER -