TY - GEN
T1 - Random Walk Based Fake Account Detection in Online Social Networks
AU - Jia, Jinyuan
AU - Wang, Binghui
AU - Gong, Neil Zhenqiang
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/30
Y1 - 2017/8/30
N2 - Online social networks are known to be vulnerable to the so-called Sybil attack, in which an attacker maintains massive fake accounts (also called Sybils) and uses them to perform various malicious activities. Therefore, Sybil detection is a fundamental security research problem in online social networks. Random walk based methods, which leverage the structure of an online social network to distribute reputation scores for users, have been demonstrated to be promising in certain real-world online social networks. In particular, random walk based methods have three desired features: they can have theoretically guaranteed performance for online social networks that have the fast-mixing property, they are accurate when the social network has strong homophily property, and they can be scalable to large-scale online social networks. However, existing random walk based methods suffer from several key limitations: 1) they can only leverage either labeled benign users or labeled Sybils, but not both, 2) they have limited detection accuracy for weak-homophily social networks, and 3) they are not robust to label noise in the training dataset. In this work, we propose a new random walk based Sybil detection method called SybilWalk. SybilWalk addresses the limitations of existing random walk based methods while maintaining their desired features. We perform both theoretical and empirical evaluations to compare SybilWalk with previous random walk based methods. Theoretically, for online social networks with the fast-mixing property, SybilWalk has a tighter asymptotical bound on the number of Sybils that are falsely accepted into the social network than all existing random walk based methods. Empirically, we compare SybilWalk with previous random walk based methods using both social networks with synthesized Sybils and a large-scale Twitter dataset with real Sybils. Our empirical results demonstrate that 1) SybilWalk is substantially more accurate than existing random walk based methods for weakhomophily social networks, 2) SybilWalk is substantially more robust to label noise than existing random walk based methods, and 3) SybilWalk is as scalable as the most efficient existing random walk based methods. In particular, on the Twitter dataset, SybilWalk achieves a false positive rate of 1.3% and a false negative rate of 17.3%.
AB - Online social networks are known to be vulnerable to the so-called Sybil attack, in which an attacker maintains massive fake accounts (also called Sybils) and uses them to perform various malicious activities. Therefore, Sybil detection is a fundamental security research problem in online social networks. Random walk based methods, which leverage the structure of an online social network to distribute reputation scores for users, have been demonstrated to be promising in certain real-world online social networks. In particular, random walk based methods have three desired features: they can have theoretically guaranteed performance for online social networks that have the fast-mixing property, they are accurate when the social network has strong homophily property, and they can be scalable to large-scale online social networks. However, existing random walk based methods suffer from several key limitations: 1) they can only leverage either labeled benign users or labeled Sybils, but not both, 2) they have limited detection accuracy for weak-homophily social networks, and 3) they are not robust to label noise in the training dataset. In this work, we propose a new random walk based Sybil detection method called SybilWalk. SybilWalk addresses the limitations of existing random walk based methods while maintaining their desired features. We perform both theoretical and empirical evaluations to compare SybilWalk with previous random walk based methods. Theoretically, for online social networks with the fast-mixing property, SybilWalk has a tighter asymptotical bound on the number of Sybils that are falsely accepted into the social network than all existing random walk based methods. Empirically, we compare SybilWalk with previous random walk based methods using both social networks with synthesized Sybils and a large-scale Twitter dataset with real Sybils. Our empirical results demonstrate that 1) SybilWalk is substantially more accurate than existing random walk based methods for weakhomophily social networks, 2) SybilWalk is substantially more robust to label noise than existing random walk based methods, and 3) SybilWalk is as scalable as the most efficient existing random walk based methods. In particular, on the Twitter dataset, SybilWalk achieves a false positive rate of 1.3% and a false negative rate of 17.3%.
UR - http://www.scopus.com/inward/record.url?scp=85031665962&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85031665962&partnerID=8YFLogxK
U2 - 10.1109/DSN.2017.55
DO - 10.1109/DSN.2017.55
M3 - Conference contribution
AN - SCOPUS:85031665962
T3 - Proceedings - 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2017
SP - 273
EP - 284
BT - Proceedings - 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2017
Y2 - 26 June 2017 through 29 June 2017
ER -