TY - GEN
T1 - HIN2Vec
T2 - 26th ACM International Conference on Information and Knowledge Management, CIKM 2017
AU - Fu, Tao Yang
AU - Lee, Wang Chien
AU - Lei, Zhen
N1 - Publisher Copyright:
© 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2017/11/6
Y1 - 2017/11/6
N2 - In this paper, we propose a novel representation learning framework, namely HIN2Vec, for heterogeneous information networks (HINs). The core of the proposed framework is a neural network model, also called HIN2Vec, designed to capture the rich semantics embedded in HINs by exploiting different types of relationships among nodes. Given a set of relationships specified in forms of meta-paths in an HIN, HIN2Vec carries out multiple prediction training tasks jointly based on a target set of relationships to learn latent vectors of nodes and meta-paths in the HIN. In addition to model design, several issues unique to HIN2Vec, including regularization of meta-path vectors, node type selection in negative sampling, and cycles in random walks, are examined. To validate our ideas, we learn latent vectors of nodes using four large-scale real HIN datasets, including Blogcatalog, Yelp, DBLP and U.S. Patents, and use them as features for multi-label node classification and link prediction applications on those networks. Empirical results show that HIN2Vec soundly outperforms the state-of-the-art representation learning models for network data, including DeepWalk, LINE, node2vec, PTE, HINE and ESim, by 6.6% to 23.8% ofmicro-f1 in multi-label node classification and 5% to 70.8% of MAP in link prediction.
AB - In this paper, we propose a novel representation learning framework, namely HIN2Vec, for heterogeneous information networks (HINs). The core of the proposed framework is a neural network model, also called HIN2Vec, designed to capture the rich semantics embedded in HINs by exploiting different types of relationships among nodes. Given a set of relationships specified in forms of meta-paths in an HIN, HIN2Vec carries out multiple prediction training tasks jointly based on a target set of relationships to learn latent vectors of nodes and meta-paths in the HIN. In addition to model design, several issues unique to HIN2Vec, including regularization of meta-path vectors, node type selection in negative sampling, and cycles in random walks, are examined. To validate our ideas, we learn latent vectors of nodes using four large-scale real HIN datasets, including Blogcatalog, Yelp, DBLP and U.S. Patents, and use them as features for multi-label node classification and link prediction applications on those networks. Empirical results show that HIN2Vec soundly outperforms the state-of-the-art representation learning models for network data, including DeepWalk, LINE, node2vec, PTE, HINE and ESim, by 6.6% to 23.8% ofmicro-f1 in multi-label node classification and 5% to 70.8% of MAP in link prediction.
UR - http://www.scopus.com/inward/record.url?scp=85037340625&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85037340625&partnerID=8YFLogxK
U2 - 10.1145/3132847.3132953
DO - 10.1145/3132847.3132953
M3 - Conference contribution
AN - SCOPUS:85037340625
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1797
EP - 1806
BT - CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 6 November 2017 through 10 November 2017
ER -