Investigating and Mitigating Degree-Related Biases in Graph Convoltuional Networks

Xianfeng Tang, Huaxiu Yao, Yiwei Sun, Yiqi Wang, Jiliang Tang, Charu Aggarwal, Prasenjit Mitra, Suhang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

95 Scopus citations

Abstract

Graph Convolutional Networks (GCNs) show promising results for semi-supervised learning tasks on graphs, thus become favorable comparing with other approaches. Despite the remarkable success of GCNs, it is difficult to train GCNs with insufficient supervision. When labeled data are limited, the performance of GCNs becomes unsatisfying for low-degree nodes. While some prior work analyze successes and failures of GCNs on the entire model level, profiling GCNs on individual node level is still underexplored. In this paper, we analyze GCNs in regard to the node degree distribution. From empirical observation to theoretical proof, we confirm that GCNs are biased towards nodes with larger degrees with higher accuracy on them, even if high-degree nodes are underrepresented in most graphs. We further develop a novel Self-Supervised-Learning Degree-Specific GCN (SL-DSGCN) that mitigate the degree-related biases of GCNs from model and data aspects. Firstly, we propose a degree-specific GCN layer that captures both discrepancies and similarities of nodes with different degrees, which reduces the inner model-aspect biases of GCNs caused by sharing the same parameters with all nodes. Secondly, we design a self-supervised-learning algorithm that creates pseudo labels with uncertainty scores on unlabeled nodes with a Bayesian neural network. Pseudo labels increase the chance of connecting to labeled neighbors for low-degree nodes, thus reducing the biases of GCNs from the data perspective. Uncertainty scores are further exploited to weight pseudo labels dynamically in the stochastic gradient descent for SL-DSGCN. Experiments on three benchmark datasets show SL-DSGCN not only outperforms state-of-the-art self-training/self-supervised-learning GCN methods, but also improves GCN accuracy dramatically for low-degree nodes.

Original languageEnglish (US)
Title of host publicationCIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages1435-1444
Number of pages10
ISBN (Electronic)9781450368599
DOIs
StatePublished - Oct 19 2020
Event29th ACM International Conference on Information and Knowledge Management, CIKM 2020 - Virtual, Online, Ireland
Duration: Oct 19 2020Oct 23 2020

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference29th ACM International Conference on Information and Knowledge Management, CIKM 2020
Country/TerritoryIreland
CityVirtual, Online
Period10/19/2010/23/20

All Science Journal Classification (ASJC) codes

  • General Business, Management and Accounting
  • General Decision Sciences

Fingerprint

Dive into the research topics of 'Investigating and Mitigating Degree-Related Biases in Graph Convoltuional Networks'. Together they form a unique fingerprint.

Cite this