Active Learning for Graphs with Noisy Structures

Hongliang Chi, Cong Qi, Suhang Wang, Yao Ma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Graph Neural Networks (GNNs) have seen significant success in tasks such as node classification, largely contingent upon the availability of sufficient labeled nodes. Yet, the excessive cost of labeling large-scale graphs led to a focus on active learning on graphs, which aims for effective data selection to maximize downstream model performance. Notably, most existing methods assume reliable graph topology, while real-world scenarios often present noisy graphs. Given this, designing a successful active learning framework for noisy graphs is highly needed but challenging, as selecting data for labeling and obtaining a clean graph are two tasks naturally interdependent: selecting high-quality data requires clean graph structure while cleaning noisy graph structure requires sufficient labeled data. Considering the complexity mentioned above, we propose an active learning framework, GALClean, which has been specifically designed to adopt an iterative approach for conducting both data selection and graph purification simultaneously with best information learned from the prior iteration. Importantly, we summarize GALClean as an instance of the Expectation-Maximization algorithm, which provides a theoretical understanding of its design and mechanisms. This theory naturally leads to an enhanced version, GALClean+. Extensive experiments have demonstrated the effectiveness and robustness of our proposed method across various types and levels of noisy graphs.

Original languageEnglish (US)
Title of host publicationProceedings of the 2024 SIAM International Conference on Data Mining, SDM 2024
EditorsShashi Shekhar, Vagelis Papalexakis, Jing Gao, Zhe Jiang, Matteo Riondato
PublisherSociety for Industrial and Applied Mathematics Publications
Pages262-270
Number of pages9
ISBN (Electronic)9781611978032
StatePublished - 2024
Event2024 SIAM International Conference on Data Mining, SDM 2024 - Houston, United States
Duration: Apr 18 2024Apr 20 2024

Publication series

NameProceedings of the 2024 SIAM International Conference on Data Mining, SDM 2024

Conference

Conference2024 SIAM International Conference on Data Mining, SDM 2024
Country/TerritoryUnited States
CityHouston
Period4/18/244/20/24

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences

Cite this