Fast approximate subgraph counting and enumeration

George M. Slota, Kamesh Madduri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Scopus citations

Abstract

We present a new shared-memory parallel algorithm and implementation called FASCIA for the problems of approximate subgraph counting and subgraph enumeration. The problem of subgraph counting refers to determining the frequency of occurrence of a given subgraph (or template) within a large network. This is a key graph analytic with applications in various domains. In bioinformatics, subgraph counting is used to detect and characterize local structure (motifs) in protein interaction networks. Exhaustive enumeration and exact counting is extremely compute-intensive, with running time growing exponentially with the number of vertices in the template. In this work, we apply the color coding technique to determine approximate counts of non-induced occurrences of the subgraph in the original network. Color coding gives a fixed-parameter algorithm for this problem, using a dynamic programming-based counting approach. Our new contributions are a multilevel shared-memory parallelization of the counting scheme and several optimizations to reduce the memory footprint. We show that approximate counts can be obtained for templates with up to 12 vertices, on networks with up to millions of vertices and edges. Prior work on this problem has only considered out-of-core parallelization on distributed platforms. With our new counting scheme, data layout optimizations, and multicore parallelism, we demonstrate a significant speedup over the current state-of-the-art for subgraph counting.

Original languageEnglish (US)
Title of host publicationProceedings
Subtitle of host publicationInternational Conference on Parallel Processing - The 42nd Annual Conference, ICPP 2013
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages210-219
Number of pages10
ISBN (Print)9780769551173
DOIs
StatePublished - 2013
Event42nd Annual International Conference on Parallel Processing, ICPP 2013 - Lyon, France
Duration: Oct 1 2013Oct 4 2013

Publication series

NameProceedings of the International Conference on Parallel Processing
ISSN (Print)0190-3918

Other

Other42nd Annual International Conference on Parallel Processing, ICPP 2013
Country/TerritoryFrance
CityLyon
Period10/1/1310/4/13

All Science Journal Classification (ASJC) codes

  • Software
  • General Mathematics
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Fast approximate subgraph counting and enumeration'. Together they form a unique fingerprint.

Cite this