DSI: A method for indexing large graphs using distance set

Yubo Kou, Yukun Li, Xiaofeng Meng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasible to scan the whole large graph for answers, especially when the query's size is also large. Hence, the "filter-verification" approach is widely adopted. In this approach, researchers first index the neighborhood of each vertex in the large graph, then filter vertexes , and finally perform subgraph matching algorithms. Previous techniques mainly focus on efficient matching algorithms, paying little attention to indexing techniques. However, appropriate indexing techniques could help improve the efficiency of query response by generating less candidates. In this paper we investigate indexing techniques on large graphs, and propose an index structure DSI(Distance Set Index) to capture the neighborhood of each vertex. Through our distance set index, more vertexes could be pruned, resulting in a much smaller search space. Then a subgraph matching algorithm is performed in the search space. We have applied our index structure to real datasets and synthetic datasets. Extensive experiments demonstrate the efficiency and effectiveness of our indexing technique.

Original languageEnglish (US)
Title of host publicationWeb-Age Information Management - 11th International Conference, WAIM 2010, Proceedings
Pages297-308
Number of pages12
DOIs
StatePublished - 2010
Event11th International Conference on Web-Age Information Management, WAIM 2010 - Jiuzhaigou, China
Duration: Jul 15 2010Jul 17 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6184 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on Web-Age Information Management, WAIM 2010
Country/TerritoryChina
CityJiuzhaigou
Period7/15/107/17/10

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'DSI: A method for indexing large graphs using distance set'. Together they form a unique fingerprint.

Cite this