## Abstract

We study supergraph search (SPS), that is, given a query graph qand a graph database G that contains a collection of graphs, returngraphs that have q as a supergraph from G. SPS has broad applicationsin bioinformatics, cheminformatics and other scientific andcommercial fields. Determining whether a graph is a subgraph (orsupergraph) of another is an NP-complete problem. Hence, it is intractableto compute SPS for large graph databases. Two separateindexing methods, a "filter + verify"-based method and a "prefixsharing"-based method, have been studied to efficiently computeSPS. To implement the above two methods, subgraph patterns aremined from the graph database to build an index. Those subgraphsare mined to optimize either the filtering gain or the prefix-sharinggain. However, no single subgraph-mining algorithm considersboth gains. This work is the first one to mine subgraphs to optimize boththe filtering gain and the prefix-sharing gain while processing SPSqueries. First, we show that the subgraph-mining problem is NPhard. Then, we propose two polynomial-time algorithms to solvethe problem with an approximation ratio of 1-1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, toorganize the selected subgraph patterns for fast index-lookup. Ourexperiments show that our approach improves the query processingtime for SPS queries by a factor of 3 to 10.

Original language | English (US) |
---|---|

Pages (from-to) | 829-840 |

Number of pages | 12 |

Journal | Proceedings of the VLDB Endowment |

Volume | 6 |

Issue number | 10 |

DOIs | |

State | Published - Aug 2013 |

Event | 39th International Conference on Very Large Data Bases, VLDB 2012 - Trento, Italy Duration: Aug 26 2013 → Aug 30 2013 |

## All Science Journal Classification (ASJC) codes

- Computer Science (miscellaneous)
- General Computer Science