TY - GEN
T1 - Massive-scale RDF processing using compressed bitmap indexes
AU - Madduri, Kamesh
AU - Wu, Kesheng
PY - 2011
Y1 - 2011
N2 - The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQL-like syntax. SPARQL queries generate complex intermediate join queries; to compute these joins efficiently, this paper presents a new strategy based on bitmap indexes. We store the RDF data in column-oriented compressed bitmap structures, along with two dictionaries. We find that our bitmap index-based query evaluation approach is up to an order of magnitude faster the state-of-the-art system RDF-3X, for a variety of SPARQL queries on gigascale RDF data sets.
AB - The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQL-like syntax. SPARQL queries generate complex intermediate join queries; to compute these joins efficiently, this paper presents a new strategy based on bitmap indexes. We store the RDF data in column-oriented compressed bitmap structures, along with two dictionaries. We find that our bitmap index-based query evaluation approach is up to an order of magnitude faster the state-of-the-art system RDF-3X, for a variety of SPARQL queries on gigascale RDF data sets.
UR - http://www.scopus.com/inward/record.url?scp=79961175154&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79961175154&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-22351-8_30
DO - 10.1007/978-3-642-22351-8_30
M3 - Conference contribution
AN - SCOPUS:79961175154
SN - 9783642223501
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 470
EP - 479
BT - Scientific and Statistical Database Management - 23rd International Conference, SSDBM 2011, Proceedings
T2 - 23rd International Conference on Scientific and Statistical Database Management, SSDBM 2011
Y2 - 20 July 2011 through 22 July 2011
ER -