Open Access Open Access  Restricted Access Subscription or Fee Access

A Multi-Query Optimization Algorithm Using Map Reduce

R Gomathi, S Logeswari, B Gomathy

Abstract


The need for storing statements about web resources lead to the emergence of the semantic
web technology. A World Wide Web Consortium (W3C) standard for storing the semantic
web data is Resource Description Framework (RDF). The existing frameworks do not
provide scalability for large RDF graphs. This paper focuses on the problem of multi-query
optimization of semantic web data. A scalable framework for storing RDF graphs is designed
using Hadoop Distributed file system and the problem of multi-query optimization in the
perspective of SPARQL is revisited in this research. Algorithms for multi-query optimization
is proposed and query execution is done through map reduce programming to get the final
result of optimized query. Experiments were conducted on the LUBM benchmark dataset. The
algorithm is executed on Jena data store and the Hadoop framework. The extent to which the
algorithm is efficient and scalability is tested and the results are documented.
Keywords: hadoop, map reduce, query optimization, resource description framework,
semantic web

Full Text:

PDF

References


D. Abadi, A. Marcus, S. Madden,

K. HollenBach. Scalable semantic

web data management using vertical

partitioning. VLDB. 2007.

A. Aljanaby, E. Abuelrub, M.

Odeh. A survey of distributed query

optimization, Int Arab J Inform

Technol. 2005; 2(5): 48–57p.

K. Anyanwu. A vision for

SPARQL multi-query optimization

on Map Reduce, ICDEW. 2013: 25–

p.

M. Cermak, Z. Falt, J. Dokulil, F.

Zavoral. SPARQL query processing

using Bobox Framework, Int Conf

Adv Sem Proces. 2011.

M. Hong, A. Demers, J. Gehrke,

C. Koch, M. Riedewald, W. White.

Massively

multi-query

join

processing

in

publish/subscribe

systems, SIGMOD. 2007.

M. Husain, J. McGlothlin, M.

Masud, L. Khan, B. Thuraisingham.

Heuristics based Query processing

for large RDF graphs using cloud

computing, IEEE Transac Know

Data Eng. 2011.

A. Kementsietsidis, F. Neven, D.

Craen, S. Vansummeren. Scalable

multi-query

optimization

for

exploratory queries over federated

scientific databases, PVLDB. 2008.

H. Kim, P. Ravindra, K. Anyanwu.

From SPARQL to map reduce: the

journey using a nested triple group

algebra, Proc VLDB. 2011.

W. Le, K. Anastasios, D. Songyun,

L. Feifei. Scalable multi-query

optimization for SPARQL, Int Conf

Data Eng. 2012: 666–7p.

IJADA (2017) 1–10 © JournalsPub 2017. All Rights Reserved

Page 9A Multi-Query Optimization Algorithm

T. Neumann, G. Weikum. RDF-

X: a RISC-style engine for RDF,

PVLDB. 2008.

K. O’Gorman, D. Agrawal, A.

Abbadi. Multiple query optimization

by cache-aware middleware using

query teamwork, ICDE. 2002.

S. Prabha, A. Kannan, P.

Anandhakumar. An optimizing query

processor with an efficient caching

mechanism for distributed databases,

Int Arab J Inform Technol. 2006;

(3): 231–6p.

P. Ravindra, S. Hong, H. Kim, K.

Anyanwu. Efficient processing of

RDF graph pattern matching on map

reduce platforms, Data cloud SC’11,

ACM. 2011.

Gomathi et al.

P. Roy, S. Seshadri, S. Sudharshan,

S. Bhobe. Efficient and extensible

algorithms

for

multi

query

optimization, SIGMOD. 2000.

M. Stocker, A. Seaborne, A.

Bernstein, C. Kiefer, D. Reynolds.

SPARQL basic graph pattern

optimization

using

selectivity

estimation, WWW. 2008.

P. Tsialiamanis, L. Sidirourgos, I.

Fundulaki, V. Christophides, P.

Boncz. Heuristics based query

optimization for SPARQL, EDBT.

R. Gomathi, C. Sathya, D. Sharmila.

Efficient optimization of multiple

SPARQL queries, IOSR J Comp Eng.

: 97–101p.


Refbacks

  • There are currently no refbacks.