Xin Wang

Work place: School of Computer Science and Technology, Tianjin University, Tianjin, China

E-mail: wangx@tju.edu.cn

Website:

Research Interests: Computer systems and computational processes, Computer Architecture and Organization, Database Management System, Data Structures and Algorithms

Biography

Xin Wang was born in Tianjin, China, in 1981. He received his Ph.D. degree in Computer Science from Nankai University, Tianjin, China, in 2009. He is an Assistant Professor of School of Computer Science and Technology at Tianjin University, Tianjin, China, since July 2009. His representative publications include: “Storing and indexing RDF data in a column-oriented DBMS” (In Proc. of the 2nd International Workshop on Database Technology and Applications, 2010), “Efficient XPath evaluation using a structural summary index” (In Proc. of the 1st International Conference on Computer Science and Software Engineering, 2008), and “Towards an incremental approach to validation of native XML databases” (Journal of Computational Information Systems, 2007). His current research interests are semantic data management and database implementation. Prof. Wang is a member of Association for Computing Machinery (ACM) and China Computer Federation (CCF). 

Author Articles
CHex: An Efficient RDF Storage and Indexing Scheme for Column-Oriented Databases

By Xin Wang Shuyi Wang Pufeng Du Zhiyong Feng

DOI: https://doi.org/10.5815/ijmecs.2011.03.08, Pub. Date: 8 Jun. 2011

As increasingly large RDF data sets are being published on the Web, effcient RDF data management has become an essential factor in realizing the Semantic Web vision. However, most existing RDF storage schemes, which are built on top of row-store relational databases, are constrained in terms of efficiency and scalability. Still, the growing popularity of the RDF format used in real-world applications arguably calls for an effort to deal with these drawbacks. In this paper, we propose a novel RDF storage and indexing scheme, called CHex, which uses the triple nature of RDF as an asset to implement sextuple indexing for a column-oriented database system. Using binary association tables (BATs) in the column-oriented data model, RDF data is indexed in six possible ways, one for each possible ordering of the three RDF elements. The sextuple indexing scheme in a column-oriented database not only provides efficient single triple pattern lookups, but also allows fast merge-joins for any pair of two triple patterns. To evaluate the performance of our approach, we generate large-scale data sets upto 13 million triples, and devise benchmark queries that cover important RDF join patterns. The experimental results show that our approach outperforms the row-oriented database systems by upto an order of magnitude and is even competitive to the best state-of-the-art native RDF store.

[...] Read more.
Other Articles