4.4 Article

Scalable reduction of large datasets to interesting subsets

Journal

JOURNAL OF WEB SEMANTICS
Volume 8, Issue 4, Pages 365-373

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.websem.2010.08.002

Keywords

Billion Triples Challenge; Scalability; Parallel; Inferencing; Query; Triplestore

Funding

  1. DARPA's Transformational Convergence Technology Office
  2. Lockheed Martin Advanced Technology Laboratories
  3. Fujitsu Laboratories of America

Ask authors/readers for more resources

With a huge amount of RDF data available on the web, the ability to find and access relevant information is crucial. Traditional approaches to storing, querying, and reasoning fall short when faced with web-scale data. We present a system that combines the computational power of large clusters for enabling large-scale reasoning and data access with an efficient data structure for storing and querying the accessed data on a traditional personal computer or other resource-constrained device. We present results of using this system to load the 2009 Billion Triples Challenge dataset, materialize RDFS inferences, extract an interesting subset of the data using a large cluster, and further analyze the extracted data using a personal computer, all in the order of tens of minutes. (C) 2010 Elsevier B. V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available