Skip to content

Latest commit

 

History

History
executable file
·
57 lines (40 loc) · 1.95 KB

README.md

File metadata and controls

executable file
·
57 lines (40 loc) · 1.95 KB

Semantic Partitioning

Master Thesis from Bonn University

SANSA Semantic Partitioning is a scalable and highly efficient application that first perform in-memory RDF Data (N-Triples) Partition and then pass the partitioning data to the SPARQL Query Engine layer to get efficient results. It is built on top of SANSA-Stack using the Scala and Spark technologies.

Alt text

Read here: Scala & Spark

Benchmarks

The datasets should be in N-Triples format.

./generate.sh --quiet --timing -u 1 --format NTRIPLES --consolidate Maximal --threads 8

./generate -fc -s nt -fn dataset_10MB -pc 100

direct download

Application Settings

VM Options

-DLogFilePath=/SANSA-Semantic-Partitioning/src/main/resources/log/console.log

Program Arguments

--input /SANSA-Semantic-Partitioning/src/main/resources/input/lubm/sample.nt
--queries /SANSA-Semantic-Partitioning/src/main/resources/queries/lubm/query-01.txt
--partitions /SANSA-Semantic-Partitioning/src/main/resources/output/partitioned-data/
--output /SANSA-Semantic-Partitioning/src/main/resources/output/query-result/

SPARQL Operators

Read here: SPARQL Operators

SPARQL Queries

Read here: SPARQL Queries

Deploy App on Cluster

Read here: App Deploy

Future Work

  • Implement Prefix for SPARQL queries
  • Add more operators
  • Add support in FILTER:
    • Math: +, -, *, /
    • SPARQL Tests: bound
    • SPARQL Accessors: str
    • Other: sameTerm, langMatches, regex
  • Show predicate in the final result (for flexibility)