What is SPARQL

SPARQL is a query language for semantic databases using the Resource Description Framework (RDF) format

Tutorial

There are quite a few tutorials out there for SPARQL e.g.

This tutorial is for people which are new to semantic concepts but would like to use an example with a fair amount of data but not too much of complexity in the structure of the data.

Semantic Concepts

Personally I learned Semantic Concepts using Semantic MediaWiki see

When using SPARQL a tutorial needs to get a slightly different touch, so for those who know the talk above I'll explain some key concepts based on an example using:

Countries
Towns
Municipal Units

Triples

A semantic statement has the form

<subject> <predicate> <object>

e.g.

Dubai is-located-in AE

is such a semantic statement which is also called a Triple.

The natural language statement "Dubai is located in United Arab Emirates" is purposely slightly modified to a more "computer-ready" form. The predicate has been written as is-located-in to make it a proper Identifier. And the country-name "United Arab Emirates" has been replaced by its two letter United Nations Location Code AE. A triple has a natural graph representation:

TripleStore

A Triplestore is a database that can store and query triples. In fact for educational purposes I have written a simple Triplestore myself:

https://github.com/BITPlan/org.sidif.triplestore

For that simple triplestore the triples are supplied in Simple Data Interchange Format. Again that format is mostly for educational purposes although it can also be used for small usecases with just a few thousand triples. Please also note that there is no SPARQL support in that project.

For more than a non-educational use a Triplestore is needed that can handle larger amounts of data and support SPARQL. The Wikipedia List of Subject-Predicate-Object Databases shows you some options. For this tutorial we'll use Blazegraph.

Setting up the Blazegraph Triple Store

https://www.blazegraph.com/wordpress/wp-content/uploads/2015/04/logo.png

You need Java to be installed on you machine.

Download the blazegraph.jar file from https://www.blazegraph.com/download/ and start it with

java -jar blazegraph.jar

In fact it's better if you start the jar file with an option to allow bigger xml files to be handled:

java -Djdk.xml.entityExpansionLimit=0 -jar blazegraph.jar

otherwise you might run into the error:

org.openrdf.rio.RDFParseException: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK

you should see

Welcome to the Blazegraph(tm) Database.

Go to http://localhost:9999/blazegraph/ to get started.

And you might want to do just that and click that link.

Where Blazegraph stores it's data

The default setting for Blazegraphs journal file is to use blazegraph.jnl in the directory where you started the jar file. On my Mac OS Laptop the initial file size is some 200 MBytes.

ls -l blazegraph.jnl 
-rw-r--r--  1 wf  staff  209715200  4 Jan 11:50 blazegraph.jnl

The Blazegraph Web UI

The Web-UI shows the Tabs:

WELCOME
QUERY
UPDATE
EXPLORE
NAMESPACES
STATUS
PERFORMANCE

Let's start with the UPDATE tab to load some sample data.

The sample Data

The human readable form of some of our sample data and their description is available at:

RDF Version of the data

SPARQL

Contents

What is SPARQL

Tutorial

Semantic Concepts

Triples

TripleStore

Setting up the Blazegraph Triple Store

Where Blazegraph stores it's data

The Blazegraph Web UI

The sample Data

RDF Version of the data

Navigation menu

SPARQL

What is SPARQL

Tutorial

Semantic Concepts

Triples

TripleStore

Setting up the Blazegraph Triple Store

Where Blazegraph stores it's data

The Blazegraph Web UI

The sample Data

RDF Version of the data

Navigation menu

Search