SPARQL Tutorial

August 23rd, 2018

Wouter Beek (wouter@triply.cc)

Part I: The first SPARQL query

The first SPARQL query

select * {
  ?s ?p ?o.
}
limit 5
select *
Projection: the variables whose bindings are returned (result set). * means bindings for all variables are returned.
{ ?s ?p ?o. }
The graph pattern that is matched. Variables start with a question mark.
limit 5
Modifier over the result set: only show the first 5 results.

Make the graph pattern more specific

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
}
limit 5
<https://triply.cc/rce/def/locator>
Instantiate variable ?p with an IRI.
?image and ?url
Use descriptive names for variables .

Change the projection

select ?url {
  ?image <https://triply.cc/rce/def/locator> ?url.
}
limit 5
?url
Only return the bindings for ?url in the projection.
?image
A hidden variable: one that is not returned.

Introduce variables with bind

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
  bind(concat('<a href="',?image,'"><img src="',?url,'"><a>') as ?widget)
}
limit 5
bind(EXPRESSION) as ?x
The result of evaluating EXPRESSION is bound to ?x.
concat(STRING,…)
Evaluates to a concatenated string.

Change the limit

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
}
limit 6

Add an offset

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
}
limit 5
offset 5
offset 5
Skips the first 5 results, returning the 6th through the 10th result.

Part II: Graph patterns

Graph pattern (2 triple patterns)

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
  ?image <http://xmlns.com/foaf/0.1/depicts> ?monument.
}
limit 5
?image
A shared variable connects two triple patterns. Connected triple patterns form a graph pattern.

Graph pattern (3 triple patterns)

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
  ?image <http://xmlns.com/foaf/0.1/depicts> ?monument.
  ?monument <http://www.w3.org/2000/01/rdf-schema#label> ?label.
}
limit 5
http://xmlns.com/foaf/0.1/depicts
Friend of a Friend (FOAF)
http://www.w3.org/2000/01/rdf-schema#
RDF Schema (RDFS)

Graph pattern (4 triple patterns)

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
  ?image <http://xmlns.com/foaf/0.1/depicts> ?monument.
  ?monument <http://www.w3.org/2000/01/rdf-schema#label> ?label.
  ?monument <https://triply.cc/rce/def/bouwjaar> ?year.
}
limit 5

Sort by year (oldest first)

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
  ?image <http://xmlns.com/foaf/0.1/depicts> ?monument.
  ?monument <http://www.w3.org/2000/01/rdf-schema#label> ?label.
  ?monument <https://triply.cc/rce/def/bouwjaar> ?year.
}
order by ?year
limit 5
order by ?x
Orders results by the bindings of ?x.

Sort by year (newest first)

select * {
  ?image <https://triply.cc/rce/def/locator> ?url.
  ?image <http://xmlns.com/foaf/0.1/depicts> ?monument.
  ?monument <http://www.w3.org/2000/01/rdf-schema#label> ?label.
  ?monument <https://triply.cc/rce/def/bouwjaar> ?year.
}
order by desc(?year)
limit 5
order by desc(?x)
Inversely orders results by the bindings of ?x.

Use abbreviations

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix rce: <https://triply.cc/rce/def/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select * {
  ?image rce:locator ?url;
         foaf:depicts ?monument.
  ?monument rdfs:label ?label;
            rce:bouwjaar ?year.
}
order by desc(?year)
limit 5
prefix ALIAS: <IRI>
Allows IRI to be abbreviated with ALIAS + colon.
; after an object term
Means that we implicitly repeat the subject term.

Part III: GeoSPARQL

Linked Geodata

prefix geo: <http://www.opengis.net/ont/geosparql#>
select * {
  ?monument geo:hasGeometry ?geometry.
  ?geometry geo:asWKT ?shape.
}
limit 1000
geo:hasGeometry and geo:asWKT
GeoSPARQL, standardized by the Open Geospatial Consortium (OGC).

Property Path notation

prefix geo: <http://www.opengis.net/ont/geosparql#>
select * {
  ?monument geo:hasGeometry/geo:asWKT ?shape.
}
limit 1000
p/q
Means follow property p and then property q (and do not bind the node in between).

Federated query

prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix owl: <http://www.w3.org/2002/07/owl#>
select * {
  {
    select * {
      ?monument geo:hasGeometry/geo:asWKT ?shape1;
                owl:sameAs ?pand.
    }
    limit 10
  }
  service <https://data.pdok.nl/sparql> {
    ?pand geo:hasGeometry/geo:asWKT ?shape2.
  }
}
Sub-select
First the inner query is performed, and then the outer query is performed.
service <IRI> { QUERY }
Execute QUERY on endpoint IRI.

Complex data model

prefix bag: <http://bag.basisregistraties.overheid.nl/def/bag#>
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?shape {
  service <https://data.pdok.nl/sparql> {
    ?woonplaats rdfs:label "Amsterdam"@nl.
    ?openbareRuimte bag:bijbehorendeWoonplaats ?woonplaats;
                    bag:naamOpenbareRuimte "De Boelelaan".
    ?nummeraanduiding bag:bijbehorendeOpenbareRuimte ?openbareRuimte;
                      bag:huisnummer 1105;
                      bag:postcode "1081HV".
    ?verblijfsobject bag:hoofdadres ?nummeraanduiding;
                     bag:pandrelatering ?pand.
    ?pand geo:hasGeometry/geo:asWKT ?shape.
  }
}
limit 1
Data model of the Dutch Base Registry for Buildings (BAG).

Large-scale queries

prefix bag: <http://bag.basisregistraties.overheid.nl/def/bag#>
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?shape ?shapeLabel {
  ?verblijfsobject bag:hoofdadres/bag:bijbehorendeOpenbareRuimte/bag:bijbehorendeWoonplaats/bag:naamWoonplaats "Apeldoorn"^^xsd:string;
                   bag:pandrelatering ?pand.
  ?pand bag:geometriePand/geo:asWKT ?shape;
        bag:oorspronkelijkBouwjaar ?shapeLabel.
}
order by asc(?shapeLabel)
limit 50
Show the 50 oldest buildings in Apeldoorn.

Geo-3D query

  • Living space
  • Shopping
  • Offices
  • Education
  • Health
  • Sport

Average building values (WOZ)

Querying 4 datasets

Kadaster
BAG
Kamer van Koophandel (KvK)
Bedrijfsvestigingen
Rijksdienst voor het Cultureel Erfgoed (RCE)
Monumentenregister & Beeldbank
Rijksdienst voor Ondernemend Nederland (RVO)
Energielabels

Part IV: Querying the LOD Cloud

Linked Open Data Cloud

Dutch municipality → DBpedia 🕸

prefix brt: <http://brt.basisregistraties.overheid.nl/def/top10nl#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?shape ?shapeLabel {
  ?place1 brt:isBAGwoonplaats true;
          geo:hasGeometry/geo:asWKT ?shape;
          rdfs:label "Swalmen"^^xsd:string.
  service <https://dbpedia.org/sparql> {
    ?place2 foaf:depiction ?vlag;
            rdfs:label "Swalmen"@nl.
  }
}
limit 1

service <URL> { A } means that subquery A is executed on a different SPARQL endpoint. The results are received from that endpoint, and integrated within the overall query results.

Apeldoorn does not have a flag

prefix brt: <http://brt.basisregistraties.overheid.nl/def/top10nl#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?shape ?shapeLabel {
  ?place1 brt:isBAGwoonplaats true;
          geo:hasGeometry/geo:asWKT ?shape;
          rdfs:label "Apeldoorn"^^xsd:string.
  service <https://dbpedia.org/sparql> {
    ?place2 rdfs:label "Apeldoorn"@nl.
    optional { ?place2 foaf:depiction ?vlag. }
  }
}
limit 1

When you query the web, not all information is there all the time. optional { A } makes the query resilient against missing information.

Population density

Thank you for your attention!