Querying Knowledge Graphs

Wouter Beek (wouter@triply.cc)

Core conceptual content
Core technical content

Graph → Table

SELECT queries

SELECT query

  • RDF data is stored in a graph.
  • A SELECT query creates a tabular view by matching a pattern against the graph data.

Components of a SELECT query

Projection
Specifies the columns of the table.
Pattern
Specifies how the cells of the table are filled.
Modifier
Specifies additional operations over the table.

SELECT queries

Triple Patterns

Our first SELECT query

select ?s ?p ?o {
  ?s ?p ?o
}
limit 25
Projection (columns)
select ?s ?p ?o
Pattern (graph match)
{ ?s ?p ?o }
Modifier
limit 25

Table of Pokémon + image links

select ?pokemon ?image {
  ?pokemon <http://xmlns.com/foaf/0.1/depiction> ?image
}
limit 25
<http://xmlns.com/foaf/0.1/depiction>
Match specific arcs in the graph.
?pokemon and ?image
Use descriptive names for variables.

Abbreviated IRI notation

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select ?pokemon ?image {
  ?pokemon foaf:depiction ?image
}
limit 25
Abbreviated query notation
Allows foaf:depiction to be written i.o. <http://xmlns.com/foaf/0.1/depiction>.
Abbreviated result notation
Display id:flareon instead of <https://triply.cc/academy/pokemon/id/pokemon/flareon>.

Invert the projection

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select ?image ?pokemon {
  ?pokemon foaf:depiction ?image
}
limit 25
?image
The first column contains the image links.
?pokemon
The second column contains the Pokémon.

Change the projection

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select ?image {
  ?pokemon foaf:depiction ?image
}
limit 25
?image
Only return the column for image links.
?pokemon
A hidden variable: one whose bindings are not returned.

The generic projection

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select * {
  ?pokemon foaf:depiction ?image
}
limit 25
select *
Return columns for all variables. Columns appear in unspecified order.

Introduce a variable

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select * {
  ?pokemon foaf:depiction ?image
  bind('Hi!' as ?widget)
}
limit 25
bind(VALUE as VARIABLE)
Add a column with values that are not matched in the graph.

HTML template

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select * {
  ?pokemon foaf:depiction ?image
  bind('<img src="{{image}}">' as ?widget)
}
limit 25
{{VARIABLE}}
Use ?VARIABLE in a template string.
Triply-specific feature
Still a standards-compliant SPARQL 1.1 query, but will not perform HTML templating in other SPARQL editors.

Change the number of results

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select * {
  ?pokemon foaf:depiction ?image
  bind('<img src="{{image}}">' as ?widget)
}
limit 250
limit 250
Return at most 250 rows.

Add an offset

prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix id: <https://triply.cc/academy/pokemon/id/pokemon/>
select ?pokemon ?image {
  ?pokemon foaf:depiction ?image
}
limit 25
offset 250
offset 250
Skip the first 250 rows, return the 251st through the 260th row.

Summary

Construct Purpose Examples
Prefix Abbreviate syntax prefix ex: <https://example.com/>
Projection Select columns select ?x ?y
select *
Pattern Match cell values { ?s ?p ?o }
{ ?s ex:p ?o }
Binding Introduce new variables bind('Hi!' as ?widget)
Template Return HTML widgets bind('<img src="{{image}}">' as ?widget)
Limit Set a maximum number of rows. limit 10
Offset Skip a number of rows. offset 10

SELECT queries

Graph patterns

Graph Pattern: 1 Triple Pattern

select * {
  ?pokemon foaf:depiction ?image
}
limit 25
  • Graph patterns contain zero or more Triple Patterns.
  • We leave out prefix declarations from now on.
  • We leave out widget bindings (bind) from now on.

Graph Pattern: 2 Triple Patterns

select * {
  ?pokemon foaf:depiction ?image.
  ?pokemon pokemon:cry ?cry.
}
limit 25
?pokemon
A shared variable connects two or more triple patterns.
. (dot)
Marks the end of a Triple Pattern. (The dot behind the last Triple Pattern is optional.)

Graph Pattern: 3 Triple Patterns

select * {
  ?pokemon foaf:depiction ?image.
  ?pokemon pokemon:cry ?cry.
  ?pokemon rdfs:label ?name.
}
limit 25

Vocabularies:

Friend of a Friend (FOAF)
foaf:depiction
RDF Schema (RDFS)
rdfs:label

Graph Pattern: Abbreviated notation

select * {
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           rdfs:label ?name.
}
limit 25
; (semi-colon)
Repeat the previous subject term.
, (comma)
Repeat the previous subject and predicate terms.

Multiple values

select * {
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           pokemon:name ?name1, ?name2.
  filter(?name1 != ?name2)
}
limit 25
, (comma)
Repeat the previous subject and predicate terms.
filter( … )
A non-graph restriction that is added to the pattern.
X != Y
X and Y must not be the same.

Filter by language

select * {
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           pokemon:name ?name1, ?name2.
  filter(lang(?name1) = 'ja-ja' && lang(?name2) = 'en-us')
}
limit 25
lang(…)
Returns the language of a language-tagged string.
filter( A && B )
Apply filter A ánd filter B.

Graph pattern: 5 Triple Patterns

select * {
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           rdfs:label ?name;
           pokemon:type ?type.
  ?type rdfs:label ?typeName.
}
limit 25

Property Path

select * {
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           rdfs:label ?name;
           pokemon:type/rdfs:label ?typeName.
}
limit 25
P/Q
Sequence: first follow P, then follow Q.
P|Q
Choice: follow P or follow Q.
P+
Follow P one or more times.

Making the query more specific

select * {
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           rdfs:label ?name;
           pokemon:type/rdfs:label "Dragon Type"^^xsd:string.
}
limit 25
Instantiating a variable makes the query more specific.

Sort rows

select * {
  ?pokemon foaf:depiction ?image;
           rdfs:label ?name;
           pokemon:happiness ?happiness.
}
order by ?happiness
limit 25
order by ?x
Sorts rows from least happy to most happy Pokémon.
order by ?x ?y ?z
It is possible to sort by multiple criteria.

Inversely sort rows

select * {
  ?pokemon foaf:depiction ?image;
           rdfs:label ?name;
           pokemon:happiness ?happiness.
}
order by desc(?happiness)
limit 25
order by desc(?x)
Inversely sort rows (descending).

GeoSPARQL

GeoSPARQL: Features & Shapes

select * {
  ?feature geo:hasGeometry/geo:asWKT ?shape;
           rdfs:label ?shapeLabel
}
limit 1
geo:hasGeometry and geo:asWKT
GeoSPARQL, standardized by the Open Geospatial Consortium (OGC).
?xLabel
Popup for the shape bound to ?x.
?xColor
Color of the shape bound to ?x.

Anonymous nodes

select * {
  [ geo:hasGeometry/geo:asWKT ?shape;
    rdfs:label ?shapeLabel
  ]
}
limit 1
[ … ]
Anonymous node: a subject term that is neither a term nor a variable.

Find a building

select * {
  service <https://data.pdok.nl/sparql> {
    [
      bag:hoofdadres [
        bag:bijbehorendeOpenbareRuimte [
          bag:bijbehorendeWoonplaats/rdfs:label "Amsterdam"@nl;
          bag:naamOpenbareRuimte "De Boelelaan"
        ];
        bag:huisnummer 1105;
        bag:postcode "1081HV"
      ];
      bag:pandrelatering/geo:hasGeometry/geo:asWKT ?shape
    ]
  }
}
limit 1
service <URL> { … }
Run the (partial) query on a different endpoint.

Data from the Dutch Base Registry for Buildings (Cadastre).

SPARQL ↔ JSON

{
  "hoofdadres": {
    "bijbehorendeOpenbareRuimte": {
      "bijbehorendeWoonplaats": { "label": "Amsterdam" },
      "naamOpenbareRuimte": "De Boelelaan"
    },
    "huisnummer": 1105,
    "postcode": "1081HV"
  },
}
[
  bag:hoofdadres [
    bag:bijbehorendeOpenbareRuimte [
      bag:bijbehorendeWoonplaats/rdfs:label "Amsterdam"@nl;
      bag:naamOpenbareRuimte "De Boelelaan"
    ];
    bag:huisnummer 1105;
    bag:postcode "1081HV"
  ];
]

Oldest buildings

select * {
  [ bag:hoofdadres/bag:bijbehorendeOpenbareRuimte/bag:bijbehorendeWoonplaats/bag:naamWoonplaats "Apeldoorn";
    bag:pandrelatering [
      bag:oorspronkelijkBouwjaar ?shapeLabel;
      bag:geometriePand/geo:asWKT ?shape
    ]
  ]
}
order by ?shapeLabel
limit 50

Show the 50 oldest buildings in Apeldoorn.

GeoSPARQL 3D

  • Living space
  • Shopping
  • Offices
  • Education
  • Health
  • Sport
3D shapes
Supported by GeoSPARQL, but not very common (yet).
?xHeight
The heights of the 2D shape bound to ?x

Federated Querying

Linked Open Data Cloud

Dutch municipality → DBpedia 🕸

select ?shape ?shapeLabel {
  ?place1 brt:isBAGwoonplaats true;
          geo:hasGeometry/geo:asWKT ?shape;
          rdfs:label "Swalmen"^^xsd:string.
  service <https://dbpedia.org/sparql> {
    ?place2 foaf:depiction ?vlag;
            rdfs:label "Swalmen"@nl.
  }
}
limit 1

service <URL> { A } means that subquery A is executed on a different SPARQL endpoint. The results are received from that endpoint, and integrated within the overall query results.

Apeldoorn does not have a flag

select ?shape ?shapeLabel {
  ?place1 brt:isBAGwoonplaats true;
          geo:hasGeometry/geo:asWKT ?shape;
          rdfs:label "Apeldoorn"^^xsd:string.
  service <https://dbpedia.org/sparql> {
    ?place2 rdfs:label "Apeldoorn"@nl.
    optional { ?place2 foaf:depiction ?vlag. }
  }
}
limit 1

When you query the web, not all information is there all the time. optional { A } makes the query resilient against missing information.

Aggregates

Concatenate all values

select
  (group_concat(concat(str(?name),' (',lang(?name),')');separator='</li><li>') as ?names)
  (concat('<ol><li>',?names,'</li></ol><img src="',?image,'">') as ?widget)
{
  ?pokemon foaf:depiction ?image;
           pokemon:cry ?cry;
           pokemon:name ?name.
}
limit 25
concat(…)
Concatenate all arguments into one new string value.
group_concat(…;separator=…)
Concatenate all bindings, interspersed with separators, into one new string value.

Aggregate sum

select ?id (sum(?employees) as ?employees) (sample(?name) as ?name) {
  ?record kvk:kvknummer ?id;
          kvk:plaats "Zwolle";
          schema:legalName ?name;
          schema:numberOfEmployees ?employees
}
order by desc(?employees)
limit 25

Using Dutch Chamber of Commerce data.

(sum(?employees) as ?employees)
Summate the number of employees per company (?id).
(sample(?name) as ?name)
Since a company (?id) can have multiple names (?name), select one arbitrarily.

Examples

Average building values (WOZ)

Zwolle in 3D

Kadaster
BAG
Kamer van Koophandel (KvK)
Bedrijfsvestigingen
Rijksdienst voor het Cultureel Erfgoed (RCE)
Monumentenregister & Beeldbank
Rijksdienst voor Ondernemend Nederland (RVO)
Energielabels

Thank you for your attention!