Linked Data

Wouter Beek (,

January 24th, 2018


The first query

select * {
  ?s ?p ?o .
limit 10

Search by label

prefix wp: <>
prefix xsd: <>
select ?s {
  ?s wp:organismName "Homo sapiens"^^xsd:string .
limit 10

Property path

prefix dct: <>
prefix wp: <>
prefix xsd: <>
select ?s {
  ?s dct:isPartOf/wp:organismName "Homo sapiens"^^xsd:string .
limit 10

Match any graph structure

prefix dct: <>
prefix obo: <>
prefix path: <>
prefix rdfs: <>
prefix wp: <>
select * {
  ?interaction dct:isPartOf ?organism ;
               wp:source ?source ;
               wp:target ?target .
  ?source rdfs:label ?sourceName .
  ?target rdfs:label ?targetName ;
          wp:bdbWikidata ?wikidata .
  ?organism wp:organism obo:NCBITaxon_9606 ;
            wp:organismName ?organismName .
limit 1

Link to Wikidata

prefix direct: <>
prefix rdfs: <>
prefix wikidata: <>
select * {
  wikidata:Q27102800 direct:P274 ?molecule ;
                     direct:P2067 ?mass ;
                     rdfs:label ?label .
limit 10

Aggregate query

prefix dct: <>
prefix rdfs: <>
select ?s ?label (count(?article) as ?n) {
  ?s dct:bibliographicCitation ?article ;
     rdfs:label ?label
group by ?s ?label
order by desc(?n)
limit 10

What is LOD?

WWW (Web of documents)

LOD (Web of data)

The 5 stars of LOD

Source: Tim Berners-Lee (

Let's look at some Linked Open Data

What is the LOD Cloud?

Who uses Linked Data? (1/2)

Who uses Linked Data? (2/2)

  • 20M web sites
  • 35% of pages in search index
  • 50% of US/EU eCommerce emails
  • 800B small graphs of ~25 statements
Source: A.W.Moore & R.V. Gua, Google Research

LOD Cloud: 2014

LOD Cloud: 2017

LOD Laundromat

L. Rietveld & W. Beek & S. Schlobach, “LOD Lab: Experiments at LOD Scale”, International Semantic Web Conference, 2015 (Best Paper Award).
Best Linked Open Data Application, 2015

F. Ilievski, W. Beek, M. van Erp, L. Rietveld, S. Schlobach, “LOTUS: Adaptive Text Search for Big Linked Data”, ESWC 2016.

Nice, but expensive to scale
to the full web.

Scalable Knowledge Graph Solutions

Query a Large Knowledge Graph Using Commodity Hardware

Lowering the cost of access

  • Store Linked Data in 1 file
  • 28,362,198,927 triples (>650K data documents)
  • €305,- hardware cost (524GB disk; 15.7GB RAM)

J.D. Fernández, W. Beek, M.A. Martínez-Prieto, M. Arias, “LOD-a-lot”, International Semantic Web Conference, 2017

Up to 1K HDT files

Header Dictionary Triples (HDT)

Fernández & Martínez-Prieto & Gutiérrez, “Binary RDF representation for publication and exchange (HDT)”, ISWC, 2013.

Empirical Semantics

(Measuring Meaning)

‘Barack Obama’ in the LOD Cloud

owl:sameAs links

Data cleaning (owl:sameAs)ባራክ_ኦባማ

‘Barack Obama’ after community detection

purple: person; orange: government; green: president; blue: senator

Thank you!