Why I Use LDF

& Why You Should Do So Too!

May 30th, 2016

Wouter Beek (w.g.j.beek@vu.nl)

Why does not everybody use the SW all day long today?

Metcalfe's Law

The value of a network is proportional to the square of the number of connected nodes

So... how many connected nodes does the SW have?

Data growth is exponential

SW growth is linear

Allocative Efficiency

What the consumer is willing to pay should equal the marginal cost of production

LOD Laundromat

Beek & Rietveld & Bazoobandi & Wielemaker & Schlobach “LOD laundromat: A Uniform Way of Publishing Other People’s Dirty Data” ISWC 2014

LOD Laundromat stats

  • >650K LDF endpoints
  • Millions of LDF queries each month
  • Hundreds of users
  • All served from disk!

L. Rietveld & R. Verborgh & W. Beek & M. Vander Sande & S. Schlobach, “Linked Data-as-a-Service: The Semantic Web Redeployed”, ESWC 2015

SW layer cake

Alt. SW layer cake

W. Beek & L. Rietveld & S. Schlobach & F. Van Harmelen, “LOD Laundromat: Why the Semantic Web Needs Centralization (Even If We Don't Like It)”, IEEE Internet Computing, 20 (2), p.78-81, 2016

Why LDF is great

(According to Wouter)

(1/2) Trade-off

  • For many use cases, SPARQL is too difficult & expensive
  • LDF is single-line SPARQL
  • Single-line SPARQL is easy & cheap

(2/2) Honesty

  • SPARQL gives you 10,000 results describing dbp:London
  • SPARQL does not tell you that 10,000 is an arbitrary restriction that is enforced by the data publisher because SPARQL is too difficult & expensive
  • (Luckily the data publisher did not set the limit to 67!)
  • LDF tells you how many results there are (query optimization)

Current challenges

(1/3) Findability

LDF ≠ dereferencing

LDF decouples naming from locating (like SPARQL)

(2/3) Federation

Will federation scale to millions of endpoints?

(3/3) Hydra

  • More complicated than a ‘dumb’ API
  • But... benefit rises when more endpoints are queried