WDS Data Stewardship Award 2018

November 7th, 2018

Wouter Beek (w.g.j.beek@vu.nl, wouter@triply.cc)

LOD

Linked Open Data

80%

Time data scientists spend on finding & cleaning data.

Most data is:

  • not findable
  • not accessable
  • not interpretable
  • not reusable

LOD Laundromat


http://lodlaundromat.org

Published at DANS

https://doi.org/10.17026/dans-znh-bcg3

>65K datasets, >38B facts

Academic use cases

Reproducible research
  • L. Rietveld, W. Beek & S. Schlobach, 2015. “LOD Lab: Experiments at LOD Scale”, ISWC 2015. Best Paper Award.
Large-scale data cleaning
  • W. Beek, F. Ilievski, J. Debattista, S. Schlobach & J. Wielemaker, “Literally better: Analyzing and Improving the Quality of Literals”, Semantic Web Journal 2017.
Semantic search engines
  • F. Ilievski, W. Beek, M. Van Erp, L. Rietveld & S. Schlobach, “LOTUS: Adaptive Text Search for Big Linked Data”, ESWC 2016. Best LOD Application Award.
Large-scale querying
  • J. Fernández, W. Beek, M. Martínez-Prieto & M. Arias, “LOD-a-lot: A Queryable Dump of the LOD Cloud”, ISWC 2017.
  • W. Beek, J. Fernández & R. Verborgh, “LOD-a-lot: A Single-file Enabler for Data Science”, 13th Int. Conf. on Semantic Systems 2017.
  • W. Beek, L. Rietveld, S. Schlobach & F. Van Harmelen, “LOD Laundromat: Why the Semantic Web Needs Centralization (Even If We Don't Like It)”, IEEE Internet Computing 2016.
  • L. Rietveld, R. Verborgh, W. Beek, M. Vander Sande & S. Schlobach. 2015. “Linked Data-as-a-Service: The Semantic Web Redeployed”, ESWC 2015.
Erroneous link detection
  • W. Beek, J. Raad, J. Wielemaker & F. van Harmelen “sameAs.cc: The Closure of 500M owl:sameAs Statements”, ESWC 2018. Best Resource Paper Award.
  • J. Raad, W. Beek, F. Van Harmelen, N. Pernelle & F. Saïs, “Detecting Erroneous Identity Links on the Web using Network Metrics”, ISWC 2018.

‘Barack Obama’ in LOD

But are these links correct?

http://als.dbpedia.org/resource/Barack_Obama
http://am.dbpedia.org/resource/ባራክ_ኦባማ
http://data.nytimes.com/obama_barack_per
http://viaf.org/viaf/52010985
http://yago-knowledge.org/resource/Barack_Obama
http://rdf.freebase.com/ns/m.02mjmr
http://dbpedia.org/resource/Administration_of_Barack_Obama
http://dbpedia.org/resource/Barack_Obama_Cabinet
http://dbpedia.org/resource/Barack_Obama_presidency
http://yago-knowledge.org/resource/Presidency_of_Barack_Obama
http://rdf.freebase.com/ns/m.05b6w1g

Cluster detection for ‘Barack Obama’

  • person
  • senator
  • president
  • government

Commercial use case

https://demo.triply.cc

Triply Cloud (https://demo.triply.cc)

Triply FAIR Paper

Triply FAIR Paper

Triply Linked Data Browser

Thank you for your attention!