Meaning on the Web


Wouter Beek w.g.j.beek@vu.nl

Stefan Schlobach k.s.schlobach@vu.nl

April 24th, 2018

APIs:

Similarity:

sim(C,D) := |CExt(C) ∩ CExt(D)| / |CExt(C) ∪ CExt(D)|
CExt(C) := {I(x) ∈ IR |〈I(x),I(C)〉∈ Ext(I(rdf:type))}

Instances of ‘dbo:Book

№ statements: https://hdt.lod.labs.vu.nl/triple/count?p=rdf:type&o=dbo:Book

Statements: https://hdt.lod.labs.vu.nl/triple?p=rdf:type&o=dbo:Book

Instances of ‘schema:Book

№ statements: https://hdt.lod.labs.vu.nl/triple/count?p=rdf:type&o=schema:Book

Statements: https://hdt.lod.labs.vu.nl/triple?p=rdf:type&o=schema:Book

Part I: Formal Meaning

Truth-conditional Semantics

The meaning of a sentence can be reduced to the truth conditions of that sentence.

4.024 To understand a proposition means to know what is the case if it is true. (One can understand it, therefore, without knowing whether it is true.) [Tractatus Logico-Philosophicus]

Model theory

A formal theory that implements a truth-conditional semantics, by equating the meaning of a sentence to the set of models/interpretations that make the sentence true.

Models/interpretations are commonly expressed in set theory.

Example

The Louvre owns the Mona Lisa.

True in model I₁:

  • I₁(‘The Louvre’) = a
  • I₁(‘Mona Lisa’) = b
  • I₁(‘owns’) = {〈a,b〉}

False in model I₂:

  • I₂(‘The Louvre’) = a
  • I₂(‘Mona Lisa’) = b
  • I₂(‘owns’) = {〈b,a〉}

The meaning of the sentence is the collection of all and only those interpretations that make the sentence true, i.e., {I₁,…}.

Open World Assumption (OWA)

  • I(‘The Louvre’) = a
  • I(‘Mona Lisa’) = b
  • I(‘owns’) = {〈 a,b〉}
  • I(‘exhibits’) = {〈 a,b〉}

Owning and exhibiting have the same model-theoretic meaning.

However, there can be assertions that we have not yet seen or that have not yet been asserted (OWA). E.g., assertions that describe a museum exhibiting an artwork that it does not own.

RDF has 2 interpretation functions

Intension function (I)
Mapping RDF terms to resources.
Extension function (Ext)
Mapping properties to extensions. (Properties are resources.)
〈s,p,o〉is true iff 〈I(s),I(o)〉∈ Ext(I(p))

Example

‘owning’ and ‘exhibiting’ have the same extension, but they are not the same property.

  • I(‘The Louvre’) = l
  • I(‘Mona Lisa’) = m
  • I(‘owns’) = o
  • I(‘exhibits’) = e
  • Ext(o) = Ext(e) = {〈l,m〉}

Due to the OWA, in RDF there is more to meaning that truth conditions alone!

Special cases (1/2)

rdfs:Class rdf:type rdfs:Class .
  • I(rdfs:Class) = 🐕
  • I(rdf:type) = 🐘
  • Ext(🐘) = {〈🐕,🐕〉,…}

This would not be possible with only an extension function.

Axiom of regularity: ∀x (x≠∅ → ∃y∈x(x∩y=∅)

Special cases (2/2)

owl:sameAs owl:sameAs owl:sameAs .
  • I(owl:sameAs) = ~
  • Ext(~) = {〈~,~〉,…}
〈owl:sameAs,owl:sameAs,owl:sameAs〉is true iff
〈I(owl:sameAs),I(owl:sameAs)〉∈ Ext(I(owl:sameAs))

(Notice that an RDF graph is a special kind of graph, i.e., vertices and edges overlap.)

Part II: Social Meaning

Formal semantics cannot capture all meaning

Graph G₁

id:store def:sells id:tent    .
id:tent  def:costs "¥150,000" .
id:tent  rdf:type  id:Product .

Graph G₂

fy:aHup   pe:ko9sap_ fy:jufn12     .
fy:jufn12 pe:oao9_   "Ufou"        .
fy:jufn12 rdf:type   fyufnt:tmffqt .

Graphs G₁ and G₂ are true in the same models.

Social Meaning

“An RDF graph may contain "defining information" that is opaque to logical reasoners. This information may be used by human interpreters of RDF information.”
“Human publishers of RDF content commit themselves to the mechanically-inferred social obligations.”
“The meaning of an RDF document includes the social meaning, the formal meaning, and the social meaning of the formal entailments.”

Part of an early working version of the RDF standard (link).

2 notions of meaning

Formal meaning

Social meaning

How to analyze non-formal / social meaning?

Part III: Empirical Semantics

Empirical Semantics

The empirical (i.e., non-analytical) analysis of meaning.

(We still use model theory and other formalisms in order to describe the outcomes of our analyses. But we do use formalisms in order to prescribe what a given expressions should mean.)

Hypothesis: Names do not encode any meaning

Graph G₁

id:store def:sells id:tent    .
id:tent  def:costs "¥150,000" .
id:tent  rdf:type  id:Product .

Graph G₂

fy:aHup   pe:ko9sap_ fy:jufn12     .
fy:jufn12 pe:oao9_   "Ufou"        .
fy:jufn12 rdf:type   fyufnt:tmffqt .

Information Theory / compression

compress(SEMANTICS) + compress(NAMES)

- compress(SEMANTICS + NAMES)

= mutual information

Refuting the hypothesis that names have no meaning

Part IV: Empirical Semantics: experimental setup

lodlaundromat.org


lodsearch.org

Nice, but expensive to scale to the full web, i.e., tens of billions of statements.

LOD-a-lot

Scalable Knowledge Graph Solutions

Query a Large Knowledge Graph Using Commodity Hardware

Lowering the cost of access

  • Store Linked Data in 1 file
  • 28,362,198,927 triples (>650K data documents)
  • €305,- hardware cost (524GB disk; 15.7GB RAM)

Header Dictionary Triples (HDT)

Fernández & Martínez-Prieto & Gutiérrez, “Binary RDF representation for publication and exchange (HDT)”, ISWC, 2013.

Part V: Identity

2 notions of identity

Formal meaning

a=b ↔ ∀P(Pa=Pb) OWL 2 specification

Social meaning

“Include links to other URIs, to discover more things.” Tim Berners-Lee, 4th Linked Data principle

Example

How formal meaning and social meaning collide

Apply Empirical Semantics: How is owl:sameAs used in practice?

bbc:sameAs

bbc:sameAs

owl:sameAs

?

The terms over which identity is defined

Identity closure

558,943,116 owl:sameAs triples

E.g., the equivalence set for ‘Barack Obama’

Formal semantics: all these identifiers denote the same thing.

But are they really the same thing?

Some IRIs refer to a person:

              http://als.dbpedia.org/resource/Barack_Obama
              http://am.dbpedia.org/resource/ባራክ_ኦባማ
              http://data.nytimes.com/obama_barack_per
              http://nl.dbpedia.org/resource/Barack_Obama
              http://rdf.freebase.com/ns/m.02mjmr
              http://viaf.org/viaf/52010985
              http://yago-knowledge.org/resource/Barack_Obama

Other IRIs refer to an administration:

                http://dbpedia.org/resource/Administration_of_Barack_Obama
                http://dbpedia.org/resource/Barack_Obama_Cabinet
                http://dbpedia.org/resource/Barack_Obama_presidency
                http://rdf.freebase.com/ns/m.05b6w1g
                http://wikidata.dbpedia.org/resource/Q1379733
                http://yago-knowledge.org/resource/Presidency_of_Barack_Obama

An alternative semantics for instance identity

‘Barack Obama’ after community detection

purple: person; orange: government; green: president; blue: senator

An alternative semantics for class identity

Similarity:

sim(C,D) := |CExt(C) ∩ CExt(D)| / |CExt(C) ∪ CExt(D)|
CExt(C) := {x ∈ RDF-T |〈I(x),I(C)〉∈ Ext(I(rdf:type))}
RDF-T := (B ∪ L ∪ U)