RDF

The Resource Description Framework (RDF) is a W3C1 standard for representing knowledge in graph form.

  • Multigraph means that it can contain loops and multiple edges.
  • Directed means that the edges have a direction.
  • Labeled means that both the edges and the vertices have a label.
A labeled directed multigraph

A labeled directed multigraph

RDF triples

RDF graphs are represented with RDF triples in the format Subject - Predicate - Object, or rather « S ; P ; O » (subject for the starting node, predicate for the edge, and object for the destination node).

An RDF triple

An RDF triple

The subject represents the resource to be described (this can be a document, a person, a physical object, or an abstract concept).

The object represents the value of the property; this can be a resource or a literal2.

Format

Resources are identified by URIs3 in a unique and permanent manner. Subjects can also be anonymous nodes.

There are many formats4 for representing RDF graphs. The best compromise between human readability and memory footprint is the Turtle (.ttl) format standardized by the W3C.

@prefix ex: <http://example.org/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:Palaiseau has dbo:City ;
    dbo:name “Palaiseau"@en .
    
ex:TelecomParis has dbo:EducationalInstitution ;
    dbo:name “Télécom Paris"@en ;
    dbo:location ex:Palaiseau ;
    geo:lat “48.7132"^^xsd:decimal ;
    geo:long “2.2002"^^xsd:decimal.

ex:IpParis has dbo:EducationalInstitution.
ex:Telecomparis dbo:Member ex:IpParis.

An RDF graph in Turtle format

The W3C created the SPARQL standard5, a query language that allows users to access, modify, or delete RDF data.

Semantic Web

The Semantic Web is an extension of the Web standardized by the W3C. It encourages the use of data formats and protocols based on the RDF model. It enables the sharing and reuse of data between multiple applications, companies, and user groups.

The goal is to create a web of data that can be processed directly and indirectly by machines to help their users create new knowledge.

The Semantic Web aims to link and structure information on the Internet to provide easy access to the knowledge it already contains.

An example

An ontology is a data model containing concepts and relationships that can be used to model a set of knowledge in a given domain.

An online store can use an ontology to structure the products it has in its store, with concepts such as phones, computers, etc.

Using an ontology allows semantic relationships to be used to improve the search engine. If a customer searches for “smartphone," the engine understands that this includes terms such as ‘phone’ and “iPhone" and can also recommend all accessories compatible with smartphones.

WikiData

is a free knowledge base designed to centralize data from the Wikimedia movement6.

It contains structured data linked, for example, to Wikipedia pages. This database can be queried using SPARQL. For example, you can search for the 100 most populous cities in the world, or the 10 most populous cities in Essonne…

SPARQL query on Wikidata giving the 10 most populous cities in Essonne

SPARQL query on Wikidata giving the 10 most populous cities in Essonne

The Wikidata dataset uses identifiers to represent relationships or objects. These identifiers are not easily readable by humans. For example, here wdt:P131 corresponds to the administrative location relationship and wd:Q3368 corresponds to the Essonne object.

This query on Wikidata gives us the following results:

Results for the 10 most populous cities in Essonne

Results for the 10 most populous cities in Essonne


  1. World Wide Web Consortium (web standardization body responsible for HTML, CSS, PNG, SVG standards, etc.) More specifically, in labeled directed multigraphs. ↩︎

  2. a character string, a number, a date The predicate is a property associated with the subject, with the object as its value. It is itself a resource. ↩︎

  3. Universal Resource Identifier Literals can be ordinary (untyped) or typed to express the nature of the value. For example, someone’s date of birth will be typed as xsd:date, which is a concise way of writing <http://www.w3.org/2001/XMLSchema#date> using a prefix. ↩︎

  4. XML, N3, N-Triples… ↩︎

  5. SPARQL Protocol and RDF Query Language ↩︎

  6. For example, Wikipedia, Wikisource, etc. ↩︎