Last edited 7 months ago
by Serena Cericola

Share-VDE and the Semantic Web


Overview

This document illustrates how Share-VDE URIs must be constructed and resolved depending on the requested destination format and language.

It describes Share-VDE's URIs construction and resolution methods, according to the Semantic Web’s best practices, to make the fruition of resources exposed by Share-VDE’s on the largest and most proper basis possible, all through the HTTP protocol.

At its very heart, the solution reported within this document implements the strategies indicated by the Cool URIs for the Semantic Web document, in particular at paragraph 4.3.

Basic requirements for URIs in the Semantic Web

From: https://www.w3.org/TR/cooluris/:

1. Be on the Web.

Given only a URI, machines and people should be able to retrieve a description about the resource identified by the URI from the Web. Such a look-up mechanism is important to establish shared understanding of what a URI identifies. Machines should get RDF data and humans should get a readable representation, such as HTML. The standard Web transfer protocol, HTTP, should be used.

2. Be unambiguous.

There should be no confusion between identifiers for Web documents and identifiers for other resources. URIs are meant to identify only one of them, so one URI can't stand for both a Web document and a real-world object.

URIs forwarding to different documents

Share-VDE offers several representations for the same resource, which in their very minimal set are HTML, JSON, and RDF.

Nevertheless, as stated by requirement no.1 above, the best way to offer a URI describing either a real-world object (like an author or a work) in RDF or JSON format and the related web page is to use one URI for all of them.

The way to disambiguate such single URIs is using the HTTP mechanism known as content negotiation. Content negotiation is achieved in Share-VDE introducing a new architectural block: the Frontier Reverse Proxy.

The Frontier Reverse Proxy (FRP)

At its very essence, the FRP is an HTTP reverse proxy, thus capable of applying all the content negotiation logic and deputed to contain all the redirection rules, necessary to apply the Semantic Web patterns to URIs exposure.

The FRP is the Share-VDE entry point for all requests.

In general terms, the responsibilities of the FRP will mainly be related to the implementation of a URI exposure strategy conforming to several use cases, such as:

  • Generic resource request (https://svde.org/agents/17282), with the Accept HTTP header specialising the desired type of resource (content negotiation: Web page, JSON or RDF)
  • Page type resource direct request (https://svde.org/pages/agents/17282.html)
  • JSON type resource direct request (https://svde.org/data/agents/17282.json)
  • RDF type resource direct request (https://svde.org/data/agents/17282.rdf)

The following diagram explains the concept:

303 URIs forwarding to Different Documents

In greater detail, these responsibilities translate to:

  1. Map requests whose Accept header matches the offered format content types (i.e. text/html, application/json or application/rdf+xml), and accordingly redirect the client to the service exposing the requested resource format, with a 303 (See Other) status code;
  2. Map requests whose resource suffix header matches the offered format content types (i.e. .html, .json or .rdf), and accordingly redirect the client to the service exposing the requested resource format, with a 303 (See Other) status code;
  3. if both the Accept header and the resource suffix should be present and diverging, the header will have the priority in the resource destination format detection;
  4. If no Accept header nor any format suffix should be present, or should they not match any of the offered formats, a response with a 406 (Not Acceptable) status code will be returned.

Resources URI mapping patterns

As a consequence to all that has been said, it is clear the resources Share-VDE exposes are either human-readable (Web pages) and machine-readable (JSON and RDF).

For machine-readable formats, the FRP will apply the following mapping patterns, using the agents resource type as an example:

Frontier Reverse Proxy URI redirection patterns for JSON and RDF resources
Request Accept header value Response status code Target location
/agents/217[.*] application/json 303 (See Other) /data/agents/217.json
/agents/217.json n.a. 303 (See Other) /data/agents/217.json
/agents/217[.*] application/rdf+xml 303 (See Other) /data/agents/217.rdf
/agents/217.rdf n.a. 303 (See Other) /data/agents/217.rdf
/agents/217 n.a. 406 (Not Acceptable) n.a.
/agents/217.xyz n.a. 406 (not Acceptable) n.a.


The same rules apply for resource collections, like/agents, and for attributes and entities belonging to top-level resources, like /agents/217/works. The [.*] URI portion means an optional extension suffix.

On the other hand, for Web pages, a tight collaboration will be established in URIs forwarding and transformation between the FRP and the Share-VDE front-end application, called Neoaves. Neoaves will receive from the FRP a semi-finished URI for the 303 target location, and it will transform it in a way that lets it be intelligible both for humans and search engines like Google, enhancing SEO aspects.

Frontier Reverse Proxy URI redirection patterns for HTML resources
Request Accept header value Response status code Target location URI transformation by Neoaves
/agents/201[.*] text/html 303 (See Other) /agents-201 /lewis-carroll-a201
/agents/201.html n.a. 303 (See Other) /agents-201 /lewis-carroll-a201
/agents/201 n.a. 406 (Not Acceptable) n.a. n.a.
/agents/201.xyz n.a. 406 (Not Acceptable) n.a. n.a.
/agents/201/opuses text/html 303 (See Other) /agents-201/opuses /lewis-carroll-a201/original-work-by

This behaviour for HTML resources has been discussed and agreed upon in a meeting with Samhaeng (the developers of Neoaves).

The FRP, though, will be equipped with further rules to let it proxy non-resource-related, static Web pages, like /about.html or /faq.html, avoiding the unnecessary URI redirection logic processing.

At the time of writing (July 7th, 2021), the URI redirection and transformation patterns related to the HTML format of the whole resource base are under analysis, and will be followed by their implementation as soon as the analysis will be complete.