The LOD Platform Technology

LOD - Linked Open Data Platform is a highly innovative technological framework, an integrated ecosystem for the management of bibliographic, archive and museum catalogues, and their conversion to linked data according to the BIBFRAME ontology version 2.0 (https://www.loc.gov/bibframe/docs/bibframe2-model.html), extensible as needed for specific purposes.

The core of the LOD Platform was designed in the EU-funded project ALIADA, with the idea of creating a scalable and configurable framework able to adapt to ontologies from different domains, capable of automating the entire process of creating and publishing linked open data, regardless of the data source format.

The aim of this framework is to open the possibilities offered by linked data to libraries, archives and museums by providing greater interoperability, visibility and availability for all types of resources.

The application of the LOD Platform obviously requires the careful analysis of the standards, formats and models used in the institution addressed; however, its coverage, based on BIBFRAME 2.0 as core ontology, can be enriched with a suite of additional ontologies, such as Schema.org, Prov-O, MADS, RDFS, LC vocabularies, RDA vocabularies and so on; it’s extremely flexible and allows for the implementation of additional ontologies, vocabularies and modelling according to specific needs.

By incorporating standards, models and technologies recognized as key elements for the creation of new processes of management and use of knowledge, the LOD Platform allows:

the creation of a data structure based on Agent, Work, Instance, Item, Place entities, as defined by BIBFRAME, and extensible to reconcile other entities;
data enrichment through the connection with external data sources;
reconciliation and clusterization of entities created from the original data;
the conversion of data according to the standard model indicated by the W3C for the LOD, RDF - Resource Description Framework;
delivery of converted and enriched data to the target institution for reuse in their systems;
the publication of the dataset in linked data on RDF storage (triplestore);
the creation of a discovery portal with a web user interface based on BIBFRAME or other ontologies defined in specific projects.

High level steps

In the implementation of a system that uses the LOD Platform, data from libraries, archives and museums are transformed into linked data through entity identification, reconciliation and enrichment processes.

Attributes are used to uniquely identify a person, work or other entity, with variant forms reconciled to form a cluster of data referring to the same entity. The data are subsequently reconciled and enriched with further external sources, to create a network of information and resources. The result is an open relationship database and Cluster Knowledge Base (CKB) in RDF.

The database uses the semantic web paradigms but allows the target institution to manage their data independently, and is able to provide:

enrichment of data with URIs, both for the original library records and for the output linked data entities; examples of sources for URI enrichment are ISNI, VIAF, FAST, GeoNames, LC Classification, LCSH, LC NAF, Wikidata;
conversion of data to RDF using the BIBFRAME vocabulary and other ontologies;
creation of a virtual discovery platform with web user interface;
creation of a database of relationships and clusters accessible in RDF through a triplestore;
implementation of tools for direct interaction with the data, permitting the validation, update, long-term control and maintenance of the clusters and of the URIs identifying the entities (see below);
batch/automated data updating procedures;
batch/automated data dissemination to libraries.
progressive implementation of additional workflows such as API for ILS, back-conversion for local acquisition and administration systems, reporting.

The goal is to ensure that a large amount of data, which often remains hidden or unexpressed in closed silos (“containers”), finally reveals its richness within existing collections.

Benefits

The LOD Platform, developed according to the principle of functionality, provides various environments and interfaces for the creation and enrichment of data and offers workflows capable of responding to the different needs of librarians / archivists / museum operators, professionals, scholars, researchers and participating students.

There are several advantages:

integration of the processes of a collaborative environment with local systems and tools;
integration into the semantic web while maintaining ownership and control of the data, benefiting from the simplified administration of the environment and a large pool of data;
integration of library/archive/museum data into the collaborative environment and pool of data;

standards and infrastructures for "future-proof" data, ie ensuring that they are compatible with the structure of linked data and the semantic web;
enrichment of data with further information and relationships not previously expressed in the established metadata formats in use (e.g. MARC), increasing the possibilities of discovery for all types of resources;
create an environment that is useful for both end users and professionals (librarians, archivists, museum operators);
allow librarians a wider and direct interaction with and editing of linked data entities through the Cluster Knowledge Base Editor (more details in the next section);
advanced search interfaces to improve the user experience and provide broader search results to users;
reveal data that would otherwise have remained hidden in silos, allowing end users to access a large amount of information that can be both imported and exported by the library.

This approach fully harnesses the potential of linked data, connecting library information to the advantage of scholars, patrons and all library users in a dynamic research environment that unlocks new ways of accessing knowledge.

Added values

It’s particularly relevant to highlight that the LOD Platform is currently being enhanced with a module dedicated to edit and update entities in the Cluster Knowledge Base (CKB). This Cluster Knowledge Base editor has been named JCricket, and is conceived as a collaborative environment with different levels of access and interaction with the data, enabling several manual and automatic actions on the clusters of entities saved in the database, including creation, modification, merge of clusters of works, of agents etc.

JCricket consists of two main layers:

automatic checks and update of the data performed by the LOD system;
manual checks and edit of the data performed by the user through a web interface.

All changes to entities, both automatic and manual, are reported on the Entity Registry, a source (also available in RDF) that tracks the updates of each entity, especially when this has an impact on the persistent entity URI.