ShareFamily:NewsAndUpdates/December2023: Difference between revisions

mNo edit summary
mNo edit summary
 
Line 3: Line 3:
You can download a printable version at https://bit.ly/SFBulletin_n8_Dec2023
You can download a printable version at https://bit.ly/SFBulletin_n8_Dec2023


Citation: Share Family Team, ''Share Family Bulletin'' 8 (December 2023), https://bit.ly/SFBulletin_n8_Dec2023
Citation: Share Family Team, ''Share Family Bulletin'' ''8'' (December 2023), https://bit.ly/SFBulletin_n8_Dec2023


== Introduction ==
== Introduction ==

Latest revision as of 07:32, 31 July 2024


You can download a printable version at https://bit.ly/SFBulletin_n8_Dec2023

Citation: Share Family Team, Share Family Bulletin 8 (December 2023), https://bit.ly/SFBulletin_n8_Dec2023

Introduction

As 2023 comes to an end, we take a moment to look back at the achievements and challenges within the Share Family initiative during the past year. This report serves as a comprehensive reflection on the accomplishments of this year, while also offering insights into the exciting goals we've set for the upcoming period. It provides a glimpse into both the highlights of the past year and our shared vision for the future within the Share Family community.

The international reach of the Share Family, spread across different countries and bibliographic traditions, stimulates mutual exchanges and gives us the occasion to improve our processes, data treatment and outputs.

The efforts and resources devoted to the development of the initiative (both technical and in terms of cooperation and outreach) are always rewarded by the added value of working side by side with member institutions and partners in the linked data community. This co-operative approach, with time, expertise and costs shared across the community for the benefit of all members, is creating a truly global network of interconnected institutions and information resources that transcends traditional cataloging boundaries.

A useful perspective on how we carry on this family of initiatives can be found in the Share-VDE Executive Summary: the outline of SVDE role in linked open data for libraries is reflected on the overall approach to all the branches of the Share Family.

In this context, we also find useful to provide a comprehensive overview of the main stepping stones that have marked the path of the Share Family of initiatives over the years, described in detail in the next sections:

In this extensive update are reported - among others - prominent achievements:

  • the extension of the community of members and parallel pilots;
  • the publication of new branches of the Share Family dedicated to specific initiatives, supported by dedicated entity discovery portals and linked data infrastructure (the portals for the BNB - British National Bibliography, Parsifal, and LILLIT);
  • the involvement in the international library and cultural heritage community through cross-institutional working groups;
  • the development of JCricket collaborative entity editor for shared cataloguing;
  • the continuous enhancement of the system components and upgrade to a new version 3 of the technology in use, towards the support of production workflows;
  • the extension of LOD Platform technology capabilities for the integration with third parties.

To cope with all this, the Share Family internal team of metadata experts and technicians is constantly working to improve the internal organisation of work and is increasingly formalising project rules and methodology to streamline development processes and achieve better results.

News from members, promotion and outreach

The Share Family linked data ecosystem comprises several collaborative LOD - Linked Open Data environments (also called “tenants” in Share Family jargon).

The different characteristics of each field represented in the various environments are a useful asset that can be used to the advantage not only of the Share Family as a whole, but for each single discipline.

The tenants currently active and the corresponding discovery websites are:

  • Share-VDE (Virtual Discovery Environment);
  • SHARE Catalogue - the Italian network of university libraries;
  • the PCC data pool - the Program for Cooperative Cataloging (PCC) Catalogue in Linked Open Data;
  • National Bibliographies in Linked Open Data;
  • Parsifal - the LOD portal of the URBE consortium (Roman Union of Ecclesiastical Libraries).
  • LILLIT - portal for Italian illustrated books with Linked Open Data descriptions and illustrations of Italian editions printed in the 16th-18th centuries
loghi.jpg

The tenant infrastructure is designed to be flexible, therefore it’s possible to show the data of participating members at the tenant level or even in an institutional customised sub-portal of a pre-existing tenant (also called “skin portal” in Share Family jargon). This can be done according to the institution/consortia’s needs and policies.

sf infrastr.jpg

The Share Family team is committed to advancing the technical developments of each tenant and also to fostering user engagement through outreach initiatives. Below are reported the main facts around membership and outreach within the Share Family.

New Share-VDE members and ongoing pilots

In 2023 Berkeley Law Library joined Share-VDE, while, in an exciting development, Toronto University Library and Lehigh University will be joining Share-VDE in 2024.

The National Taiwan University Library will join the Share Family too: it will be particularly valuable to enrich LOD Platform processes with the treatment of non-Latin scripts, thus expanding the wealth of data at disposal of the Share Family community.

In the framework of the cooperation with library consortia that the Share Family is bringing forward, we are working on a linked open data pilot with the IPLC - IvyPlus Libraries’ Confederation using the IPLC data lake hosted in the MARC based POD - Platform for Open Data. We are working to provide the Share Family LOD Platform environment and discovery portal to the IPLC institutions to assess the compatibility with the POD Aggregator scope and carry on the complementary aspects with the BIBFRAME based components of Share Family. The initial phase started in October 2023 with the set-up of the dedicated IPLC infrastructure tenant, which will include a dedicated discovery portal. After the ingestion of the 100M+ records from the 13 IPLC institutions is concluded an assessment period by POD members to test the output will be undertaken.

The participation of new institutions signifies a commitment to advancing research and educational opportunities, and we look forward to the contributions they will make to our growing community of libraries and institutions.

The British National Bibliography in NatBib tenant

As part of the NatBib - National Bibliographies tenant, the British Library announced its Linked Open Data BNB - British National Bibliography in beta version, now available for exploration at https://bl.natbib-lod.org.

CAVEAT: as of early January 2024, we will be updating the technical infrastructure of https://bl.natbib-lod.org, therefore the portal might be unstable.

For further insights into the benefits of the collaboration between the Share Family and the British Library and future developments, we invite you to visit their Digital Scholarship blog at https://blogs.bl.uk/digital-scholarship/2023/07/share-family-british-national-bibliography.html.

The PCC data pool hosted by the Share Family infrastructure

The cooperation with the Program for Cooperative Cataloguing - initiated in 2020 as part of the LD4P pilot cataloguing activities and enabled by the cooperation with OCLC - for the management of a linked data pool of PCC records has continued with a test phase by a dedicated task group formed by volunteer PCC members. The task group has tested the PCC data pool discovery portal https://pcc-lod.org/ and reported back inputs for adjustments that will greatly improve our work.

The feedback received from the task group was analyzed and mostly incorporated by the Share Family team. The output will be visible after re-indexing of PCC data pool data, which is expected to take place in early 2024. The PCC will also be involved in a testing phase of the JCricket entity editor. To present the new edition of the data as well as JCricket we plan to establish a communication channel with the PCC group for ongoing updates and training in the use of JCricket.

SHARE Catalogue: the Italian network of university libraries

The SHARE Catalogue initiative has completed the work on UNIMARC - BIBFRAME direct mapping and conversion (with no intermediate steps through MARC), and will share this work with the linked data community through a Wikibase instance https://unimarc2bibframe.wikibase.cloud/ that will be enriched and documented.

The Share Family team has been working hard to incorporate the developments derived from the UNIMARC - BIBFRAME direct mapping and to prepare for the switch from the discovery portal version 1.0 to 2.0 (and the consequent transition of the back-end infrastructure to AWS - Amazon Web Services that better supports system scalability and robustness), which is expected in early 2024.

Parsifal: the LOD portal of the URBE consortium

On May 11, 2023 the Roman Union of Ecclesiastical Libraries (URBE) and @CULT (the technological branch of the Share Family, developing and maintaining its LOD Platform technology) announced the release of PARSIFAL, a cutting-edge linked data management system and entity discovery platform. PARSIFAL aims to enhance the exchange and interoperability of bibliographic data and authority items, significantly increasing accessibility to the abundant resources of the 16 participating URBE libraries. With a vast collection of 2.8 million records regularly updated, PARSIFAL offers users a streamlined research experience, enriched by diverse catalogues.

Much work has been done to allow Parsifal libraries to have a shared authority system, built on collaboration between libraries and which guarantees each member to be autonomous in local data treatment, sharing in the same time the effort of data quality with the whole Parsifal community. This is achieved through the unified access for the maintenance of the Central Authority Catalogue, whereby each library accesses the same system to increase the quality level of the catalogue. The authority data, commonly created or enriched by all URBE cataloguers, becomes part of the clustering/conversion pipeline, to optimize the libraries' production processes and increase the quality of the data published on the Parsifal portal.

LILLIT: the portal for Italian illustrated books 1501-1800

The LILLIT portal for Italian illustrated books 1501-1800 is a collaborative effort by Sapienza University, ICCU (Central Institute for the Single Catalog of Italian Libraries), the Central Institute for Graphics and @Cult. Developed through a Sapienza University of Rome project, LILLIT offers an intuitive interface, advanced search options, and access to digital copies. Utilizing Linked Open Data, the portal displays enhanced information on 16th-18th century editions, highlighting engraving techniques and creators. Operating on the BIBFRAME bibliographic model, LILLIT seamlessly aligns with the technological and conceptual innovation of the other Share Family initiatives, bringing new possibilities to the bibliographic landscape.

Towards new domains: Share Art, Share Music, Share MIA

The Share Family institutions and collaborative networks of libraries are engaging in discussions to establish three specialized shared discovery environments: Share Art, Share Music, and Share MIA (Manuscripts, Incunabula, and Ancient Books). These initiatives are designed to cater to the specific needs of the art, music, and ancient book domains, providing a wealth of resources and knowledge to users, and further strengthening the interconnected bibliographic data network through the use of linked open data technologies. This will translate into dedicated tenants (branches) of the Share Family infrastructure.

On this topic, two presentations have been given at IFLA WLIC 2023 and IFLA 2023 Satellite Conference, that can be found in the Resources section of this wiki:

New brochure website for the Share Family

We are thrilled to introduce the new Share Family website, www.share-family.org, a digital hub designed to connect our community. This website serves as a window into the heart of the Share Family, offering a comprehensive overview of our mission and values, a general presentation of our goals and technology. The website also serves as a dynamic platform for keeping everyone informed about our latest activities, events, and advancements.

To delve even further into the intricacies of our project, we encourage you also to explore the other sections of this wiki, a resource which provides more in-depth information, including technical documents, a new brochure that can be freely distributed, and recent updates available both in presentation format and as demo video.

Conferences and events

In 2023, the Share Family team and member institutions have had the opportunity to showcase the initiative and highlight its latest developments at numerous conferences. These presentations have served as a platform for sharing insights, best practices, and the collective achievements of the Share Family community. You can find slides and, often, recording of these events on this page of the Share Family wiki.

Work exchange with National Library of Finland member Serafia Kari

From April to May 2023 we welcomed Serafia Kari, a Service Designer for the Finna service within the National Library of Finland. Her main goal during her visit was to conduct a usability study focusing on the Share-VDE portal's discovery interface (www.svde.org). This study aimed to assess the information-sharing capabilities of the portal from the perspective of users from different backgrounds. Serafia recently had the opportunity to share her findings in a presentation titled "Usability Study of Share-VDE" at the BIBFRAME Workshop in Europe, held in Brussels on September 19, 2023. For those eager to delve deeper into her research, the slides and a recording of the session are available for review at https://www.bfwe.eu/brussels_2023.

Share Family community work and cooperation

We are very committed to encouraging the cooperation within the Share Family membership, to demonstrate the invaluable benefit of being part of an initiative developed and driven by libraries and for libraries. To support the constant exchange of ideas and principles that, together, define the vision, aims and progress of Share Family environments and their tools, we have created new materials and accompanying documents to guide member institutions into this process.

The wiki members’ area dedicated to participating institutions has been enriched with documentation on how to provide feedback on the various linked data entity discovery environments. Structured processes supported by ad hoc tools and testing environments have been shared with members that are actively involved in exchanges with the Share Family staff to collaboratively improve the system functioning.

Share-VDE and Share Family Working Groups

Share Family activities are organised into several work strands. Being a community initiative, the goals and desired outcomes are defined by the participating institutions through active engagement in different working groups. The current working groups are guided by the Advisory Council, which plays a key role in shaping the future of the initiative and provides advice on how to develop use cases and establish priorities, and ensures communication among member institutions.

Each member institution has a seat in the Advisory Council and this governance model based on direct participation of member institutions steering the initiative represents one of the major strengths and core values of the Share Family.

Also, each community of institutions participating in the individual Share Family of initiatives can establish its specific governance group that determines the policy for the treatment and processing of the data.

Among the major achievements of the Working Groups in cooperation with the Share Family team it is worth mentioning those carried out by the SEI – Sapientia Entity Identification Working Group and the User Experience – User Interface Working Group cooperating with the National Bibliographies Working Group.

Sapientia Entity Identification Working Group

The SEI – Sapientia Entity Identification Working Group has worked on the creation of the Share-VDE Ontology, that is an extension to BIBFRAME. While the ontology supports the discovery functionality of Share-VDE and the Share family search systems, it may be re-used in any system requiring a bridge among BIBFRAME, IFLA LRM and RDA. Classes of the Share-VDE ontology include the svde:Opus, svde:OpusType, and svde:Work. The Share-VDE ontology achieves interoperability among the major bibliographic models by asserting that bibliographic entities are described by attribute sets. The attribute set modeling approach is a departure from the conceptual modeling that has informed the development of nearly all modern linked data models. A few steps remain to complete the ontology, and its preliminary version has been published, see https://doi.org/10.5281/zenodo.8332350. The latest public updates on the ontology have been shared at SWIB conference (slides and recording available).

User Experience – User Interface Working Group

The analysis for improvements of the Share Family entity discovery portals has been devised by the User Experience – User Interface Working Group cooperating with the National Bibliographies Working Group. After tests and use of the beta version of Share-VDE 2.0 and the National Bibliographies portal, working groups members have been analysing the front-end layer of the entity discovery portals according to a set of evaluation criteria including for example usability and accessibility, consistency of design and navigation, overall user friendliness, integration with SVDE features and local library discoveries etc. In parallel, the groups have been collecting issues and adjustments that can be potentially made to the display of information on the entity discovery portals, along with proposals for new features that will enhance the user experience.

This analysis, which will incorporate inputs also from the PCC, will output suggestions for improvements in the overall orchestration of the front-end layer and will continue with the exploration of further areas for enhancements and new features.

IFLA, Linked data and BIBFRAME communities

The Share Family team and member institutions continue to nurture the liaisons with the library information experts and linked data and BIBFRAME communities, to increase interoperability to the advantage of new bibliographic and linked data workflows and to enhance existing ones.

Tiziana Possemato participates in the IFLA Bibliographic Section through various working groups of this division, co-chairing with Annette Dortmund the National Bibliographies and New Technologies working group that will be dedicated to national bibliographies metadata and ICT technologies like AI, linked open data, the semantic web. This opportunity will also foster the interconnections of the Share Family National Bibliographies working group and the related tenant https://natbib-lod.org.

The exchanges with LD4P - Linked Data 4 Production will move towards establishing a permanent connection with the Sinopia BIBFRAME editor environment for the mutual exchange of data in BIBFRAME/RDF.

More broadly, Share-VDE takes part in the BIG - BIBFRAME Interoperability Group established to create guidelines for data exchange in BIBFRAME, thus expanding the methods and tools shared by linked data nodes for BIBFRAME interoperability.

LOD Platform developments

We are striving to enhance the Share Family LOD Platform technology towards production workflows that make available advanced tools and components to support linked data management systems and entity discovery platforms of institutions and consortia.

The increasing complexity of the Share Family system poses several challenges that we constantly tackle by improving the organisation of our teamwork and the planning of developments.

Here follows an overview of the major work strands intertwined.

JCricket entity editor

JCricket is the Share Family tool for collaborative entity curation shared across member institutions. It primarily serves as an entity editor, enabling - according to the BIBFRAME ontology - the creation of new entities, entity modification, and the application of merge and split functions to improve the quality of automated LOD Platform clustering processes. With JCricket, it becomes possible to manage or create any kind of entities, each representing what, in traditional cataloging, was known as a "bibliographic set".

Also, the availability of this tool enables new data workflows that are being analysed to support the integration and use of JCricket even outside of the Share Family LOD Platform. We are considering several scenarios where linked data systems (eg. Library of Congress Marva or LD4P Sinopia BIBFRAME editors) can simultaneously operate locally with their own tools and share and cooperatively edit linked data resources across different environments using JCricket.

JCricket operates on the Share-VDE CKB - Cluster Knowledge Base (and on the CKB of the other Share Family tenants), which is not a local data storage for a single library, but rather the outcome of complex integration processes that generate entities from local data in traditional formats, such as MARC, and new formats, like RDF. Its optimal application is within a large data pool formed through contributions from multiple libraries, such as Share-VDE; therefore, it does not impact original data that reside in member libraries’ systems (unless libraries want to use ad hoc APIs for entity updates both in SVDE and in their systems).

The following picture illustrates JCricket workflow from a conceptual standpoint:

jcr.jpg

The system fetches MARC records from member libraries for ingestion and BIBFRAME conversion; linked data entities are created from this process and stored in the Cluster Knowledge Base. Entities are in fact clusters, ie. entities created through the aggregation of data from multiple institutions (Provenances).

Using JCricket, member libraries are able to authenticate in a dedicated area of the search portal and manually edit potential errors found in entities that are created during the conversion of records contributed by member libraries. These edits on entities will be saved in the system.

Also, JCricket APIs allow communication with member libraries’ systems, to notify the changes made on the entities in the CKB: this can be done in case the libraries want to use this service to be informed about the changes in the clusters to which the library data contributed, or even to optionally activate the entity update APIs.

JCricket is integrated in the discovery portal web interface, for authenticated users.

jcr.jpg

Several demos have been done this year, showing the advancements of JCricket user interface, which will be released in a test version shortly:

The work on this key component of the LOD Platform technology and of the vision of Share Family environments for cooperative linked data sharing was one of the most intense throughout 2023. A test version of JCricket is soon to be released for user validation.

Third parties integration

The evolution of the Share Family technology encompasses the ability to mutually integrate the data produced by the LOD Platform with external systems, notably with local ILS and Library Service Platforms and authority sources.

As to ILS and LSP integration, it’s worth mentioning some advancements:

  • the new authority services for MARC-based workflows – designed with SVDE AIMS Working Group and further input by Stanford University Library – have been completed and are available for institutions willing to test and use them. Also, the AIMS Working Group will reconvene in 2024 to analyse and devise the authority control features for RDF / linked data based workflows;
  • the integration of Alma circulation APIs for local library services is almost completed;
  • the integration with the native BIBFRAME cataloguing editor Sinopia is progressing: the parser for incoming RDF data from Sinopia to be clustered by Share-VDE processes is under development;
  • the connection to FOLIO ILS has been analysed to correlate FOLIO Inventory data with Share-VDE data, and to integrate JCricket user interface in FOLIO. A possible model for ILS/LSP interaction through FOLIO was presented by Andrea Gazzarini from the Share Family team and Sebastian Hammer from Index Data at WOLFcon 2023, to engage discussion within the linked data community on how to pursue this connection.

As to integration with authority systems, several data sources are being investigated and in some cases the initial integration steps have been completed:

  • LD4P Questioning Authority lookup tool;
  • Wikidata for mutual enrichment of entity IDs (initial specifications were shaped by SVDE working groups);
  • ISNI for mutual enrichment of entity IDs (initial specifications were shaped by SVDE working groups).

The resulting scenario will translate into an integrated, “hybrid” operational ecosystem, based on a variety of tools and diverse data sources including traditional workflows (eg. new authority services for MARC workflows) as well as advanced models for data exchange, eg. those envisaged above to simultaneously operate through JCricket and local BIBFRAME editors both within the Share Family system and locally.

Data processing and BIBFRAME conversion

Catalogues ingestion and ongoing data updates

Much work is going on to complete the full workflow of data ingestion, conversion, regular updates of data exports from libraries and publication to the discovery portal of the Share-VDE tenant.

The very complex architecture of the Share Family, made of different tenants each one residing on a separate branch of the system infrastructure with its own installation of databases and system components, entails articulated processes that often take some time to become stable, due to the impact they have on the whole system. In this scenario, the completion of the module handling ongoing updates (“delta updates” in Share Family jargon) done in 2023 is a significant milestone adding value to a system moving towards the support of stable production workflows.

The full workflow of ingestion and ongoing updates has been tested for the beta version of the BNB – British National Bibliography skin portal on NatBib tenant, and is being fine tuned for the future steps towards bringing the BNB in production. The BNB experience will also be useful to measure the performance of the system in view of the much larger scale process for the Share-VDE tenant.

It’s also important to mention that we anticipate the release of a new version 3 of the software supporting JCricket and introducing major changes, that will trigger a new ingestion of libraries’ catalogues and the connection to the delta updates module - pausing the load of library catalogue using the previous version of the software was preparatory to this major change. The new version 3 release is expected in early 2024, starting with the Share-VDE tenant, and the subsequent steps (ie. delta updates module set-up and data ingestion, publication of new data on svde.org portal), will follow.

Granularization of the Cluster Knowledge Base and new conversion model

So far, the support of linked data conversion from multiple input formats has been managed through multiple conversion pipelines.

In line with the evolution of the system over time, it became clear that a single “source of truth” is needed to orchestrate a conversion workflow as frictionless as possible. So, several analysis and implementation actions have been undertaken to revise the current conversion model and streamline this process by extending input data capabilities and making the conversion component format-agnostic.

To achieve this, the new conversion model should support:

  • a finer granularity level of the CKB in compliance with BIBFRAME granularity;
  • a “format-agnostic” CKB with extended input data capabilities to converge all input formats into one conversion source (eg. MARC21, UNIMARC, native BIBFRAME/RDF eg. from LD4P Sinopia application profiles etc.);
  • one single conversion pipeline from the CKB – removing the conversion pipeline based on MARC.

In addition, the granularization of the CKB will facilitate the implementation of the Share-VDE Ontology extensions that have been introduced during 2023 by the work of the SEI – Sapientia Entity Identification Working Group (see the current version of the SVDE Ontology at https://zenodo.org/doi/10.5281/zenodo.8332350), and the ongoing improvements to the clustering algorithm that will derive from members’ and users’ tests with the data.

Finally, the review of the CKB will facilitate the integration of data from other domains, including archival and museum domains.

Once the final pipeline conversion will be in place, the triple store will make available the stable version of the data that will be even more compatible for reuse externally of the Share system.

This work will continue in 2024.

Subjects and Concepts

Collaboration with numerous libraries has contributed to improving the data processing of subject-related information. Currently, we identify and display the most general level, which is the Related Subjects. These subjects may encompass various linguistic variations, but these variations will not be explicitly categorized as such; typically, they will be considered as Related Subjects.

concepts.jpg

In the future, we plan to highlight more specific subsets of equivalent subjects or linguistic variations. This selection will be based on the original data provided by the libraries. Variations will only be identified as such if the library has accurately flagged them in their own records. Additionally, further analysis is being conducted to enhance the development of Concepts and their display in all Share Family tenants.

topics.jpg

Non-Latin scripts

Within the Share Family community, a significant and ongoing discussion revolves around the processing of non-Latin scripts. This encompassing dialogue includes active participation from members within the Advisory Council. Testing of data provided by the National Taiwan University Library will be carried out by the LD4P Non-Latin Script Material affinity group.

Also, an experiment is being carried out with a test portal supporting Arabic script that gave us the occasion to further test with non-Latin scripts, and to apply to the LOD Platform features that will be propagated to all other Share Family tenants, ie.:

  • the creation of Subject entity starting from authority subjects (ie. not only from bib. record access points);
  • an improved version of Subject management, including related subjects.

Goals for the next period

By adopting BIBFRAME as the main ontology in compatibility with IFLA-LRM, the Share Family takes advantage of the potentials of linked open data to facilitate interoperability among data pools, in coexistence with MARC. Leveraging this approach, we work to ultimately:

  • consolidate outputs and collaborative tools to enhance workflows and services for consortia or individual libraries;
  • transform library catalogs into research tools with structured access and visibility to original language research in all disciplines;
  • serve as an authoritative data source, contribute to a new bibliographic ecosystem where data modeling, enrichment and sharing are handled collectively;
  • apply and support open metadata policies.

As always, this will be achieved by being independent of local practices and of ILS/LSP local systems, and through the international collaboration at the heart of the Share Family vision.

We will continue to cooperate with any kind of system or initiative that will be eager to become “Share-ready”.

Your opinion is always welcome: to provide feedback on the Share Family discovery website, report bugs and suggestions as external users reach out through the forum https://forum.svde.org/ or send a message to helpdesk@svde.org.

For general information on the initiative, contact info@svde.org.