Last edited 6 months ago
by Serena Cericola

Query Languages

Revision as of 14:33, 20 February 2024 by Andrea Gazzarini (talk | contribs)

Introduction

Share-VDE Search API supports three different query languages[1]; each of them has a different purpose as briefly illustrated in the following diagram:

query languages.png

SVDEQL

The SVDE QL is a pseudo-natural query language used for querying the Share-VDE dataset.

The query language has been implemented in order to fulfil the Share-VDE advanced search requirements. As consequence of that, it is not a general purpose query language but instead it is strictly tied to those entities that can be search using that kind of search.

The query language is exposed in the RESTful and GraphQL API.

Syntax

Initial Token: Advanced or Simple Search?

If there is an initial token which identifies the entity we want to query:

  • agents whose: for querying agents without specifying the type in advance
  • people whose: for querying people
  • families whose: for querying families
  • meetings whose: for querying meetings
  • organisations whose: for querying organisations
  • opuses whose: for querying opuses
  • publications whose: for querying publications

then the full SVDEQL syntax is expected as described in the following sections. Otherwise, if the query consists only of query terms then a plain "simple" term search is executed. At time of writing, terms search is available for all entities but works, instances, publications and items.

When used in GraphQL, the SvdeQL type encapsulates the information needed for issuing a query:

  • the query
  • the number of results we want to get back in the returned page
  • the start offset within the overall results
  • a flag which forces a "partial match" logic (i.e. user entered query terms are considered optional)

Otherwise, if an advanced search is triggered (one of the initial tokens above is detected) the subsequent sections apply.

Clauses

After the declaration above, there must be at least one clause with the following syntax:

<attribute> <predicate> <value> where

  • the attribute is a valid attribute for the requested entity (e.g. it's not possible to use "dissolutionYear" in a person query. See below for a list of valid attributes
  • the predicate is a valid predicate for the attribute above (e.g. it's not possible to use "begins with" for numeric attribute)
  • the value is a valid value according with the attribute (e.g. the value of a numeric attribute must be numeric)

In case of multiple clauses they must be separated using a boolean operator (in uppercase):

  • AND
  • OR

Some predicates can be expressed in different forms. Here's a list of them:

  • doesn't contain, does not contain
  • doesn't begin with, does not begin with
  • doesn't match, does not match
  • isn't in range, is not in range
  • exactly matches, matches
  • isn't, is not

In the tables below we will write only one of those forms. However, keep in mind the variants above can be used in those cases.

Entities

Agents

Agents refer generically to organisations, people, families, meetings without explicitly indicating the specific type. As a consequence of that, the available attributes are a superset which contains things valid for all agents.

Valid clauses
Attribute Predicate Value
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
beginningDate is, isn't, is in range, isn't in range range[4] or a numeric value
endingDate is, isn't, is in range, isn't in range range[4] or a numeric value
location is, isn't URI[5]
Examples
  • agents whose name contains Carroll
  • agents whose name contains "Lewis Carroll"
  • agents whose beginningDate is 1992 AND endingDate is in range from 2000 to 2010
  • agents whose beginningDate is in range from 1982 to 1999
  • agents whose location is https://svde.org/places/2387273
  • agents whose location is https://svde.org/places/2387273 AND name contains Carroll

People

Valid clauses
Attribute Predicate Value
firstName Fulltext search predicates[2] terms, phrases[3]
lastName Fulltext search predicates[2] terms, phrases[3]
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
birthDate is, isn't, is in range, isn't in range range[4] or a numeric value
deathDate is, isn't, is in range, isn't in range range[4] or a numeric value
occupation is, isn't URI[5]
birthPlace is, isn't URI[5]
deathPlace is, isn't URI[5]
Examples
  • people whose name contains Carroll
  • people whose name contains "Lewis Carroll"
  • people whose beginningDate is 1992
  • people whose birthDate is in range from 1982 to 1999
  • people whose deathDate is in range to 1999
  • people whose deathDate is in range from 1982
  • people whose birthPlace is https://svde.org/places/2387273
  • people whose deathPlace is https://svde.org/places/2387273 AND name contains Carroll

Families

Valid clauses
Attribute Predicate Value
name Fulltext search predicates<ref name=":0"> terms, phrases[3]
identifier Fulltext search predicates<ref name=":0"> terms, phrases[3]
description Fulltext search predicates<ref name=":0"> terms, phrases[3]
startDate is, isn't, is in range, isn't in range range[4] or a numeric value
endDate is, isn't, is in range, isn't in range range[4] or a numeric value
Examples
  • families whose name contains kennedy
  • families whose name contains "Kennedy family"
  • families whose startDate is 1992
  • families whose endDate is in range from 1982 to 1999

Organisations

Valid clauses
Attribute Predicate Value
name Fulltext search predicates<ref name=":0"> terms, phrases[3]
identifier Fulltext search predicates<ref name=":0"> terms, phrases[3]
description Fulltext search predicates<ref name=":0"> terms, phrases[3]
foundingYear is, isn't, is in range, isn't in range range[4] or a numeric value
dissolutionYear is, isn't, is in range, isn't in range range[4] or a numeric value
location is, isn't URI[5]
Examples
  • organisations whose name contains international
  • organisations whose name contains "International company"
  • organisations whose foundingYear is 1992
  • organisations whose dissolutionYear is in range from 1982 to 1999

Meetings

Valid clauses
Attribute Predicate Value
name Fulltext search predicates<ref name=":0"> terms, phrases[3]
identifier Fulltext search predicates<ref name=":0"> terms, phrases[3]
description Fulltext search predicates<ref name=":0"> terms, phrases[3]
year is, isn't, is in range, isn't in range range[4] or a numeric value
location is, isn't URI[5]
Examples
  • meetings whose name contains annual
  • meetings whose name contains "annual conference of BIBLIO"
  • meetings whose year is 1992
  • meetings whose year is in range from 1982 to 1999

Opuses

Valid clauses
Attribute Predicate Value
title Fulltext search predicates<ref name=":0"> terms, phrases[3]
identifier Fulltext search predicates<ref name=":0"> terms, phrases[3]
year is, isn't, is in range, isn't in range range[4] or a numeric value
contributor is, isn't, is known, is unknown See below
work is, isn't, is unknown URI[5]
genre is, isn't, is known URI[5]

The contributor attribute has a very specific syntax which follow the syntax:

contributor   
    is / isn't    
    (any type | any person | any meeting | any organisation | any family | <URI>)   
    (as <relator code> | in any role)

or

contributor is known / is unknown 

where

  • <URI>: the resource (contributor) URI[5]
  • <relator code>: the relator code in case we want to search for a specific role.
Examples (contributor attribute)
  • opuses whose contributor is any type in any role
  • opuses whose contributor is any type as aut
  • opuses whose contributor is any person in any role
  • opuses whose contributor is any person as aut
  • opuses whose contributor is http://dbpedia.org/resource/MarioRossi in any role
  • opuses whose contributor is http://dbpedia.org/resource/MarioRossi as aut
  • opuses whose contributor isn't any type in any role
  • opuses whose contributor isn't any type as aut
  • opuses whose contributor isn't any person in any role
  • opuses whose contributor isn't any person as aut
  • opuses whose contributor isn't http://dbpedia.org/resource/MarioRossi in any role
  • opuses whose contributor isn't http://dbpedia.org/resource/MarioRossi as aut
  • opuses whose contributor is not any type in any role
  • opuses whose contributor is not any type as aut
  • opuses whose contributor is not any person in any role
  • opuses whose contributor is not any person as aut
  • opuses whose contributor is not http://dbpedia.org/resource/MarioRossi in any role
  • opuses whose contributor is not http://dbpedia.org/resource/MarioRossi as aut
  • opuses whose contributor is known
  • opuses whose contributor is unknown

Publications

A publication is a logical entity which groups

  • 1 Instance
  • the corresponding Items
  • the instance parent Work
Valid clauses
Attribute Predicate Value
title Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
publicationPlace is, isn't URI[5]
format is, isn't URI[5]
publicationYear is, isn't, is in range, isn't in range range[4] or a numeric value
note Fulltext search predicates[2] terms, phrases[3]
isbnOrIssn Fulltext search predicates[2] terms, phrases[3]
eanOrIsmn Fulltext search predicates[2] terms, phrases[3]
language is, isn't URI[5]
subject is, isn't URI[5]
holdingInstitution Fulltext search predicates[2] terms, phrases[3]
barcode is, isn't text
classification Fulltext search predicates[2] terms, phrases[3]
contributor is, isn't, is known, is unknown See the contributor attribute in Opuses (above)
anyField contains terms, phrases
library is, isn't URI[5]
opusType is, isn't URI[5]
printOnlineChoice is, isn't print,online
auctionExhibition is,isn't auction,exhibition

The following table lists the attribute ownership within the Publication entity

Attribute Entity
title Instance
identifier Work, Instance
publicationPlace Instance
format Instance
publicationYear Instance
note Instance
isbnOrIssn Instance
eanOrIsmn Instance
language Work
holdingInstitution Item
barcode Item
classification Work
contributor Work, Instance
subject Work

Instances

Instead of querying publications, a user with editing capabilities can also query their compounding parts; that is instances (works and items, as well)

Valid clauses
Attribute Predicate Value
title Fulltext search predicates[6] terms, phrases[7]
identifier Fulltext search predicates[6] terms, phrases[7]
publicationPlace is, isn't URI[8]
format is, isn't URI[8]
publicationType is, Isn't URI[8]
publicationYear is, isn't, is in range, isn't in range range[9] or a numeric value
note Fulltext search predicates[6] terms, phrases[7]
isbnOrIssn Fulltext search predicates[6] terms, phrases[7]
eanOrIsmn Fulltext search predicates[6] terms, phrases[7]
contributor is, isn't, is known, is unknown See the contributor attribute in Opuses (above)
anyField contains terms, phrases
printOnlineChoice is, isn't print,online

Works

Valid clauses
Attribute Predicate Value
identifier Fulltext search predicates[10] terms, phrases[11]
language is, isn't URI[12]
subject is, isn't URI[12]
classification Fulltext search predicates[10] terms, phrases[11]
contributor is, isn't, is known, is unknown See the contributor attribute in Opuses (above)
anyField contains terms, phrases

Items

Valid clauses
Attribute Predicate Value
identifier Fulltext search predicates[13] terms, phrases[14]
holdingInstitution Fulltext search predicates[13] terms, phrases[14]
barcode is, isn't text
anyField contains terms, phrases
library is, isn't URI[15]

StructQL

The StructQL is a structure, JSON-like based query language used for querying the Share-VDE dataset. The query language has been implemented in order to fulfil the Share-VDE advanced search requirements.

As consequence of that, it is not a general purpose query language but instead it is strictly tied to those entities that can be search using that kind of search.

The language is exposed only in the GraphQL API. The whole syntax is exposed as part of the GraphQL schema that can be browsed using the GraphiQL[16] interface available in our SIT environment.

Syntax

Target Entity type

The requested entity type is driven by the specific GraphQL operations. So for example the people(...) operation is meant to return Person entities, families(...) operation returns Family entities, and so on.

Clauses

There must be at least one clause with the following syntax:

{ 
    { 
        <attribute> : {p: <predicate>, o: <value>}, op:<boolean operator>
    } 
}

where

  • attribute is a valid attribute for the requested entity (e.g. it's not possible to use "dissolutionYear" in a person query. See below for a list of valid attributes)
  • predicate is a valid predicate for the attribute above (e.g. it's not possible to use "begins with" for numeric attribute)
  • value is a valid value according with the attribute (e.g. the value of a numeric attribute must be numeric)
  • the boolean operator (and, or) is mandatory only in case there is another following clause, for example
{ 
    q: [ 
        { name : {p: CONTAINS, o: "Carroll"}, op:and}, 
        { name : {p: CONTAINS, o: "Lewis"}} 
    ] 
}

Entities

Agents

Valid clauses
Attribute Predicate Value
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
beginningDate IS, ISNT range[4] or a numeric value
endingDate IS, ISNT range[4] or a numeric value
location IS, ISNT URI[5]

Examples

{
    { name : {p: CONTAINS, o: "Carroll"}}
}

{
    { name : {p: CONTAINS, o: "\" Lewis Carroll\"}}
}

{
    { beginningDate : {p: IS, o: 1992}}
}

{
    { endingDate : {p: IS, from: 1982, to: 1999}}
}

{
    { location : {p: IS, o:"https://svde.org/places/2387273"}}
}

{
    { location : {p: IS, o:"https://svde.org/places/2387273"}, op: and },
    { name : {p: CONTAINS, o:"Carroll"} }
}

People

Valid clauses
Attribute Predicate Value
firstName Fulltext search predicates[2] terms, phrases[3]
lastName Fulltext search predicates[2] terms, phrases[3]
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
birthDate IS, ISNT range[4] or a numeric value
deathDate IS, ISNT range[4] or a numeric value
occupation is, isn't URI[5]
birthPlace is, isn't URI[5]
deathPlace is, isn't URI[5]
Examples

See the examples above.

Family

Valid clauses
Attribute Predicate Value
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
startDate IS, ISNT range[4] or a numeric value
endDate IS, ISNT range[4] or a numeric value
Examples

See the examples above.

Organisation

Valid clauses
Attribute Predicate Value
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
foundingYear IS, ISNT range[4] or a numeric value
dissolutionYear IS, ISNT range[4] or a numeric value
location IS, ISNT URI[5]
Examples

See the examples above.

Meeting

Valid clauses
Attribute Predicate Value
name Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
description Fulltext search predicates[2] terms, phrases[3]
year IS, ISNT range[4] or a numeric value
location IS, ISNT URI[5]
Examples

See the examples above.

Opuses

Valid clauses
Attribute Predicate Value
title Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
year IS, ISNT range[4] or a numeric value
contributor IS, ISNT, IS_KNOWN, IS_UNKNOWN See below
work IS, ISNT URI[5]
genre IS, ISNT URI[5]

The contributor attribute has a very specific syntax which follow the pseudo-syntax:

contributor   
   IS / ISNT 
   (ANY | any person | any meeting | any organisation | any family | <URI>)
   (as <relator code> | ANY)

or contributor IS_KNOWN / IS_UNKNOWN where

  • ANY: a special placeholder for indicating (depending on the context) any agent type or any role
  • <URI>: the resource (contributor) URI[5]
  • <relator code>: the relator code in case we want to search for a specific role.

Publications

A publication is a logical entity which groups

  • an instance
  • the corresponding items
  • the parent work
Valid clauses
Attribute Predicate Value
title Fulltext search predicates[2] terms, phrases[3]
identifier Fulltext search predicates[2] terms, phrases[3]
publicationPlace IS, ISNT URI[5]
format IS, ISNT URI[5]
publicationYear IS, ISNT range[4] or a numeric value
note Fulltext search predicates[2] terms, phrases[3]
isbnOrIssn Fulltext search predicates[2] terms, phrases[3]
eanOrIsmn Fulltext search predicates[2] terms, phrases[3]
language IS, ISNT URI[5]
availability IS, ISNT URI[5]
subject IS, ISNT URI[5]
holdingInstitution Fulltext search predicates[2] terms, phrases[3]
barcode IS, ISNT text
classification Fulltext search predicates[2] terms, phrases[3]
contributor IS, ISNT, IS_KNOWN, IS_UNKNOWN See the contributor attribute in Opuses (above)
subject (not yet supported)
anyField CONTAINS terms, phrases
library IS, ISNT URI
opusType IS, ISNT URI
printOnlineChoice IS,ISNT print,online
auctionExhibition IS,ISNT auction,exhibition

The following table lists the attribute ownership within the Publication entity

Attribute Entity
title Instance
identifier Work, Instance
publicationPlace Instance
format Instance
publicationYear Instance
note Instance
isbnOrIssn Instance
eanOrIsmn Instance
language Work
availability Item
holdingInstitution Item
barcode Item
classification Work
contributor Work, Instance
subject Work
Examples
{
    q: [
        { publicationPlace: { p: IS, o: "https://svde.org/places/837463}}
    ]
}

{
    q: [
        { publicationYear: { p: IS, from: 1993, to:2001 }}
    ]
}

{
    q: [
        { contributor: { p: IS,agentType: "ANY", role:"ANY"}}
    ]
}

{
    q: [
        { contributor: { p: IS,agentType: "ANY", role:"ANY"}}
    ]
}

{
    q: [
        { contributor: { p: IS,agentType: "ANY", role:"aut"}}
    ]
}

{
    q: [
        { contributor: { p: IS,agentType: "https://svde.org/agentTypes/Person", role:"ANY"}}
    ]
}

{
    q: [
        { contributor: { p: IS,agentType: "https://svde.org/agentTypes/Person", role:"aut"}}
    ]
}

{
    q: [
        { contributor: { p: IS, uri: "https://svde.org/agents/2837273",role:"ANY"}}
    ]
}

{
    q: [
        { contributor: { p: IS, uri: "https://svde.org/agents/2837273",role:"aut"}}
    ]
}

{
    q: [
        { contributor: { p: IS_KNOWN }}
    ]
}

{
    q: [
        { contributor: { p: IS_UNKNOWN }}
    ]
}

TermsQL (TQL)

TQL is not a query language itself, it is used for denoting a query composed only by search terms and used in typeahead contexts. The typeahead search is not available for all entities. Here's a list of entities / endpoints that support it:

Core Entities/Endpoints

  • /agents
  • /people
  • /organisations
  • /meetings
  • /families

Controlled Vocabulary Entities/Endpoints

  • /agentTypes
  • /availabilities
  • /formats
  • /forms
  • /genres
  • /languages
  • /occupations
  • /places
  • /roles
  • /subjectTypes

It's important to underline again only the endpoints above support the typeahead search. This because in some cases the same entity can be accessed also through other endpoints. For example, occupations can be accessed

  • /occupations
  • /people/201/occupations

Only in the first endpoint we can trigger a typeahead search.

A typeahead search accepts the following parameters

  • mode: must be set to typeahead
  • fuzzy: enables fuzzy logic; in case the original terms entered by user don't produce any results, the search is repeated by applying a correction on them
  • edits: the max number of corrections (in terms of characters edit) the fuzzy logic applies to the original terms entered by user

When a typeahead search is requested, the system executes a first round by looking for matches using

  • the original terms entered by user
  • the headings associated to the language of the requestor. The search language is negotiated through the Accept-Language HTTP header and it defaults to EN (English)

In case of zero results,

  • if the fuzzy parameter is set to true, then the a second search is executed according with the value of the edits parameter (defaults to 1). A boolean flag in the response indicates if the fuzzy logic has been enabled for that specific search
  • a third search is executed using the original terms but this time the target headings are those associated with the other available languages

Examples

-------

  1. https://docs.google.com/presentation/d/1tjc6J_HOPtcbSvERcMwD9BX5DlCokT_zywyC-LOaScg/edit#slide=id.p1
  2. 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27 2.28 2.29 2.30 2.31 2.32 2.33 2.34 2.35 2.36 2.37 2.38 2.39 2.40 Fulltext search predicates: contains, doesn't contain, matches, doesn't matches, begins with, doesn't begin with
  3. 3.00 3.01 3.02 3.03 3.04 3.05 3.06 3.07 3.08 3.09 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21 3.22 3.23 3.24 3.25 3.26 3.27 3.28 3.29 3.30 3.31 3.32 3.33 3.34 3.35 3.36 3.37 3.38 3.39 3.40 3.41 3.42 3.43 3.44 3.45 3.46 3.47 3.48 3.49 3.50 3.51 Some predicates like "begins with" or "doesn't begin with" doesn't allow a mix if phrase and terms in the value because it doesn't make sense
  4. 4.00 4.01 4.02 4.03 4.04 4.05 4.06 4.07 4.08 4.09 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 4.21 Range queries can have both bounds (e.g. "is in range from 1982 to 1999" or just one of them (e.g. "is in range from 1928" or "is in range to 1999")
  5. 5.00 5.01 5.02 5.03 5.04 5.05 5.06 5.07 5.08 5.09 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18 5.19 5.20 5.21 5.22 5.23 5.24 5.25 5.26 5.27 5.28 URIs are supposed to be Share VDE URIs (e.g. https://svde.org/places/273623)
  6. 6.0 6.1 6.2 6.3 6.4 Fulltext search predicates: contains, doesn't contain, matches, doesn't matches, begins with, doesn't begin with
  7. 7.0 7.1 7.2 7.3 7.4 Some predicates like "begins with" or "doesn't begin with" doesn't allow a mix if phrase and terms in the value because it doesn't make sense
  8. 8.0 8.1 8.2 URIs are supposed to be Share VDE URIs (e.g. https://svde.org/places/273623)
  9. Range queries can have both bounds (e.g. "is in range from 1982 to 1999" or just one of them (e.g. "is in range from 1928" or "is in range to 1999")
  10. 10.0 10.1 Fulltext search predicates: contains, doesn't contain, matches, doesn't matches, begins with, doesn't begin with
  11. 11.0 11.1 Some predicates like "begins with" or "doesn't begin with" doesn't allow a mix if phrase and terms in the value because it doesn't make sense
  12. 12.0 12.1 URIs are supposed to be Share VDE URIs (e.g. https://svde.org/places/273623)
  13. 13.0 13.1 Fulltext search predicates: contains, doesn't contain, matches, doesn't matches, begins with, doesn't begin with
  14. 14.0 14.1 Some predicates like "begins with" or "doesn't begin with" doesn't allow a mix if phrase and terms in the value because it doesn't make sense
  15. URIs are supposed to be Share VDE URIs (e.g. https://svde.org/places/273623)
  16. https://uat3-base-svde.atcult.it/api/graphiql