Jump to content

Wikidata:WikiProject GLAM data registry

From Wikidata

GLAM institutions (galleries, libraries, archives and museum) have made available online considerable amounts of data. The data is important for digital humanists and citizen scientist, for example. However, the data's findability and accessibility present a major hurdle to its use. There are no established registries that list available data resources.

In this respect, Wikidata is useful as it can serve as an open, publicly editable, performant and sufficiently flexilbe registry for GLAM data resources.

By 'data resources', we mean either APIs or data dumps. The aim of this WikiProject is twofold:

  • on the one hand, to develop a viable data model that can adequately describe data dumps and APIs;
  • on the other, to invite participants to describe data resources in a collective, more or less systematic effort.

Relevant similar projects include:

The data model should strive for compatibility with:

  • DCAT 3
  • api-catalog
  • Henk Alkemade; Gustavo Candela; Steven Claeyssens; et al. (7 July 2025), Datasheets for Digital Cultural Heritage Datasets, 2, doi:10.5281/ZENODO.15828222, Wikidata Q137188431View profile on Scholia

Data structure

[edit]

Data dump

[edit]

One way to refer to data dumps could be by download URL (P4945) or external data available at URL (P1325).

Qualifiers:

API

[edit]

Property

[edit]
Title ID Data type Description Examples Inverse
API endpoint URLP6269URLweb API: base URL of a web serviceCleveland Museum of Art <API endpoint URL> "http://openaccess-api.clevelandart.org"-

Qualifiers

[edit]
Title ID Data type Description Examples Inverse
protocolP2700Itemcommunication protocol: communication protocol to use to access a dataset or serviceCleveland Museum of Art <API endpoint URL> "http://openaccess-api.clevelandart.org"
<protocol> HTTP
-
file formatP2701Itemfile format: file format, compression type, or ontology used in a fileCleveland Museum of Art <API endpoint URL> "http://openaccess-api.clevelandart.org"
<file format> JSON
-
described at URLP973URLURL: item is described at the following URLCleveland Museum of Art <API endpoint URL> "http://openaccess-api.clevelandart.org"
<described at URL> "https://openaccess-api.clevelandart.org/"
-
described at URLP973URLURL: item is described at the following URLCleveland Museum of Art <user manual URL> "http://openaccess-api.clevelandart.org"
<described at URL> "https://openaccess-api.clevelandart.org/"
-

Queries

[edit]

Here is a query to get REST API endpoints of museums in Berlin:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
SELECT ?glam ?glamLabel ?endpointUrl WHERE {
  ?glam wdt:P31/wdt:P279* wd:Q33506 ;
        wdt:P131* wd:Q64 ;
        p:P6269 ?s .
  ?s ps:P6269 ?endpointUrl .
  ?s pq:P2700 wd:Q749568 .
  ?glam rdfs:label ?glamLabel .
  FILTER (LANG(?glamLabel) = "de") .
}
Try it! (QLever)

Here's a breakdown of GLAM APIs grouped by protocol:

#defaultView:BarChart
SELECT  ?protocol ?protocolLabel (COUNT(?protocol) as ?n) WHERE {
  VALUES ?glamCategories { wd:Q7075 wd:Q33506 wd:Q166118 }
  ?glam wdt:P31/wdt:P279* ?glamCategories .
  ?glam p:P6269 ?s ;
    wdt:P6269 ?api ;
  OPTIONAL { ?s pq:P2700 ?protocol . }
  OPTIONAL { ?s pq:P973 ?described . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  } GROUP BY ?protocol ?protocolLabel
Try it!

Subpages

[edit]

Participants

[edit]

The participants listed below can be notified using the following template in discussions:
{{Ping project|GLAM data registry}}