Database – Technologies

Frontend

The Thesaurus frontend is a web interface developed by the project itself and offers all the functions of a modern and user-friendly database: a free-text search across the entire database, an expert search for individual fields or several fields to be searched in combination, and numerous filter options. In keeping with the Thesaurus' central interest in images of antiquities, navigation from record to record is via the images wherever possible. There are convenient full-screen views of individual records as well as the function of direct image and object comparison between two records. If the individual data set focuses on a single illustration, its context on the plate or book page is always immediately recognisable, and navigation to records of neighbouring illustrations can also take place directly via the image.

The frontend was developed using the technologies Vue.js 2 with Vuetify. A switch to Vue.js 3 is planned in the near future. The free text search is based on an Elasticsearch index.

RDF, CIDOC CRM, Triplestore

One goal of the Thesaurus in the digital domain is to publish the data collected by the project in the Semantic Web, i.e. to enable their further dissemination and processing according to the FAIR principles (Findable, Accessible, Interoperable, Reusable). A prerequisite for this is the transfer of the data recorded by the Thesaurus from heidICON's relational data model (see below) into an RDF model compatible with CIDOC CRM, which has become standard in the cultural heritage sector, and the establishment of a public SPARQL endpoint through which the semantically modelled data can be automatically queried and further used. Therefore, the conceptual mapping of the heidICON data model to the CIDOC CRM took place, followed by the programming of the corresponding software pipeline. The Python programming language was used for this.

This data conversion also opens up new possibilities for processing the data differently and more diversely within the project and enriching it with external information, e.g. for persons and places that do not receive their own records within the Thesaurus because heidICON's data model does not provide for this. Therefore, the Thesaurus itself accesses the CIDOC-CRM-mapped RDF data stored in a local triplestore, selectively enriches it with external data (e.g. from Wikidata) and forwards it to the Thesaurus frontend. Currently, Blazegraph is used as a triplestore.

Comprehensive documentation of the mapping and the setup of the SPARQL endpoint will follow later in 2023. The software developed by the project will also be published.

Backend

In order to meet the requirements formulated by the initiative for a national research data infrastructure (NFDI), the project is entering into a partnership with Heidelberg University Library. As the main pillar of the DFG-funded ‘Fachinformationsdienste’ arthistoricum.net for art history and Propylaeum for classical studies, Heidelberg University Library offers a digital infrastructure for the backend of the Thesaurus database that is designed for the long term and has been tested in numerous applications and cooperations.

The object and image database heidICON serves the Thesaurus as an instrument for data collection. The data model of heidICON and the database management system Easydb (version 5), which is currently being used, are both generically designed and implemented in such a way that they can do justice to a wide range of topics, in particular also to the potential complexities of cultural heritage objects, as the Thesaurus investigates them. The data model incorporates authority data from the GND and other providers, and is also largely mapped to the LIDO data exchange format, so that standardised data exports to aggregators such as the Deutsche Digitale Bibliothek, Europeana or the Graphikportal are possible. In addition, the heidICON data can be fully accessed via the Easydb API. Finally, all data from heidICON and from the digitisation platform DWork are integrated into a powerful backup and long-term archiving system (heiARCHIVE).