Publish to triple store so you can SPARQL your linked data [module proposal]

Intro

This idea is posted here to get feedback of the Omeka S community. Is this a user story you can relate to? Is the described functionality not already present in an existing Omeka S module? Is the description/requirements clear enough (in order for a developer to work on a module)?

User story

I’d like to query my linked data (managed in Omeka S) via SPARQL. Which opens up the possibility to do federated queries, where you can combine data from multiple distributed SPARQL endpoints into a single query result.

Concepts

A triple store is a specialized database used for storing and managing RDF (Resource Description Framework) data. RDF data is represented in the form of subject-predicate-object triples, where the subject represents a resource, the predicate represents a relationship or property, and the object represents the value or another resource.

A triple store efficiently indexes and stores these triples, making it possible to query and retrieve information about resources and their relationships. This allows for flexible and scalable management of linked data and semantic information, making triple stores an essential component in the world of Linked Data and Semantic Web applications.

SPARQL (SPARQL Protocol and RDF Query Language) is a query language designed specifically for querying RDF data stored in triple stores. It enables users to construct complex queries that retrieve specific information from the triple store based on patterns in the RDF data. With SPARQL, users can ask questions and request data using graph patterns, filtering, aggregations, and other query features, similar to how SQL is used to query relational databases.

Examples (proof of concept)

All data of the Gouda Time machine is available a triplestore (GraphDB) and can be queries via https://www.goudatijdmachine.nl/sparql/.

This has been accomplished by ‘harvesting’ all JSON-LD, convert it to N-triples and import these into a triple store like GraphDB. The ‘harvest and convert’ part can be accomplished via the Linked Data Sets modules. But harvesting can take a lot of time. An approach like the Advanced Search adapter for Solr by @Daniel_KM might be a more efficient. The Solr module indexes the metadata of each item in Solr when it’s created/updated (and I guess deleted). So the index is updated near-real-time. A near-real-time up-to-date triple store would be nice.

Module requirements

  • be able to INSERT/DELETE data by using standard SPARQL queries (no triple store dependent API’s).

Yes, I’m working on it for the project Manioc of the Université des Antilles (currently managed externally with Greenstone), but probably only for display, not insert/delete.