diff --git a/federated-catalogue/src/docs/architecture/chapters/05_building_block_view.adoc b/federated-catalogue/src/docs/architecture/chapters/05_building_block_view.adoc index fdfbcbd..4caefa1 100644 --- a/federated-catalogue/src/docs/architecture/chapters/05_building_block_view.adoc +++ b/federated-catalogue/src/docs/architecture/chapters/05_building_block_view.adoc @@ -74,9 +74,21 @@ The Graph Database can be considered as a kind of search index. The single sourc Generically returning the Self-Description files containing claims that influence query response is not possible. To get the relevant Self-Description files, the query to the Graph Database can be formulated to return the Gaia-X entity that is the __credentialSubject__ of a Verifiable Credential. Then this can be used as a filter for the Self-Description endpoint, to download the Self-Description file. -Neo4j is used as implementation of the Graph database. +The graph database backend is **pluggable**. The implementation is selected via the `graphstore.impl` configuration property: -[IMPORTANT] +[options="header",cols="1,2,2"] +|=== +| Value | Backend | Query Language +| `neo4j` | Neo4j (default) | openCypher +| `fuseki` | Apache Jena Fuseki | SPARQL-star +| `dummy` | No-op (for testing) | — +|=== + +Both backends implement the same `GraphStore` interface. Core services contain no backend-specific logic. + +===== Neo4j backend + +[IMPORTANT] .Limitation: queries to non-Enterprise Neo4j Graph database return empty record when no results found, rather than empty list. ==== When there is no data in the Graph database, i.e., no claims extracted from Self-Description, there is still a configuration node for the neosemantics modulefootnote:[https://neo4j.com/labs/neosemantics/4.0/config/[Configuring Neo4j to use RDF data]], which enables Neo4j to support the RDF data model, which is required here. @@ -89,6 +101,10 @@ However this revoke operation is only supported in Neo4j enterprise.footnote:[ht We chose not to implement a workaround that involves query rewriting, as this may have harmful side effects. ==== +===== Fuseki backend + +The Fuseki backend uses Apache Jena with RDF-star. Claims are stored as RDF-star triple nodes: `<< subject predicate object >> cred:credentialSubject credentialSubjectUri`. Queries must use SPARQL-star syntax. Results are shuffled unless the query contains an `ORDER BY` clause. + ==== File Store The File Store is responsible to persist all file based content submitted to the catalogue. These are Self-Descriptions and Schemas. @@ -249,7 +265,7 @@ This component is used by the _Verification Service_ to extract Claims from Self ==== Graph Store -The Graph Store is the interface for interacting with the Graph Database. It receives claims (extracted from Self-Descriptions) and adds them to the graph database. The Graph Database only contains claims from active Self-Descriptions and offers an openCypher query interface. External ontologies are not queried when processing requests. +The Graph Store is the interface for interacting with the Graph Database. It receives claims (extracted from Self-Descriptions) and adds them to the graph database. The Graph Database only contains claims from active Self-Descriptions and offers a query interface (openCypher for the Neo4j backend, SPARQL-star for the Fuseki backend). External ontologies are not queried when processing requests. [source,java] ---- diff --git a/federated-catalogue/src/docs/architecture/chapters/06_runtime_view.adoc b/federated-catalogue/src/docs/architecture/chapters/06_runtime_view.adoc index a00fd56..066a230 100644 --- a/federated-catalogue/src/docs/architecture/chapters/06_runtime_view.adoc +++ b/federated-catalogue/src/docs/architecture/chapters/06_runtime_view.adoc @@ -771,6 +771,10 @@ When calling `GET /query` a HTML page with a query input form is displayed. ==== Query the catalogue +The query endpoint accepts an optional `queryLanguage` query parameter to select the query language. The default is `OPENCYPHER` (for the Neo4j backend). When the Fuseki backend is configured, use `queryLanguage=SPARQL` to issue SPARQL-star queries. Queries in a language not supported by the active backend are rejected with HTTP 501. + +Only read queries are allowed. Write queries (e.g., `CREATE`/`DELETE` in openCypher, `INSERT`/`DELETE` in SPARQL) are rejected with HTTP 500. + [mermaid, width=2000] .... sequenceDiagram @@ -778,7 +782,7 @@ sequenceDiagram actor User participant API participant Graph as Graph-DB - User ->> + API: POST /query + User ->> + API: POST /query?queryLanguage={OPENCYPHER|SPARQL} API ->> + Graph: queryData(sdQuery) Graph ->> Graph: validateQuery() Note right of Graph: Only read queries are allowed diff --git a/federated-catalogue/src/docs/architecture/chapters/07_deployment_view.adoc b/federated-catalogue/src/docs/architecture/chapters/07_deployment_view.adoc index 69a5005..3aec893 100644 --- a/federated-catalogue/src/docs/architecture/chapters/07_deployment_view.adoc +++ b/federated-catalogue/src/docs/architecture/chapters/07_deployment_view.adoc @@ -13,13 +13,13 @@ ifndef::imagesdir[:imagesdir: ../../images] == Deployment View -The Federated Catalogue is provided as a container based application. The different components are shown in <>. The main component is the `Federated Catalogue Server`. It requires a PostgreSQL-Serverfootnote:[https://www.postgresql.org/[PostgreSQL: The world's most advanced open source database]] instance to store Self-Description metadata and a Neo4jfootnote:[https://neo4j.com/[Neo4j Graph Data Platform | Graph Database Management System]] instance to store the Self-Description graph and to support the query interface of the catalogue. +The Federated Catalogue is provided as a container based application. The different components are shown in <>. The main component is the `Federated Catalogue Server`. It requires a PostgreSQL-Serverfootnote:[https://www.postgresql.org/[PostgreSQL: The world's most advanced open source database]] instance to store Self-Description metadata and a graph database instance to store the Self-Description graph and to support the query interface of the catalogue. The graph database backend is pluggable, selected via the `graphstore.impl` configuration property. Supported backends are Neo4jfootnote:[https://neo4j.com/[Neo4j Graph Data Platform | Graph Database Management System]] (openCypher queries) and Apache Jena Fusekifootnote:[https://jena.apache.org/documentation/fuseki2/[Apache Jena Fuseki]] (SPARQL-star queries). [#img_deployment_view,reftext='Figure {counter:refnum}'] image::07_deployment_view.png[title="Deployment View"] In addition, a Keycloak instance is needed to handle authentication. Keycloak, as well as the portal, used to interact with the Federated Catalogue, is considered out of scope for the Federated Catalogue itself. -<> shows the default ports for the components' interaction. The connection settings (hostnames, ports) are configurable. It is recommended to replace the default credentials. To store persistent data, the `Federated Catalogue Server`, `PostgreSQL` as well as `Neo4J` require a volume, which must be mounted into the container. +<> shows the default ports for the components' interaction. The connection settings (hostnames, ports) are configurable. It is recommended to replace the default credentials. To store persistent data, the `Federated Catalogue Server`, `PostgreSQL` as well as the graph database (Neo4j or Fuseki) require a volume, which must be mounted into the container. Kubernetesfootnote:[https://kubernetes.io/[Kubernetes]] is the recommended infrastructure for production deployment. For development, Dockerfootnote:[https://www.docker.com/[Docker: Accelerated, Containerized Application Development]] in combination with docker-compose can be used. diff --git a/federated-catalogue/src/docs/architecture/chapters/08_concepts.adoc b/federated-catalogue/src/docs/architecture/chapters/08_concepts.adoc index 0d1e42c..5865896 100644 --- a/federated-catalogue/src/docs/architecture/chapters/08_concepts.adoc +++ b/federated-catalogue/src/docs/architecture/chapters/08_concepts.adoc @@ -18,7 +18,7 @@ Data is persisted in the following components: * Federated Catalogue Server: Self-Description and Schema files (Filesystem) * Metadata Store: Self-Description and Schema metadata (PostgreSQL) -* Graph store: Self-Description graph (Neo4j) +* Graph store: Self-Description graph (Neo4j or Fuseki, depending on `graphstore.impl` configuration) During the deployment it must be ensured that volumes are mounted to store the persistent data in the mentioned components. diff --git a/federated-catalogue/src/docs/architecture/chapters/13_appendix.adoc b/federated-catalogue/src/docs/architecture/chapters/13_appendix.adoc index 1b68854..a0a8b43 100644 --- a/federated-catalogue/src/docs/architecture/chapters/13_appendix.adoc +++ b/federated-catalogue/src/docs/architecture/chapters/13_appendix.adoc @@ -15,6 +15,8 @@ ifndef::imagesdir[:imagesdir: ../../images] These examples demonstrate how to interact with the xref:05_building_block_view#_graph_database[Graph Database]. +NOTE: The examples below use **openCypher** syntax, which applies to the **Neo4j** backend (`graphstore.impl=neo4j`). When using the **Fuseki** backend (`graphstore.impl=fuseki`), equivalent queries must be expressed in **SPARQL-star** syntax via the `POST /query?queryLanguage=SPARQL` endpoint. See <> for SPARQL-star equivalents. + All of the following examples have the same structure: Data:: @@ -279,3 +281,44 @@ MATCH (n:ServiceOffering) WHERE n.name = "Elastic Search DB" RETURN n.uri LIMIT ---- [{n.uri=http://w3id.org/gaia-x/indiv#serviceElasticSearch.json}] ---- + +[[sparql_star_examples]] +==== SPARQL-star Examples (Fuseki backend) + +The following examples show equivalent queries for the Fuseki backend using SPARQL-star syntax. These are sent via `POST /query?queryLanguage=SPARQL`. + +===== All triples + +Retrieve all triples stored in the graph: + +[source,sparql] +---- +SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 100 +---- + +===== All claims with credential subject + +Retrieve all Self-Description claims together with the credential subject they belong to, using RDF-star syntax: + +[source,sparql] +---- +PREFIX cred: +SELECT ?s ?p ?o ?cs +WHERE { + << ?s ?p ?o >> cred:credentialSubject ?cs +} +LIMIT 100 +---- + +===== Claims for a specific credential subject + +Retrieve claims belonging to a specific Self-Description: + +[source,sparql] +---- +PREFIX cred: +SELECT ?s ?p ?o +WHERE { + << ?s ?p ?o >> cred:credentialSubject +} +----