5. Self-Description Definition

Gaia-X Self-Descriptions (SD) describe Entities from the Gaia-X Conceptual Model in a machine interpretable format. This includes Self-Descriptions for the Participants themselves, as well as the Resources and Service Offerings from the Providers. Well-defined Self-Description Schemas (which can be extended by the Federations for their domain) enable ensuring a unified representation of the Self-Descriptions. The Self-Description allows to find and compare Entities inside Gaia-X.

flowchart LR
    entity[Entity]
    sd[Self-Description]
    credential[Verifiable Credentials]
    schema[Schema]
    issuer[Issuer]
    entity -- described by --> sd
    sd -- contains --> credential
    sd -- validated against --> schema
    credential -- signed by --> issuer

Overview on Self-Descriptions. The terms Verifiable Credential, Schema and Issuer are explained in more detail in the following sections.

Self-Descriptions in combination with trustworthy verification mechanisms empower Participants in their decision-making processes. Specifically, Self-Descriptions can be used for:

Tool-assisted evaluation, selection, composition and orchestration of Services and Resources
Enforcement, continuous validation and trust monitoring together with usage policies
Negotiation of contractual terms

The Participants (particularly Providers) are responsible for the creation of their Self-Descriptions. In addition to self-declared information by Participants about themselves or their offerings, a Self-Description may comprise statements issued and signed by trusted parties.

5.1 Self-Description Structure

Self-Descriptions are W3C Verifiable Presentations in the JSON-LD format. Self-Descriptions comprise one or more Verifiable Credentials. Verifiable Credentials themselves contain a set of Claims: assertions about Entities expressed in the RDF data model. Both Verifiable Credentials and Verifiable Presentations come with cryptographic signatures to increase the level of trust. Note that the Verifiable Credentials inside a Self-Description may be signed from different (trusted) parties.

Self-Descriptions contain verifiable credentials about the attributes of Entities and relations to other Entities based on subject-predicate-object triples (cf. the RDF data model). The possible attributes and relations to be used in a Self-Description come from Self-Description Schemas (see next section).

Cross-referencing between Self-Descriptions is enabled by unique Identifiers for the Entities. Identifiers in Gaia-X are URIs and follow the specification of RFC 3986.

Every Self-Description has one Entity as its main topic. This Entity must have a Gaia-X compliant Identifier. Self-Descriptions can additionally describe “anonymous entities” if they are required for the Self-Description and if these Entities do not yet have their own Self-Description. The main reason for this is to give mandatory information which would go into a dedicated Self-Description, but which is not available so far.

These “anonymous entities” are defined via blank-nodes in the RDF data model and do not have identifiers. Anonymous entities from different Self-Descriptions are not merged when they are loaded into a joint Self-Description Graph (see below). Thus, there can be duplicate anonymous entities.

5.2 Self-Description Schema

Self-Description Schema is a collection of class’s data schema describing Gaia-X entities. Each data schema is part of an inheritance hierarchy having Participant, Service Offering or Resource as top-level super-class schema and defining a set of attributes available to describe the entity. Self-Descriptions must follow a common structure and well-defined semantics. Only by this it can be assured that entities can be found and compared within Gaia-X. This structure is formally described in Self-Description Schemas.

The basic set of Self-Description data schemas is defined within the Service Characteristics Working Group. Individual Gaia-X Federations can extend the schema for their application domain. Such extensions must make an explicit reference to the organization that is responsible for the development and maintenance of the extension. The Self-Description Schema defines entities that are recognized within Gaia-X. Those entities form an inheritance structure, whereas each Entity inherits from one Entity of the Conceptual Model. We call this inheritance hierarchy Self-Description Taxonomy. Derived classes substantiate the basic entities of the Conceptual Model with more detailed information. For each class properties are defined that an instance of this class can have. Those properties can be of a primitive datatype (called datatype properties in RDF), of instances of auxiliary classes (e.g., a class describing an address, containing attributes like city and street), referencing to a controlled vocabulary (well-defined terms within Gaia-X) or referencing another entity inside the Gaia-X Federation. In addition to the allowed attributes and their types, the Self-Description schema defines the cardinality of each attribute. Meaning: is the attribute mandatory, so there must be at least one value for the attribute and is it allowed to have multiple values for the attribute?

Multiple inheritance is allowed and encouraged in the Self-Description Taxonomy as well as for the instances in a Self-Description. That way, deeply nested specializations in the schema hierarchy can be avoided. For example, a REST-based database service could inherit from both database and REST-based service instead of creating a specialized class.

Gaia-X aims at building upon existing schemas, preferably those that have been standardized or at least widely adopted.

For frequently used attribute values, it is recommended that they be maintained in the same governed process as Self-Description Schemas, i.e., by giving them unambiguous identifiers maintained in Controlled Vocabularies. Examples include standards against which a Participant or a Resource has been certified, or classification schemes, e.g., for the sector in which a Provider is doing their business.

5.3 Self-Description Graph

5.3.1 Relations between Self-Descriptions

A Self-Description of one Entity may contain typed relations to other Entities. For instance, the Self-Description of Entity ServiceA may specify that this service is hosted on DataCenterB (another Entity) formalized as RDF triple (ServiceA, hostedOn, DataCenterB). Entities (e.g., DataCenterB) are referred to by their respective Identifier.

In the example, the property hostedOn constitutes the type of the relation linking the two Self-Descriptions. Many other relation types between different Entities (not just between Services and Resources) are conceivable such as providedBy or operatedBy and are defined in the respective Self-Description data schemas. Relations between Entities may also cross organizational boundaries, for instance, when DataCenterB is operatedBy a different Participant than ServiceA.

The formalized (in RDF) relations between Self-Descriptions always have a direction (viz., from subject to value) and Self-Descriptions shall only contain one direction as defined in the respective Self-Description Schema. Reverse relations such as (DataCenterB, hosts, ServiceA) shall not be included in the Self-Description of DataCenterB but may be inferred and queried by suitable systems (e.g., a Catalogue) evaluating all Self-Descriptions and their relations. This also holds for (semantically) symmetric relations such as collocatedTo or nearTo where only one direction shall be used in the Self-Descriptions.

The set of all Self-Descriptions in the “active” lifecycle state and all their typed relations with each other are called the Self-Description Graph (with Self-Descriptions as the vertices of the graph and the relations between Self-Description its edges). By following the relations between Self-Descriptions in the Self-Description Graph, advanced queries across individual Self-Descriptions become possible. Such functionality will be typically implemented by Catalogues and may further involve Certification aspects and Usage Policies relating to the Self-Descriptions. This will enable, for instance, that a Consumer can use Catalogue Services to require that a Service Instance must not depend on other Service Instances deployed on Nodes outside a Consumer-specified list of acceptable countries.

5.3.2 Representing Claims in the Self-Description Graph

Self-Descriptions are Verifiable Presentations collecting Verifiable Credentials about the Entity from various issuers with metadata about the issuer itself, date and time of issuance, expiry date, and so on. In order to also include this information in the Self-Description Graph (e.g., to allow searches and queries based on this metadata), the simple graph model indicated above (Self-Description –-property--> Self-Description) is extended by permitting so-called “edge properties”. This means that the edges of the Self-Description Graph are endowed with additional attributes besides their type such as the origin of the claim, the issuer, and others.

Graphs with edge properties are supported by labelled-property graph databases (e.g., based on GQL² or OpenCypher) as well as in semantic triple-stores that support the RDF-star/SPARQL-star extension³.

Labelled-Property Graph Representation. Labelled-property graphs allow edge properties as shown in the following (simplified) example.

(Bob, hasCar[claimedBy: Alice], redCar)
(redCar, hasColor[claimedBy: Alice], "red")

RDF-star Representation. In RDF-star, edge properties are added by using the respective original triples (1 and 2 below) as the subject of another triple (3 and 4 below):

[1]   (Bob, hasCar, redCar)
[2]   (redCar, hasColor, "red")

[3]   ( (Bob, hasCar, redCar), claimedBy, Alice )
[4]   ( (redCar, hasColor, "red"), claimedBy, Alice )

Such a representation enables complex queries mixing normal properties of Entities with metadata about the claims. The following examples (one in GQL, the other in SPARQL-star) present two queries. The first is a simple query that searches for all persons owning a red car (without taking into account any claim metadata). The second query, though, looks for all persons owning a red car where Alice has verified this ownership relation.

GQL 
MATCH (person)-[:hasCar]->(car) WHERE car.hasColor = "red" 
RETURN person, car 

MATCH (person)-[rel:hasCar]->(car) 
WHERE rel.claimedBy = Alice RETURN person, car

SPARQL* 
SELECT ?person, ?car 
WHERE { ?person hasCar ?name . ?car hasColor "red" . } 

SELECT ?person, ?car 
WHERE { <<?person hasCar ?car>> claimedBy Alice . }

This approach has the advantage that users can both write simple queries that do not consider claim structure information, and also complex queries that take the claims into account. Note that simple queries are not impacted by the presence (or absence) of claim structure or metadata information.

The disadvantage is that fewer query languages support edge properties. For example, SPARQL-Star (in draft status) would have to be used instead of SPARQL. Another disadvantage is that only claims “one level deep” can be represented straightforwardly via edge properties. Deeply nested claims (such as Bob attesting that Alice has verified that a particular person owns a red car) would require a more involved representation.

5.4 Self-Description Lifecycle

Although a Self-Description is a standalone document, its validity is influenced by external factors like the list of Trust Anchors or Compliance Rules. Since claims inside Self-Descriptions are protected by cryptographic signatures, they are immutable and cannot be changed once published. The lifecycle state is therefore managed outside the Self-Description itself. It can be computed by applying different rules and a catalogue might display or filter Self-Descriptions based on those rules.

Self-Description

With regards to Gaia-X, a self-description is defined as a signed or unsigned array of one or more verifiable credentials.
A self-description can be:

gaia-x-compliant if it fulfils the rules of the Gaia-X Trust Framework
non-gaia-x-compliant otherwise.

Verifiable Credential

A verifiable credential is a signed array of one or more object and can be:

expired if the expiration datetime is older than current datetime or the certificate containing the key used to sign the claim has expired.
revoked if the key used to sign the array is revoked. See below.
deprecated if another verifiable credential with the same identifier and the same signature issuer has a newer issuance datetime.
user-defined by using the credentialStatus. !! Given this parameter doesn’t have a controlled vocabulary yet, usage of this parameter is not advised by Gaia-X. !!
active only if none of the above. !! active doesn’t imply gaia-x-compliant !!

Note to credential issuers : in Gaia-X the credential lifecyle is managed through the lifecyle of the key used to sign the credential. As such it is recommended to use a different keypair for every credential to manage revocation and to ensure that the certificate expiration datetime is always later than the credential expiration.

Keypair

A keypair is defined as a tuple of corresponding asymmetric encryption public and private key and can be:

revoked if the key used to sign the array is revoked. Revocation status can be verified from, by priority highest to lowest:
- the Gaia-X Registry
- if defined by the ecosystem’s federators, the ecosystem local registry
- if provided by the keypair issuer, a revocation status endpoint like OCSP extracted from the JWK x5u parameter
active only if none of the above.

5.5 Self-Description creation

5.5.1 Collecting claim

The first step is to collect Verifiable Credentials (= set of claims signed by its issuer). Claims can be self-signed using a keypair issued to the creator of the Self-Description by one of the Trust Anchors, or it can be signed by a third-party who is one of the Trust Anchors.
The list and scope of each member of Trust Anchors are defined in the Gaia-X Trust Framework.

Collecting signed claims

5.5.2 Gaia-X verification

Using the collected signed claims, a participant can submit them for verification to the Gaia-X Compliance service and get in return a signed Claim with the result.
The Gaia-X Registry and Gaia-X Compliance are developed as an Open Source project inside the Gaia-X Association.
The version v1.0 of the services are deployed by the Gaia-X Association. The version v2.0 of those services will be distributed. The version 3.0 of those services will be decentralized.

Validating signed Claims using the Gaia-X Trust Framework

5.5.3 Federation governance

Using the same general workflow of the previous step, every federation is free to extend Gaia-X governance and add custom rules and checks.

Federation extending Gaia-X governance

W3C (2012). Web Ontology Language (OWL). https://www.w3.org/OWL/ ↩
Graph Query Language GQL, https://www.gqlstandards.org/ ↩
https://w3c.github.io/rdf-star/cg-spec/editors_draft.html ↩