5. Self-Description Definition

Gaia-X Self-Descriptions (SD) describe Entities from the Gaia-X Conceptual Model in a machine interpretable format. This includes Self-Descriptions for the Participants themselves, as well as the Resources and Service Offerings from the Providers. Well-defined Self-Description Schemas (which can be extended by the Federations for their domain) enable ensuring a unified representation of the Self-Descriptions. The Self-Description allows to find and compare Entities inside Gaia-X.

flowchart LR entity[Entity] sd[Self-Description] credential[Verifiable Credentials] schema[Schema] issuer[Issuer] entity -- described by --> sd sd -- contains --> credential sd -- validated against --> schema credential -- signed by --> issuer

Overview on Self-Descriptions. The terms Verifiable Credential, Schema and Issuer are explained in more detail in the following sections.

Self-Descriptions in combination with trustworthy verification mechanisms empower Participants in their decision-making processes. Specifically, Self-Descriptions can be used for:

Tool-assisted evaluation, selection, composition and orchestration of Services and Resources
Enforcement, continuous validation and trust monitoring together with usage policies
Negotiation of contractual terms

The Participants (particularly Providers) are responsible for the creation of their Self-Descriptions. In addition to self-declared information by Participants about themselves or their offerings, a Self-Description may comprise statements issued and signed by trusted parties.

5.1 Self-Description Structure

Self-Descriptions are W3C Verifiable Presentations in the JSON-LD format. Self-Description consist of a list of Verifiable Credentials. Verifiable Credentials themselves contain a list of Claims: assertions about Entities expressed in the RDF data model. Both Verifiable Credentials and Verifiable Presentations come with cryptographic signatures to increase the level of trust. Note that the Verifiable Credentials inside a Self-Description may be signed from different (trusted) parties. For example, a certification assessment body may assert a certification result in a Verifiable Credential. This can then be included in a Self-Description for that service.

flowchart subgraph "Self-Description (Verifiable Presentation)" metadata1[Metadata] vc1[Verifiable Credential - 1..*] proof1[Proof Info - 1..*] subgraph vc [Verifiable Credential] direction LR metadata2[Metadata] claim[Claim - 1..*] proof2[Proof Info - 1..*] end end vc1 -- 1 .. * --> vc

Self-Description assembly model

Verifiable Presentations and Verifiable Credentials can be expressed as graph. Below is an example for a Verifiable Presentation Graph, where a trusted party “DUV” asserts that a resource is certified according to ISO 27001.

graph TB subgraph "Self-Description (Verifiable Presentation)" subgraph Credential Graph cred789[Credential789] -- credentialSubject --> nodeAbc[NodeABC] cred789 -- proof --> signaturexyz[SignatureXYZ] nodeAbc -- hasNumberOfCores --> cores[1024] nodeAbc -- hasRAMCapacity --> ram[2TB] end subgraph Proof Graph sigxyz[SignatureXYZ] -- issuer --> provider567[NodeProvider567] sigxyz -- created --> date2(2021-05-03 12:15:31) end subgraph Credential Graph cred123[Credential123] -- credentialSubject --> node[NodeABC] cred123 -- proof --> signature[Signature456] node -- hasCertificate --> iso(ISO 27001) end subgraph Proof Graph sig456[Signature456] -- issuer --> duv[DUV] sig456 -- created --> date(2021-03-01 14:01:46) end signature ---> sig456 signaturexyz ---> sigxyz end %% classDef attribute fill:#f9f,stroke:#333,stroke-width:4px,rx:50%,ry:50%; %% %% classDef attribute stroke-width:2px,rx:50%,ry:50%; %% class iso,date attribute;

Example for Verifiable Credentials from different issuers in the same Self-Description. The DUV organization asserts the certification of a Node according to ISO 27001. The provider himself provides additional technical details. The individual elements and their relation are shown as a graph (non-normative visualization).

Self-Description contain verifiable credentials about the attributes of Entities and relations to other Entities based on subject-predicate-object triples (cf. the RDF data model). The possible attributes and relations to be used in a Self-Description come from Self-Description Schemas (see next section). Leaving out the syntactic sugar of JSON-LD, the following triples represent the payload information about the Entity NodeABC from the above figure. Typically the Provider NodeProvider567 and the DUV organization have dedicated Self-Descriptions for their respective identifiers.

(NodeABC, isA, Node)
(NodeABC, providedBy, NodeProvider567)
(NodeABC, hasNumberOfCores, 1024)
(NodeABC, hasRAMCapacity, 2TB)
(NodeABC, hasCertificate, ISO 27001)

Each of these assertions is called a claim in the Verifiable Credentails data model.

Cross-referencing between Self-Descriptions is enabled by unique Identifiers for the Entities. Identifiers in Gaia-X are URIs and follow the specification of RFC 3986. Depending on the prefix of the URI, different technical systems are defined to ensure uniqueness of Identifiers. For example, the use of a domain-name with methods like DID:DNS as part of the Identifier, enables the domain owner to control the Identifiers by itself, eg. did:dns:example.com#z6MljvBkt9ETnxJGAFPKGgYHb33q9oNHLX7BiYSPcXqG6gZ9.

Every Self-Description has one Entity as its main topic. This Entity must have a Gaia-X compliant Identifier. Self-Descriptions can additionally describe “anonymous entities” if they are required for the Self-Description and if these Entities do not yet have their own Self-Description. The main reason for this is to give mandatory information which would go into a dedicated Self-Description, but which is not available so far. (Take as an example that the provider company of the hosting infrastructure must be described). These “anonymous entities” are defined via blank-nodes in the RDF data model and do not have identifiers. Anonymous entities from different Self-Descriptions are not merged when they are loaded into a joint Self-Description Graph (see below). So there can be duplicate anonymous entities.

5.2 Self-Description Schema

Self-Description Schema is a collection of class’s data schema describing Gaia-X entities. Each data schema is part of an inheritance hierarchy having Participant, Service Offering or Resource as top-level super-class schema and defining a set of attributes available to describe the entity. Self-Descriptions must follow a common structure and well-defined semantics. Only by this it can be assured that entities can be found and compared within Gaia-X. This structure is formally described in Self-Description Schemas.

The basic set of Self-Description data schemas is defined within the Service Characteristics Working Group. Individual Gaia-X Federations can extend the schema for their application domain. Such extensions must make an explicit reference to the organization that is responsible for the development and maintenance of the extension. The Self-Description Schema defines entities that are recognized within Gaia-X. Those entities form an inheritance structure, whereas each Entity inherits from one Entity of the Conceptual Model. We call this inheritance hierarchy Self-Description Taxonomy. Derived classes substantiate the basic Entities of the Conceptual Model with more detailed information. For each class, properties are defined that an instance of this class can have. Those properties include attributes, which can have * plain values of a primitive datatype (called datatype properties in the W3C Web Ontology Language OWL¹ used to define terms to be used in the RDF data model), * values that are instances of auxiliary classes (e.g., a class describing an address, containing attributes like city and street), or * values reused by referencing a controlled vocabulary (well-defined terms within Gaia-X); the latter two being called object properties in OWL.

Properties also include relationships used for referencing another entity inside the Gaia-X Federation. In addition to the allowed attributes and their types, the Self-Description schema defines the cardinality of each attribute. Meaning: is the attribute mandatory, so there must be at least one value for the attribute and is it allowed to have multiple values for the attribute?

Multiple inheritance is allowed and encouraged in the Self-Description Taxonomy as well as for the instances in a Self-Description. That way, deeply nested specializations in the schema hierarchy can be avoided. For example, a REST-based database service could inherit from both database and REST-based service instead of creating a specialized class.

Gaia-X aims at building upon existing schemas, preferably those that have been standardized or at least widely adopted.

For frequently used attribute values, it is recommended that they be maintained in the same governed process as Self-Description Schemas, i.e., by giving them unambiguous identifiers maintained in Controlled Vocabularies. Examples include standards against which a Participant or a Resource has been certified, or classification schemes, e.g., for the sector in which a Provider is doing their business.

5.3 Cryptograph Signatures in Self-Descriptions

To ensure the Self-Description’s integrity and authenticity the overall Self-Description (a Verifiable Presentation) must be cryptographically signed by the Participant that is issuing the Self-Description. This is done to avoid tampering and to technically allow to check the origin of the Self-Description. Inside the Self-Description each Verifiable Credential must be individually signed as well (this can be a third party different from the issuer of the overall Self-Description). A Verifiable Credential can also be signed by multiple parties to increase the trust level.

The signature mechanism used is Linked Data Signature with the JsonWebKey2020 suite. It generates a JSON object to be included in the Verifiable Credential or Presentation. The JSON object comprises of the following fields:

type: The field is set to "JsonWebKey2020". See https://w3c-ccg.github.io/lds-jws2020/ for more details.
proofPurpose: The field is set to "assertionMethod".
verificationMethod: The field identifies the party that has issued the proof. It contains either a) the Gaia-X compliant Identifier of the Participant that is signing or b) the digest (fingerprint) of an X509 certificate containing the key material. The Gaia-X compliant Identifier of the signing party can be resolved to a Self-Description that contains the public key used for the signature. An X509 certificate digest is to be matched to known certificates to resolve the public key. For example certificates from the Gaia-X Registry.
created: The field contains the creation date in the ISO8601 format.

A Verifiable Credential is Gaia-X conformant if the Issuer of the Verifiable Credential has itself an identity coming from one of the Trust Anchors. See the section on the Trust Framework for more details on Trust Anchors and the Registry.

flowchart LR part["Participants"] subgraph ta[Trust Anchors] tsp{{Trust Service Provider}} end subgraph sd[Self-Descriptions] cred[Verifiable Credential] end tsp -- proves identity of --> part part -- hold --> sd part -- issue --> cred part -- verify --> cred

5.4 Self-Description Graph

5.4.1 Relations between Self-Descriptions

A Self-Description of one Entity may contain typed relations to other Entities. For instance, the Self-Description of Entity ServiceA may specify that this service is hosted on DataCenterB (another Entity) formalized as RDF triple (ServiceA, hostedOn, DataCenterB). Entities (e.g., DataCenterB) are referred to by their respective Identifier.

In the example, the property hostedOn constitutes the type of the relation linking the two Self-Descriptions. Many other relation types between different Entities (not just between Services and Resources) are conceivable such as providedBy or operatedBy and are defined in the respective Self-Description data schemas. Relations between Entities may also cross organizational boundaries, for instance, when DataCenterB is operatedBy a different Participant than ServiceA.

The formalized (in RDF) relations between Self-Descriptions always have a direction (viz., from subject to value) and Self-Descriptions shall only contain one direction as defined in the respective Self-Description Schema. Reverse relations such as (DataCenterB, hosts, ServiceA) shall not be included in the Self-Description of DataCenterB but may be inferred and queried by suitable systems (e.g., a Catalogue) evaluating all Self-Descriptions and their relations. This also holds for (semantically) symmetric relations such as collocatedTo or nearTo where only one direction shall be used in the Self-Descriptions.

The set of all Self-Descriptions in the “active” lifecycle state and all their typed relations with each other are called the Self-Description Graph (with Self-Descriptions as the vertices of the graph and the relations between Self-Description its edges). By following the relations between Self-Descriptions in the Self-Description Graph, advanced queries across individual Self-Descriptions become possible. Such functionality will be typically implemented by Catalogues and may further involve Certification aspects and Usage Policies relating to the Self-Descriptions. This will enable, for instance, that a Consumer can use Catalogue Services to require that a Service Instance must not depend on other Service Instances deployed on Nodes outside a Consumer-specified list of acceptable countries.

5.4.2 Representing Claims in the Self-Description Graph

Self-Descriptions are Verifiable Presentations collecting Verifiable Credentials about the Entity from various issuers with metadata about the issuer itself, date and time of issuance, expiry date, and so on. In order to also include this information in the Self-Description Graph (e.g., to allow searches and queries based on this metadata), the simple graph model indicated above (Self-Description –-property--> Self-Description) is extended by permitting so-called “edge properties”. This means that the edges of the Self-Description Graph are endowed with additional attributes besides their type such as the origin of the claim, the issuer, and others.

Graphs with edge properties are supported by labelled-property graph databases (e.g., based on GQL² or OpenCypher) as well as in semantic triple-stores that support the RDF-Star/SPARQL-Star extension³.

Labelled-Property Graph Representation. Labelled-property graphs allow edge properties as shown in the following (simplified) example.

(NodeProvider567, provides[claimedBy: NodeProvider567], NodeABC)
(NodeABC, hasCertificate[claimedBy: DUV], [certificateType:"ISO27001"])

RDF-Star Representation. In RDF-Star, edge properties are added by using the respective original triples (1 and 2 below) as the subject of another triple (3 and 4 below):

[1]   (NodeProvider567, provides, NodeABC)
[2]   (NodeABC, hasCertificate, "ISO27001")

[3]   ( (NodeProvider567, provides, NodeABC), claimedBy, NodeProvider567 )
[4]   ( (NodeABC, hasCertificate, "ISO27001"), claimedBy, DUV )

Such a representation allows complex queries mixing normal properties of Entities with metadata about the claims.

GQL 
MATCH (provider)-[:provides]->(node),
      (node)-[rel:hasCertificate]->(certificate)
WHERE certificate.certificateType = "ISO27001",
      rel.claimedBy = DUV
RETURN provider, node

SPARQL-Star
SELECT ?provider, ?node 
WHERE { ?provider provides ?node .
        ?node hasCertificate "ISO27001" .
        <<?node hasCertificate "ISO27001">> claimedBy DUV }

The approach has the advantage that users can both, write simple queries that do not consider claim structure information, and also complex queries that take the claims into account. Note, that simple queries are not impacted by the presence (or absence) of claim structure or metadata information.

The disadvantage is that fewer query languages support edge properties. For example, SPARQL-Star (in draft status) would have to be used instead of SPARQL. Another disadvantage is that only claims “one level deep” can be represented straightforwardly via edge properties. Deeply nested claims (such as Bob attesting that Alice has verified that a particular person owns a red car) would require a more involved representation.

5.5 Self-Description Lifecycle

Since Self-Descriptions are protected by cryptographic signatures, they are immutable and cannot be changed once published. The lifecycle state is therefore managed outside of the Self-Description itself, like for example and not limited to, Catalogues.

The lifecycle of the Self-Description depends on the lifecycle of the verifiable credentials that are contained within it. Furthermore both depend on the lifecycle status of the certificate (public/private key-pair) used to sign it. The following table shows the possible states in the lifecycle of Self-Descriptions, Credentials and certificates.

State	Self-Description	Verifiable Credential	Certificate (Key Pair)	Comment
Active	x	x	x	All Verification Rules are passed.
Partially Active	x	-	-	Some claims inside the Self-Description are not in the active state.
Expired	x	x	x	Too old and must be renewed.
Deprecated	x	-	x	Replaced by a newer version.
Revoked	x	x	x	Revoked by the issuer or a trusted party.
Inconsistent	x	x	-	Internally inconsistent or incompatible with existing information.
Unverifiable	x	x	x	The trust level could not be verified.

*Table with the possible lifecycle states for Self-Descriptions, Verifiable Credentials and Certificates (Key Pairs). The lifecycle is the result of Verification Rules. The full list of Verification Rules and the aggregation of their result into an overall state is the subject of ongoing specification work. The x indicates support for the lifecycle state, a - indicates that the state is not supported for either Self-Description, Verifiable Credential or Certificate. *

5.6 Self-Description creation

5.6.1 Collecting claim

The first step is to collect Verifiable Credentials (= set of claims signed by its issuer). Claims can be self-signed (by using a keypair issued by a Trust Anchor) or directly signed by a Trust Anchors. Claims can be self-signed using a keypair issued to the creator of the Self-Description by one of the Trust Anchors, or it can be signed by a third-party who is one of the Trust Anchors.
The list and scope of each member of Trust Anchors are defined in the Gaia-X Trust Framework.

Collecting signed claims

5.6.2 Gaia-X verification

Using the collected signed claims, a participant can submit them for verification to the Gaia-X Compliance service and get in return a signed Claim with the result.
The Gaia-X Registry and Gaia-X Compliance are developed as Open-Source project inside the Gaia-X association.
The version v1.0 of the services are deployed by the Gaia-X Association. The version v2.0 of those services will be distributed. The version 3.0 of those services will be decentralized.

Validating signed Claims using the Gaia-X Trust Framework

5.6.3 Federation governance

Using the same general workflow of the previous step, every federation is free to extend Gaia-X governance and add custom rules and checks.

Federation extending Gaia-X governance

W3C (2012). Web Ontology Language (OWL). https://www.w3.org/OWL/ ↩
Graph Query Language GQL, https://www.gqlstandards.org/ ↩
https://w3c.github.io/rdf-star/cg-spec/editors_draft.html ↩