Skip to content

Data Catalog Publishing

This specification describes how metadata files representing Datasets and Data Services are published to a Trust Framework's Data Catalog.

Data Providers submit metadata to the Directory by publishing at a public URL or pasting the content directly into a Directory web form. The Directory ingests these Data Catalog Entries and makes them available to other participants in the Trust Framework through searchable catalogs, downloadable feeds, and SPARQL endpoints.

Note: This document uses US English. To align with W3C and other prevalent standards, IB1 uses US English in its technical specifications and technical documentation.

RDF Prefixes

This specification uses the following prefixes:

  dcat:    http://www.w3.org/ns/dcat#
  dcterms: http://purl.org/dc/terms/
  ib1:     https://registry.trust.ib1.org/ns/1.0#

Dependencies

A Scheme using this specification must also adopt the following specification:

Publishing Metadata

Members publish Data Catalog Entries describing the Datasets and Data Services they offer by submitting metadata files to the Directory using one of the following methods:

1. Hosted URL

A metadata file may be published by making it available at a publicly accessible HTTPS URL. These files must meet the following requirements:

  • Use HTTPS with a certificate issued by a widely trusted public Certificate Authority
  • Return HTTP 200 on success (redirects are not permitted)
  • Contain valid RDF according to DCAT, using one of the supported serialization formats (Turtle, RDF/XML, or JSON-LD), served with the correct Content-Type header
  • Be accessible without authentication or access control

Providers are encouraged to group related Data Catalog Entries into a small number of metadata files. The Directory may impose a limit on the number of registered URLs per Member, but this must be no fewer than 16.

2. Web Form Submission

The Directory also provides a web form that allows Members to paste metadata content for inclusion in the catalog. This is intended for the convenient publication of a small number of Data Services.

A size limit may apply, which may be no less than 64,000 Unicode characters.

Catalog Updates

The Directory updates its catalog on a regular schedule, typically daily. Each update involves the following steps:

  1. Registered metadata files are fetched.
  2. If a file cannot be fetched or contains no valid entries, its previously accepted entries remain unchanged in the Data Catalog.
  3. For each Data Catalog Entry:
    • Validation checks are applied (see Validation).
    • Permitted change rules are enforced (see Permitted Changes).
    • The previous version of an Entry remains in the catalog if it does not pass validation or change rules.
    • Data quality checks may be applied (see Quality Checks).
  4. After processing all entries, the catalog is atomically updated.

If a metadata file contains no valid entries, the Directory retains the entries from the most recent successful version. This safeguards against accidental deletion caused by an empty or invalid file. To remove all entries from a metadata file, the file must be explicitly deleted using the Directory interface.

Members are notified of errors or warnings via the Directory dashboard and email.

Permitted Changes

To maintain the stability and integrity of metadata in the catalog, the following constraints apply:

  • The RDF subject URL (identifier) of a Data Catalog Entry must not change.
  • The value of the dcterms:conformsTo and ib1:dataSchema terms (or their absence) must not change.
  • Data Catalog Entries must not be moved between metadata files or between submission methods.

Validation

Each metadata file is validated to ensure it:

  • Is well-formed RDF in a supported serialization format
  • Contains all required metadata fields
  • Includes a dcterms:publisher value that exactly matches the Member’s registered URL in the Directory
  • References only Schemes that the Member has joined and for which all relevant Agreements are in place
  • Maintains conformance with the same Scheme Catalog Requirements Document if updating an existing entry

Additional validation rules may be introduced by other Specifications adopted by a Scheme.

Quality Checks

The Directory may apply non-blocking data quality checks to support discoverability and standardization. These may include:

  • Use of dcat:keyword is consistent within the Scheme
  • Use of appropriate schema definitions (ib1:dataSchema)

Quality issues do not prevent inclusion but are recorded and notified to the submitting Member.

Additional Specifications may define further quality or consistency checks.

Record Removal and Archiving

A metadata record is removed from the catalog when:

  • The URL of the corresponding metadata file is deleted from the Directory
  • The entry is explicitly removed from the metadata file
  • A web form submission is withdrawn

Removals are processed during the next catalog update.

The Directory may retain archived copies of removed or rejected entries for audit and operational purposes. These archived copies are not visible in public views but may be accessed by the Trust Framework Operator as needed for those purposes.

Catalog Access

The Directory may expose Data Catalog metadata through multiple mechanisms:

  • A SPARQL endpoint for querying the Catalog
  • Downloadable DCAT feeds, grouped by Scheme, purpose, or access policy