RDF diffing documentation
Overview
Model2owl provides functionality for generating difference reports between two versions of a model. It currently supports AsciiDoc format for human-readable reports and JSON-based outputs for machine-readable purposes. The feature is powered by the RDF Differ tool. Reports can be generated directly within Model2owl (locally, offline) or through the automated workflow provided by the model2owl-boilerplate project (online).
Usage
This section provides a step-by-step guide on how to generate a diff report using model2owl. The process is the same but may differ in details depending on whether model2owl is used locally or in an automated workflow (see the How to use page).
-
prepare OWL core (and SHACL shapes) for the old version: The former version of the Semantic Data Specification for the processed model needs to be prepared (read more about the expected input in the Diffing Process section).
-
prepare OWL core (and SHACL shapes) for the new version: The new version of the Semantic Data Specification for the processed model needs to be prepared (read more about the expected input in the Diffing Process section).
-
generate diff and report(s): The user needs to decide which report(s) should be generated and run the generation. If using model2owl-boilerplate, the formats are already predefined as AsciiDoc and JSON.
-
inspect the reports: Reports are available in the specified destination or in the predefined directory when using the model2owl-boilerplate workflow.
Diffing process
Model2owl uses the RDF Differ
tool to calculate differences between two RDF files and to generate
diff reports in AsciiDoc and JSON formats. It compares either two OWL core files
or two pairs consisting of an OWL core file and a SHACL shapes file. When SHACL
files are provided, the diff report will be augmented with additional
information about the domain, range, and cardinality. The scope of comparison is
defined in an application profile suitable for comparing OWL ontologies.
Model2owl integrates this tool via its CLI client and provides a dedicated set
of commands for interacting with it (see the descriptions of the run-rdf-diff
and merge-owl-shacl commands in the
Functional commands section).
Further details on how the RDF Differ tool operates can be found in the
project documentation.
Diff reports
Model2owl produces a set of files summarising the differences between two versions of a model, collectively known as diff reports. These reports can be generated in three formats:
-
AsciiDoc – suitable for integration into external documentation workflows.
-
HTML – a standalone, human-readable format.
-
JSON – a machine-readable format following the SPARQL 1.1 Query Results JSON Format.
Each report’s content mirrors the types of changes detected. The fundamental
building block of any report is a diff describing a change to a property of a
model instance. For example, modifications to skos:definition values for
resources of type owl:ObjectProperty.
AsciiDoc and HTML reports follow an identical structure. Each includes, among other sections, a general statistics section and a detailed section in which the actual differences are shown together with relevant metadata. A full explanation of the report scope, supported output formats, and content is available in the dedicated RDF Differ documentation section.
Automated diffing workflow in the model2owl-boilerplate repository
The workflow located in the model2owl-boilerplate repository implements calculation of a difference as described in the Diffing process section. The workflow supports processing of one or more ontology modules.
The process followed in the workflow is similar to the one described in the Usage section, with some differences:
-
The new versions of the model files need to be committed to the repository where the workflow is configured. It’s possible to compare a file in the current revision with the one committed in another revision or even another repository.
-
Results are committed to the repository (the same branch where the workflow was triggered), saved in
diff-reportsdirectory. -
Alternatively, diff reports can be generated between the current version and past versions without modifying files. In this case, the diff configuration is specified per run via workflow dispatch.
The workflow uses some default configuration for demonstration purposes.
The configuration can be adjusted using the diff-config.env file or
GitHub Variables
as described in the RDF diffing configuration section.
The workflow is called automatically by the transform workflow after generating new OWL/SHACL artefacts. It can also be triggered manually via workflow dispatch.
| The workflow dispatch option is only available after the workflow file has been merged to the default branch of the repository (as per GitHub Actions specifications). |
Multiple ontology modules processing
The workflow can process multiple modules in a single run. They can be specified
explicitly via the modules parameter as a comma-separated list, or auto-detected
from the repository structure. Each is processed in parallel, with separate reports
stored in dedicated subdirectories (e.g., diff-reports/epo_core/, diff-reports/epo_cat/).
Manual workflow dispatch
When triggering the workflow manually, the following options are available in the GitHub Actions UI:
| Option | Description | Default | Example |
|---|---|---|---|
|
Comma-separated list of module names to compare. Leave empty to process all modules. |
(empty) |
|
|
Git revision (branch, tag, or commit SHA) to compare against. |
|
|
|
URL of an external public repository for the old version. Leave empty to compare within the same repository. |
(empty) |
|
|
Root directory for old OWL/SHACL files. Expected to contain subdirectories for ontology modules. |
|
|
|
Root directory for new OWL/SHACL files. Expected to contain subdirectories for ontology modules. |
|
|
|
Output directory for generated diff reports. |
|
|
Generated files are committed to the repository. The commit message includes
the number of modules processed and the run number, following conventional
commits format (e.g., chore(ci): diff 2 module(s) #143 [skip ci]).
Each run produces a summary visible in the GitHub Actions page, showing which modules were processed with links to the reports. Failed modules (e.g., no preexisting files to compare to) are noted accordingly. For more information on navigating workflow runs, see Using the visualization graph.
Further details about the workflow can be found in the project’s README.