RDF Canonicalization API
Introduction
The RDF Dataset Canonicalization spec defines a process for creating a canonical serialization of an RDF dataset (a collection of graphs). Canonicalization is typically required when wishing to determine if one serialization of a graph (or collection of graphs) is isomorphic to another. Scenarios such as the signing of RDF data independent of serialization or format rely on generating a canonical form of the data as input to cryptographic functions.
DotNetRDF includes an implementation of the specification which currently conforms to the 30 November 2023 draft of the specification.
This implementation is provided by the class RdfCanonicalizer.
Creating an RdfCanonicalizer
The constructor for RdfCanonicalizer takes a single optional argument which is the moniker of the hashing algorithm to use when comparing nodes.
The default value for this argument is SHA256 (which is the default algorithm defined by the specification).
Canonicalization
The RdfCanonicalizer provides a single public Canonicalize method which accepts the dataset to be canonicalized as an ITripleStore instance.
This method creates a new in-memory ITripleStore instance that contains the canonical form of the input dataset.
The input dataset is not modified by this process.
The return value is a RdfCanonicalizer.CanonicalizedRdfDataset instance which provides the following properties:
InputDataset- theITripleStoreinstance provided as input to the canonicalizer.OutputDataset- theITripleStoreinstance created by the canonicalizer containing the canonical form of the input dataset.SerializedNQuads- the serialization ofOutputDatasetin NQuads format (conforming to the W3C NQuads 1.1 specification).IssuedIdentifiersMap- a dictionary mapping the blank node identifiers in the input dataset to their canonical counterpartsin the output dataset.