Class GraphMatcher
Implements a Graph Isomorphism Algorithm.
Inheritance
Inherited Members
Namespace: VDS.RDF
Assembly: dotNetRdf.dll
Syntax
public class GraphMatcher
Remarks
The algorithm used to determine Graph equality is based in part on a Iterative Vertex Classification Algorithm described in a Technical Report from HP by Jeremy J Carroll - Matching RDF Graphs but has been expanded upon significantly to use a variety of techniques.
Graph Equality is determined according to the following algorithm, we refer to the first graph as the Source Graph and the second graph as the Target Graph:.
- If both graphs are null they are considered equal
- If only one of the given graph is null then they are not equal
- If the given graphs are reference equal then they are equal
- If the given graphs have a different number of Triples they are not equal
- If the given graphs have a different number of quoted triples, they are not equal
- Declare a list of triples which are the asserted triples of the second graph called TargetTriples
- Declare a set of triples which are the quoted triples of the second graph called TargetQuoted
- Declare two dictionaries of Nodes to Integers which are called SourceClassification and TargetClassification
- For Each Triple asserted in the Source Graph
- If it is a ground triple and cannot be found and removed from TargetTriples then graphs are not equal since the triple does not exist in both graphs
- If it contains blank nodes track the number of usages of this blank node in SourceClassification
- For Each Triple quoted in the Source Graph
- If it is a ground triple and cannot be found and removed from TargetQuoted then graphs are not equal since the triple is not quoted in both graphs
- If it contains blank nodes track the number of usages of this blank node in SourceClassification
- If there are any triples remaining in TargetTriples which are ground triples then graphs are not equal since the Source Graph does not contain them
- If there are any triples remaining in TargetQuoted which are ground triples then graphs are not equal since the Source Graph does not quote them
- If all the triples from both graphs were ground triples (i.e. there were no blank nodes) then the graphs are equal
- Iterate over the remaining triples in TargetTriples and TargetQuoted and populate the TargetClassification
- If the count of the two classifications is different the graphs are not equal since there are differing numbers of blank nodes in the Graph
- Now build two additional dictionaries of Integers to Integers which are called SourceDegreeClassification and TargetDegreeClassification. Iterate over SourceClassification and TargetClassification such that the corresponding degree classifications contain a mapping of the number of blank nodes with a given degree
- If the count of the two degree classifications is different the graphs are not equal since there are not the same range of blank node degrees in both graphs
- For All classifications in SourceDegreeClassification there must be a matching classification in TargetDegreeClassification else the graphs are not equal
- From this point in the algorithm, we can merge the remaining quoted and asserted triples to be matched into a single set for source and target triples.
- Then build a possible mapping using the following rules:
- Any blank bode used only once (single-use) in the Source Graph should be mapped to an equivalent blank bode in the Target Graph. If this is not possible then the graphs are not equal
- Any blank node with a unique degree in the Source Graph should be mapped to an equivalent blank node in the Target Graph. If this is not possible then the graphs are not equal
- Any blank node used with unique constants (two other ground terms in a triple) in the Source Graph should be mapped to an equivalent blank bode in the Target Graph. If this is not possible then the graphs are not equal.
- Build up lists of dependent pairs of blank Nodes for both graphs
- Use these lists to determine if there are any independent nodes not yet mapped in the Source Graph. These should be mapped to equivalent blank nodes in the Target Graph, if this is not possible the graphs are not equal
- Important: Keep a copy of the mapping up to this point as a Base Mapping for use as a fallback in later steps
- Use the dependency information and existing mappings to generate a possible mapping
- If a complete possible mapping (there is a mapping for each blank node from the Source Graph to the Target Graph) then test this mapping. If it succeeds then the graphs are equal
- If we don't yet have a mapping take a divide and conquer approach:
- Take the not yet mapped blank nodes for each graph and sub-divide them into their isolated sub-graphs
- If there are at least 2 isolated sub-graphs proceed to divide and conquer
- For Each Isolated Sub-Graph from the Source Graph
- Consider each possible isolated sub-graph of the same size from the target graph, if there are none then graphs are not equal. If there is a single possible equal isolated sub-graph add the mappings for all involved blank nodes.
- If we now have a complete possible mapping (there is a mapping for each blank node from the Source Graph to the Target Graph) then test the mapping. Return success/failure depending on whether the mapping is valid.
- Important: Keep a copy of the mapping up to this point as a Base Mapping for use as a base for the brute force step
- If we still don't have a complete mapping we now fallback to the Base Mapping and use it as a basis for brute forcing the possible solution space and testing every possibility until either a mapping works or we find the graphs to be non-equal
Properties
| Improve this Doc View SourceMapping
Gets the Blank Node Mapping found between the Graphs (if one was found).
Declaration
public Dictionary<INode, INode> Mapping { get; }
Property Value
Type | Description |
---|---|
System.Collections.Generic.Dictionary<TKey, TValue><INode, INode> |
Remarks
The algorithm used to determine Graph equality is based in part on a Iterative Vertex Classification Algorithm described in a Technical Report from HP by Jeremy J Carroll - Matching RDF Graphs but has been expanded upon significantly to use a variety of techniques.
Graph Equality is determined according to the following algorithm, we refer to the first graph as the Source Graph and the second graph as the Target Graph:.
- If both graphs are null they are considered equal
- If only one of the given graph is null then they are not equal
- If the given graphs are reference equal then they are equal
- If the given graphs have a different number of Triples they are not equal
- If the given graphs have a different number of quoted triples, they are not equal
- Declare a list of triples which are the asserted triples of the second graph called TargetTriples
- Declare a set of triples which are the quoted triples of the second graph called TargetQuoted
- Declare two dictionaries of Nodes to Integers which are called SourceClassification and TargetClassification
- For Each Triple asserted in the Source Graph
- If it is a ground triple and cannot be found and removed from TargetTriples then graphs are not equal since the triple does not exist in both graphs
- If it contains blank nodes track the number of usages of this blank node in SourceClassification
- For Each Triple quoted in the Source Graph
- If it is a ground triple and cannot be found and removed from TargetQuoted then graphs are not equal since the triple is not quoted in both graphs
- If it contains blank nodes track the number of usages of this blank node in SourceClassification
- If there are any triples remaining in TargetTriples which are ground triples then graphs are not equal since the Source Graph does not contain them
- If there are any triples remaining in TargetQuoted which are ground triples then graphs are not equal since the Source Graph does not quote them
- If all the triples from both graphs were ground triples (i.e. there were no blank nodes) then the graphs are equal
- Iterate over the remaining triples in TargetTriples and TargetQuoted and populate the TargetClassification
- If the count of the two classifications is different the graphs are not equal since there are differing numbers of blank nodes in the Graph
- Now build two additional dictionaries of Integers to Integers which are called SourceDegreeClassification and TargetDegreeClassification. Iterate over SourceClassification and TargetClassification such that the corresponding degree classifications contain a mapping of the number of blank nodes with a given degree
- If the count of the two degree classifications is different the graphs are not equal since there are not the same range of blank node degrees in both graphs
- For All classifications in SourceDegreeClassification there must be a matching classification in TargetDegreeClassification else the graphs are not equal
- From this point in the algorithm, we can merge the remaining quoted and asserted triples to be matched into a single set for source and target triples.
- Then build a possible mapping using the following rules:
- Any blank bode used only once (single-use) in the Source Graph should be mapped to an equivalent blank bode in the Target Graph. If this is not possible then the graphs are not equal
- Any blank node with a unique degree in the Source Graph should be mapped to an equivalent blank node in the Target Graph. If this is not possible then the graphs are not equal
- Any blank node used with unique constants (two other ground terms in a triple) in the Source Graph should be mapped to an equivalent blank bode in the Target Graph. If this is not possible then the graphs are not equal.
- Build up lists of dependent pairs of blank Nodes for both graphs
- Use these lists to determine if there are any independent nodes not yet mapped in the Source Graph. These should be mapped to equivalent blank nodes in the Target Graph, if this is not possible the graphs are not equal
- Important: Keep a copy of the mapping up to this point as a Base Mapping for use as a fallback in later steps
- Use the dependency information and existing mappings to generate a possible mapping
- If a complete possible mapping (there is a mapping for each blank node from the Source Graph to the Target Graph) then test this mapping. If it succeeds then the graphs are equal
- If we don't yet have a mapping take a divide and conquer approach:
- Take the not yet mapped blank nodes for each graph and sub-divide them into their isolated sub-graphs
- If there are at least 2 isolated sub-graphs proceed to divide and conquer
- For Each Isolated Sub-Graph from the Source Graph
- Consider each possible isolated sub-graph of the same size from the target graph, if there are none then graphs are not equal. If there is a single possible equal isolated sub-graph add the mappings for all involved blank nodes.
- If we now have a complete possible mapping (there is a mapping for each blank node from the Source Graph to the Target Graph) then test the mapping. Return success/failure depending on whether the mapping is valid.
- Important: Keep a copy of the mapping up to this point as a Base Mapping for use as a base for the brute force step
- If we still don't have a complete mapping we now fallback to the Base Mapping and use it as a basis for brute forcing the possible solution space and testing every possibility until either a mapping works or we find the graphs to be non-equal
Methods
| Improve this Doc View SourceEquals(IGraph, IGraph)
Compares two Graphs for equality.
Declaration
public bool Equals(IGraph g, IGraph h)
Parameters
Type | Name | Description |
---|---|---|
IGraph | g | Graph. |
IGraph | h | Graph. |
Returns
Type | Description |
---|---|
System.Boolean |
Remarks
The algorithm used to determine Graph equality is based in part on a Iterative Vertex Classification Algorithm described in a Technical Report from HP by Jeremy J Carroll - Matching RDF Graphs but has been expanded upon significantly to use a variety of techniques.
Graph Equality is determined according to the following algorithm, we refer to the first graph as the Source Graph and the second graph as the Target Graph:.
- If both graphs are null they are considered equal
- If only one of the given graph is null then they are not equal
- If the given graphs are reference equal then they are equal
- If the given graphs have a different number of Triples they are not equal
- If the given graphs have a different number of quoted triples, they are not equal
- Declare a list of triples which are the asserted triples of the second graph called TargetTriples
- Declare a set of triples which are the quoted triples of the second graph called TargetQuoted
- Declare two dictionaries of Nodes to Integers which are called SourceClassification and TargetClassification
- For Each Triple asserted in the Source Graph
- If it is a ground triple and cannot be found and removed from TargetTriples then graphs are not equal since the triple does not exist in both graphs
- If it contains blank nodes track the number of usages of this blank node in SourceClassification
- For Each Triple quoted in the Source Graph
- If it is a ground triple and cannot be found and removed from TargetQuoted then graphs are not equal since the triple is not quoted in both graphs
- If it contains blank nodes track the number of usages of this blank node in SourceClassification
- If there are any triples remaining in TargetTriples which are ground triples then graphs are not equal since the Source Graph does not contain them
- If there are any triples remaining in TargetQuoted which are ground triples then graphs are not equal since the Source Graph does not quote them
- If all the triples from both graphs were ground triples (i.e. there were no blank nodes) then the graphs are equal
- Iterate over the remaining triples in TargetTriples and TargetQuoted and populate the TargetClassification
- If the count of the two classifications is different the graphs are not equal since there are differing numbers of blank nodes in the Graph
- Now build two additional dictionaries of Integers to Integers which are called SourceDegreeClassification and TargetDegreeClassification. Iterate over SourceClassification and TargetClassification such that the corresponding degree classifications contain a mapping of the number of blank nodes with a given degree
- If the count of the two degree classifications is different the graphs are not equal since there are not the same range of blank node degrees in both graphs
- For All classifications in SourceDegreeClassification there must be a matching classification in TargetDegreeClassification else the graphs are not equal
- From this point in the algorithm, we can merge the remaining quoted and asserted triples to be matched into a single set for source and target triples.
- Then build a possible mapping using the following rules:
- Any blank bode used only once (single-use) in the Source Graph should be mapped to an equivalent blank bode in the Target Graph. If this is not possible then the graphs are not equal
- Any blank node with a unique degree in the Source Graph should be mapped to an equivalent blank node in the Target Graph. If this is not possible then the graphs are not equal
- Any blank node used with unique constants (two other ground terms in a triple) in the Source Graph should be mapped to an equivalent blank bode in the Target Graph. If this is not possible then the graphs are not equal.
- Build up lists of dependent pairs of blank Nodes for both graphs
- Use these lists to determine if there are any independent nodes not yet mapped in the Source Graph. These should be mapped to equivalent blank nodes in the Target Graph, if this is not possible the graphs are not equal
- Important: Keep a copy of the mapping up to this point as a Base Mapping for use as a fallback in later steps
- Use the dependency information and existing mappings to generate a possible mapping
- If a complete possible mapping (there is a mapping for each blank node from the Source Graph to the Target Graph) then test this mapping. If it succeeds then the graphs are equal
- If we don't yet have a mapping take a divide and conquer approach:
- Take the not yet mapped blank nodes for each graph and sub-divide them into their isolated sub-graphs
- If there are at least 2 isolated sub-graphs proceed to divide and conquer
- For Each Isolated Sub-Graph from the Source Graph
- Consider each possible isolated sub-graph of the same size from the target graph, if there are none then graphs are not equal. If there is a single possible equal isolated sub-graph add the mappings for all involved blank nodes.
- If we now have a complete possible mapping (there is a mapping for each blank node from the Source Graph to the Target Graph) then test the mapping. Return success/failure depending on whether the mapping is valid.
- Important: Keep a copy of the mapping up to this point as a Base Mapping for use as a base for the brute force step
- If we still don't have a complete mapping we now fallback to the Base Mapping and use it as a basis for brute forcing the possible solution space and testing every possibility until either a mapping works or we find the graphs to be non-equal
GenerateMappings(Dictionary<INode, INode>, Dictionary<INode, List<INode>>)
Helper method for brute forcing the possible mappings.
Declaration
public static IEnumerable<Dictionary<INode, INode>> GenerateMappings(Dictionary<INode, INode> baseMapping, Dictionary<INode, List<INode>> possibleMappings)
Parameters
Type | Name | Description |
---|---|---|
System.Collections.Generic.Dictionary<TKey, TValue><INode, INode> | baseMapping | Base Mapping. |
System.Collections.Generic.Dictionary<TKey, TValue><INode, System.Collections.Generic.List<T><INode>> | possibleMappings | Possible Mappings. |
Returns
Type | Description |
---|---|
System.Collections.Generic.IEnumerable<T><System.Collections.Generic.Dictionary<TKey, TValue><INode, INode>> |
Remarks
The base mapping at the time of the initial call should contain known good mappings.