• User Guide
  • API Documentation
  • Developer Guide
  • Support
  • User Guide
  • How-To Guides
  • Minimize Memory Usage
Show / Hide Table of Contents
  • Getting Started
    • Core Concepts
    • Hello World
    • Reading RDF
    • Writing RDF
    • Working with Graphs
    • Typed Values and Lists
    • Working with Triple Stores
    • Building SPARQL
    • Querying with SPARQL
    • Updating with SPARQL
    • Formats Supported
    • dotNetRdf Assembly Signing
  • Going Deeper
    • Exceptions
    • Event Model
    • Equality and Comparison
    • Utility Methods
    • Extension Methods
    • Namespace Mapper
    • URI Factory
    • Node Factory
  • Storage API
    • Triple Store Integration
    • Servers API
    • Transactions API
  • Storage Providers
    • Allegro Graph
    • Blazegraph
    • Dataset Files
    • 4store
    • Fuseki
    • In-Memory
    • Sesame
    • SPARQL Query Endpoints
    • SPARQL Query and Update Endpoints
    • SPARQL Graph Store Protocol
    • Stardog
  • Advanced SPARQL
    • Result Formatting
    • SPARQL Datasets
    • Full Text Querying with SPARQL
    • Advanced SPARQL Operations
  • Ontology API
  • Inference and Reasoning
  • Dynamic API
  • Formatting API
  • Configuration API
    • Graphs
    • Triple Stores
    • Object Factories
    • Readers and Writers
    • SPARQL Endpoints (Deprecated)
    • SPARQL Clients
    • Query Processors
    • Update Processors
    • SPARQL Datasets
    • SPARQL Expression Factories
    • SPARQL Operators
    • SPARQL Optimisers
    • Full Text Query
    • Reasoners
    • Storage Providers
    • Static Options
    • Proxy Servers
  • Handlers API
  • JSON-LD API
    • Expansion
    • Compaction
    • Flattening
    • Framing
    • Processor Options
    • Error Handling
  • Linked Data Fragments API
  • RDF Canonicalization API
  • How-To Guides
    • Debug HTTP Communication
    • Debug SPARQL Queries
    • Load OWL
    • Load RDF from a File
    • Load RDF from the Web
    • Minimize Memory Usage
    • Reify Triples
  • Upgrading to dotNetRDF 3.0
    • Key Changes
    • Global Options Changes

Minimize Memory Usage

Depending on how it is used and the data you are working with dotNetRDF can be very memory hungry resulting in large memory footprints for relatively small volumes of data or a memory footprint that grows ever larger over time.

This is not down to memory leaks but rather down to internal features of the API that are designed to make things faster for the average user. If you are reading this page then you are not an average user and are looking to tailor your usage of dotNetRDF to better manage the memory footprint. dotNetRDF has you covered here and there are a number of features you can disable/configure and alternative ways of working that will reduce your memory usage.

URI Interning

Our URI Factory feature is used to improve the speed of URI comparisons however if the data you are working with has a lot of unique URIs or you work with data for short periods before throwing it away then this can use a lot of memory over time.

If you are working with the default root URI factory rather than managing your own instances of the IUriFactory interface, then you can disable this feature completely like so:

UriFactory.InternUris = false;

Or you can choose to periodically clean up the memory used:

UriFactory.Clear();

However, you may find it beneficial to consider creating scoped URI Factory instances (e.g. one for each separate run of a processing pipeline) and dispose of these when the code using them is finished.

Triple Indexing

By default the standard Graph implementation builds a number of triple indexes behind the scenes. This help make the GetTriplesWithX() methods and SPARQL queries run very fast on in-memory data. However depending on what you are doing with the graph you may either not need indexes at all or only need certain indexes.

By default indexed graphs in dotNetRDF will incorporate 6 indices:

  • Subject
  • Predicate
  • Object
  • Subject-Predicate
  • Subject-Object
  • Predicate-Object

The first three are considered simple indices and the last three compound indices.

Use No Indices

If you don't need the indices you can avoid using indices completely by always using the NonIndexedGraph in place of the standard Graph

Use Only Simple Indexes

If you can live with only simple indices and don't need compound indices you can disable the compound indices like so:

Options.FullTripleIndexing = false;

Note: Indexed graph implementations inspect this setting only when a graph is first creating so you must set this prior to creating your graph.

With full triple indexing set to off only simple indices will be created for newly instantiated graphs.

Use Specific Indexes

The standard graph allows you to configure the underlying BaseTripleCollection used so you can create a graph that only uses the indexes you want:

// Create a Triple Collection with only a subject index
BaseTripleCollection tripleCollection = new TreeIndexedTripleCollection(true, false, false, false, false, false, MultiDictionaryMode.AVL);

// Create a Graph using the customized triple collection
Graph g = new Graph(tripleCollection);

Stream Processing

If you can work with RDF/SPARQL Results in a stream then you can leverage the Handlers API as an alternative to loading your data fully into memory, this API gives you complete control over what happens to triples/results as they are generated.

Token Queue Mode

Token based parsers accept a TokenQueueMode which is defines how it tokenizes the input data, the choice of this can have a significant affect on memory usage. The global default for this can be controlled via the Options.DefaultTokenQueueMode property.

Queue Mode Behaviour
QueueAllBeforeParsing As the name suggests in this mode input data is completely tokenized before any parsing takes place. This is the most expensive memory wise but may be more performant for very small data
SynchronousBufferDuringParsing This is the default, during parsing the tokenizer will buffer some number of tokens ahead in the input data. It will stop buffering when the buffer is full and start buffering again once the parser starts reading tokens. This makes it the most stable and memory efficient mode.
AsynchronousBufferDuringParsing In this mode the tokenizer will buffer as many tokens as it can asynchronous with the parser, so it can consume very large amounts of memory. However if the parser is very fast this may be the fastest way to parse.
  • Edit this page
In this article
  • Minimize Memory Usage
  • URI Interning
  • Triple Indexing
    • Use No Indices
    • Use Only Simple Indexes
    • Use Specific Indexes
  • Stream Processing
  • Token Queue Mode
Back to top Generated by DocFX