Add QuadStoreStorageProvider: dotNetRDF IStorageProvider compatibility for Leviathan SPARQL#4
Add QuadStoreStorageProvider: dotNetRDF IStorageProvider compatibility for Leviathan SPARQL#4
Conversation
…torage, IUpdateableStorage Co-authored-by: aabs <157775+aabs@users.noreply.github.com>
…oreStorageProvider Co-authored-by: aabs <157775+aabs@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds a QuadStoreStorageProvider adapter class to TripleStore.Core that wraps the existing QuadStore and implements the dotNetRDF IStorageProvider, IQueryableStorage, and IUpdateableStorage interfaces, enabling the store to be used as a first-class backend for the Leviathan SPARQL engine and other dotNetRDF tooling.
Changes:
- New
QuadStoreStorageProvideradapter implementing three dotNetRDF storage interfaces with graph CRUD, SPARQL query support via Leviathan, and documented limitations (append-only, no graph deletion, no SPARQL Update). - New
QuadStoreStorageProviderTeststest class with 38 tests covering capabilities, graph CRUD, SPARQL querying, node serialization round-trips, and integration scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/TripleStore.Core/QuadStoreStorageProvider.cs |
New adapter class implementing dotNetRDF storage interfaces with helper methods for node serialization and dual-format URI querying. |
test/TripleStore.Tests/QuadStoreStorageProviderTests.cs |
New test file covering the full public surface of QuadStoreStorageProvider including round-trip and integration tests. |
| .Replace("\\\"", "\"") | ||
| .Replace("\\n", "\n") | ||
| .Replace("\\r", "\r") | ||
| .Replace("\\\\", "\\"); |
There was a problem hiding this comment.
The UnescapeLiteral method uses sequential String.Replace calls in the wrong order, causing data corruption for literals that contain a backslash followed by the letter n or r.
Consider a literal value that is the two-character string \n (backslash + letter 'n', NOT a newline). EscapeLiteral correctly stores it as \\n (three chars: \, \, n). However, UnescapeLiteral then processes it as follows:
- Step "Replace
\"→"": no match. - Step "Replace
\n→ newline": finds the 2-char sequence\nat positions 1–2 of\\nand replaces it, yielding\<newline>(two chars: backslash + actual newline). This is wrong. - Step "Replace
\\→\": no match on the remaining string.
Final result: \<newline> instead of the correct \n (backslash + n).
The root cause is that the \\ → \ substitution must happen before the \n → newline and \r → carriage-return substitutions. Otherwise an escaped backslash at the end of \\n is consumed as the first char of the \n escape sequence.
The fix is to reorder: process \\ first, then \", \n, \r. Note that even with reordering, chained String.Replace calls are not fully correct for mutual escape sequences; the truly correct solution is a character-by-character scanner that processes each \X token as a single unit.
There is no test covering this case (a literal that contains a backslash followed by n or r as literal characters).
| .Replace("\\\"", "\"") | |
| .Replace("\\n", "\n") | |
| .Replace("\\r", "\r") | |
| .Replace("\\\\", "\\"); | |
| .Replace("\\\\", "\\") | |
| .Replace("\\\"", "\"") | |
| .Replace("\\n", "\n") | |
| .Replace("\\r", "\r"); |
| /// <inheritdoc/> | ||
| public IEnumerable<string> ListGraphNames() | ||
| { | ||
| return _store.Query() | ||
| .Select(q => q.graph) | ||
| .Distinct(); | ||
| } |
There was a problem hiding this comment.
When the store contains data with angle-bracketed graph URIs (e.g. <http://example.org/g>), ListGraphNames() (line 197–202) returns those angle-bracketed strings as-is. In CreateQueryProcessor() (line 347), these raw strings are then passed to LoadGraph(IGraph g, string graphUri) (line 105).
In LoadGraph(IGraph, string), after populating the graph, the code tries Uri.TryCreate(graphUri, UriKind.Absolute, out var uri) to set g.BaseUri. An angle-bracketed string like <http://example.org/g> is not a valid absolute URI, so TryCreate returns false and g.BaseUri is never set. The graph added to the in-memory triple store will have a null BaseUri, so when the graph is added to the TripleStore via tripleStore.Add(g), it will be treated as the default graph instead of the named graph http://example.org/g.
Additionally, ListGraphs() (line 187) filters with Uri.TryCreate(g, UriKind.Absolute, out _), which also silently drops angle-bracketed URIs. This creates an inconsistency where ListGraphNames() and ListGraphs() can return different numbers of graphs when some are stored with angle brackets.
Both ListGraphNames() and ListGraphs() should normalise the raw stored graph string by stripping angle brackets before returning.
| private LeviathanQueryProcessor CreateQueryProcessor() | ||
| { | ||
| var tripleStore = new VDS.RDF.TripleStore(); | ||
| // Materialise the graph names eagerly to release the QuadStore read lock | ||
| // before calling LoadGraph (which also acquires the same lock). | ||
| var graphNames = ListGraphNames().ToList(); | ||
| foreach (var graphName in graphNames) | ||
| { | ||
| var g = new Graph(); | ||
| LoadGraph(g, graphName); | ||
| tripleStore.Add(g, mergeIfExists: true); | ||
| } | ||
| return new LeviathanQueryProcessor(new InMemoryDataset(tripleStore, unionDefaultGraph: true)); | ||
| } |
There was a problem hiding this comment.
CreateQueryProcessor() is called on every invocation of Query() and rebuilds a full in-memory snapshot of the entire store by loading every graph for every query. For a store with many graphs or large graphs, this means a complete data copy per query, making query performance O(n) in store size regardless of query complexity.
Consider caching the LeviathanQueryProcessor (or the underlying InMemoryDataset) and invalidating it on writes, or at minimum documenting this as a known performance limitation.
| [Fact] | ||
| public void RoundTrip_LiteralWithSpecialChars() | ||
| { | ||
| var factory = new NodeFactory(); | ||
| var node = factory.CreateLiteralNode("line1\nline2"); | ||
| var str = QuadStoreStorageProvider.NodeToString(node); | ||
| str.Should().Be("\"line1\\nline2\""); | ||
| var back = QuadStoreStorageProvider.StringToNode(str, factory); | ||
| back.Should().BeAssignableTo<ILiteralNode>(); | ||
| ((ILiteralNode)back).Value.Should().Be("line1\nline2"); | ||
| } |
There was a problem hiding this comment.
The RoundTrip_LiteralWithSpecialChars test only covers a literal with an actual newline character (\n). It does not cover a literal containing a backslash followed by the letter n (the two-char sequence \n), which triggers the UnescapeLiteral ordering bug described in the corresponding production code comment. A test case for this would be: factory.CreateLiteralNode("value\\ntext") (where \\n is backslash + n, not a newline). Similarly, a literal containing \r (backslash + r) and \" (backslash + quote) are not tested.
QuadStore lacked the dotNetRDF storage interfaces required to plug into the Leviathan SPARQL engine and broader dotNetRDF tooling. This PR adds a
QuadStoreStorageProvideradapter that implementsIStorageProvider,IQueryableStorage, andIUpdateableStorage.New:
QuadStoreStorageProviderIStorageProvider: Full graph CRUD —LoadGraph(4 overloads:Uri/string×IGraph/IRdfHandler),SaveGraph,UpdateGraph(3 overloads),DeleteGraph,ListGraphs/ListGraphNamesIQueryableStorage: SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE via Leviathan (LeviathanQueryProcessorover anInMemoryDatasetsnapshot)IUpdateableStorage: Declared but throwsRdfStorageException— QuadStore is append-onlyDeleteSupported=false,UpdateSupported=true(additions only),ListGraphsSupported=true,IOBehaviour.IsQuadStoreNotable implementation details
http://…vs<http://…>);QueryByGraphtries both and deduplicatesxsd:stringDataType; language-tagged literals carryrdf:langString.NodeToStringchecksLanguagebeforeDataType, and usesAbsoluteUristring comparison (notUri.Equals) to avoid .NET's fragment-ignoring equality bugQuadStoreusesReaderWriterLockSlim(NoRecursion);CreateQueryProcessormaterialisesListGraphNames().ToList()before callingLoadGraphto prevent re-entrant lock acquisitionLimitations (documented on class)
UpdateGraphIUpdateableStorage.Updatethrows)Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
www.w3.org/usr/share/dotnet/dotnet /usr/share/dotnet/dotnet exec --runtimeconfig /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/TripleStore.Tests.runtimeconfig.json --depsfile /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/TripleStore.Tests.deps.json /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/testhost.dll --port 40859 --endpoint 127.0.0.1:040859 --role client --parentprocessid 4144 --telemetryoptedin false(dns block)/usr/share/dotnet/dotnet /usr/share/dotnet/dotnet exec --runtimeconfig /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/TripleStore.Tests.runtimeconfig.json --depsfile /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/TripleStore.Tests.deps.json /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/testhost.dll --port 35233 --endpoint 127.0.0.1:035233 --role client --parentprocessid 7612 --telemetryoptedin false 0.3/lib/net45/Ne-c 0.3/lib/netstand. /tmp/MSBuildTempqoFzc1/tmp0494768c8bfa4731bb2dd2cee36798be.exec.cmd 0.3/lib/netstandard2.0/Newtonsof--noheadings(dns block)/usr/share/dotnet/dotnet /usr/share/dotnet/dotnet exec --runtimeconfig /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/TripleStore.Tests.runtimeconfig.json --depsfile /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/TripleStore.Tests.deps.json /home/REDACTED/work/QuadStore/QuadStore/test/TripleStore.Tests/bin/Release/net10.0/testhost.dll --port 40783 --endpoint 127.0.0.1:040783 --role client --parentprocessid 9230 --telemetryoptedin false(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.