Added click to the pip install line. Looks like huggingface-hub 1.13.0
added a CLI dependency on click but didn't declare it as a hard
requirement (or it's only required for the CLI entrypoint). Needed to
unblock the build.
Replaces the URL-based PDF downloads in tg-load-sample-documents with
seven curated, locally bundled documents covering diverse topics (recipes,
Belgian beer, trade routes, corporate scandals, pets, fortifications,
Bronze Age collapse). Documents are packaged as data files within
trustgraph-cli and loaded from metadata.json, removing the dependency
on external URLs and the doc-cache mechanism.
Replace removed `user` parameter with `workspace` support following
the tenancy axis change in #840. Adds -w/--workspace flag and
$TRUSTGRAPH_WORKSPACE env var.
Replace hallucinated relative imports with correct absolute imports
across the ontology query package, and fix OntologyMatcher reference
to match the actual class name OntologyMatcherForQueries. Simplify
test to use standard imports instead of importlib hack.
Cosmetic, but simpler imports provides undeterministic imports in a dev
environment, and also means we're properly testing linkage
Convert the SPARQL algebra evaluator from eager list-based evaluation to
lazy async generators so results stream incrementally. This lets Slice
terminate early (via generator cleanup) and avoids materialising full
result sets for streamable operators like Project, Filter, Union, and
Extend. Blocking operators (Join, LeftJoin, OrderBy, Group) materialise
at their boundary then yield.
Add bind join optimization for Join nodes where one side is small
(VALUES/ToMultiSet): instead of materialising both sides independently
and hash-joining, iterate the small side's bindings and evaluate the
large side with those bindings pre-seeded. This turns wildcard BGP
queries into selective ones — e.g. VALUES ?x { <uri> } joined with a
BGP now queries the triple store with ?x bound rather than fetching
all triples.
Add TriplesClient.query_gen() async generator that wraps the existing
streaming callback API via an asyncio.Queue bridge, yielding individual
Triple objects as batches arrive.
Add streaming request path in the SPARQL query service that batches
solutions from the live async generator and sends them as they fill.
Fix FILTER IN/NOT IN: rdflib represents these as RelationalExpression
nodes with op="IN", not as Builtin_IN — handle both representations.
Fix Builtin_IN/Builtin_NOTIN dispatch ordering so the specific handlers
are checked before the generic Builtin_ prefix match.
Fix VALUES handling for rdflib's two representations: positional
(var/value) and dict-based (res).
The Slice evaluator was propagating the SPARQL LIMIT value as the
inner limit for child evaluations, starving LeftJoin (OPTIONAL) and
other operators of results. The safety limit parameter should flow
through unchanged; LIMIT/OFFSET are applied only at the Slice node.
Add 30+ SPARQL 1.1 built-in functions and the MINUS algebra operator to the
custom SPARQL query backend.
String functions:
- SUBSTR (2-arg and 3-arg forms), STRBEFORE, STRAFTER
- REPLACE (regex with flags), ENCODE_FOR_URI
Numeric functions:
- FLOOR, CEIL, ROUND, ABS
Date/time accessors:
- YEAR, MONTH, DAY, HOURS, MINUTES, SECONDS
- NOW, TZ
Hash functions:
- MD5, SHA1, SHA256, SHA512
Term constructors:
- IRI/URI, BNODE, UUID, STRUUID
Other functions:
- LANGMATCHES, RAND
- EXISTS / NOT EXISTS (with async pre-evaluation to bridge the
sync expression evaluator and async algebra evaluator)
Algebra:
- MINUS set-difference operator
- HAVING already works via rdflib's Filter mapping (verified)
Fix SPARQL ORDER handling
Includes 653 lines of new unit tests covering all added functionality
across expressions, solutions, and algebra layers.