trustgraph/docs/tech-specs/graph-contexts.md

---
layout: default
title: "Graph Contexts Technical Specification"
parent: "Tech Specs"
---

# Graph Contexts Technical Specification

## Overview

This specification describes changes to TrustGraph's core graph primitives to
align with RDF 1.2 and support full RDF Dataset semantics. This is a breaking
change for the 2.x release series.

### Versioning

- **2.0**: Early adopter release. Core features available, may not be fully
  production-ready.
- **2.1 / 2.2**: Production release. Stability and completeness validated.

Flexibility on maturity is intentional - early adopters can access new
capabilities before all features are production-hardened.

## Goals

The primary goals for this work are to enable metadata about facts/statements:

- **Temporal information**: Associate facts with time metadata
  - When a fact was believed to be true
  - When a fact became true
  - When a fact was discovered to be false

- **Provenance/Sources**: Track which sources support a fact
  - "This fact was supported by source X"
  - Link facts back to their origin documents

- **Veracity/Trust**: Record assertions about truth
  - "Person P asserted this was true"
  - "Person Q claims this is false"
  - Enable trust scoring and conflict detection

**Hypothesis**: Reification (RDF-star / quoted triples) is the key mechanism
to achieve these outcomes, as all require making statements about statements.

## Background

To express "the fact (Alice knows Bob) was discovered on 2024-01-15" or
"source X supports the claim (Y causes Z)", you need to reference an edge
as a thing you can make statements about. Standard triples don't support this.

### Current Limitations

The current `Value` class in `trustgraph-base/trustgraph/schema/core/primitives.py`
can represent:
- URI nodes (`is_uri=True`)
- Literal values (`is_uri=False`)

The `type` field exists but is not used to represent XSD datatypes.

## Technical Design

### RDF Features to Support

#### Core Features (Related to Reification Goals)

These features are directly related to the temporal, provenance, and veracity
goals:

1. **RDF 1.2 Quoted Triples (RDF-star)**
   - Edges that point at other edges
   - A Triple can appear as the subject or object of another Triple
   - Enables statements about statements (reification)
   - Core mechanism for annotating individual facts

2. **RDF Dataset / Named Graphs**
   - Support for multiple named graphs within a dataset
   - Each graph identified by an IRI
   - Moves from triples (s, p, o) to quads (s, p, o, g)
   - Includes a default graph plus zero or more named graphs
   - The graph IRI can be a subject in statements, e.g.:
     ```
     <graph-source-A> <discoveredOn> "2024-01-15"
     <graph-source-A> <hasVeracity> "high"
     ```
   - Note: Named graphs are a separate feature from reification. They have
     uses beyond statement annotation (partitioning, access control, dataset
     organization) and should be treated as a distinct capability.

3. **Blank Nodes** (Limited Support)
   - Anonymous nodes without a global URI
   - Supported for compatibility when loading external RDF data
   - **Limited status**: No guarantees about stable identity after loading
   - Find them via wildcard queries (match by connections, not by ID)
   - Not a first-class feature - don't rely on precise blank node handling

#### Opportunistic Fixes (2.0 Breaking Change)

These features are not directly related to the reification goals but are
valuable improvements to include while making breaking changes:

4. **Literal Datatypes**
   - Properly use the `type` field for XSD datatypes
   - Examples: xsd:string, xsd:integer, xsd:dateTime, etc.
   - Fixes current limitation: cannot represent dates or integers properly

5. **Language Tags**
   - Support for language attributes on string literals (@en, @fr, etc.)
   - Note: A literal has either a language tag OR a datatype, not both
     (except for rdf:langString)
   - Important for AI/multilingual use cases

### Data Models

#### Term (rename from Value)

The `Value` class will be renamed to `Term` to better reflect RDF terminology.
This rename serves two purposes:
1. Aligns naming with RDF concepts (a "Term" can be an IRI, literal, blank
   node, or quoted triple - not just a "value")
2. Forces code review at the breaking change interface - any code still
   referencing `Value` is visibly broken and needs updating

A Term can represent:

- **IRI/URI** - A named node/resource
- **Blank Node** - An anonymous node with local scope
- **Literal** - A data value with either:
  - A datatype (XSD type), OR
  - A language tag
- **Quoted Triple** - A triple used as a term (RDF 1.2)

##### Chosen Approach: Single Class with Type Discriminator

Serialization requirements drive the structure - a type discriminator is needed
in the wire format regardless of the Python representation. A single class with
a type field is the natural fit and aligns with the current `Value` pattern.

Single-character type codes provide compact serialization:

```python
from dataclasses import dataclass

# Term type constants
IRI = "i"      # IRI/URI node
BLANK = "b"    # Blank node
LITERAL = "l"  # Literal value
TRIPLE = "t"   # Quoted triple (RDF-star)

@dataclass
class Term:
    type: str = ""  # One of: IRI, BLANK, LITERAL, TRIPLE

    # For IRI terms (type == IRI)
    iri: str = ""

    # For blank nodes (type == BLANK)
    id: str = ""

    # For literals (type == LITERAL)
    value: str = ""
    datatype: str = ""   # XSD datatype URI (mutually exclusive with language)
    language: str = ""   # Language tag (mutually exclusive with datatype)

    # For quoted triples (type == TRIPLE)
    triple: "Triple | None" = None
```

Usage examples:

```python
# IRI term
node = Term(type=IRI, iri="http://example.org/Alice")

# Literal with datatype
age = Term(type=LITERAL, value="42", datatype="xsd:integer")

# Literal with language tag
label = Term(type=LITERAL, value="Hello", language="en")

# Blank node
anon = Term(type=BLANK, id="_:b1")

# Quoted triple (statement about a statement)
inner = Triple(
    s=Term(type=IRI, iri="http://example.org/Alice"),
    p=Term(type=IRI, iri="http://example.org/knows"),
    o=Term(type=IRI, iri="http://example.org/Bob"),
)
reified = Term(type=TRIPLE, triple=inner)
```

##### Alternatives Considered

**Option B: Union of specialized classes** (`Term = IRI | BlankNode | Literal | QuotedTriple`)
- Rejected: Serialization would still need a type discriminator, adding complexity

**Option C: Base class with subclasses**
- Rejected: Same serialization issue, plus dataclass inheritance quirks

#### Triple / Quad

The `Triple` class gains an optional graph field to become a quad:

```python
@dataclass
class Triple:
    s: Term | None = None    # Subject
    p: Term | None = None    # Predicate
    o: Term | None = None    # Object
    g: str | None = None     # Graph name (IRI), None = default graph
```

Design decisions:
- **Field name**: `g` for consistency with `s`, `p`, `o`
- **Optional**: `None` means the default graph (unnamed)
- **Type**: Plain string (IRI) rather than Term
  - Graph names are always IRIs
  - Blank nodes as graph names ruled out (too confusing)
  - No need for the full Term machinery

Note: The class name stays `Triple` even though it's technically a quad now.
This avoids churn and "triple" is still the common terminology for the s/p/o
portion. The graph context is metadata about where the triple lives.

### Candidate Query Patterns

The current query engine accepts combinations of S, P, O terms. With quoted
triples, a triple itself becomes a valid term in those positions. Below are
candidate query patterns that support the original goals.

#### Graph Parameter Semantics

Following SPARQL conventions for backward compatibility:

- **`g` omitted / None**: Query the default graph only
- **`g` = specific IRI**: Query that named graph only
- **`g` = wildcard / `*`**: Query across all graphs (equivalent to SPARQL
  `GRAPH ?g { ... }`)

This keeps simple queries simple and makes named graph queries opt-in.

Cross-graph queries (g=wildcard) are fully supported. The Cassandra schema
includes dedicated tables (SPOG, POSG, OSPG) where g is a clustering column
rather than a partition key, enabling efficient queries across all graphs.

#### Temporal Queries

**Find all facts discovered after a given date:**
```
S: ?                                    # any quoted triple
P: <discoveredOn>
O: > "2024-01-15"^^xsd:date             # date comparison
```

**Find when a specific fact was believed true:**
```
S: << <Alice> <knows> <Bob> >>          # quoted triple as subject
P: <believedTrueFrom>
O: ?                                    # returns the date
```

**Find facts that became false:**
```
S: ?                                    # any quoted triple
P: <discoveredFalseOn>
O: ?                                    # has any value (exists)
```

#### Provenance Queries

**Find all facts supported by a specific source:**
```
S: ?                                    # any quoted triple
P: <supportedBy>
O: <source:document-123>
```

**Find which sources support a specific fact:**
```
S: << <DrugA> <treats> <DiseaseB> >>    # quoted triple as subject
P: <supportedBy>
O: ?                                    # returns source IRIs
```

#### Veracity Queries

**Find assertions a person marked as true:**
```
S: ?                                    # any quoted triple
P: <assertedTrueBy>
O: <person:Alice>
```

**Find conflicting assertions (same fact, different veracity):**
```
# First query: facts asserted true
S: ?
P: <assertedTrueBy>
O: ?

# Second query: facts asserted false
S: ?
P: <assertedFalseBy>
O: ?

# Application logic: find intersection of subjects
```

**Find facts with trust score below threshold:**
```
S: ?                                    # any quoted triple
P: <trustScore>
O: < 0.5                                # numeric comparison
```

### Architecture

Significant changes required across multiple components:

#### This Repository (trustgraph)

- **Schema primitives** (`trustgraph-base/trustgraph/schema/core/primitives.py`)
  - Value → Term rename
  - New Term structure with type discriminator
  - Triple gains `g` field for graph context

- **Message translators** (`trustgraph-base/trustgraph/messaging/translators/`)
  - Update for new Term/Triple structures
  - Serialization/deserialization for new fields

- **Gateway components**
  - Handle new Term and quad structures

- **Knowledge cores**
  - Core changes to support quads and reification

- **Knowledge manager**
  - Schema changes propagate here

- **Storage layers**
  - Cassandra: Schema redesign (see Implementation Details)
  - Other backends: Deferred to later phases

- **Command-line utilities**
  - Update for new data structures

- **REST API documentation**
  - OpenAPI spec updates

#### External Repositories

- **Python API** (this repo)
  - Client library updates for new structures

- **TypeScript APIs** (separate repo)
  - Client library updates

- **Workbench** (separate repo)
  - Significant state management changes

### APIs

#### REST API

- Documented in OpenAPI spec
- Will need updates for new Term/Triple structures
- New endpoints may be needed for graph context operations

#### Python API (this repo)

- Client library changes to match new primitives
- Breaking changes to Term (was Value) and Triple

#### TypeScript API (separate repo)

- Parallel changes to Python API
- Separate release coordination

#### Workbench (separate repo)

- Significant state management changes
- UI updates for graph context features

### Implementation Details

#### Phased Storage Implementation

Multiple graph store backends exist (Cassandra, Neo4j, etc.). Implementation
will proceed in phases:

1. **Phase 1: Cassandra**
   - Start with the home-grown Cassandra store
   - Full control over the storage layer enables rapid iteration
   - Schema will be redesigned from scratch for quads + reification
   - Validate the data model and query patterns against real use cases

#### Cassandra Schema Design

Cassandra requires multiple tables to support different query access patterns
(each table efficiently queries by its partition key + clustering columns).

##### Query Patterns

With quads (g, s, p, o), each position can be specified or wildcard, giving
16 possible query patterns:

| # | g | s | p | o | Description |
|---|---|---|---|---|-------------|
| 1 | ? | ? | ? | ? | All quads |
| 2 | ? | ? | ? | o | By object |
| 3 | ? | ? | p | ? | By predicate |
| 4 | ? | ? | p | o | By predicate + object |
| 5 | ? | s | ? | ? | By subject |
| 6 | ? | s | ? | o | By subject + object |
| 7 | ? | s | p | ? | By subject + predicate |
| 8 | ? | s | p | o | Full triple (which graphs?) |
| 9 | g | ? | ? | ? | By graph |
| 10 | g | ? | ? | o | By graph + object |
| 11 | g | ? | p | ? | By graph + predicate |
| 12 | g | ? | p | o | By graph + predicate + object |
| 13 | g | s | ? | ? | By graph + subject |
| 14 | g | s | ? | o | By graph + subject + object |
| 15 | g | s | p | ? | By graph + subject + predicate |
| 16 | g | s | p | o | Exact quad |

##### Table Design

Cassandra constraint: You can only efficiently query by partition key, then
filter on clustering columns left-to-right. For g-wildcard queries, g must be
a clustering column. For g-specified queries, g in the partition key is more
efficient.

**Two table families needed:**

**Family A: g-wildcard queries** (g in clustering columns)

| Table | Partition | Clustering | Supports patterns |
|-------|-----------|------------|-------------------|
| SPOG | (user, collection, s) | p, o, g | 5, 7, 8 |
| POSG | (user, collection, p) | o, s, g | 3, 4 |
| OSPG | (user, collection, o) | s, p, g | 2, 6 |

**Family B: g-specified queries** (g in partition key)

| Table | Partition | Clustering | Supports patterns |
|-------|-----------|------------|-------------------|
| GSPO | (user, collection, g, s) | p, o | 9, 13, 15, 16 |
| GPOS | (user, collection, g, p) | o, s | 11, 12 |
| GOSP | (user, collection, g, o) | s, p | 10, 14 |

**Collection table** (for iteration and bulk deletion)

| Table | Partition | Clustering | Purpose |
|-------|-----------|------------|---------|
| COLL | (user, collection) | g, s, p, o | Enumerate all quads in collection |

##### Write and Delete Paths

**Write path**: Insert into all 7 tables.

**Delete collection path**:
1. Iterate COLL table for `(user, collection)`
2. For each quad, delete from all 6 query tables
3. Delete from COLL table (or range delete)

**Delete single quad path**: Delete from all 7 tables directly.

##### Storage Cost

Each quad is stored 7 times. This is the cost of flexible querying combined
with efficient collection deletion.

##### Quoted Triples in Storage

Subject or object can be a triple itself. Options:

**Option A: Serialize quoted triples to canonical string**
```
S: "<<http://ex/Alice|http://ex/knows|http://ex/Bob>>"
P: http://ex/discoveredOn
O: "2024-01-15"
G: null
```
- Store quoted triple as serialized string in S or O columns
- Query by exact match on serialized form
- Pro: Simple, fits existing index patterns
- Con: Can't query "find triples where quoted subject's predicate is X"

**Option B: Triple IDs / Hashes**
```
Triple table:
  id: hash(s,p,o,g)
  s, p, o, g: ...

Metadata table:
  subject_triple_id: <hash>
  p: http://ex/discoveredOn
  o: "2024-01-15"
```
- Assign each triple an ID (hash of components)
- Reification metadata references triples by ID
- Pro: Clean separation, can index triple IDs
- Con: Requires computing/managing triple identity, two-phase lookups

**Recommendation**: Start with Option A (serialized strings) for simplicity.
Option B may be needed if advanced query patterns over quoted triple
components are required.

2. **Phase 2+: Other Backends**
   - Neo4j and other stores implemented in subsequent stages
   - Lessons learned from Cassandra inform these implementations

This approach de-risks the design by validating on a fully-controlled backend
before committing to implementations across all stores.

#### Value → Term Rename

The `Value` class will be renamed to `Term`. This affects ~78 files across
the codebase. The rename acts as a forcing function: any code still using
`Value` is immediately identifiable as needing review/update for 2.0
compatibility.

## Security Considerations

Named graphs are not a security feature. Users and collections remain the
security boundaries. Named graphs are purely for data organization and
reification support.

## Performance Considerations

- Quoted triples add nesting depth - may impact query performance
- Named graph indexing strategies needed for efficient graph-scoped queries
- Cassandra schema design will need to accommodate quad storage efficiently

### Vector Store Boundary

Vector stores always reference IRIs only:
- Never edges (quoted triples)
- Never literal values
- Never blank nodes

This keeps the vector store simple - it handles semantic similarity of named
entities. The graph structure handles relationships, reification, and metadata.
Quoted triples and named graphs don't complicate vector operations.

## Testing Strategy

Use existing test strategy. As this is a breaking change, extensive focus on
the end-to-end test suite to validate the new structures work correctly across
all components.

## Migration Plan

- 2.0 is a breaking release; no backward compatibility required
- Existing data may need migration to new schema (TBD based on final design)
- Consider migration tooling for converting existing triples

## Open Questions

- **Blank nodes**: Limited support confirmed. May need to decide on
  skolemization strategy (generate IRIs on load, or preserve blank node IDs).
- **Query syntax**: What is the concrete syntax for specifying quoted triples
  in queries? Need to define the query API.
- ~~**Predicate vocabulary**~~: Resolved. Any valid RDF predicates permitted,
  including custom user-defined. Minimal assumptions about RDF validity.
  Very few locked-in values (e.g., `rdfs:label` used in some places).
  Strategy: avoid locking anything in unless absolutely necessary.
- ~~**Vector store impact**~~: Resolved. Vector stores always point to IRIs
  only - never edges, literals, or blank nodes. Quoted triples and
  reification don't affect the vector store.
- ~~**Named graph semantics**~~: Resolved. Queries default to the default
  graph (matches SPARQL behavior, backward compatible). Explicit graph
  parameter required to query named graphs or all graphs.

## References

- [RDF 1.2 Concepts](https://www.w3.org/TR/rdf12-concepts/)
- [RDF-star and SPARQL-star](https://w3c.github.io/rdf-star/)
- [RDF Dataset](https://www.w3.org/TR/rdf11-concepts/#section-dataset)
Feat: TrustGraph i18n & Documentation Translation Updates (#781) Native CLI i18n: The TrustGraph CLI has built-in translation support that dynamically loads language strings. You can test and use different languages by simply passing the --lang flag (e.g., --lang es for Spanish, --lang ru for Russian) or by configuring your environment's LANG variable. Automated Docs Translations: This PR introduces autonomously translated Markdown documentation into several target languages, including Spanish, Swahili, Portuguese, Turkish, Hindi, Hebrew, Arabic, Simplified Chinese, and Russian. 2026-04-14 07:07:58 -04:00			`---`
			`layout: default`
			`title: "Graph Contexts Technical Specification"`
			`parent: "Tech Specs"`
			`---`

Merge 2.0 to master (#651) 2026-02-28 11:03:14 +00:00			`# Graph Contexts Technical Specification`

			`## Overview`

			`This specification describes changes to TrustGraph's core graph primitives to`
			`align with RDF 1.2 and support full RDF Dataset semantics. This is a breaking`
			`change for the 2.x release series.`

			`### Versioning`

			`- 2.0: Early adopter release. Core features available, may not be fully`
			`production-ready.`
			`- 2.1 / 2.2: Production release. Stability and completeness validated.`

			`Flexibility on maturity is intentional - early adopters can access new`
			`capabilities before all features are production-hardened.`

			`## Goals`

			`The primary goals for this work are to enable metadata about facts/statements:`

			`- Temporal information: Associate facts with time metadata`
			`- When a fact was believed to be true`
			`- When a fact became true`
			`- When a fact was discovered to be false`

			`- Provenance/Sources: Track which sources support a fact`
			`- "This fact was supported by source X"`
			`- Link facts back to their origin documents`

			`- Veracity/Trust: Record assertions about truth`
			`- "Person P asserted this was true"`
			`- "Person Q claims this is false"`
			`- Enable trust scoring and conflict detection`

			`Hypothesis: Reification (RDF-star / quoted triples) is the key mechanism`
			`to achieve these outcomes, as all require making statements about statements.`

			`## Background`

			`To express "the fact (Alice knows Bob) was discovered on 2024-01-15" or`
			`"source X supports the claim (Y causes Z)", you need to reference an edge`
			`as a thing you can make statements about. Standard triples don't support this.`

			`### Current Limitations`

			The current `Value` class in `trustgraph-base/trustgraph/schema/core/primitives.py`
			`can represent:`
			- URI nodes (`is_uri=True`)
			- Literal values (`is_uri=False`)

			The `type` field exists but is not used to represent XSD datatypes.

			`## Technical Design`

			`### RDF Features to Support`

			`#### Core Features (Related to Reification Goals)`

			`These features are directly related to the temporal, provenance, and veracity`
			`goals:`

			`1. RDF 1.2 Quoted Triples (RDF-star)`
			`- Edges that point at other edges`
			`- A Triple can appear as the subject or object of another Triple`
			`- Enables statements about statements (reification)`
			`- Core mechanism for annotating individual facts`

			`2. RDF Dataset / Named Graphs`
			`- Support for multiple named graphs within a dataset`
			`- Each graph identified by an IRI`
			`- Moves from triples (s, p, o) to quads (s, p, o, g)`
			`- Includes a default graph plus zero or more named graphs`
			`- The graph IRI can be a subject in statements, e.g.:`
			```
			`<graph-source-A> <discoveredOn> "2024-01-15"`
			`<graph-source-A> <hasVeracity> "high"`
			```
			`- Note: Named graphs are a separate feature from reification. They have`
			`uses beyond statement annotation (partitioning, access control, dataset`
			`organization) and should be treated as a distinct capability.`

			`3. Blank Nodes (Limited Support)`
			`- Anonymous nodes without a global URI`
			`- Supported for compatibility when loading external RDF data`
			`- Limited status: No guarantees about stable identity after loading`
			`- Find them via wildcard queries (match by connections, not by ID)`
			`- Not a first-class feature - don't rely on precise blank node handling`

			`#### Opportunistic Fixes (2.0 Breaking Change)`

			`These features are not directly related to the reification goals but are`
			`valuable improvements to include while making breaking changes:`

			`4. Literal Datatypes`
			- Properly use the `type` field for XSD datatypes
			`- Examples: xsd:string, xsd:integer, xsd:dateTime, etc.`
			`- Fixes current limitation: cannot represent dates or integers properly`

			`5. Language Tags`
			`- Support for language attributes on string literals (@en, @fr, etc.)`
			`- Note: A literal has either a language tag OR a datatype, not both`
			`(except for rdf:langString)`
			`- Important for AI/multilingual use cases`

			`### Data Models`

			`#### Term (rename from Value)`

			The `Value` class will be renamed to `Term` to better reflect RDF terminology.
			`This rename serves two purposes:`
			`1. Aligns naming with RDF concepts (a "Term" can be an IRI, literal, blank`
			`node, or quoted triple - not just a "value")`
			`2. Forces code review at the breaking change interface - any code still`
			referencing `Value` is visibly broken and needs updating

			`A Term can represent:`

			`- IRI/URI - A named node/resource`
			`- Blank Node - An anonymous node with local scope`
			`- Literal - A data value with either:`
			`- A datatype (XSD type), OR`
			`- A language tag`
			`- Quoted Triple - A triple used as a term (RDF 1.2)`

			`##### Chosen Approach: Single Class with Type Discriminator`

			`Serialization requirements drive the structure - a type discriminator is needed`
			`in the wire format regardless of the Python representation. A single class with`
			a type field is the natural fit and aligns with the current `Value` pattern.

			`Single-character type codes provide compact serialization:`

			```python
			`from dataclasses import dataclass`

			`# Term type constants`
			`IRI = "i" # IRI/URI node`
			`BLANK = "b" # Blank node`
			`LITERAL = "l" # Literal value`
			`TRIPLE = "t" # Quoted triple (RDF-star)`

			`@dataclass`
			`class Term:`
			`type: str = "" # One of: IRI, BLANK, LITERAL, TRIPLE`

			`# For IRI terms (type == IRI)`
			`iri: str = ""`

			`# For blank nodes (type == BLANK)`
			`id: str = ""`

			`# For literals (type == LITERAL)`
			`value: str = ""`
			`datatype: str = "" # XSD datatype URI (mutually exclusive with language)`
			`language: str = "" # Language tag (mutually exclusive with datatype)`

			`# For quoted triples (type == TRIPLE)`
			`triple: "Triple \| None" = None`
			```

			`Usage examples:`

			```python
			`# IRI term`
			`node = Term(type=IRI, iri="http://example.org/Alice")`

			`# Literal with datatype`
			`age = Term(type=LITERAL, value="42", datatype="xsd:integer")`

			`# Literal with language tag`
			`label = Term(type=LITERAL, value="Hello", language="en")`

			`# Blank node`
			`anon = Term(type=BLANK, id="_:b1")`

			`# Quoted triple (statement about a statement)`
			`inner = Triple(`
			`s=Term(type=IRI, iri="http://example.org/Alice"),`
			`p=Term(type=IRI, iri="http://example.org/knows"),`
			`o=Term(type=IRI, iri="http://example.org/Bob"),`
			`)`
			`reified = Term(type=TRIPLE, triple=inner)`
			```

			`##### Alternatives Considered`

			Option B: Union of specialized classes (`Term = IRI \| BlankNode \| Literal \| QuotedTriple`)
			`- Rejected: Serialization would still need a type discriminator, adding complexity`

			`Option C: Base class with subclasses`
			`- Rejected: Same serialization issue, plus dataclass inheritance quirks`

			`#### Triple / Quad`

			The `Triple` class gains an optional graph field to become a quad:

			```python
			`@dataclass`
			`class Triple:`
			`s: Term \| None = None # Subject`
			`p: Term \| None = None # Predicate`
			`o: Term \| None = None # Object`
			`g: str \| None = None # Graph name (IRI), None = default graph`
			```

			`Design decisions:`
			- Field name: `g` for consistency with `s`, `p`, `o`
			- Optional: `None` means the default graph (unnamed)
			`- Type: Plain string (IRI) rather than Term`
			`- Graph names are always IRIs`
			`- Blank nodes as graph names ruled out (too confusing)`
			`- No need for the full Term machinery`

			Note: The class name stays `Triple` even though it's technically a quad now.
			`This avoids churn and "triple" is still the common terminology for the s/p/o`
			`portion. The graph context is metadata about where the triple lives.`

			`### Candidate Query Patterns`

			`The current query engine accepts combinations of S, P, O terms. With quoted`
			`triples, a triple itself becomes a valid term in those positions. Below are`
			`candidate query patterns that support the original goals.`

			`#### Graph Parameter Semantics`

			`Following SPARQL conventions for backward compatibility:`

			- `g` omitted / None: Query the default graph only
			- `g` = specific IRI: Query that named graph only
			- *`g` = wildcard / ``**: Query across all graphs (equivalent to SPARQL
			`GRAPH ?g { ... }`)

			`This keeps simple queries simple and makes named graph queries opt-in.`

			`Cross-graph queries (g=wildcard) are fully supported. The Cassandra schema`
			`includes dedicated tables (SPOG, POSG, OSPG) where g is a clustering column`
			`rather than a partition key, enabling efficient queries across all graphs.`

			`#### Temporal Queries`

			`Find all facts discovered after a given date:`
			```
			`S: ? # any quoted triple`
			`P: <discoveredOn>`
			`O: > "2024-01-15"^^xsd:date # date comparison`
			```

			`Find when a specific fact was believed true:`
			```
			`S: << <Alice> <knows> <Bob> >> # quoted triple as subject`
			`P: <believedTrueFrom>`
			`O: ? # returns the date`
			```

			`Find facts that became false:`
			```
			`S: ? # any quoted triple`
			`P: <discoveredFalseOn>`
			`O: ? # has any value (exists)`
			```

			`#### Provenance Queries`

			`Find all facts supported by a specific source:`
			```
			`S: ? # any quoted triple`
			`P: <supportedBy>`
			`O: <source:document-123>`
			```

			`Find which sources support a specific fact:`
			```
			`S: << <DrugA> <treats> <DiseaseB> >> # quoted triple as subject`
			`P: <supportedBy>`
			`O: ? # returns source IRIs`
			```

			`#### Veracity Queries`

			`Find assertions a person marked as true:`
			```
			`S: ? # any quoted triple`
			`P: <assertedTrueBy>`
			`O: <person:Alice>`
			```

			`Find conflicting assertions (same fact, different veracity):`
			```
			`# First query: facts asserted true`
			`S: ?`
			`P: <assertedTrueBy>`
			`O: ?`

			`# Second query: facts asserted false`
			`S: ?`
			`P: <assertedFalseBy>`
			`O: ?`

			`# Application logic: find intersection of subjects`
			```

			`Find facts with trust score below threshold:`
			```
			`S: ? # any quoted triple`
			`P: <trustScore>`
			`O: < 0.5 # numeric comparison`
			```

			`### Architecture`

			`Significant changes required across multiple components:`

			`#### This Repository (trustgraph)`

			- Schema primitives (`trustgraph-base/trustgraph/schema/core/primitives.py`)
			`- Value → Term rename`
			`- New Term structure with type discriminator`
			- Triple gains `g` field for graph context

			- Message translators (`trustgraph-base/trustgraph/messaging/translators/`)
			`- Update for new Term/Triple structures`
			`- Serialization/deserialization for new fields`

			`- Gateway components`
			`- Handle new Term and quad structures`

			`- Knowledge cores`
			`- Core changes to support quads and reification`

			`- Knowledge manager`
			`- Schema changes propagate here`

			`- Storage layers`
			`- Cassandra: Schema redesign (see Implementation Details)`
			`- Other backends: Deferred to later phases`

			`- Command-line utilities`
			`- Update for new data structures`

			`- REST API documentation`
			`- OpenAPI spec updates`

			`#### External Repositories`

			`- Python API (this repo)`
			`- Client library updates for new structures`

			`- TypeScript APIs (separate repo)`
			`- Client library updates`

			`- Workbench (separate repo)`
			`- Significant state management changes`

			`### APIs`

			`#### REST API`

			`- Documented in OpenAPI spec`
			`- Will need updates for new Term/Triple structures`
			`- New endpoints may be needed for graph context operations`

			`#### Python API (this repo)`

			`- Client library changes to match new primitives`
			`- Breaking changes to Term (was Value) and Triple`

			`#### TypeScript API (separate repo)`

			`- Parallel changes to Python API`
			`- Separate release coordination`

			`#### Workbench (separate repo)`

			`- Significant state management changes`
			`- UI updates for graph context features`

			`### Implementation Details`

			`#### Phased Storage Implementation`

			`Multiple graph store backends exist (Cassandra, Neo4j, etc.). Implementation`
			`will proceed in phases:`

			`1. Phase 1: Cassandra`
			`- Start with the home-grown Cassandra store`
			`- Full control over the storage layer enables rapid iteration`
			`- Schema will be redesigned from scratch for quads + reification`
			`- Validate the data model and query patterns against real use cases`

			`#### Cassandra Schema Design`

			`Cassandra requires multiple tables to support different query access patterns`
			`(each table efficiently queries by its partition key + clustering columns).`

			`##### Query Patterns`

			`With quads (g, s, p, o), each position can be specified or wildcard, giving`
			`16 possible query patterns:`

			`\| # \| g \| s \| p \| o \| Description \|`
			`\|---\|---\|---\|---\|---\|-------------\|`
			`\| 1 \| ? \| ? \| ? \| ? \| All quads \|`
			`\| 2 \| ? \| ? \| ? \| o \| By object \|`
			`\| 3 \| ? \| ? \| p \| ? \| By predicate \|`
			`\| 4 \| ? \| ? \| p \| o \| By predicate + object \|`
			`\| 5 \| ? \| s \| ? \| ? \| By subject \|`
			`\| 6 \| ? \| s \| ? \| o \| By subject + object \|`
			`\| 7 \| ? \| s \| p \| ? \| By subject + predicate \|`
			`\| 8 \| ? \| s \| p \| o \| Full triple (which graphs?) \|`
			`\| 9 \| g \| ? \| ? \| ? \| By graph \|`
			`\| 10 \| g \| ? \| ? \| o \| By graph + object \|`
			`\| 11 \| g \| ? \| p \| ? \| By graph + predicate \|`
			`\| 12 \| g \| ? \| p \| o \| By graph + predicate + object \|`
			`\| 13 \| g \| s \| ? \| ? \| By graph + subject \|`
			`\| 14 \| g \| s \| ? \| o \| By graph + subject + object \|`
			`\| 15 \| g \| s \| p \| ? \| By graph + subject + predicate \|`
			`\| 16 \| g \| s \| p \| o \| Exact quad \|`

			`##### Table Design`

			`Cassandra constraint: You can only efficiently query by partition key, then`
			`filter on clustering columns left-to-right. For g-wildcard queries, g must be`
			`a clustering column. For g-specified queries, g in the partition key is more`
			`efficient.`

			`Two table families needed:`

			`Family A: g-wildcard queries (g in clustering columns)`

			`\| Table \| Partition \| Clustering \| Supports patterns \|`
			`\|-------\|-----------\|------------\|-------------------\|`
			`\| SPOG \| (user, collection, s) \| p, o, g \| 5, 7, 8 \|`
			`\| POSG \| (user, collection, p) \| o, s, g \| 3, 4 \|`
			`\| OSPG \| (user, collection, o) \| s, p, g \| 2, 6 \|`

			`Family B: g-specified queries (g in partition key)`

			`\| Table \| Partition \| Clustering \| Supports patterns \|`
			`\|-------\|-----------\|------------\|-------------------\|`
			`\| GSPO \| (user, collection, g, s) \| p, o \| 9, 13, 15, 16 \|`
			`\| GPOS \| (user, collection, g, p) \| o, s \| 11, 12 \|`
			`\| GOSP \| (user, collection, g, o) \| s, p \| 10, 14 \|`

			`Collection table (for iteration and bulk deletion)`

			`\| Table \| Partition \| Clustering \| Purpose \|`
			`\|-------\|-----------\|------------\|---------\|`
			`\| COLL \| (user, collection) \| g, s, p, o \| Enumerate all quads in collection \|`

			`##### Write and Delete Paths`

			`Write path: Insert into all 7 tables.`

			`Delete collection path:`
			1. Iterate COLL table for `(user, collection)`
			`2. For each quad, delete from all 6 query tables`
			`3. Delete from COLL table (or range delete)`

			`Delete single quad path: Delete from all 7 tables directly.`

			`##### Storage Cost`

			`Each quad is stored 7 times. This is the cost of flexible querying combined`
			`with efficient collection deletion.`

			`##### Quoted Triples in Storage`

			`Subject or object can be a triple itself. Options:`

			`Option A: Serialize quoted triples to canonical string`
			```
			`S: "<<http://ex/Alice\|http://ex/knows\|http://ex/Bob>>"`
			`P: http://ex/discoveredOn`
			`O: "2024-01-15"`
			`G: null`
			```
			`- Store quoted triple as serialized string in S or O columns`
			`- Query by exact match on serialized form`
			`- Pro: Simple, fits existing index patterns`
			`- Con: Can't query "find triples where quoted subject's predicate is X"`

			`Option B: Triple IDs / Hashes`
			```
			`Triple table:`
			`id: hash(s,p,o,g)`
			`s, p, o, g: ...`

			`Metadata table:`
			`subject_triple_id: <hash>`
			`p: http://ex/discoveredOn`
			`o: "2024-01-15"`
			```
			`- Assign each triple an ID (hash of components)`
			`- Reification metadata references triples by ID`
			`- Pro: Clean separation, can index triple IDs`
			`- Con: Requires computing/managing triple identity, two-phase lookups`

			`Recommendation: Start with Option A (serialized strings) for simplicity.`
			`Option B may be needed if advanced query patterns over quoted triple`
			`components are required.`

			`2. Phase 2+: Other Backends`
			`- Neo4j and other stores implemented in subsequent stages`
			`- Lessons learned from Cassandra inform these implementations`

			`This approach de-risks the design by validating on a fully-controlled backend`
			`before committing to implementations across all stores.`

			`#### Value → Term Rename`

			The `Value` class will be renamed to `Term`. This affects ~78 files across
			`the codebase. The rename acts as a forcing function: any code still using`
			`Value` is immediately identifiable as needing review/update for 2.0
			`compatibility.`

			`## Security Considerations`

			`Named graphs are not a security feature. Users and collections remain the`
			`security boundaries. Named graphs are purely for data organization and`
			`reification support.`

			`## Performance Considerations`

			`- Quoted triples add nesting depth - may impact query performance`
			`- Named graph indexing strategies needed for efficient graph-scoped queries`
			`- Cassandra schema design will need to accommodate quad storage efficiently`

			`### Vector Store Boundary`

			`Vector stores always reference IRIs only:`
			`- Never edges (quoted triples)`
			`- Never literal values`
			`- Never blank nodes`

			`This keeps the vector store simple - it handles semantic similarity of named`
			`entities. The graph structure handles relationships, reification, and metadata.`
			`Quoted triples and named graphs don't complicate vector operations.`

			`## Testing Strategy`

			`Use existing test strategy. As this is a breaking change, extensive focus on`
			`the end-to-end test suite to validate the new structures work correctly across`
			`all components.`

			`## Migration Plan`

			`- 2.0 is a breaking release; no backward compatibility required`
			`- Existing data may need migration to new schema (TBD based on final design)`
			`- Consider migration tooling for converting existing triples`

			`## Open Questions`

			`- Blank nodes: Limited support confirmed. May need to decide on`
			`skolemization strategy (generate IRIs on load, or preserve blank node IDs).`
			`- Query syntax: What is the concrete syntax for specifying quoted triples`
			`in queries? Need to define the query API.`
			`- ~~Predicate vocabulary~~: Resolved. Any valid RDF predicates permitted,`
			`including custom user-defined. Minimal assumptions about RDF validity.`
			Very few locked-in values (e.g., `rdfs:label` used in some places).
			`Strategy: avoid locking anything in unless absolutely necessary.`
			`- ~~Vector store impact~~: Resolved. Vector stores always point to IRIs`
			`only - never edges, literals, or blank nodes. Quoted triples and`
			`reification don't affect the vector store.`
			`- ~~Named graph semantics~~: Resolved. Queries default to the default`
			`graph (matches SPARQL behavior, backward compatible). Explicit graph`
			`parameter required to query named graphs or all graphs.`

			`## References`

			`- [RDF 1.2 Concepts](https://www.w3.org/TR/rdf12-concepts/)`
			`- [RDF-star and SPARQL-star](https://w3c.github.io/rdf-star/)`
			`- [RDF Dataset](https://www.w3.org/TR/rdf11-concepts/#section-dataset)`