trustgraph/tests/contract
cybermaggedon c23e28aa66
Fix Metadata/EntityEmbeddings schema migration tail and add regression tests (#777)
The Metadata dataclass dropped its `metadata: list[Triple]` field
and EntityEmbeddings/ChunkEmbeddings settled on a singular
`vector: list[float]` field, but several call sites kept passing
`Metadata(metadata=...)` and `EntityEmbeddings(vectors=...)`. The
bugs were latent until a websocket client first hit
`/api/v1/flow/default/import/entity-contexts`, at which point the
dispatcher TypeError'd on construction.

Production fixes (5 call sites on the same migration tail):

  * trustgraph-flow gateway dispatchers entity_contexts_import.py
    and graph_embeddings_import.py — drop the stale
    Metadata(metadata=...)  kwarg; switch graph_embeddings_import
    to the singular `vector` wire key.
  * trustgraph-base messaging translators knowledge.py and
    document_loading.py — fix decode side to read the singular
    `"vector"` key, matching what their own encode sides have
    always written.
  * trustgraph-flow tables/knowledge.py — fix Cassandra row
    deserialiser to construct EntityEmbeddings(vector=...)
    instead of vectors=.
  * trustgraph-flow gateway core_import/core_export — switch the
    kg-core msgpack wire format to the singular `"v"`/`"vector"`
    key and drop the dead `m["m"]` envelope field that referenced
    the removed Metadata.metadata triples list (it was a
    guaranteed KeyError on the export side).

Defense-in-depth regression coverage (32 new tests across 7 files):

  * tests/contract/test_schema_field_contracts.py — pin the field
    set of Metadata, EntityEmbeddings, ChunkEmbeddings,
    EntityContext so any future schema rename fails CI loudly
    with a clear diff.
  * tests/unit/test_translators/test_knowledge_translator_roundtrip.py
    and test_document_embeddings_translator_roundtrip.py -
    encode→decode round-trip the affected translators end to end,
    locking in the singular `"vector"` wire key.
  * tests/unit/test_gateway/test_entity_contexts_import_dispatcher.py
    and test_graph_embeddings_import_dispatcher.py — exercise the
    websocket dispatchers' receive() path with realistic
    payloads, the direct regression test for the original
    production crash.
  * tests/unit/test_gateway/test_core_import_export_roundtrip.py
    — pack/unpack the kg-core msgpack format through the real
    dispatcher classes (with KnowledgeRequestor mocked),
    including a full export→import round-trip.
  * tests/unit/test_tables/test_knowledge_table_store.py —
    exercise the Cassandra row → schema conversion via __new__ to
    bypass the live cluster connection.

Also fixes an unrelated leaked-coroutine RuntimeWarning in
test_gateway/test_service.py::test_run_method_calls_web_run_app: the
mocked aiohttp.web.run_app now closes the coroutine that Api.run() hands
it, mirroring what the real run_app would do, instead of leaving it for
the GC to complain about.
2026-04-10 20:43:45 +01:00
..
__init__.py Extending test coverage (#434) 2025-07-14 17:54:04 +01:00
conftest.py Add multi-pattern orchestrator with plan-then-execute and supervisor (#739) 2026-03-31 00:32:49 +01:00
README.md Extending test coverage (#434) 2025-07-14 17:54:04 +01:00
test_document_embeddings_contract.py Pub/sub abstraction: decouple from Pulsar (#751) 2026-04-01 20:16:53 +01:00
test_message_contracts.py Add multi-pattern orchestrator with plan-then-execute and supervisor (#739) 2026-03-31 00:32:49 +01:00
test_orchestrator_contracts.py Update tests for agent-orchestrator (#745) 2026-03-31 13:12:26 +01:00
test_provenance_wire_format.py Update tests for agent-orchestrator (#745) 2026-03-31 13:12:26 +01:00
test_rows_cassandra_contracts.py Remove redundant metadata (#685) 2026-03-11 10:51:39 +00:00
test_rows_graphql_query_contracts.py Structured data 2 (#645) 2026-02-23 15:56:29 +00:00
test_schema_field_contracts.py Fix Metadata/EntityEmbeddings schema migration tail and add regression tests (#777) 2026-04-10 20:43:45 +01:00
test_structured_data_contracts.py Remove redundant metadata (#685) 2026-03-11 10:51:39 +00:00
test_translator_completion_flags.py Pub/sub abstraction: decouple from Pulsar (#751) 2026-04-01 20:16:53 +01:00

Contract Tests for TrustGraph

This directory contains contract tests that verify service interface contracts, message schemas, and API compatibility across the TrustGraph microservices architecture.

Overview

Contract tests ensure that:

  • Message schemas remain compatible across service versions
  • API interfaces stay stable for consumers
  • Service communication contracts are maintained
  • Schema evolution doesn't break existing integrations

Test Categories

1. Pulsar Message Schema Contracts (test_message_contracts.py)

Tests the contracts for all Pulsar message schemas used in TrustGraph service communication.

Coverage:

  • Text Completion Messages: TextCompletionRequestTextCompletionResponse
  • Document RAG Messages: DocumentRagQueryDocumentRagResponse
  • Agent Messages: AgentRequestAgentResponseAgentStep
  • Graph Messages: ChunkTripleTriplesEntityContext
  • Common Messages: Metadata, Value, Error schemas
  • Message Routing: Properties, correlation IDs, routing keys
  • Schema Evolution: Backward/forward compatibility testing
  • Serialization: Schema validation and data integrity

Key Features:

  • Schema Validation: Ensures all message schemas accept valid data and reject invalid data
  • Field Contracts: Validates required vs optional fields and type constraints
  • Nested Schema Support: Tests complex schemas with embedded objects and arrays
  • Routing Contracts: Validates message properties and routing conventions
  • Evolution Testing: Backward compatibility and schema versioning support

Running Contract Tests

Run All Contract Tests

pytest tests/contract/ -m contract

Run Specific Contract Test Categories

# Message schema contracts
pytest tests/contract/test_message_contracts.py -v

# Specific test class
pytest tests/contract/test_message_contracts.py::TestTextCompletionMessageContracts -v

# Schema evolution tests
pytest tests/contract/test_message_contracts.py::TestSchemaEvolutionContracts -v

Run with Coverage

pytest tests/contract/ -m contract --cov=trustgraph.schema --cov-report=html

Contract Test Patterns

1. Schema Validation Pattern

@pytest.mark.contract
def test_schema_contract(self, sample_message_data):
    """Test that schema accepts valid data and rejects invalid data"""
    # Arrange
    valid_data = sample_message_data["SchemaName"]
    
    # Act & Assert
    assert validate_schema_contract(SchemaClass, valid_data)
    
    # Test field constraints
    instance = SchemaClass(**valid_data)
    assert hasattr(instance, 'required_field')
    assert isinstance(instance.required_field, expected_type)

2. Serialization Contract Pattern

@pytest.mark.contract  
def test_serialization_contract(self, sample_message_data):
    """Test schema serialization/deserialization contracts"""
    # Arrange
    data = sample_message_data["SchemaName"]
    
    # Act & Assert
    assert serialize_deserialize_test(SchemaClass, data)

3. Evolution Contract Pattern

@pytest.mark.contract
def test_backward_compatibility_contract(self, schema_evolution_data):
    """Test that new schema versions accept old data formats"""
    # Arrange
    old_version_data = schema_evolution_data["SchemaName_v1"]
    
    # Act - Should work with current schema
    instance = CurrentSchema(**old_version_data)
    
    # Assert - Required fields maintained
    assert instance.required_field == expected_value

Schema Registry

The contract tests maintain a registry of all TrustGraph schemas:

schema_registry = {
    # Text Completion
    "TextCompletionRequest": TextCompletionRequest,
    "TextCompletionResponse": TextCompletionResponse,
    
    # Document RAG  
    "DocumentRagQuery": DocumentRagQuery,
    "DocumentRagResponse": DocumentRagResponse,
    
    # Agent
    "AgentRequest": AgentRequest,
    "AgentResponse": AgentResponse,
    
    # Graph/Knowledge
    "Chunk": Chunk,
    "Triple": Triple,
    "Triples": Triples,
    "Value": Value,
    
    # Common
    "Metadata": Metadata,
    "Error": Error,
}

Message Contract Specifications

Text Completion Service Contract

TextCompletionRequest:
  required_fields: [system, prompt]
  field_types:
    system: string
    prompt: string

TextCompletionResponse:
  required_fields: [error, response, model]  
  field_types:
    error: Error | null
    response: string | null
    in_token: integer | null
    out_token: integer | null
    model: string

Document RAG Service Contract

DocumentRagQuery:
  required_fields: [query, user, collection]
  field_types:
    query: string
    user: string
    collection: string
    doc_limit: integer

DocumentRagResponse:
  required_fields: [error, response]
  field_types:
    error: Error | null
    response: string | null

Agent Service Contract

AgentRequest:
  required_fields: [question, history]
  field_types:
    question: string
    plan: string
    state: string
    history: Array<AgentStep>

AgentResponse:
  required_fields: [error]
  field_types:
    answer: string | null
    error: Error | null
    thought: string | null
    observation: string | null

Best Practices

Contract Test Design

  1. Test Both Valid and Invalid Data: Ensure schemas accept valid data and reject invalid data
  2. Verify Field Constraints: Test type constraints, required vs optional fields
  3. Test Nested Schemas: Validate complex objects with embedded schemas
  4. Test Array Fields: Ensure array serialization maintains order and content
  5. Test Optional Fields: Verify optional field handling in serialization

Schema Evolution

  1. Backward Compatibility: New schema versions must accept old message formats
  2. Required Field Stability: Required fields should never become optional or be removed
  3. Additive Changes: New fields should be optional to maintain compatibility
  4. Deprecation Strategy: Plan deprecation path for schema changes

Error Handling

  1. Error Schema Consistency: All error responses use consistent Error schema
  2. Error Type Contracts: Error types follow naming conventions
  3. Error Message Format: Error messages provide actionable information

Adding New Contract Tests

When adding new message schemas or modifying existing ones:

  1. Add to Schema Registry: Update conftest.py schema registry
  2. Add Sample Data: Create valid sample data in conftest.py
  3. Create Contract Tests: Follow existing patterns for validation
  4. Test Evolution: Add backward compatibility tests
  5. Update Documentation: Document schema contracts in this README

Integration with CI/CD

Contract tests should be run:

  • On every commit to detect breaking changes early
  • Before releases to ensure API stability
  • On schema changes to validate compatibility
  • In dependency updates to catch breaking changes
# CI/CD pipeline command
pytest tests/contract/ -m contract --junitxml=contract-test-results.xml

Contract Test Results

Contract tests provide:

  • Schema Compatibility Reports: Which schemas pass/fail validation
  • Breaking Change Detection: Identifies contract violations
  • Evolution Validation: Confirms backward compatibility
  • Field Constraint Verification: Validates data type contracts

This ensures that TrustGraph services can evolve independently while maintaining stable, compatible interfaces for all service communication.