trustgraph/tests/integration/README.md
cybermaggedon 2f7fddd206
Test suite executed from CI pipeline (#433)
* Test strategy & test cases

* Unit tests

* Integration tests
2025-07-14 14:57:44 +01:00

10 KiB

Integration Test Pattern for TrustGraph

This directory contains integration tests that verify the coordination between multiple TrustGraph services and components, following the patterns outlined in TEST_STRATEGY.md.

Integration Test Approach

Integration tests focus on service-to-service communication patterns and end-to-end message flows while still using mocks for external infrastructure.

Key Principles

  1. Test Service Coordination: Verify that services work together correctly
  2. Mock External Dependencies: Use mocks for databases, APIs, and infrastructure
  3. Real Business Logic: Exercise actual service logic and data transformations
  4. Error Propagation: Test how errors flow through the system
  5. Configuration Testing: Verify services respond correctly to different configurations

Test Structure

Fixtures (conftest.py)

Common fixtures for integration tests:

  • mock_pulsar_client: Mock Pulsar messaging client
  • mock_flow_context: Mock flow context for service coordination
  • integration_config: Standard configuration for integration tests
  • sample_documents: Test document collections
  • sample_embeddings: Test embedding vectors
  • sample_queries: Test query sets

Test Patterns

1. End-to-End Flow Testing

@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_end_to_end_flow(self, service_instance, mock_clients):
    """Test complete service pipeline from input to output"""
    # Arrange - Set up realistic test data
    # Act - Execute the full service workflow
    # Assert - Verify coordination between all components

2. Error Propagation Testing

@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_error_handling(self, service_instance, mock_clients):
    """Test how errors propagate through service coordination"""
    # Arrange - Set up failure scenarios
    # Act - Execute service with failing dependency
    # Assert - Verify proper error handling and cleanup

3. Configuration Testing

@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_configuration_scenarios(self, service_instance):
    """Test service behavior with different configurations"""
    # Test multiple configuration scenarios
    # Verify service adapts correctly to each configuration

Running Integration Tests

Run All Integration Tests

pytest tests/integration/ -m integration

Run Specific Test

pytest tests/integration/test_document_rag_integration.py::TestDocumentRagIntegration::test_document_rag_end_to_end_flow -v

Run with Coverage (Skip Coverage Requirement)

pytest tests/integration/ -m integration --cov=trustgraph --cov-fail-under=0

Run Slow Tests

pytest tests/integration/ -m "integration and slow"

Skip Slow Tests

pytest tests/integration/ -m "integration and not slow"

Examples: Integration Test Implementations

1. Document RAG Integration Test

The test_document_rag_integration.py demonstrates the integration test pattern:

What It Tests

  • Service Coordination: Embeddings → Document Retrieval → Prompt Generation
  • Error Handling: Failure scenarios for each service dependency
  • Configuration: Different document limits, users, and collections
  • Performance: Large document set handling

Key Features

  • Realistic Data Flow: Uses actual service logic with mocked dependencies
  • Multiple Scenarios: Success, failure, and edge cases
  • Verbose Logging: Tests logging functionality
  • Multi-User Support: Tests user and collection isolation

Test Coverage

  • End-to-end happy path
  • No documents found scenario
  • Service failure scenarios (embeddings, documents, prompt)
  • Configuration variations
  • Multi-user isolation
  • Performance testing
  • Verbose logging

2. Text Completion Integration Test

The test_text_completion_integration.py demonstrates external API integration testing:

What It Tests

  • External API Integration: OpenAI API connectivity and authentication
  • Rate Limiting: Proper handling of API rate limits and retries
  • Error Handling: API failures, connection timeouts, and error propagation
  • Token Tracking: Accurate input/output token counting and metrics
  • Configuration: Different model parameters and settings
  • Concurrency: Multiple simultaneous API requests

Key Features

  • Realistic Mock Responses: Uses actual OpenAI API response structures
  • Authentication Testing: API key validation and base URL configuration
  • Error Scenarios: Rate limits, connection failures, invalid requests
  • Performance Metrics: Timing and token usage validation
  • Model Flexibility: Tests different GPT models and parameters

Test Coverage

  • Successful text completion generation
  • Multiple model configurations (GPT-3.5, GPT-4, GPT-4-turbo)
  • Rate limit handling (RateLimitError → TooManyRequests)
  • API error handling and propagation
  • Token counting accuracy
  • Prompt construction and parameter validation
  • Authentication patterns and API key validation
  • Concurrent request processing
  • Response content extraction and validation
  • Performance timing measurements

3. Agent Manager Integration Test

The test_agent_manager_integration.py demonstrates complex service coordination testing:

What It Tests

  • ReAct Pattern: Think-Act-Observe cycles with multi-step reasoning
  • Tool Coordination: Selection and execution of different tools (knowledge query, text completion, MCP tools)
  • Conversation State: Management of conversation history and context
  • Multi-Service Integration: Coordination between prompt, graph RAG, and tool services
  • Error Handling: Tool failures, unknown tools, and error propagation
  • Configuration Management: Dynamic tool loading and configuration

Key Features

  • Complex Coordination: Tests agent reasoning with multiple tool options
  • Stateful Processing: Maintains conversation history across interactions
  • Dynamic Tool Selection: Tests tool selection based on context and reasoning
  • Callback Pattern: Tests think/observe callback mechanisms
  • JSON Serialization: Handles complex data structures in prompts
  • Performance Testing: Large conversation history handling

Test Coverage

  • Basic reasoning cycle with tool selection
  • Final answer generation (ending ReAct cycle)
  • Full ReAct cycle with tool execution
  • Conversation history management
  • Multiple tool coordination and selection
  • Tool argument validation and processing
  • Error handling (unknown tools, execution failures)
  • Context integration and additional prompting
  • Empty tool configuration handling
  • Tool response processing and cleanup
  • Performance with large conversation history
  • JSON serialization in complex prompts

4. Knowledge Graph Extract → Store Pipeline Integration Test

The test_kg_extract_store_integration.py demonstrates multi-stage pipeline testing:

What It Tests

  • Text-to-Graph Transformation: Complete pipeline from text chunks to graph triples
  • Entity Extraction: Definition extraction with proper URI generation
  • Relationship Extraction: Subject-predicate-object relationship extraction
  • Graph Database Integration: Storage coordination with Cassandra knowledge store
  • Data Validation: Entity filtering, validation, and consistency checks
  • Pipeline Coordination: Multi-stage processing with proper data flow

Key Features

  • Multi-Stage Pipeline: Tests definitions → relationships → storage coordination
  • Graph Data Structures: RDF triples, entity contexts, and graph embeddings
  • URI Generation: Consistent entity URI creation across pipeline stages
  • Data Transformation: Complex text analysis to structured graph data
  • Batch Processing: Large document set processing performance
  • Error Resilience: Graceful handling of extraction failures

Test Coverage

  • Definitions extraction pipeline (text → entities + definitions)
  • Relationships extraction pipeline (text → subject-predicate-object)
  • URI generation consistency between processors
  • Triple generation from definitions and relationships
  • Knowledge store integration (triples and embeddings storage)
  • End-to-end pipeline coordination
  • Error handling in extraction services
  • Empty and invalid extraction results handling
  • Entity filtering and validation
  • Large batch processing performance
  • Metadata propagation through pipeline stages

Best Practices

Test Organization

  • Group related tests in classes
  • Use descriptive test names that explain the scenario
  • Follow the Arrange-Act-Assert pattern
  • Use appropriate pytest markers (@pytest.mark.integration, @pytest.mark.slow)

Mock Strategy

  • Mock external services (databases, APIs, message brokers)
  • Use real service logic and data transformations
  • Create realistic mock responses that match actual service behavior
  • Reset mocks between tests to ensure isolation

Test Data

  • Use realistic test data that reflects actual usage patterns
  • Create reusable fixtures for common test scenarios
  • Test with various data sizes and edge cases
  • Include both success and failure scenarios

Error Testing

  • Test each dependency failure scenario
  • Verify proper error propagation and cleanup
  • Test timeout and retry mechanisms
  • Validate error response formats

Performance Testing

  • Mark performance tests with @pytest.mark.slow
  • Test with realistic data volumes
  • Set reasonable performance expectations
  • Monitor resource usage during tests

Adding New Integration Tests

  1. Identify Service Dependencies: Map out which services your target service coordinates with
  2. Create Mock Fixtures: Set up mocks for each dependency in conftest.py
  3. Design Test Scenarios: Plan happy path, error cases, and edge conditions
  4. Implement Tests: Follow the established patterns in this directory
  5. Add Documentation: Update this README with your new test patterns

Test Markers

  • @pytest.mark.integration: Marks tests as integration tests
  • @pytest.mark.slow: Marks tests that take longer to run
  • @pytest.mark.asyncio: Required for async test functions

Future Enhancements

  • Add tests with real test containers for database integration
  • Implement contract testing for service interfaces
  • Add performance benchmarking for critical paths
  • Create integration test templates for common service patterns