mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-25 00:16:23 +02:00
* Bump setup.py versions for 1.1 * PoC MCP server (#419) * Very initial MCP server PoC for TrustGraph * Put service on port 8000 * Add MCP container and packages to buildout * Update docs for API/CLI changes in 1.0 (#421) * Update some API basics for the 0.23/1.0 API change * Add MCP container push (#425) * Add command args to the MCP server (#426) * Host and port parameters * Added websocket arg * More docs * MCP client support (#427) - MCP client service - Tool request/response schema - API gateway support for mcp-tool - Message translation for tool request & response - Make mcp-tool using configuration service for information about where the MCP services are. * Feature/react call mcp (#428) Key Features - MCP Tool Integration: Added core MCP tool support with ToolClientSpec and ToolClient classes - API Enhancement: New mcp_tool method for flow-specific tool invocation - CLI Tooling: New tg-invoke-mcp-tool command for testing MCP integration - React Agent Enhancement: Fixed and improved multi-tool invocation capabilities - Tool Management: Enhanced CLI for tool configuration and management Changes - Added MCP tool invocation to API with flow-specific integration - Implemented ToolClientSpec and ToolClient for tool call handling - Updated agent-manager-react to invoke MCP tools with configurable types - Enhanced CLI with new commands and improved help text - Added comprehensive documentation for new CLI commands - Improved tool configuration management Testing - Added tg-invoke-mcp-tool CLI command for isolated MCP integration testing - Enhanced agent capability to invoke multiple tools simultaneously * Test suite executed from CI pipeline (#433) * Test strategy & test cases * Unit tests * Integration tests * Extending test coverage (#434) * Contract tests * Testing embeedings * Agent unit tests * Knowledge pipeline tests * Turn on contract tests * Increase storage test coverage (#435) * Fixing storage and adding tests * PR pipeline only runs quick tests * Empty configuration is returned as empty list, previously was not in response (#436) * Update config util to take files as well as command-line text (#437) * Updated CLI invocation and config model for tools and mcp (#438) * Updated CLI invocation and config model for tools and mcp * CLI anomalies * Tweaked the MCP tool implementation for new model * Update agent implementation to match the new model * Fix agent tools, now all tested * Fixed integration tests * Fix MCP delete tool params * Update Python deps to 1.2 * Update to enable knowledge extraction using the agent framework (#439) * Implement KG extraction agent (kg-extract-agent) * Using ReAct framework (agent-manager-react) * ReAct manager had an issue when emitting JSON, which conflicts which ReAct manager's own JSON messages, so refactored ReAct manager to use traditional ReAct messages, non-JSON structure. * Minor refactor to take the prompt template client out of prompt-template so it can be more readily used by other modules. kg-extract-agent uses this framework. * Migrate from setup.py to pyproject.toml (#440) * Converted setup.py to pyproject.toml * Modern package infrastructure as recommended by py docs * Install missing build deps (#441) * Install missing build deps (#442) * Implement logging strategy (#444) * Logging strategy and convert all prints() to logging invocations * Fix/startup failure (#445) * Fix loggin startup problems * Fix logging startup problems (#446) * Fix logging startup problems (#447) * Fixed Mistral OCR to use current API (#448) * Fixed Mistral OCR to use current API * Added PDF decoder tests * Fix Mistral OCR ident to be standard pdf-decoder (#450) * Fix Mistral OCR ident to be standard pdf-decoder * Correct test * Schema structure refactor (#451) * Write schema refactor spec * Implemented schema refactor spec * Structure data mvp (#452) * Structured data tech spec * Architecture principles * New schemas * Updated schemas and specs * Object extractor * Add .coveragerc * New tests * Cassandra object storage * Trying to object extraction working, issues exist * Validate librarian collection (#453) * Fix token chunker, broken API invocation (#454) * Fix token chunker, broken API invocation (#455) * Knowledge load utility CLI (#456) * Knowledge loader * More tests
10 KiB
10 KiB
Integration Test Pattern for TrustGraph
This directory contains integration tests that verify the coordination between multiple TrustGraph services and components, following the patterns outlined in TEST_STRATEGY.md.
Integration Test Approach
Integration tests focus on service-to-service communication patterns and end-to-end message flows while still using mocks for external infrastructure.
Key Principles
- Test Service Coordination: Verify that services work together correctly
- Mock External Dependencies: Use mocks for databases, APIs, and infrastructure
- Real Business Logic: Exercise actual service logic and data transformations
- Error Propagation: Test how errors flow through the system
- Configuration Testing: Verify services respond correctly to different configurations
Test Structure
Fixtures (conftest.py)
Common fixtures for integration tests:
mock_pulsar_client: Mock Pulsar messaging clientmock_flow_context: Mock flow context for service coordinationintegration_config: Standard configuration for integration testssample_documents: Test document collectionssample_embeddings: Test embedding vectorssample_queries: Test query sets
Test Patterns
1. End-to-End Flow Testing
@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_end_to_end_flow(self, service_instance, mock_clients):
"""Test complete service pipeline from input to output"""
# Arrange - Set up realistic test data
# Act - Execute the full service workflow
# Assert - Verify coordination between all components
2. Error Propagation Testing
@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_error_handling(self, service_instance, mock_clients):
"""Test how errors propagate through service coordination"""
# Arrange - Set up failure scenarios
# Act - Execute service with failing dependency
# Assert - Verify proper error handling and cleanup
3. Configuration Testing
@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_configuration_scenarios(self, service_instance):
"""Test service behavior with different configurations"""
# Test multiple configuration scenarios
# Verify service adapts correctly to each configuration
Running Integration Tests
Run All Integration Tests
pytest tests/integration/ -m integration
Run Specific Test
pytest tests/integration/test_document_rag_integration.py::TestDocumentRagIntegration::test_document_rag_end_to_end_flow -v
Run with Coverage (Skip Coverage Requirement)
pytest tests/integration/ -m integration --cov=trustgraph --cov-fail-under=0
Run Slow Tests
pytest tests/integration/ -m "integration and slow"
Skip Slow Tests
pytest tests/integration/ -m "integration and not slow"
Examples: Integration Test Implementations
1. Document RAG Integration Test
The test_document_rag_integration.py demonstrates the integration test pattern:
What It Tests
- Service Coordination: Embeddings → Document Retrieval → Prompt Generation
- Error Handling: Failure scenarios for each service dependency
- Configuration: Different document limits, users, and collections
- Performance: Large document set handling
Key Features
- Realistic Data Flow: Uses actual service logic with mocked dependencies
- Multiple Scenarios: Success, failure, and edge cases
- Verbose Logging: Tests logging functionality
- Multi-User Support: Tests user and collection isolation
Test Coverage
- ✅ End-to-end happy path
- ✅ No documents found scenario
- ✅ Service failure scenarios (embeddings, documents, prompt)
- ✅ Configuration variations
- ✅ Multi-user isolation
- ✅ Performance testing
- ✅ Verbose logging
2. Text Completion Integration Test
The test_text_completion_integration.py demonstrates external API integration testing:
What It Tests
- External API Integration: OpenAI API connectivity and authentication
- Rate Limiting: Proper handling of API rate limits and retries
- Error Handling: API failures, connection timeouts, and error propagation
- Token Tracking: Accurate input/output token counting and metrics
- Configuration: Different model parameters and settings
- Concurrency: Multiple simultaneous API requests
Key Features
- Realistic Mock Responses: Uses actual OpenAI API response structures
- Authentication Testing: API key validation and base URL configuration
- Error Scenarios: Rate limits, connection failures, invalid requests
- Performance Metrics: Timing and token usage validation
- Model Flexibility: Tests different GPT models and parameters
Test Coverage
- ✅ Successful text completion generation
- ✅ Multiple model configurations (GPT-3.5, GPT-4, GPT-4-turbo)
- ✅ Rate limit handling (RateLimitError → TooManyRequests)
- ✅ API error handling and propagation
- ✅ Token counting accuracy
- ✅ Prompt construction and parameter validation
- ✅ Authentication patterns and API key validation
- ✅ Concurrent request processing
- ✅ Response content extraction and validation
- ✅ Performance timing measurements
3. Agent Manager Integration Test
The test_agent_manager_integration.py demonstrates complex service coordination testing:
What It Tests
- ReAct Pattern: Think-Act-Observe cycles with multi-step reasoning
- Tool Coordination: Selection and execution of different tools (knowledge query, text completion, MCP tools)
- Conversation State: Management of conversation history and context
- Multi-Service Integration: Coordination between prompt, graph RAG, and tool services
- Error Handling: Tool failures, unknown tools, and error propagation
- Configuration Management: Dynamic tool loading and configuration
Key Features
- Complex Coordination: Tests agent reasoning with multiple tool options
- Stateful Processing: Maintains conversation history across interactions
- Dynamic Tool Selection: Tests tool selection based on context and reasoning
- Callback Pattern: Tests think/observe callback mechanisms
- JSON Serialization: Handles complex data structures in prompts
- Performance Testing: Large conversation history handling
Test Coverage
- ✅ Basic reasoning cycle with tool selection
- ✅ Final answer generation (ending ReAct cycle)
- ✅ Full ReAct cycle with tool execution
- ✅ Conversation history management
- ✅ Multiple tool coordination and selection
- ✅ Tool argument validation and processing
- ✅ Error handling (unknown tools, execution failures)
- ✅ Context integration and additional prompting
- ✅ Empty tool configuration handling
- ✅ Tool response processing and cleanup
- ✅ Performance with large conversation history
- ✅ JSON serialization in complex prompts
4. Knowledge Graph Extract → Store Pipeline Integration Test
The test_kg_extract_store_integration.py demonstrates multi-stage pipeline testing:
What It Tests
- Text-to-Graph Transformation: Complete pipeline from text chunks to graph triples
- Entity Extraction: Definition extraction with proper URI generation
- Relationship Extraction: Subject-predicate-object relationship extraction
- Graph Database Integration: Storage coordination with Cassandra knowledge store
- Data Validation: Entity filtering, validation, and consistency checks
- Pipeline Coordination: Multi-stage processing with proper data flow
Key Features
- Multi-Stage Pipeline: Tests definitions → relationships → storage coordination
- Graph Data Structures: RDF triples, entity contexts, and graph embeddings
- URI Generation: Consistent entity URI creation across pipeline stages
- Data Transformation: Complex text analysis to structured graph data
- Batch Processing: Large document set processing performance
- Error Resilience: Graceful handling of extraction failures
Test Coverage
- ✅ Definitions extraction pipeline (text → entities + definitions)
- ✅ Relationships extraction pipeline (text → subject-predicate-object)
- ✅ URI generation consistency between processors
- ✅ Triple generation from definitions and relationships
- ✅ Knowledge store integration (triples and embeddings storage)
- ✅ End-to-end pipeline coordination
- ✅ Error handling in extraction services
- ✅ Empty and invalid extraction results handling
- ✅ Entity filtering and validation
- ✅ Large batch processing performance
- ✅ Metadata propagation through pipeline stages
Best Practices
Test Organization
- Group related tests in classes
- Use descriptive test names that explain the scenario
- Follow the Arrange-Act-Assert pattern
- Use appropriate pytest markers (
@pytest.mark.integration,@pytest.mark.slow)
Mock Strategy
- Mock external services (databases, APIs, message brokers)
- Use real service logic and data transformations
- Create realistic mock responses that match actual service behavior
- Reset mocks between tests to ensure isolation
Test Data
- Use realistic test data that reflects actual usage patterns
- Create reusable fixtures for common test scenarios
- Test with various data sizes and edge cases
- Include both success and failure scenarios
Error Testing
- Test each dependency failure scenario
- Verify proper error propagation and cleanup
- Test timeout and retry mechanisms
- Validate error response formats
Performance Testing
- Mark performance tests with
@pytest.mark.slow - Test with realistic data volumes
- Set reasonable performance expectations
- Monitor resource usage during tests
Adding New Integration Tests
- Identify Service Dependencies: Map out which services your target service coordinates with
- Create Mock Fixtures: Set up mocks for each dependency in conftest.py
- Design Test Scenarios: Plan happy path, error cases, and edge conditions
- Implement Tests: Follow the established patterns in this directory
- Add Documentation: Update this README with your new test patterns
Test Markers
@pytest.mark.integration: Marks tests as integration tests@pytest.mark.slow: Marks tests that take longer to run@pytest.mark.asyncio: Required for async test functions
Future Enhancements
- Add tests with real test containers for database integration
- Implement contract testing for service interfaces
- Add performance benchmarking for critical paths
- Create integration test templates for common service patterns