trustgraph/tests/integration/README.md
cybermaggedon 89be656990
Release/v1.2 (#457)
* Bump setup.py versions for 1.1

* PoC MCP server (#419)

* Very initial MCP server PoC for TrustGraph

* Put service on port 8000

* Add MCP container and packages to buildout

* Update docs for API/CLI changes in 1.0 (#421)

* Update some API basics for the 0.23/1.0 API change

* Add MCP container push (#425)

* Add command args to the MCP server (#426)

* Host and port parameters

* Added websocket arg

* More docs

* MCP client support (#427)

- MCP client service
- Tool request/response schema
- API gateway support for mcp-tool
- Message translation for tool request & response
- Make mcp-tool using configuration service for information
  about where the MCP services are.

* Feature/react call mcp (#428)

Key Features

  - MCP Tool Integration: Added core MCP tool support with ToolClientSpec and ToolClient classes
  - API Enhancement: New mcp_tool method for flow-specific tool invocation
  - CLI Tooling: New tg-invoke-mcp-tool command for testing MCP integration
  - React Agent Enhancement: Fixed and improved multi-tool invocation capabilities
  - Tool Management: Enhanced CLI for tool configuration and management

Changes

  - Added MCP tool invocation to API with flow-specific integration
  - Implemented ToolClientSpec and ToolClient for tool call handling
  - Updated agent-manager-react to invoke MCP tools with configurable types
  - Enhanced CLI with new commands and improved help text
  - Added comprehensive documentation for new CLI commands
  - Improved tool configuration management

Testing

  - Added tg-invoke-mcp-tool CLI command for isolated MCP integration testing
  - Enhanced agent capability to invoke multiple tools simultaneously

* Test suite executed from CI pipeline (#433)

* Test strategy & test cases

* Unit tests

* Integration tests

* Extending test coverage (#434)

* Contract tests

* Testing embeedings

* Agent unit tests

* Knowledge pipeline tests

* Turn on contract tests

* Increase storage test coverage (#435)

* Fixing storage and adding tests

* PR pipeline only runs quick tests

* Empty configuration is returned as empty list, previously was not in response (#436)

* Update config util to take files as well as command-line text (#437)

* Updated CLI invocation and config model for tools and mcp (#438)

* Updated CLI invocation and config model for tools and mcp

* CLI anomalies

* Tweaked the MCP tool implementation for new model

* Update agent implementation to match the new model

* Fix agent tools, now all tested

* Fixed integration tests

* Fix MCP delete tool params

* Update Python deps to 1.2

* Update to enable knowledge extraction using the agent framework (#439)

* Implement KG extraction agent (kg-extract-agent)

* Using ReAct framework (agent-manager-react)
 
* ReAct manager had an issue when emitting JSON, which conflicts which ReAct manager's own JSON messages, so refactored ReAct manager to use traditional ReAct messages, non-JSON structure.
 
* Minor refactor to take the prompt template client out of prompt-template so it can be more readily used by other modules. kg-extract-agent uses this framework.

* Migrate from setup.py to pyproject.toml (#440)

* Converted setup.py to pyproject.toml

* Modern package infrastructure as recommended by py docs

* Install missing build deps (#441)

* Install missing build deps (#442)

* Implement logging strategy (#444)

* Logging strategy and convert all prints() to logging invocations

* Fix/startup failure (#445)

* Fix loggin startup problems

* Fix logging startup problems (#446)

* Fix logging startup problems (#447)

* Fixed Mistral OCR to use current API (#448)

* Fixed Mistral OCR to use current API

* Added PDF decoder tests

* Fix Mistral OCR ident to be standard pdf-decoder (#450)

* Fix Mistral OCR ident to be standard pdf-decoder

* Correct test

* Schema structure refactor (#451)

* Write schema refactor spec

* Implemented schema refactor spec

* Structure data mvp (#452)

* Structured data tech spec

* Architecture principles

* New schemas

* Updated schemas and specs

* Object extractor

* Add .coveragerc

* New tests

* Cassandra object storage

* Trying to object extraction working, issues exist

* Validate librarian collection (#453)

* Fix token chunker, broken API invocation (#454)

* Fix token chunker, broken API invocation (#455)

* Knowledge load utility CLI (#456)

* Knowledge loader

* More tests
2025-08-18 20:56:09 +01:00

269 lines
No EOL
10 KiB
Markdown

# Integration Test Pattern for TrustGraph
This directory contains integration tests that verify the coordination between multiple TrustGraph services and components, following the patterns outlined in [TEST_STRATEGY.md](../../TEST_STRATEGY.md).
## Integration Test Approach
Integration tests focus on **service-to-service communication patterns** and **end-to-end message flows** while still using mocks for external infrastructure.
### Key Principles
1. **Test Service Coordination**: Verify that services work together correctly
2. **Mock External Dependencies**: Use mocks for databases, APIs, and infrastructure
3. **Real Business Logic**: Exercise actual service logic and data transformations
4. **Error Propagation**: Test how errors flow through the system
5. **Configuration Testing**: Verify services respond correctly to different configurations
## Test Structure
### Fixtures (conftest.py)
Common fixtures for integration tests:
- `mock_pulsar_client`: Mock Pulsar messaging client
- `mock_flow_context`: Mock flow context for service coordination
- `integration_config`: Standard configuration for integration tests
- `sample_documents`: Test document collections
- `sample_embeddings`: Test embedding vectors
- `sample_queries`: Test query sets
### Test Patterns
#### 1. End-to-End Flow Testing
```python
@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_end_to_end_flow(self, service_instance, mock_clients):
"""Test complete service pipeline from input to output"""
# Arrange - Set up realistic test data
# Act - Execute the full service workflow
# Assert - Verify coordination between all components
```
#### 2. Error Propagation Testing
```python
@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_error_handling(self, service_instance, mock_clients):
"""Test how errors propagate through service coordination"""
# Arrange - Set up failure scenarios
# Act - Execute service with failing dependency
# Assert - Verify proper error handling and cleanup
```
#### 3. Configuration Testing
```python
@pytest.mark.integration
@pytest.mark.asyncio
async def test_service_configuration_scenarios(self, service_instance):
"""Test service behavior with different configurations"""
# Test multiple configuration scenarios
# Verify service adapts correctly to each configuration
```
## Running Integration Tests
### Run All Integration Tests
```bash
pytest tests/integration/ -m integration
```
### Run Specific Test
```bash
pytest tests/integration/test_document_rag_integration.py::TestDocumentRagIntegration::test_document_rag_end_to_end_flow -v
```
### Run with Coverage (Skip Coverage Requirement)
```bash
pytest tests/integration/ -m integration --cov=trustgraph --cov-fail-under=0
```
### Run Slow Tests
```bash
pytest tests/integration/ -m "integration and slow"
```
### Skip Slow Tests
```bash
pytest tests/integration/ -m "integration and not slow"
```
## Examples: Integration Test Implementations
### 1. Document RAG Integration Test
The `test_document_rag_integration.py` demonstrates the integration test pattern:
### What It Tests
- **Service Coordination**: Embeddings → Document Retrieval → Prompt Generation
- **Error Handling**: Failure scenarios for each service dependency
- **Configuration**: Different document limits, users, and collections
- **Performance**: Large document set handling
### Key Features
- **Realistic Data Flow**: Uses actual service logic with mocked dependencies
- **Multiple Scenarios**: Success, failure, and edge cases
- **Verbose Logging**: Tests logging functionality
- **Multi-User Support**: Tests user and collection isolation
### Test Coverage
- ✅ End-to-end happy path
- ✅ No documents found scenario
- ✅ Service failure scenarios (embeddings, documents, prompt)
- ✅ Configuration variations
- ✅ Multi-user isolation
- ✅ Performance testing
- ✅ Verbose logging
### 2. Text Completion Integration Test
The `test_text_completion_integration.py` demonstrates external API integration testing:
### What It Tests
- **External API Integration**: OpenAI API connectivity and authentication
- **Rate Limiting**: Proper handling of API rate limits and retries
- **Error Handling**: API failures, connection timeouts, and error propagation
- **Token Tracking**: Accurate input/output token counting and metrics
- **Configuration**: Different model parameters and settings
- **Concurrency**: Multiple simultaneous API requests
### Key Features
- **Realistic Mock Responses**: Uses actual OpenAI API response structures
- **Authentication Testing**: API key validation and base URL configuration
- **Error Scenarios**: Rate limits, connection failures, invalid requests
- **Performance Metrics**: Timing and token usage validation
- **Model Flexibility**: Tests different GPT models and parameters
### Test Coverage
- ✅ Successful text completion generation
- ✅ Multiple model configurations (GPT-3.5, GPT-4, GPT-4-turbo)
- ✅ Rate limit handling (RateLimitError → TooManyRequests)
- ✅ API error handling and propagation
- ✅ Token counting accuracy
- ✅ Prompt construction and parameter validation
- ✅ Authentication patterns and API key validation
- ✅ Concurrent request processing
- ✅ Response content extraction and validation
- ✅ Performance timing measurements
### 3. Agent Manager Integration Test
The `test_agent_manager_integration.py` demonstrates complex service coordination testing:
### What It Tests
- **ReAct Pattern**: Think-Act-Observe cycles with multi-step reasoning
- **Tool Coordination**: Selection and execution of different tools (knowledge query, text completion, MCP tools)
- **Conversation State**: Management of conversation history and context
- **Multi-Service Integration**: Coordination between prompt, graph RAG, and tool services
- **Error Handling**: Tool failures, unknown tools, and error propagation
- **Configuration Management**: Dynamic tool loading and configuration
### Key Features
- **Complex Coordination**: Tests agent reasoning with multiple tool options
- **Stateful Processing**: Maintains conversation history across interactions
- **Dynamic Tool Selection**: Tests tool selection based on context and reasoning
- **Callback Pattern**: Tests think/observe callback mechanisms
- **JSON Serialization**: Handles complex data structures in prompts
- **Performance Testing**: Large conversation history handling
### Test Coverage
- ✅ Basic reasoning cycle with tool selection
- ✅ Final answer generation (ending ReAct cycle)
- ✅ Full ReAct cycle with tool execution
- ✅ Conversation history management
- ✅ Multiple tool coordination and selection
- ✅ Tool argument validation and processing
- ✅ Error handling (unknown tools, execution failures)
- ✅ Context integration and additional prompting
- ✅ Empty tool configuration handling
- ✅ Tool response processing and cleanup
- ✅ Performance with large conversation history
- ✅ JSON serialization in complex prompts
### 4. Knowledge Graph Extract → Store Pipeline Integration Test
The `test_kg_extract_store_integration.py` demonstrates multi-stage pipeline testing:
### What It Tests
- **Text-to-Graph Transformation**: Complete pipeline from text chunks to graph triples
- **Entity Extraction**: Definition extraction with proper URI generation
- **Relationship Extraction**: Subject-predicate-object relationship extraction
- **Graph Database Integration**: Storage coordination with Cassandra knowledge store
- **Data Validation**: Entity filtering, validation, and consistency checks
- **Pipeline Coordination**: Multi-stage processing with proper data flow
### Key Features
- **Multi-Stage Pipeline**: Tests definitions → relationships → storage coordination
- **Graph Data Structures**: RDF triples, entity contexts, and graph embeddings
- **URI Generation**: Consistent entity URI creation across pipeline stages
- **Data Transformation**: Complex text analysis to structured graph data
- **Batch Processing**: Large document set processing performance
- **Error Resilience**: Graceful handling of extraction failures
### Test Coverage
- ✅ Definitions extraction pipeline (text → entities + definitions)
- ✅ Relationships extraction pipeline (text → subject-predicate-object)
- ✅ URI generation consistency between processors
- ✅ Triple generation from definitions and relationships
- ✅ Knowledge store integration (triples and embeddings storage)
- ✅ End-to-end pipeline coordination
- ✅ Error handling in extraction services
- ✅ Empty and invalid extraction results handling
- ✅ Entity filtering and validation
- ✅ Large batch processing performance
- ✅ Metadata propagation through pipeline stages
## Best Practices
### Test Organization
- Group related tests in classes
- Use descriptive test names that explain the scenario
- Follow the Arrange-Act-Assert pattern
- Use appropriate pytest markers (`@pytest.mark.integration`, `@pytest.mark.slow`)
### Mock Strategy
- Mock external services (databases, APIs, message brokers)
- Use real service logic and data transformations
- Create realistic mock responses that match actual service behavior
- Reset mocks between tests to ensure isolation
### Test Data
- Use realistic test data that reflects actual usage patterns
- Create reusable fixtures for common test scenarios
- Test with various data sizes and edge cases
- Include both success and failure scenarios
### Error Testing
- Test each dependency failure scenario
- Verify proper error propagation and cleanup
- Test timeout and retry mechanisms
- Validate error response formats
### Performance Testing
- Mark performance tests with `@pytest.mark.slow`
- Test with realistic data volumes
- Set reasonable performance expectations
- Monitor resource usage during tests
## Adding New Integration Tests
1. **Identify Service Dependencies**: Map out which services your target service coordinates with
2. **Create Mock Fixtures**: Set up mocks for each dependency in conftest.py
3. **Design Test Scenarios**: Plan happy path, error cases, and edge conditions
4. **Implement Tests**: Follow the established patterns in this directory
5. **Add Documentation**: Update this README with your new test patterns
## Test Markers
- `@pytest.mark.integration`: Marks tests as integration tests
- `@pytest.mark.slow`: Marks tests that take longer to run
- `@pytest.mark.asyncio`: Required for async test functions
## Future Enhancements
- Add tests with real test containers for database integration
- Implement contract testing for service interfaces
- Add performance benchmarking for critical paths
- Create integration test templates for common service patterns