plano/tests/e2e
Adil Hafeez 3a6a672c9d Add mock-based E2E tests and gate live tests to main/nightly
Introduce a new mock-based E2E test suite that uses pytest_httpserver to
simulate LLM provider responses, eliminating the need for real API keys
on PR builds. The mock tests cover model alias routing, protocol
transformation (OpenAI↔Anthropic), Responses API passthrough/translation,
streaming, tool calls, thinking mode, and multi-turn state management.

CI changes:
- Add mock-e2e-tests job (zero secrets, runs on every PR)
- Gate all live E2E jobs to main pushes + nightly schedule
- Scope secrets to only the keys each job actually needs
- Add daily cron schedule for full live test coverage

Also relaxes exact-match assertions in live e2e tests to structural
checks (non-null, non-empty) since LLM output is non-deterministic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 19:33:48 +00:00
..
.vscode better model names (#517) 2025-07-11 16:42:16 -07:00
common.py Rename all arch references to plano (#745) 2026-02-13 15:16:56 -08:00
common_scripts.sh Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
config_memory_state_v1_responses.yaml rename cli to plano (#647) 2025-12-23 18:37:58 -08:00
docker-compose.yaml Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
pyproject.toml improve e2e tests (#731) 2026-02-09 13:20:06 -08:00
README.md Upgrade CI, Docker, and demos to Python 3.14 (#759) 2026-02-15 10:22:33 -08:00
response.hex Add support for Amazon Bedrock Converse and ConverseStream (#588) 2025-10-22 11:31:21 -07:00
response_with_tools.hex Add support for Amazon Bedrock Converse and ConverseStream (#588) 2025-10-22 11:31:21 -07:00
run_e2e_tests.sh Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
run_model_alias_tests.sh Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
run_prompt_gateway_tests.sh Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
run_responses_state_tests.sh Rename all arch references to plano (#745) 2026-02-13 15:16:56 -08:00
test_model_alias_routing.py Add mock-based E2E tests and gate live tests to main/nightly 2026-02-18 19:33:48 +00:00
test_openai_responses_api_client.py disable bedrock tests (#732) 2026-02-10 00:34:00 -08:00
test_openai_responses_api_client_with_state.py enable state management for v1/responses (#631) 2025-12-17 12:18:38 -08:00
test_prompt_gateway.py Add mock-based E2E tests and gate live tests to main/nightly 2026-02-18 19:33:48 +00:00
uv.lock Fix code scanning and dependabot security alerts (#756) 2026-02-14 12:27:07 -08:00

e2e tests

e2e tests for arch llm gateway and prompt gateway

To be able to run e2e tests successfully run_e2e_script prepares environment in following way,

  1. build and start weather_forecast demo (using docker compose)
  2. build, install and start model server async (using uv)
  3. build and start Plano gateway (using docker compose)
  4. wait for model server to be ready
  5. wait for Plano gateway to be ready
  6. start e2e tests (using uv)
    1. runs llm gateway tests for llm routing
    2. runs prompt gateway tests to test function calling, parameter gathering and summarization
  7. cleanup
    1. stops Plano gateway
    2. stops model server
    3. stops weather_forecast demo

How to run

To run locally make sure that following requirements are met.

Requirements

  • Python 3.10+
  • uv
  • Docker

Running tests locally

sh run_e2e_test.sh