plano

mirror of https://github.com/katanemo/plano.git synced 2026-04-29 10:56:35 +02:00

Author	SHA1	Message	Date
Musa	980faef6be	Redis-backed session cache for cross-replica model affinity (#879 ) Some checks failed CI / pre-commit (push) Has been cancelled Details CI / plano-tools-tests (push) Has been cancelled Details CI / native-smoke-test (push) Has been cancelled Details CI / docker-build (push) Has been cancelled Details CI / validate-config (push) Has been cancelled Details Publish docker image (latest) / build-arm64 (push) Has been cancelled Details Publish docker image (latest) / build-amd64 (push) Has been cancelled Details Build and Deploy Documentation / build (push) Has been cancelled Details CI / security-scan (push) Has been cancelled Details CI / test-prompt-gateway (push) Has been cancelled Details CI / test-model-alias-routing (push) Has been cancelled Details CI / test-responses-api-with-state (push) Has been cancelled Details CI / e2e-plano-tests (3.10) (push) Has been cancelled Details CI / e2e-plano-tests (3.11) (push) Has been cancelled Details CI / e2e-plano-tests (3.12) (push) Has been cancelled Details CI / e2e-plano-tests (3.13) (push) Has been cancelled Details CI / e2e-plano-tests (3.14) (push) Has been cancelled Details CI / e2e-demo-preference (push) Has been cancelled Details CI / e2e-demo-currency (push) Has been cancelled Details Publish docker image (latest) / create-manifest (push) Has been cancelled Details * add pluggable session cache with Redis backend * add Redis session affinity demos (Docker Compose and Kubernetes) * address PR review feedback on session cache * document Redis session cache backend for model affinity * sync rendered config reference with session_cache addition * add tenant-scoped Redis session cache keys and remove dead log_affinity_hit - Add tenant_header to SessionCacheConfig; when set, cache keys are scoped as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation - Thread tenant_id through RouterService, routing_service, and llm handlers - Use Cow<'_, str> in session_key to avoid allocation when no tenant is set - Remove unused log_affinity_hit (logging was already inlined at call sites) * remove session_affinity_redis and session_affinity_redis_k8s demos	2026-04-13 19:30:47 -07:00
Adil Hafeez	8dedf0bec1	Model affinity for consistent model selection in agentic loops (#827 ) Some checks are pending CI / pre-commit (push) Waiting to run Details CI / plano-tools-tests (push) Waiting to run Details CI / native-smoke-test (push) Waiting to run Details CI / docker-build (push) Waiting to run Details CI / validate-config (push) Waiting to run Details CI / security-scan (push) Blocked by required conditions Details CI / test-prompt-gateway (push) Blocked by required conditions Details CI / test-model-alias-routing (push) Blocked by required conditions Details CI / test-responses-api-with-state (push) Blocked by required conditions Details CI / e2e-plano-tests (3.10) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.11) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.12) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.13) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.14) (push) Blocked by required conditions Details CI / e2e-demo-preference (push) Blocked by required conditions Details CI / e2e-demo-currency (push) Blocked by required conditions Details Publish docker image (latest) / build-arm64 (push) Waiting to run Details Publish docker image (latest) / build-amd64 (push) Waiting to run Details Publish docker image (latest) / create-manifest (push) Blocked by required conditions Details Build and Deploy Documentation / build (push) Waiting to run Details	2026-04-08 17:32:02 -07:00
Musa	978b1ea722	Add first-class Xiaomi provider support (#863 ) Some checks failed CI / pre-commit (push) Has been cancelled Details CI / plano-tools-tests (push) Has been cancelled Details CI / native-smoke-test (push) Has been cancelled Details CI / docker-build (push) Has been cancelled Details CI / validate-config (push) Has been cancelled Details CI / security-scan (push) Has been cancelled Details CI / test-prompt-gateway (push) Has been cancelled Details CI / test-model-alias-routing (push) Has been cancelled Details CI / test-responses-api-with-state (push) Has been cancelled Details CI / e2e-plano-tests (3.10) (push) Has been cancelled Details CI / e2e-plano-tests (3.11) (push) Has been cancelled Details CI / e2e-plano-tests (3.12) (push) Has been cancelled Details CI / e2e-plano-tests (3.13) (push) Has been cancelled Details CI / e2e-plano-tests (3.14) (push) Has been cancelled Details CI / e2e-demo-preference (push) Has been cancelled Details CI / e2e-demo-currency (push) Has been cancelled Details Publish docker image (latest) / build-arm64 (push) Has been cancelled Details Publish docker image (latest) / build-amd64 (push) Has been cancelled Details Publish docker image (latest) / create-manifest (push) Has been cancelled Details Build and Deploy Documentation / build (push) Has been cancelled Details * feat(provider): add xiaomi as first-class provider * feat(demos): add xiaomi mimo integration demo * refactor(demos): remove Xiaomi MiMo integration demo and update documentation * updating model list and adding the xiamoi models --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-04-04 09:58:36 -07:00
Musa	f68c21f8df	Handle null prefer in inline routing policy (#856 ) * Handle null prefer in inline routing policy * Use serde defaulting for null selection preference * Add tests for default selection policy behavior in routing preferences	2026-03-31 17:41:25 -07:00
Adil Hafeez	af98c11a6d	restructure model_metrics_sources to type + provider (#855 )	2026-03-30 17:12:20 -07:00
Adil Hafeez	e5751d6b13	model routing: cost/latency ranking with ranked fallback list (#849 )	2026-03-30 13:46:52 -07:00
Adil Hafeez	1f23c573bf	add output filter chain (#822 )	2026-03-18 17:58:20 -07:00
Adil Hafeez	bc059aed4d	Unified overrides for custom router and orchestrator models (#820 ) * support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference	2026-03-15 09:36:11 -07:00
Musa	2bde21ff57	add Custom Trace Attributes to extend observability (#708 ) * add custom trace attributes * refactor: prefix custom trace attributes and update schema handlers tests configs * refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers * docs: add section on custom span attributes * refactor: update tracing configuration to use span attributes and adjust related handlers * docs: custom span attributes section to include static attributes and clarify configuration * add custom trace attributes * refactor: prefix custom trace attributes and update schema handlers tests configs * refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers * docs: add section on custom span attributes * refactor: update tracing configuration to use span attributes and adjust related handlers * docs: custom span attributes section to include static attributes and clarify configuration * refactor: remove TraceCollector usage and enhance logging with structured attributes * refactor: custom trace attribute extraction to improve clarity --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-25 16:27:20 -08:00
Adil Hafeez	ba651aaf71	Rename all arch references to plano (#745 ) * Rename all arch references to plano across the codebase Complete rebrand from "Arch"/"archgw" to "Plano" including: - Config files: arch_config_schema.yaml, workflow, demo configs - Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_* - Python CLI: variables, functions, file paths, docker mounts - Rust crates: config paths, log messages, metadata keys - Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore - Docker Compose: volume mounts and env vars across all demos/tests - GitHub workflows: job/step names - Shell scripts: log messages - Demos: Python code, READMEs, VS Code configs, Grafana dashboard - Docs: RST includes, code comments, config references - Package metadata: package.json, pyproject.toml, uv.lock External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update remaining arch references in docs - Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_* - Update label references in request_lifecycle.rst - Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml - Update config YAML comments: "Arch creates/uses" → "Plano creates/uses" - Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst - Update arch_config_schema.yaml reference in provider_models.py - Rename arch_agent_router → plano_agent_router in config example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix remaining arch references found in second pass - config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE, arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs - config/test_passthrough.yaml: container mount path - tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml) - cli/planoai/core.py: comment and log message - crates/brightstaff/src/tracing/constants.rs: doc comment - tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages, arch_state/arch_messages variables renamed - tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages - demos/shared/test_runner/{common,test_demos}.py: same renames - tests/e2e/test_model_alias_routing.py: docstring - .dockerignore: archgw_modelserver → plano_modelserver - demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name Note: x-arch-* HTTP header values and Rust constant names intentionally preserved for backwards compatibility with existing deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 15:16:56 -08:00
Adil Hafeez	46de89590b	use standard tracing and logging in brightstaff (#721 )	2026-02-09 13:33:27 -08:00
Salman Paracha	2941392ed1	Adding support for wildcard models in the model_providers config (#696 ) * cleaning up plano cli commands * adding support for wildcard model providers * fixing compile errors * fixing bugs related to default model provider, provider hint and duplicates in the model provider list * fixed cargo fmt issues * updating tests to always include the model id * using default for the prompt_gateway path * fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config * making sure that all aliases and models match the config * fixed the config generator to allow for base_url providers LLMs to include wildcard models * re-ran the models list utility and added a shell script to run it * updating docs to mention wildcard model providers * updated provider_models.json to yaml, added that file to our docs for reference * updating the build docs to use the new root-based build --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-28 17:47:33 -08:00
Tang Quoc Thai	4d53297c17	feat: add passthrough_auth option for forwarding client Authorization header (#687 ) * feat: add passthrough_auth option for forwarding client Authorization header * fix tests * Update comment to reflect upstream forwarding * Apply suggestions from code review --------- Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2026-01-14 15:06:28 -08:00
Adil Hafeez	ab391f96c7	don't include internal models in /v1/models endpoint (#685 )	2026-01-09 16:57:41 -08:00
Adil Hafeez	ca95ffb63d	cargo clippy (#660 )	2025-12-25 21:08:37 -08:00
Salman Paracha	e224cba3e3	Update docs to Plano (#639 )	2025-12-23 17:14:50 -08:00
Adil Hafeez	15fbb6c3af	plano orchestration using plano orchestration 4b model (#637 )	2025-12-22 18:05:49 -08:00
Adil Hafeez	2f9121407b	Use mcp tools for filter chain (#621 ) * agents framework demo * more changes * add more changes * pending changes * fix tests * fix more * rebase with main and better handle error from mcp * add trace for filters * add test for client error, server error and for mcp error * update schema validate code and rename kind => type in agent_filter * fix agent description and pre-commit * fix tests * add provider specific request parsing in agents chat * fix precommit and tests * cleanup demo * update readme * fix pre-commit * refactor tracing * fix fmt * fix: handle MessageContent enum in responses API conversion - Update request.rs to handle new MessageContent enum structure from main - MessageContent can now be Text(String) or Items(Vec<InputContent>) - Handle new InputItem variants (ItemReference, FunctionCallOutput) - Fixes compilation error after merging latest main (#632) * address pr feedback * fix span * fix build * update openai version	2025-12-17 17:30:14 -08:00
Shuguang Chen	cb82a83c7b	orchestration integration (#623 ) * orchestration integration * Convert compact json to spaced json	2025-12-17 17:20:19 -08:00
Salman Paracha	d5a273f740	enable state management for v1/responses (#631 ) * first commit with tests to enable state mamangement via memory * fixed logs to follow the conversational flow a bit better * added support for supabase * added the state_storage_v1_responses flag, and use that to store state appropriately * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixed mixed inputs from openai v1/responses api (#632) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> * resolving PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-17 12:18:38 -08:00
Salman Paracha	cdfcfb9169	support base_url path for model providers (#608 ) * adding support for base_url * updated docs * fixed tests for config generator * making fixes based on PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-10-29 17:08:07 -07:00
Salman Paracha	9407ae6af7	Add support for Amazon Bedrock Converse and ConverseStream (#588 ) * first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames * adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent * Claude Code works with Amazon Bedrock * added tests for openai streaming from bedrock * PR comments fixed * adding support for bedrock in docs as supported provider * cargo fmt * revertted to chatgpt models for claude code routing --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local> Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>	2025-10-22 11:31:21 -07:00
Adil Hafeez	96e0732089	add support for agents (#564 )	2025-10-14 14:01:11 -07:00
Salman Paracha	226139e907	adding support for Qwen models and fixed issue with passing PATH vari… (#583 ) * adding support for Qwen models and fixed issue with passing PATH variable * don't need to have qwen in the model alias routing example * fixed base_url for qwen --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-10-01 21:57:58 -07:00
Salman Paracha	045a5e9751	adding support for moonshot and z-ai (#578 ) * adding support for moonshot and z-ai * Revert unwanted changes to arch_config.yaml --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-09-30 12:24:06 -07:00
Salman Paracha	fbe82351c0	Salmanap/fix docs new providers model alias (#571 ) * fixed docs and added ollama as a first-class LLM provider * matching the LLM routing section on the README.md to the docs * updated the section on preference-based routing --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-19 10:19:57 -07:00
Salman Paracha	8d0b468345	draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570 ) * draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI * fixing failing tests and updating rederend config file * Update arch_config_with_aliases.yaml * adding the AZURE_API_KEY to the GH workflow for e2e * fixing GH secerts * adding valdiating for azure_openai --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-18 18:36:30 -07:00
Salman Paracha	4eb2b410c5	adding support for model aliases in archgw (#566 ) * adding support for model aliases in archgw * fixed PR based on feedback * removing README. Not relevant for PR --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>	2025-09-16 11:12:08 -07:00
Salman Paracha	fb0581fd39	add support for v1/messages and transformations (#558 ) * pushing draft PR * transformations are working. Now need to add some tests next * updated tests and added necessary response transformations for Anthropics' message response object * fixed bugs for integration tests * fixed doc tests * fixed serialization issues with enums on response * adding some debug logs to help * fixed issues with non-streaming responses * updated the stream_context to update response bytes * the serialized bytes length must be set in the response side * fixed the debug statement that was causing the integration tests for wasm to fail * fixing json parsing errors * intentionally removing the headers * making sure that we convert the raw bytes to the correct provider type upstream * fixing non-streaming responses to tranform correctly * /v1/messages works with transformations to and from /v1/chat/completions * updating the CLI and demos to support anthropic vs. claude * adding the anthropic key to the preference based routing tests * fixed test cases and added more structured logs * fixed integration tests and cleaned up logs * added python client tests for anthropic and openai * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixing the tests. python dependency order was broken * updated the openAI client to fix demos * removed the raw response debug statement * fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits * fixing logs * moved away from string literals to consts * fixed streaming from Anthropic Client to OpenAI * removed debug statement that would likely trip up integration tests * fixed integration tests for llm_gateway * cleaned up test cases and removed unnecessary crates * fixing comments from PR * fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>	2025-09-10 07:40:30 -07:00
Salman Paracha	89ab51697a	updating the implementation of /v1/chat/completions to use the generi… (#548 ) * updating the implementation of /v1/chat/completions to use the generic provider interfaces * saving changes, although we will need a small re-factor after this as well * more refactoring changes, getting close * more refactoring changes to avoid unecessary re-direction and duplication * more clean up * more refactoring * more refactoring to clean code and make stream_context.rs work * removing unecessary trait implemenations * some more clean-up * fixed bugs * fixing test cases, and making sure all references to the ChatCOmpletions* objects point to the new types * refactored changes to support enum dispatch * removed the dependency on try_streaming_from_bytes into a try_from trait implementation * updated readme based on new usage * updated code based on code review comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-2.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>	2025-08-20 12:55:29 -07:00
Adil Hafeez	d341f4365b	In request path use same format for usage preferences as arch_config (#533 )	2025-07-21 18:31:19 -07:00
Adil Hafeez	a7fddf30f9	better model names (#517 )	2025-07-11 16:42:16 -07:00
Adil Hafeez	147908ba7e	make arch-router cluster optional (#518 )	2025-07-08 00:33:40 -07:00
Adil Hafeez	00dc95e034	Add support for updating model preferences (#510 )	2025-07-02 14:08:19 -07:00
Adil Hafeez	aa9d747fa9	add support for gemini (#505 )	2025-06-11 15:15:00 -07:00
Adil Hafeez	6c53510f49	Introduce hermesllm library to handle llm message translation (#501 )	2025-06-10 12:53:27 -07:00
Adil Hafeez	0d190a6e5c	update code to use new json based system prompt for routing (#493 )	2025-05-30 17:40:46 -07:00
Adil Hafeez	8d12a9a6e0	add arch provider (#494 )	2025-05-30 17:12:52 -07:00
Adil Hafeez	f5e77bbe65	add support for claude and add first class support for groq and deepseek (#479 )	2025-05-22 22:55:46 -07:00
Adil Hafeez	27c0f2fdce	Introduce brightstaff a new terminal service for llm routing (#477 )	2025-05-19 09:59:22 -07:00
Adil Hafeez	84cd1df7bf	add preliminary support for llm agents (#432 )	2025-03-19 15:21:34 -07:00
Adil Hafeez	e40b13be05	Update arch_config and add tests for arch config file (#407 )	2025-02-14 19:28:10 -08:00
Adil Hafeez	8de6eacfbd	spotify demo with optimized context window code change (#397 )	2025-02-07 19:14:15 -08:00
Adil Hafeez	2bd61d628c	add ability to specify custom http headers in api endpoint (#386 )	2025-02-06 11:48:09 -08:00
Adil Hafeez	07ef3149b8	add support for using custom upstream llm (#365 )	2025-01-17 18:25:55 -08:00
Shuguang Chen	ba7279becb	Use intent model from archfc to pick prompt gateway (#328 )	2024-12-20 13:25:01 -08:00
José Ulises Niño Rivera	d002b2042a	Break apart common_types mod (#334 ) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-12-06 17:25:42 -08:00
Adil Hafeez	a54db1a098	update getting started guide and add llm gateway and prompt gateway samples (#330 )	2024-12-06 14:37:33 -08:00
José Ulises Niño Rivera	be8c3c9ea3	Remove blanket unused imports from the common crate (#292 ) * Remove blanket unused imports from the common crate Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com> * updatE Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com> --------- Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-11-25 17:19:06 -08:00
Adil Hafeez	a72bb804eb	add support for jaeger tracing (#229 )	2024-11-07 22:11:00 -06:00

1 2

53 commits