plano

mirror of https://github.com/katanemo/plano.git synced 2026-04-28 18:36:34 +02:00

Author	SHA1	Message	Date
Musa	897fda2deb	fix(routing): auto-migrate v0.3.0 inline routing_preferences to v0.4.0 top-level (#912 ) * fix(routing): auto-migrate v0.3.0 inline routing_preferences to v0.4.0 top-level Lift inline routing_preferences under each model_provider into the top-level routing_preferences list with merged models[] and bump version to v0.4.0, with a deprecation warning. Existing v0.3.0 demo configs (Claude Code, Codex, preference_based_routing, etc.) keep working unchanged. Schema flags the inline shape as deprecated but still accepts it. Docs and skills updated to canonical top-level multi-model form. * test(common): bump reference config assertion to v0.4.0 The rendered reference config was bumped to v0.4.0 when its inline routing_preferences were lifted to the top level; align the configuration deserialization test with that change. * fix(config_generator): bump version to v0.4.0 up front in migration Move the v0.3.0 -> v0.4.0 version bump to the top of migrate_inline_routing_preferences so it runs unconditionally, including for configs that already declare top-level routing_preferences at v0.3.0. Previously the bump only fired when inline migration produced entries, leaving top-level v0.3.0 configs rejected by brightstaff's v0.4.0 gate. Tests updated to cover the new behavior and to confirm we never downgrade newer versions. * fix(config_generator): gate routing_preferences migration on version < v0.4.0 Short-circuit the migration when the config already declares v0.4.0 or newer. Anything at v0.4.0+ is assumed to be on the canonical top-level shape and is passed through untouched, including stray inline preferences (which are the author's bug to fix). Only v0.3.0 and older configs are rewritten and bumped.	2026-04-24 12:31:44 -07:00
Adil Hafeez	6701195a5d	add overrides.disable_signals to skip CPU-heavy signal analysis (#906 )	2026-04-23 11:38:29 -07:00
Adil Hafeez	90b926c2ce	use plano-orchestrator for LLM routing, remove arch-router (#886 )	2026-04-15 16:41:42 -07:00
Musa	980faef6be	Redis-backed session cache for cross-replica model affinity (#879 ) Some checks failed CI / pre-commit (push) Has been cancelled Details CI / plano-tools-tests (push) Has been cancelled Details CI / native-smoke-test (push) Has been cancelled Details CI / docker-build (push) Has been cancelled Details CI / validate-config (push) Has been cancelled Details Publish docker image (latest) / build-arm64 (push) Has been cancelled Details Publish docker image (latest) / build-amd64 (push) Has been cancelled Details Build and Deploy Documentation / build (push) Has been cancelled Details CI / security-scan (push) Has been cancelled Details CI / test-prompt-gateway (push) Has been cancelled Details CI / test-model-alias-routing (push) Has been cancelled Details CI / test-responses-api-with-state (push) Has been cancelled Details CI / e2e-plano-tests (3.10) (push) Has been cancelled Details CI / e2e-plano-tests (3.11) (push) Has been cancelled Details CI / e2e-plano-tests (3.12) (push) Has been cancelled Details CI / e2e-plano-tests (3.13) (push) Has been cancelled Details CI / e2e-plano-tests (3.14) (push) Has been cancelled Details CI / e2e-demo-preference (push) Has been cancelled Details CI / e2e-demo-currency (push) Has been cancelled Details Publish docker image (latest) / create-manifest (push) Has been cancelled Details * add pluggable session cache with Redis backend * add Redis session affinity demos (Docker Compose and Kubernetes) * address PR review feedback on session cache * document Redis session cache backend for model affinity * sync rendered config reference with session_cache addition * add tenant-scoped Redis session cache keys and remove dead log_affinity_hit - Add tenant_header to SessionCacheConfig; when set, cache keys are scoped as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation - Thread tenant_id through RouterService, routing_service, and llm handlers - Use Cow<'_, str> in session_key to avoid allocation when no tenant is set - Remove unused log_affinity_hit (logging was already inlined at call sites) * remove session_affinity_redis and session_affinity_redis_k8s demos	2026-04-13 19:30:47 -07:00
Adil Hafeez	8dedf0bec1	Model affinity for consistent model selection in agentic loops (#827 ) Some checks are pending CI / pre-commit (push) Waiting to run Details CI / plano-tools-tests (push) Waiting to run Details CI / native-smoke-test (push) Waiting to run Details CI / docker-build (push) Waiting to run Details CI / validate-config (push) Waiting to run Details CI / security-scan (push) Blocked by required conditions Details CI / test-prompt-gateway (push) Blocked by required conditions Details CI / test-model-alias-routing (push) Blocked by required conditions Details CI / test-responses-api-with-state (push) Blocked by required conditions Details CI / e2e-plano-tests (3.10) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.11) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.12) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.13) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.14) (push) Blocked by required conditions Details CI / e2e-demo-preference (push) Blocked by required conditions Details CI / e2e-demo-currency (push) Blocked by required conditions Details Publish docker image (latest) / build-arm64 (push) Waiting to run Details Publish docker image (latest) / build-amd64 (push) Waiting to run Details Publish docker image (latest) / create-manifest (push) Blocked by required conditions Details Build and Deploy Documentation / build (push) Waiting to run Details	2026-04-08 17:32:02 -07:00
Musa	82f34f82f2	Update black hook for Python 3.14 (#857 ) * Update pre-commit black to latest release * Reformat Python files for new black version	2026-03-31 13:18:45 -07:00
Adil Hafeez	e5751d6b13	model routing: cost/latency ranking with ranked fallback list (#849 )	2026-03-30 13:46:52 -07:00
Musa	3a531ce22a	expand configuration reference with missing fields (#851 )	2026-03-30 12:25:05 -07:00
Adil Hafeez	1f23c573bf	add output filter chain (#822 )	2026-03-18 17:58:20 -07:00
Adil Hafeez	bc059aed4d	Unified overrides for custom router and orchestrator models (#820 ) * support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference	2026-03-15 09:36:11 -07:00
Adil Hafeez	f63d5de02c	Run plano natively by default (#744 )	2026-03-05 07:35:25 -08:00
Adil Hafeez	70ad56a258	remove exposed example passwords from documentation (#779 ) * remove exposed example passwords from documentation Replace hardcoded example password (MyPass#123/MyPass%23123) and project-specific Supabase references (postgres.myproject) with generic placeholders in docs. https://claude.ai/code/session_01H5wj3VH1Jh28kzepEwdDCx * remove hardcoded FlightAware AeroAPI key from flights.py https://claude.ai/code/session_01H5wj3VH1Jh28kzepEwdDCx --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-25 13:14:36 -08:00
Adil Hafeez	ba651aaf71	Rename all arch references to plano (#745 ) * Rename all arch references to plano across the codebase Complete rebrand from "Arch"/"archgw" to "Plano" including: - Config files: arch_config_schema.yaml, workflow, demo configs - Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_* - Python CLI: variables, functions, file paths, docker mounts - Rust crates: config paths, log messages, metadata keys - Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore - Docker Compose: volume mounts and env vars across all demos/tests - GitHub workflows: job/step names - Shell scripts: log messages - Demos: Python code, READMEs, VS Code configs, Grafana dashboard - Docs: RST includes, code comments, config references - Package metadata: package.json, pyproject.toml, uv.lock External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update remaining arch references in docs - Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_* - Update label references in request_lifecycle.rst - Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml - Update config YAML comments: "Arch creates/uses" → "Plano creates/uses" - Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst - Update arch_config_schema.yaml reference in provider_models.py - Rename arch_agent_router → plano_agent_router in config example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix remaining arch references found in second pass - config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE, arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs - config/test_passthrough.yaml: container mount path - tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml) - cli/planoai/core.py: comment and log message - crates/brightstaff/src/tracing/constants.rs: doc comment - tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages, arch_state/arch_messages variables renamed - tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages - demos/shared/test_runner/{common,test_demos}.py: same renames - tests/e2e/test_model_alias_routing.py: docstring - .dockerignore: archgw_modelserver → plano_modelserver - demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name Note: x-arch-* HTTP header values and Rust constant names intentionally preserved for backwards compatibility with existing deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 15:16:56 -08:00
Adil Hafeez	46de89590b	use standard tracing and logging in brightstaff (#721 )	2026-02-09 13:33:27 -08:00
Adil Hafeez	062825f26e	add envoy retries (#712 ) * add envoy retries * add missing file * fix tests --------- Co-authored-by: Adil Hafeez <adil.hafeez10@t-mobile.com>	2026-01-28 20:31:01 -08:00
Tang Quoc Thai	4d53297c17	feat: add passthrough_auth option for forwarding client Authorization header (#687 ) * feat: add passthrough_auth option for forwarding client Authorization header * fix tests * Update comment to reflect upstream forwarding * Apply suggestions from code review --------- Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2026-01-14 15:06:28 -08:00
Adil Hafeez	ab391f96c7	don't include internal models in /v1/models endpoint (#685 )	2026-01-09 16:57:41 -08:00
Salman Paracha	e224cba3e3	Update docs to Plano (#639 )	2025-12-23 17:14:50 -08:00
Adil Hafeez	15fbb6c3af	plano orchestration using plano orchestration 4b model (#637 )	2025-12-22 18:05:49 -08:00
Salman Paracha	d5a273f740	enable state management for v1/responses (#631 ) * first commit with tests to enable state mamangement via memory * fixed logs to follow the conversational flow a bit better * added support for supabase * added the state_storage_v1_responses flag, and use that to store state appropriately * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixed mixed inputs from openai v1/responses api (#632) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> * resolving PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-17 12:18:38 -08:00
Salman Paracha	88c2bd1851	removing model_server python module to brightstaff (function calling) (#615 ) * adding function_calling functionality via rust * fixed rendered YAML file * removed model_server from envoy.template and forwarding traffic to bright_staff * fixed bugs in function_calling.rs that were breaking tests. All good now * updating e2e test to clean up disk usage * removing Arch* models to be used as a default model if one is not specified * if the user sets arch-function base_url we should honor it * fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build * adding a constant for Arch-Function model name * fixing some edge cases with calls made to Arch-Function * fixed JSON parsing issues in function_calling.rs * fixed bug where the raw response from Arch-Function was re-encoded * removed debug from supervisord.conf * commenting out disk cleanup * adding back disk space --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-11-22 12:55:00 -08:00
Adil Hafeez	96e0732089	add support for agents (#564 )	2025-10-14 14:01:11 -07:00
Salman Paracha	8d0b468345	draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570 ) * draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI * fixing failing tests and updating rederend config file * Update arch_config_with_aliases.yaml * adding the AZURE_API_KEY to the GH workflow for e2e * fixing GH secerts * adding valdiating for azure_openai --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-18 18:36:30 -07:00
Adil Hafeez	a7fddf30f9	better model names (#517 )	2025-07-11 16:42:16 -07:00
Mat Sylvia	e7b0de2a72	Tweak readme docs for minor nits (#461 ) Co-authored-by: darkdatter <msylvia@tradestax.io>	2025-04-12 23:52:20 -07:00
Adil Hafeez	e40b13be05	Update arch_config and add tests for arch config file (#407 )	2025-02-14 19:28:10 -08:00
Adil Hafeez	07ef3149b8	add support for using custom upstream llm (#365 )	2025-01-17 18:25:55 -08:00
Adil Hafeez	a54db1a098	update getting started guide and add llm gateway and prompt gateway samples (#330 )	2024-12-06 14:37:33 -08:00
Adil Hafeez	e462e393b1	Use large github action machine to run e2e tests (#230 )	2024-10-30 17:54:51 -07:00
Adil Hafeez	285aa1419b	Split listener (#141 )	2024-10-08 16:24:08 -07:00
Shuguang Chen	b30ad791f7	Fix errors and improve Doc (#143 ) * Fix link issues and add icons * Improve Doc * fix test * making minor modifications to shuguangs' doc changes --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2024-10-08 13:18:34 -07:00
Shuguang Chen	5c7567584d	Doc Update (#129 ) * init update * Update terminology.rst * fix the branch to create an index.html, and fix pre-commit issues * Doc update * made several changes to the docs after Shuguang's revision * fixing pre-commit issues * fixed the reference file to the final prompt config file * added google analytics --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-06 16:54:34 -07:00

32 commits