plano

mirror of https://github.com/katanemo/plano.git synced 2026-06-02 14:35:14 +02:00

Author	SHA1	Message	Date
Adil Hafeez	c90b699c90	fix: surface real upstream error messages from orchestrator HTTP client `post_and_extract_content` was unconditionally deserializing the upstream response body as a `ChatCompletionsResponse`, which meant 4xx/5xx error bodies (OpenAI-style `{"error": {...}}` envelopes) failed with confusing messages like `missing field 'id' at line 1 column 391`. The real upstream message (e.g. "This model's maximum context length is 32768 tokens...") only appeared once as a warn log and then got buried in the generic "Failed to parse JSON response" path. Now we: - Check the HTTP status before attempting to parse the success body. - On non-2xx, extract a human-readable message from the OpenAI-style error envelope (or fall back to a UTF-8-safe truncated raw body). - Return a dedicated `HttpError::Upstream { status, message }` variant so callers can log / surface / retry based on the real status code. - Truncate raw bodies in warn logs to 512 bytes (UTF-8-safe) to avoid flooding logs with oversized JSON or HTML error pages.	2026-04-17 18:41:15 -07:00
Adil Hafeez	321c28da37	fix: truncate oversized user messages in orchestrator routing prompt The orchestrator trimmer had a bypass that kept the latest user message whole even when it alone exceeded the configured token budget. This caused brightstaff to send a ~500KB prompt to the Plano-Orchestrator model, which rejected it with a 400 "context length exceeded" from the upstream 32K-token window. Brightstaff then surfaced a confusing "missing field id" parse error instead of the real upstream message. Fix the bypass by trimming the overflowing user message from the end toward the beginning until it fits in the remaining token budget. The beginning of the message (where user intent usually lives) is preserved and the tail is dropped. Added a UTF-8-safe byte-truncation helper and a regression test that mirrors the production payload (a single ~500KB user message with a small budget).	2026-04-17 18:00:02 -07:00
Adil Hafeez	37600fd07a	fix: passthrough_auth accepts Anthropic x-api-key and normalizes to upstream format (#892 ) Some checks are pending CI / pre-commit (push) Waiting to run CI / plano-tools-tests (push) Waiting to run CI / native-smoke-test (push) Waiting to run CI / docker-build (push) Waiting to run CI / validate-config (push) Waiting to run CI / security-scan (push) Blocked by required conditions CI / test-prompt-gateway (push) Blocked by required conditions CI / test-model-alias-routing (push) Blocked by required conditions CI / test-responses-api-with-state (push) Blocked by required conditions CI / e2e-plano-tests (3.10) (push) Blocked by required conditions CI / e2e-plano-tests (3.11) (push) Blocked by required conditions CI / e2e-plano-tests (3.12) (push) Blocked by required conditions CI / e2e-plano-tests (3.13) (push) Blocked by required conditions CI / e2e-plano-tests (3.14) (push) Blocked by required conditions CI / e2e-demo-preference (push) Blocked by required conditions CI / e2e-demo-currency (push) Blocked by required conditions Publish docker image (latest) / build-arm64 (push) Waiting to run Publish docker image (latest) / build-amd64 (push) Waiting to run Publish docker image (latest) / create-manifest (push) Blocked by required conditions Build and Deploy Documentation / build (push) Waiting to run	2026-04-17 17:23:05 -07:00
Adil Hafeez	0f67b2c806	planoai obs: live LLM observability TUI (#891 )	2026-04-17 14:03:47 -07:00
Adil Hafeez	1f701258cb	Zero-config planoai up: pass-through proxy with auto-detected providers (#890 )	2026-04-17 13:11:12 -07:00
Adil Hafeez	711e4dd07d	Add DigitalOcean as a first-class LLM provider (#889 )	2026-04-17 12:25:34 -07:00
Adil Hafeez	90b926c2ce	use plano-orchestrator for LLM routing, remove arch-router (#886 )	2026-04-15 16:41:42 -07:00
Musa	980faef6be	Redis-backed session cache for cross-replica model affinity (#879 ) Some checks failed CI / pre-commit (push) Has been cancelled CI / plano-tools-tests (push) Has been cancelled CI / native-smoke-test (push) Has been cancelled CI / docker-build (push) Has been cancelled CI / validate-config (push) Has been cancelled Publish docker image (latest) / build-arm64 (push) Has been cancelled Publish docker image (latest) / build-amd64 (push) Has been cancelled Build and Deploy Documentation / build (push) Has been cancelled CI / security-scan (push) Has been cancelled CI / test-prompt-gateway (push) Has been cancelled CI / test-model-alias-routing (push) Has been cancelled CI / test-responses-api-with-state (push) Has been cancelled CI / e2e-plano-tests (3.10) (push) Has been cancelled CI / e2e-plano-tests (3.11) (push) Has been cancelled CI / e2e-plano-tests (3.12) (push) Has been cancelled CI / e2e-plano-tests (3.13) (push) Has been cancelled CI / e2e-plano-tests (3.14) (push) Has been cancelled CI / e2e-demo-preference (push) Has been cancelled CI / e2e-demo-currency (push) Has been cancelled Publish docker image (latest) / create-manifest (push) Has been cancelled * add pluggable session cache with Redis backend * add Redis session affinity demos (Docker Compose and Kubernetes) * address PR review feedback on session cache * document Redis session cache backend for model affinity * sync rendered config reference with session_cache addition * add tenant-scoped Redis session cache keys and remove dead log_affinity_hit - Add tenant_header to SessionCacheConfig; when set, cache keys are scoped as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation - Thread tenant_id through RouterService, routing_service, and llm handlers - Use Cow<'_, str> in session_key to avoid allocation when no tenant is set - Remove unused log_affinity_hit (logging was already inlined at call sites) * remove session_affinity_redis and session_affinity_redis_k8s demos	2026-04-13 19:30:47 -07:00
Adil Hafeez	8dedf0bec1	Model affinity for consistent model selection in agentic loops (#827 ) Some checks are pending CI / pre-commit (push) Waiting to run CI / plano-tools-tests (push) Waiting to run CI / native-smoke-test (push) Waiting to run CI / docker-build (push) Waiting to run CI / validate-config (push) Waiting to run CI / security-scan (push) Blocked by required conditions CI / test-prompt-gateway (push) Blocked by required conditions CI / test-model-alias-routing (push) Blocked by required conditions CI / test-responses-api-with-state (push) Blocked by required conditions CI / e2e-plano-tests (3.10) (push) Blocked by required conditions CI / e2e-plano-tests (3.11) (push) Blocked by required conditions CI / e2e-plano-tests (3.12) (push) Blocked by required conditions CI / e2e-plano-tests (3.13) (push) Blocked by required conditions CI / e2e-plano-tests (3.14) (push) Blocked by required conditions CI / e2e-demo-preference (push) Blocked by required conditions CI / e2e-demo-currency (push) Blocked by required conditions Publish docker image (latest) / build-arm64 (push) Waiting to run Publish docker image (latest) / build-amd64 (push) Waiting to run Publish docker image (latest) / create-manifest (push) Blocked by required conditions Build and Deploy Documentation / build (push) Waiting to run	2026-04-08 17:32:02 -07:00
Musa	978b1ea722	Add first-class Xiaomi provider support (#863 ) Some checks failed CI / pre-commit (push) Has been cancelled CI / plano-tools-tests (push) Has been cancelled CI / native-smoke-test (push) Has been cancelled CI / docker-build (push) Has been cancelled CI / validate-config (push) Has been cancelled CI / security-scan (push) Has been cancelled CI / test-prompt-gateway (push) Has been cancelled CI / test-model-alias-routing (push) Has been cancelled CI / test-responses-api-with-state (push) Has been cancelled CI / e2e-plano-tests (3.10) (push) Has been cancelled CI / e2e-plano-tests (3.11) (push) Has been cancelled CI / e2e-plano-tests (3.12) (push) Has been cancelled CI / e2e-plano-tests (3.13) (push) Has been cancelled CI / e2e-plano-tests (3.14) (push) Has been cancelled CI / e2e-demo-preference (push) Has been cancelled CI / e2e-demo-currency (push) Has been cancelled Publish docker image (latest) / build-arm64 (push) Has been cancelled Publish docker image (latest) / build-amd64 (push) Has been cancelled Publish docker image (latest) / create-manifest (push) Has been cancelled Build and Deploy Documentation / build (push) Has been cancelled * feat(provider): add xiaomi as first-class provider * feat(demos): add xiaomi mimo integration demo * refactor(demos): remove Xiaomi MiMo integration demo and update documentation * updating model list and adding the xiamoi models --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-04-04 09:58:36 -07:00
Adil Hafeez	7606c55b4b	support developer role in chat completions API (#867 )	2026-04-02 18:10:32 -07:00
Musa	f68c21f8df	Handle null prefer in inline routing policy (#856 ) * Handle null prefer in inline routing policy * Use serde defaulting for null selection preference * Add tests for default selection policy behavior in routing preferences	2026-03-31 17:41:25 -07:00
Musa	3dbda9741e	fix: route Perplexity OpenAI endpoints without /v1 (#854 ) * fix: route Perplexity OpenAI paths without /v1 * add tests for Perplexity provider handling in LLM module * refactor: use constant for Perplexity provider prefix in LLM module * moving const to top of file	2026-03-31 17:40:42 -07:00
Adil Hafeez	d8f4fd76e3	replace production panics with graceful error handling in common crate (#844 )	2026-03-31 14:28:11 -07:00
Adil Hafeez	af98c11a6d	restructure model_metrics_sources to type + provider (#855 )	2026-03-30 17:12:20 -07:00
Adil Hafeez	e5751d6b13	model routing: cost/latency ranking with ranked fallback list (#849 )	2026-03-30 13:46:52 -07:00
Salman Paracha	69df124c47	the orchestrator had a bug where it was setting the wrong headers for archfc.katanemo.dev (#839 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-03-20 00:40:47 -07:00
Adil Hafeez	1ad3e0f64e	refactor brightstaff (#736 )	2026-03-19 17:58:33 -07:00
Adil Hafeez	1f23c573bf	add output filter chain (#822 )	2026-03-18 17:58:20 -07:00
Salman Paracha	4bb5c6404f	adding new supported models to plano (#829 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-03-15 12:37:20 -07:00
Adil Hafeez	bc059aed4d	Unified overrides for custom router and orchestrator models (#820 ) * support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference	2026-03-15 09:36:11 -07:00
Musa	6610097659	Support for Codex via Plano (#808 ) * Add Codex CLI support; xAI response improvements * Add native Plano running check and update CLI agent error handling * adding PR suggestions for transformations and code quality * message extraction logic in ResponsesAPIRequest * xAI support for Responses API by routing to native endpoint + refactor code	2026-03-10 20:54:14 -07:00
Adil Hafeez	97b7a390ef	support inline routing_policy in request body (#811 ) (#815 )	2026-03-10 12:23:18 -07:00
Adil Hafeez	028a2cd196	add routing service (#814 ) fixes https://github.com/katanemo/plano/issues/810	2026-03-09 16:32:16 -07:00
Musa	2bde21ff57	add Custom Trace Attributes to extend observability (#708 ) * add custom trace attributes * refactor: prefix custom trace attributes and update schema handlers tests configs * refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers * docs: add section on custom span attributes * refactor: update tracing configuration to use span attributes and adjust related handlers * docs: custom span attributes section to include static attributes and clarify configuration * add custom trace attributes * refactor: prefix custom trace attributes and update schema handlers tests configs * refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers * docs: add section on custom span attributes * refactor: update tracing configuration to use span attributes and adjust related handlers * docs: custom span attributes section to include static attributes and clarify configuration * refactor: remove TraceCollector usage and enhance logging with structured attributes * refactor: custom trace attribute extraction to improve clarity --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-25 16:27:20 -08:00
Syed A. Hashmi	54bc8e5e52	[ISSUE 706]: Standardize returned errors from Plano (#772 ) * [ISSUE 706]: Standardize returned errors from Plano * Standardized errors in chat completion	2026-02-24 14:34:33 -08:00
Adil Hafeez	baeee56f6b	Make model field optional in request types, resolve from default provider (#768 )	2026-02-18 04:43:59 -08:00
Adil Hafeez	1df43872a6	Fix code scanning and dependabot security alerts (#756 ) * Fix code scanning and dependabot security alerts Code scanning fixes (14 alerts): - Fix XSS in OG image route by validating request origin against allowlist - Fix incomplete URL sanitization in blog layout using exact hostname matching - Bind port-check socket to 127.0.0.1 instead of 0.0.0.0 - Add explicit permissions to 7 GitHub Actions workflows Dependabot fixes: - Update @isaacs/brace-expansion 5.0.0 -> 5.0.1 (CVE-2026-25547) - Update bytes 1.10.1 -> 1.11.1 (CVE-2026-25541) - Update time 0.3.41 -> 0.3.47 (CVE-2026-25727) - Update cryptography 45.0.7 -> 46.0.5 (CVE-2026-26007) - Update python-multipart 0.0.20 -> 0.0.22 (CVE-2026-24486) - Update urllib3 2.6.2 -> 2.6.3 in test lockfiles (CVE-2026-21441) - Update Werkzeug 3.1.4 -> 3.1.5 (CVE-2026-21860) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR review feedback - Replace plano.katanemo.com with planoai.dev in allowed hosts - Add planoai.dev to OG route and blog layout allowlists - Revert socket bind to 0.0.0.0 (intentional for port-in-use check) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 12:27:07 -08:00
Adil Hafeez	ba651aaf71	Rename all arch references to plano (#745 ) * Rename all arch references to plano across the codebase Complete rebrand from "Arch"/"archgw" to "Plano" including: - Config files: arch_config_schema.yaml, workflow, demo configs - Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_* - Python CLI: variables, functions, file paths, docker mounts - Rust crates: config paths, log messages, metadata keys - Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore - Docker Compose: volume mounts and env vars across all demos/tests - GitHub workflows: job/step names - Shell scripts: log messages - Demos: Python code, READMEs, VS Code configs, Grafana dashboard - Docs: RST includes, code comments, config references - Package metadata: package.json, pyproject.toml, uv.lock External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update remaining arch references in docs - Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_* - Update label references in request_lifecycle.rst - Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml - Update config YAML comments: "Arch creates/uses" → "Plano creates/uses" - Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst - Update arch_config_schema.yaml reference in provider_models.py - Rename arch_agent_router → plano_agent_router in config example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix remaining arch references found in second pass - config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE, arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs - config/test_passthrough.yaml: container mount path - tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml) - cli/planoai/core.py: comment and log message - crates/brightstaff/src/tracing/constants.rs: doc comment - tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages, arch_state/arch_messages variables renamed - tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages - demos/shared/test_runner/{common,test_demos}.py: same renames - tests/e2e/test_model_alias_routing.py: docstring - .dockerignore: archgw_modelserver → plano_modelserver - demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name Note: x-arch-* HTTP header values and Rust constant names intentionally preserved for backwards compatibility with existing deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 15:16:56 -08:00
Salman Paracha	0557f7ff98	updated the models list to include models like Opus 4.6 (#753 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-02-13 15:08:11 -08:00
Musa	e3bf2b7f71	Introduce brand new CLI experience with tracing and quickstart (#724 ) Release hardens tracing and routing: clearer CLI, modular internals, updated demos/docs/tests, and improved multi-agent reliability. Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>	2026-02-10 13:17:43 -08:00
Adil Hafeez	46de89590b	use standard tracing and logging in brightstaff (#721 )	2026-02-09 13:33:27 -08:00
Adil Hafeez	e41aa0a617	upgrade rust to 1.93.0 and fix pre-commit (#720 )	2026-02-02 11:03:12 -08:00
Salman Paracha	2941392ed1	Adding support for wildcard models in the model_providers config (#696 ) * cleaning up plano cli commands * adding support for wildcard model providers * fixing compile errors * fixing bugs related to default model provider, provider hint and duplicates in the model provider list * fixed cargo fmt issues * updating tests to always include the model id * using default for the prompt_gateway path * fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config * making sure that all aliases and models match the config * fixed the config generator to allow for base_url providers LLMs to include wildcard models * re-ran the models list utility and added a shell script to run it * updating docs to mention wildcard model providers * updated provider_models.json to yaml, added that file to our docs for reference * updating the build docs to use the new root-based build --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-28 17:47:33 -08:00
Salman Paracha	cdc1d7cee2	making Messages.Content optional, and having the upstream LLM fail if the right fields aren't set (#699 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-16 16:24:03 -08:00
Adil Hafeez	626f556cc6	reduce number of info statements in pipeline processor (#698 ) Co-authored-by: Adil Hafeez <adil.hafeez10@t-mobile.com>	2026-01-16 15:38:43 -08:00
Tang Quoc Thai	4d53297c17	feat: add passthrough_auth option for forwarding client Authorization header (#687 ) * feat: add passthrough_auth option for forwarding client Authorization header * fix tests * Update comment to reflect upstream forwarding * Apply suggestions from code review --------- Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2026-01-14 15:06:28 -08:00
Adil Hafeez	ab391f96c7	don't include internal models in /v1/models endpoint (#685 )	2026-01-09 16:57:41 -08:00
Adil Hafeez	11fb4cd633	remove unnecessary clones from code (#682 )	2026-01-08 15:11:05 -08:00
Adil Hafeez	78b2ae0cf7	pass request_id in orchestrator and routing model (#678 )	2026-01-07 12:04:10 -08:00
Salman Paracha	b4543ba56c	Introduce signals change (#655 ) * adding support for signals * reducing false positives for signals like positive interaction * adding docs. Still need to fix the messages list, but waiting on PR #621 * Improve frustration detection: normalize contractions and refine punctuation * Further refine test cases with longer messages * minor doc changes * fixing echo statement for build * fixing the messages construction and using the trait for signals * update signals docs * fixed some minor doc changes * added more tests and fixed docuemtnation. PR 100% ready * made fixes based on PR comments * Optimize latency 1. replace sliding window approach with trigram containment check 2. add code to pre-compute ngrams for patterns * removed some debug statements to make tests easier to read * PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges> * fixed PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com> Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>	2026-01-07 11:20:44 -08:00
Adil Hafeez	57327ba667	ensure that request id is consistent (#677 ) * ensure that request id is consistent * remove test debug/info statements	2026-01-07 08:44:41 -08:00
Adil Hafeez	ca95ffb63d	cargo clippy (#660 )	2025-12-25 21:08:37 -08:00
Salman Paracha	e224cba3e3	Update docs to Plano (#639 )	2025-12-23 17:14:50 -08:00
Adil Hafeez	15fbb6c3af	plano orchestration using plano orchestration 4b model (#637 )	2025-12-22 18:05:49 -08:00
Salman Paracha	48bbc7cce7	fixed reasoning failures (#634 ) * fixed reasoning failures * adding debugging * made several fixes for transmission isses for SSeEvents, incomplete handling of json types by anthropic, and wrote a bunch of tests * removed debugging from supervisord.conf --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-18 11:02:59 -08:00
Adil Hafeez	2f9121407b	Use mcp tools for filter chain (#621 ) * agents framework demo * more changes * add more changes * pending changes * fix tests * fix more * rebase with main and better handle error from mcp * add trace for filters * add test for client error, server error and for mcp error * update schema validate code and rename kind => type in agent_filter * fix agent description and pre-commit * fix tests * add provider specific request parsing in agents chat * fix precommit and tests * cleanup demo * update readme * fix pre-commit * refactor tracing * fix fmt * fix: handle MessageContent enum in responses API conversion - Update request.rs to handle new MessageContent enum structure from main - MessageContent can now be Text(String) or Items(Vec<InputContent>) - Handle new InputItem variants (ItemReference, FunctionCallOutput) - Fixes compilation error after merging latest main (#632) * address pr feedback * fix span * fix build * update openai version	2025-12-17 17:30:14 -08:00
Shuguang Chen	cb82a83c7b	orchestration integration (#623 ) * orchestration integration * Convert compact json to spaced json	2025-12-17 17:20:19 -08:00
Salman Paracha	d5a273f740	enable state management for v1/responses (#631 ) * first commit with tests to enable state mamangement via memory * fixed logs to follow the conversational flow a bit better * added support for supabase * added the state_storage_v1_responses flag, and use that to store state appropriately * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixed mixed inputs from openai v1/responses api (#632) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> * resolving PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-17 12:18:38 -08:00
Salman Paracha	33e90dd338	fixed mixed inputs from openai v1/responses api (#632 ) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-16 13:39:13 -08:00

1 2 3

149 commits