Commit graph

145 commits

Author SHA1 Message Date
Adil Hafeez
4845d83100 fix: address review findings from refactoring PR
- Replace unreachable!() with proper error return in orchestrator agent chain
- Remove incorrect #[allow(dead_code)] on routing_provider_name
- Change SerError alias to _ (trait import for method resolution only)
- Remove dead commented-out code in pipeline.rs
- Replace unwrap()s with expect/if-let in LLM handler filter paths
- Make find_listener synchronous (no await needed)
- Unify message truncation logic via streaming::truncate_message

Made-with: Cursor
2026-03-18 18:26:05 -07:00
Adil Hafeez
8ed4b36087 Merge remote-tracking branch 'origin/main' into adil/refactor_brightstaff
Made-with: Cursor

# Conflicts:
#	crates/brightstaff/src/handlers/agents/orchestrator.rs
#	crates/brightstaff/src/handlers/agents/pipeline.rs
#	crates/brightstaff/src/handlers/llm.rs
#	crates/brightstaff/src/main.rs
2026-03-18 18:14:38 -07:00
Adil Hafeez
c7d8ba7556 Merge remote-tracking branch 'origin/main' into adil/refactor_brightstaff
Made-with: Cursor

# Conflicts:
#	crates/brightstaff/src/main.rs
#	crates/brightstaff/src/router/plano_orchestrator.rs
2026-03-18 17:59:20 -07:00
Adil Hafeez
1f23c573bf
add output filter chain (#822) 2026-03-18 17:58:20 -07:00
Salman Paracha
4bb5c6404f
adding new supported models to plano (#829)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>
2026-03-15 12:37:20 -07:00
Adil Hafeez
bc059aed4d
Unified overrides for custom router and orchestrator models (#820)
* support configurable orchestrator model via orchestration config section

* add self-hosting docs and demo for Plano-Orchestrator

* list all Plano-Orchestrator model variants in docs

* use overrides for custom routing and orchestration model

* update docs

* update orchestrator model name

* rename arch provider to plano, use llm_routing_model and agent_orchestration_model

* regenerate rendered config reference
2026-03-15 09:36:11 -07:00
Adil Hafeez
084f23744a fix: restore config load log and state merge JSON payload 2026-03-11 19:16:10 +00:00
Adil Hafeez
bcb7f60005 merge main and resolve conflicts 2026-03-11 18:57:36 +00:00
Musa
6610097659
Support for Codex via Plano (#808)
* Add Codex CLI support; xAI response improvements

* Add native Plano running check and update CLI agent error handling

* adding PR suggestions for transformations and code quality

* message extraction logic in ResponsesAPIRequest

* xAI support for Responses API by routing to native endpoint + refactor code
2026-03-10 20:54:14 -07:00
Adil Hafeez
97b7a390ef
support inline routing_policy in request body (#811) (#815) 2026-03-10 12:23:18 -07:00
Adil Hafeez
028a2cd196
add routing service (#814)
fixes https://github.com/katanemo/plano/issues/810
2026-03-09 16:32:16 -07:00
Adil Hafeez
dd74df6ca8 refactor: decompose orchestrator handler, deduplicate headers, fix unwraps 2026-03-06 23:04:35 +00:00
Adil Hafeez
2c7d3a9c6c fix: cargo fmt 2026-03-06 14:54:02 +00:00
Adil Hafeez
a0513fe191 fix: restore span attributes, OTEL_SERVICE_NAME, config path, and error status 2026-03-06 14:48:32 +00:00
Adil Hafeez
9748cdd857 fix: use correct PLANO_CONFIG_PATH_RENDERED env var name 2026-03-06 14:22:06 +00:00
Adil Hafeez
96a2ebf37a fix: add missing ResponseHandler methods and use BrightStaffError 2026-03-06 14:10:35 +00:00
Adil Hafeez
66e55e1621 refactor: pass AppState to handlers, add shared HTTP client, fix remaining unwraps
- Pass Arc<AppState> directly to llm_chat and agent_chat instead of
  destructuring into individual parameters
- Add shared reqwest::Client to AppState for connection pooling on
  upstream LLM requests
- Fix unwrap panics in pipeline.rs: get_new_session_id now returns
  Result, invoke_agent to_bytes properly handled
- Fix unwrap panics in orchestrator.rs: strip_prefix and pop
- Fix unwrap panics in response.rs: SSE parsing no longer panics
- Fix unwrap panics in router services: serialization errors propagated
- Convert old-style string-format debug log in state/mod.rs to
  structured tracing fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:54:00 +00:00
Adil Hafeez
6fcecb60c3 fix: restore LLM span recordings and replace unwrap panics with proper errors
- Restore llm.model, llm.tools, llm.user_message_preview, llm.temperature
  span field recordings that were lost during refactor
- Replace agents.as_ref().unwrap() and agent_map.get().unwrap() in
  orchestrator with proper error returns
- Replace from_endpoint().unwrap() with ok_or_else returning 400
- Replace to_bytes().unwrap() with match returning 500

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:54:00 +00:00
Adil Hafeez
ebe0e18e3b more cleanups 2026-03-06 12:54:00 +00:00
Adil Hafeez
3fdd8a3a35 refactor brightstaff 2026-03-06 12:54:00 +00:00
Musa
2bde21ff57
add Custom Trace Attributes to extend observability (#708)
* add custom trace attributes

* refactor: prefix custom trace attributes and update schema handlers tests configs

* refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers

* docs: add section on custom span attributes

* refactor: update tracing configuration to use span attributes and adjust related handlers

* docs: custom span attributes section to include static attributes and clarify configuration

* add custom trace attributes

* refactor: prefix custom trace attributes and update schema handlers tests configs

* refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers

* docs: add section on custom span attributes

* refactor: update tracing configuration to use span attributes and adjust related handlers

* docs: custom span attributes section to include static attributes and clarify configuration

* refactor: remove TraceCollector usage and enhance logging with structured attributes

* refactor: custom trace attribute extraction to improve clarity

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-25 16:27:20 -08:00
Syed A. Hashmi
54bc8e5e52
[ISSUE 706]: Standardize returned errors from Plano (#772)
* [ISSUE 706]: Standardize returned errors from Plano

* Standardized errors in chat completion
2026-02-24 14:34:33 -08:00
Adil Hafeez
baeee56f6b
Make model field optional in request types, resolve from default provider (#768) 2026-02-18 04:43:59 -08:00
Adil Hafeez
1df43872a6
Fix code scanning and dependabot security alerts (#756)
* Fix code scanning and dependabot security alerts

Code scanning fixes (14 alerts):
- Fix XSS in OG image route by validating request origin against allowlist
- Fix incomplete URL sanitization in blog layout using exact hostname matching
- Bind port-check socket to 127.0.0.1 instead of 0.0.0.0
- Add explicit permissions to 7 GitHub Actions workflows

Dependabot fixes:
- Update @isaacs/brace-expansion 5.0.0 -> 5.0.1 (CVE-2026-25547)
- Update bytes 1.10.1 -> 1.11.1 (CVE-2026-25541)
- Update time 0.3.41 -> 0.3.47 (CVE-2026-25727)
- Update cryptography 45.0.7 -> 46.0.5 (CVE-2026-26007)
- Update python-multipart 0.0.20 -> 0.0.22 (CVE-2026-24486)
- Update urllib3 2.6.2 -> 2.6.3 in test lockfiles (CVE-2026-21441)
- Update Werkzeug 3.1.4 -> 3.1.5 (CVE-2026-21860)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address PR review feedback

- Replace plano.katanemo.com with planoai.dev in allowed hosts
- Add planoai.dev to OG route and blog layout allowlists
- Revert socket bind to 0.0.0.0 (intentional for port-in-use check)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 12:27:07 -08:00
Adil Hafeez
ba651aaf71
Rename all arch references to plano (#745)
* Rename all arch references to plano across the codebase

Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock

External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update remaining arch references in docs

- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix remaining arch references found in second pass

- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
  arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
  arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name

Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 15:16:56 -08:00
Salman Paracha
0557f7ff98
updated the models list to include models like Opus 4.6 (#753)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-02-13 15:08:11 -08:00
Musa
e3bf2b7f71
Introduce brand new CLI experience with tracing and quickstart (#724)
Release hardens tracing and routing: clearer CLI, modular internals, updated demos/docs/tests, and improved multi-agent reliability.

Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
2026-02-10 13:17:43 -08:00
Adil Hafeez
46de89590b
use standard tracing and logging in brightstaff (#721) 2026-02-09 13:33:27 -08:00
Adil Hafeez
e41aa0a617
upgrade rust to 1.93.0 and fix pre-commit (#720) 2026-02-02 11:03:12 -08:00
Salman Paracha
2941392ed1
Adding support for wildcard models in the model_providers config (#696)
* cleaning up plano cli commands

* adding support for wildcard model providers

* fixing compile errors

* fixing bugs related to default model provider, provider hint and duplicates in the model provider list

* fixed cargo fmt issues

* updating tests to always include the model id

* using default for the prompt_gateway path

* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config

* making sure that all aliases and models match the config

* fixed the config generator to allow for base_url providers LLMs to include wildcard models

* re-ran the models list utility and added a shell script to run it

* updating docs to mention wildcard model providers

* updated provider_models.json to yaml, added that file to our docs for reference

* updating the build docs to use the new root-based build

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-28 17:47:33 -08:00
Salman Paracha
cdc1d7cee2
making Messages.Content optional, and having the upstream LLM fail if the right fields aren't set (#699)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-16 16:24:03 -08:00
Adil Hafeez
626f556cc6
reduce number of info statements in pipeline processor (#698)
Co-authored-by: Adil Hafeez <adil.hafeez10@t-mobile.com>
2026-01-16 15:38:43 -08:00
Tang Quoc Thai
4d53297c17
feat: add passthrough_auth option for forwarding client Authorization header (#687)
* feat: add passthrough_auth option for forwarding client Authorization header

* fix tests

* Update comment to reflect upstream forwarding

* Apply suggestions from code review

---------

Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2026-01-14 15:06:28 -08:00
Adil Hafeez
ab391f96c7
don't include internal models in /v1/models endpoint (#685) 2026-01-09 16:57:41 -08:00
Adil Hafeez
11fb4cd633
remove unnecessary clones from code (#682) 2026-01-08 15:11:05 -08:00
Adil Hafeez
78b2ae0cf7
pass request_id in orchestrator and routing model (#678) 2026-01-07 12:04:10 -08:00
Salman Paracha
b4543ba56c
Introduce signals change (#655)
* adding support for signals

* reducing false positives for signals like positive interaction

* adding docs. Still need to fix the messages list, but waiting on PR #621

* Improve frustration detection: normalize contractions and refine punctuation

* Further refine test cases with longer messages

* minor doc changes

* fixing echo statement for build

* fixing the messages construction and using the trait for signals

* update signals docs

* fixed some minor doc changes

* added more tests and fixed docuemtnation. PR 100% ready

* made fixes based on PR comments

* Optimize latency

1. replace sliding window approach with trigram containment check
2. add code to pre-compute ngrams for patterns

* removed some debug statements to make tests easier to read

* PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges>

* fixed PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com>
Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>
2026-01-07 11:20:44 -08:00
Adil Hafeez
57327ba667
ensure that request id is consistent (#677)
* ensure that request id is consistent

* remove test debug/info statements
2026-01-07 08:44:41 -08:00
Adil Hafeez
ca95ffb63d
cargo clippy (#660) 2025-12-25 21:08:37 -08:00
Salman Paracha
e224cba3e3
Update docs to Plano (#639) 2025-12-23 17:14:50 -08:00
Adil Hafeez
15fbb6c3af
plano orchestration using plano orchestration 4b model (#637) 2025-12-22 18:05:49 -08:00
Salman Paracha
48bbc7cce7
fixed reasoning failures (#634)
* fixed reasoning failures

* adding debugging

* made several fixes for transmission isses for SSeEvents, incomplete handling of json types by anthropic, and wrote a bunch of tests

* removed debugging from supervisord.conf

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-18 11:02:59 -08:00
Adil Hafeez
2f9121407b
Use mcp tools for filter chain (#621)
* agents framework demo

* more changes

* add more changes

* pending changes

* fix tests

* fix more

* rebase with main and better handle error from mcp

* add trace for filters

* add test for client error, server error and for mcp error

* update schema validate code and rename kind => type in agent_filter

* fix agent description and pre-commit

* fix tests

* add provider specific request parsing in agents chat

* fix precommit and tests

* cleanup demo

* update readme

* fix pre-commit

* refactor tracing

* fix fmt

* fix: handle MessageContent enum in responses API conversion

- Update request.rs to handle new MessageContent enum structure from main
- MessageContent can now be Text(String) or Items(Vec<InputContent>)
- Handle new InputItem variants (ItemReference, FunctionCallOutput)
- Fixes compilation error after merging latest main (#632)

* address pr feedback

* fix span

* fix build

* update openai version
2025-12-17 17:30:14 -08:00
Shuguang Chen
cb82a83c7b
orchestration integration (#623)
* orchestration integration

* Convert compact json to spaced json
2025-12-17 17:20:19 -08:00
Salman Paracha
d5a273f740
enable state management for v1/responses (#631)
* first commit with tests to enable state mamangement via memory

* fixed logs to follow the conversational flow a bit better

* added support for supabase

* added the state_storage_v1_responses flag, and use that to store state appropriately

* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo

* fixed mixed inputs from openai v1/responses api (#632)

* fixed mixed inputs from openai v1/responses api

* removing tracing from model-alias-rouing

* handling additional input types from openairs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>

* resolving PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-17 12:18:38 -08:00
Salman Paracha
33e90dd338
fixed mixed inputs from openai v1/responses api (#632)
* fixed mixed inputs from openai v1/responses api

* removing tracing from model-alias-rouing

* handling additional input types from openairs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-16 13:39:13 -08:00
Salman Paracha
a79f55f313
Improve end to end tracing (#628)
* adding canonical tracing support via bright-staff

* improved formatting for tools in the traces

* removing anthropic from the currency exchange demo

* using Envoy to transport traces, not calling OTEL directly

* moving otel collcetor cluster outside tracing if/else

* minor fixes to not write to the OTEL collector if tracing is disabled

* fixed PR comments and added more trace attributes

* more fixes based on PR comments

* more clean up based on PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-11 15:21:57 -08:00
Adil Hafeez
367f48bf1e
handle agent error better (#627) 2025-12-10 11:20:00 -08:00
Salman Paracha
a448c6e9cb
Add support for v1/responses API (#622)
* making first commit. still need to work on streaming respones

* making first commit. still need to work on streaming respones

* stream buffer implementation with tests

* adding grok API keys to workflow

* fixed changes based on code review

* adding support for bedrock models

* fixed issues with translation to claude code

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-03 14:58:26 -08:00
Salman Paracha
88c2bd1851
removing model_server python module to brightstaff (function calling) (#615)
* adding function_calling functionality via rust

* fixed rendered YAML file

* removed model_server from envoy.template and forwarding traffic to bright_staff

* fixed bugs in function_calling.rs that were breaking tests. All good now

* updating e2e test to clean up disk usage

* removing Arch* models to be used as a default model if one is not specified

* if the user sets arch-function base_url we should honor it

* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build

* adding a constant for Arch-Function model name

* fixing some edge cases with calls made to Arch-Function

* fixed JSON parsing issues in function_calling.rs

* fixed bug where the raw response from Arch-Function was re-encoded

* removed debug from supervisord.conf

* commenting out disk cleanup

* adding back disk space

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-11-22 12:55:00 -08:00