Commit graph

35 commits

Author SHA1 Message Date
Adil Hafeez
baab4e793c
add native mode smoke test to CI 2026-03-03 15:15:57 -08:00
Adil Hafeez
53d11ae235
add --docker flag to E2E tests and demo scripts 2026-03-03 15:08:50 -08:00
Adil Hafeez
473996d35d
Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
Adil Hafeez
c3591bcbf3
Upgrade CI, Docker, and demos to Python 3.14 (#759)
Update all GitHub Actions workflows and Dockerfiles to use Python 3.14
as the default version. Remove the upper bound on requires-python in
model_choice_with_test_harness to allow 3.14+. The CLI's
requires-python stays at >=3.10 for broad compatibility.
2026-02-15 10:22:33 -08:00
Adil Hafeez
1df43872a6
Fix code scanning and dependabot security alerts (#756)
* Fix code scanning and dependabot security alerts

Code scanning fixes (14 alerts):
- Fix XSS in OG image route by validating request origin against allowlist
- Fix incomplete URL sanitization in blog layout using exact hostname matching
- Bind port-check socket to 127.0.0.1 instead of 0.0.0.0
- Add explicit permissions to 7 GitHub Actions workflows

Dependabot fixes:
- Update @isaacs/brace-expansion 5.0.0 -> 5.0.1 (CVE-2026-25547)
- Update bytes 1.10.1 -> 1.11.1 (CVE-2026-25541)
- Update time 0.3.41 -> 0.3.47 (CVE-2026-25727)
- Update cryptography 45.0.7 -> 46.0.5 (CVE-2026-26007)
- Update python-multipart 0.0.20 -> 0.0.22 (CVE-2026-24486)
- Update urllib3 2.6.2 -> 2.6.3 in test lockfiles (CVE-2026-21441)
- Update Werkzeug 3.1.4 -> 3.1.5 (CVE-2026-21860)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address PR review feedback

- Replace plano.katanemo.com with planoai.dev in allowed hosts
- Add planoai.dev to OG route and blog layout allowlists
- Revert socket bind to 0.0.0.0 (intentional for port-in-use check)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 12:27:07 -08:00
Adil Hafeez
ba651aaf71
Rename all arch references to plano (#745)
* Rename all arch references to plano across the codebase

Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock

External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update remaining arch references in docs

- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix remaining arch references found in second pass

- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
  arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
  arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name

Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 15:16:56 -08:00
Adil Hafeez
5394ef5770
disable bedrock tests (#732) 2026-02-10 00:34:00 -08:00
Adil Hafeez
46de89590b
use standard tracing and logging in brightstaff (#721) 2026-02-09 13:33:27 -08:00
Adil Hafeez
4d9ed74b68
improve e2e tests (#731)
* fix build break

docs build was breaking because requirements file was getting ignored from .dockerignore

* improve e2e tests time

* fix: bump GH Actions to latest versions (checkout@v4, setup-python@v5, build-push-action@v6)

* more improvements

* fix perm

* more improvements

* parallel runs
2026-02-09 13:20:06 -08:00
Salman Paracha
2941392ed1
Adding support for wildcard models in the model_providers config (#696)
* cleaning up plano cli commands

* adding support for wildcard model providers

* fixing compile errors

* fixing bugs related to default model provider, provider hint and duplicates in the model provider list

* fixed cargo fmt issues

* updating tests to always include the model id

* using default for the prompt_gateway path

* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config

* making sure that all aliases and models match the config

* fixed the config generator to allow for base_url providers LLMs to include wildcard models

* re-ran the models list utility and added a shell script to run it

* updating docs to mention wildcard model providers

* updated provider_models.json to yaml, added that file to our docs for reference

* updating the build docs to use the new root-based build

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-28 17:47:33 -08:00
Salman Paracha
cdc1d7cee2
making Messages.Content optional, and having the upstream LLM fail if the right fields aren't set (#699)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-16 16:24:03 -08:00
Adil Hafeez
053e2b3a74
use uv instead of poetry (#663) 2025-12-26 11:21:42 -08:00
Adil Hafeez
88d14a205b
restructure cli (#656) 2025-12-25 14:55:29 -08:00
Adil Hafeez
e8170f76ca
rename to planoai (#650) 2025-12-23 19:26:51 -08:00
Adil Hafeez
e7ce00b5a7
rename cli to plano (#647) 2025-12-23 18:37:58 -08:00
Salman Paracha
48bbc7cce7
fixed reasoning failures (#634)
* fixed reasoning failures

* adding debugging

* made several fixes for transmission isses for SSeEvents, incomplete handling of json types by anthropic, and wrote a bunch of tests

* removed debugging from supervisord.conf

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-18 11:02:59 -08:00
Adil Hafeez
2f9121407b
Use mcp tools for filter chain (#621)
* agents framework demo

* more changes

* add more changes

* pending changes

* fix tests

* fix more

* rebase with main and better handle error from mcp

* add trace for filters

* add test for client error, server error and for mcp error

* update schema validate code and rename kind => type in agent_filter

* fix agent description and pre-commit

* fix tests

* add provider specific request parsing in agents chat

* fix precommit and tests

* cleanup demo

* update readme

* fix pre-commit

* refactor tracing

* fix fmt

* fix: handle MessageContent enum in responses API conversion

- Update request.rs to handle new MessageContent enum structure from main
- MessageContent can now be Text(String) or Items(Vec<InputContent>)
- Handle new InputItem variants (ItemReference, FunctionCallOutput)
- Fixes compilation error after merging latest main (#632)

* address pr feedback

* fix span

* fix build

* update openai version
2025-12-17 17:30:14 -08:00
Salman Paracha
d5a273f740
enable state management for v1/responses (#631)
* first commit with tests to enable state mamangement via memory

* fixed logs to follow the conversational flow a bit better

* added support for supabase

* added the state_storage_v1_responses flag, and use that to store state appropriately

* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo

* fixed mixed inputs from openai v1/responses api (#632)

* fixed mixed inputs from openai v1/responses api

* removing tracing from model-alias-rouing

* handling additional input types from openairs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>

* resolving PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-17 12:18:38 -08:00
Salman Paracha
33e90dd338
fixed mixed inputs from openai v1/responses api (#632)
* fixed mixed inputs from openai v1/responses api

* removing tracing from model-alias-rouing

* handling additional input types from openairs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-16 13:39:13 -08:00
Salman Paracha
a448c6e9cb
Add support for v1/responses API (#622)
* making first commit. still need to work on streaming respones

* making first commit. still need to work on streaming respones

* stream buffer implementation with tests

* adding grok API keys to workflow

* fixed changes based on code review

* adding support for bedrock models

* fixed issues with translation to claude code

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-03 14:58:26 -08:00
Salman Paracha
d37af7605c
removing model_server. buh bye (#619) 2025-11-22 15:04:41 -08:00
Salman Paracha
88c2bd1851
removing model_server python module to brightstaff (function calling) (#615)
* adding function_calling functionality via rust

* fixed rendered YAML file

* removed model_server from envoy.template and forwarding traffic to bright_staff

* fixed bugs in function_calling.rs that were breaking tests. All good now

* updating e2e test to clean up disk usage

* removing Arch* models to be used as a default model if one is not specified

* if the user sets arch-function base_url we should honor it

* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build

* adding a constant for Arch-Function model name

* fixing some edge cases with calls made to Arch-Function

* fixed JSON parsing issues in function_calling.rs

* fixed bug where the raw response from Arch-Function was re-encoded

* removed debug from supervisord.conf

* commenting out disk cleanup

* adding back disk space

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-11-22 12:55:00 -08:00
Salman Paracha
7a6f87de3e
fixed test and docs for deployment (#595)
* fixed test and docs for deployment

* updating the main logo image

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-22 14:13:16 -07:00
Salman Paracha
9407ae6af7
Add support for Amazon Bedrock Converse and ConverseStream (#588)
* first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames

* adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent

* Claude Code works with Amazon Bedrock

* added tests for openai streaming from bedrock

* PR comments fixed

* adding support for bedrock in docs as supported provider

* cargo fmt

* revertted to chatgpt models for claude code routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
2025-10-22 11:31:21 -07:00
Salman Paracha
f00870dccb
adding support for claude code routing (#575)
* fixed for claude code routing. first commit

* removing redundant enum tags for cache_control

* making sure that claude code can run via the archgw cli

* fixing broken config

* adding a README.md and updated the cli to use more of our defined patterns for params

* fixed config.yaml

* minor fixes to make sure PR is clean. Ready to ship

* adding claude-sonnet-4-5 to the config

* fixes based on PR

* fixed alias for README

* fixed 400 error handling tests, now that we write temperature to 1.0 for GPT-5

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-09-29 19:23:08 -07:00
Salman Paracha
03c2cf6f0d
fixed changes related to max_tokens and processing http error codes like 400 properly (#574)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>
2025-09-25 17:00:37 -07:00
Salman Paracha
4eb2b410c5
adding support for model aliases in archgw (#566)
* adding support for model aliases in archgw

* fixed PR based on feedback

* removing README. Not relevant for PR

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
2025-09-16 11:12:08 -07:00
Salman Paracha
fb0581fd39
add support for v1/messages and transformations (#558)
* pushing draft PR

* transformations are working. Now need to add some tests next

* updated tests and added necessary response transformations for Anthropics' message response object

* fixed bugs for integration tests

* fixed doc tests

* fixed serialization issues with enums on response

* adding some debug logs to help

* fixed issues with non-streaming responses

* updated the stream_context to update response bytes

* the serialized bytes length must be set in the response side

* fixed the debug statement that was causing the integration tests for wasm to fail

* fixing json parsing errors

* intentionally removing the headers

* making sure that we convert the raw bytes to the correct provider type upstream

* fixing non-streaming responses to tranform correctly

* /v1/messages works with transformations to and from /v1/chat/completions

* updating the CLI and demos to support anthropic vs. claude

* adding the anthropic key to the preference based routing tests

* fixed test cases and added more structured logs

* fixed integration tests and cleaned up logs

* added python client tests for anthropic and openai

* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo

* fixing the tests. python dependency order was broken

* updated the openAI client to fix demos

* removed the raw response debug statement

* fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits

* fixing logs

* moved away from string literals to consts

* fixed streaming from Anthropic Client to OpenAI

* removed debug statement that would likely trip up integration tests

* fixed integration tests for llm_gateway

* cleaned up test cases and removed unnecessary crates

* fixing comments from PR

* fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
2025-09-10 07:40:30 -07:00
Adil Hafeez
a7fddf30f9
better model names (#517) 2025-07-11 16:42:16 -07:00
Adil Hafeez
6c53510f49
Introduce hermesllm library to handle llm message translation (#501) 2025-06-10 12:53:27 -07:00
Shuguang Chen
7d4b261a68
Integrate Arch-Function-Chat (#449) 2025-04-15 14:39:12 -07:00
Salman Paracha
f31aa59fac
fixed issue with groq LLMs that require the openai in the /v1/chat/co… (#460)
* fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change

* updated the GH actions with keys for Groq

* adding missing groq API keys

* add llama-3.2-3b-preview to the model based on addin groq to the demo

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-04-13 14:00:16 -07:00
Shuguang Chen
e77fc47225
Handle intent matching better in arch gateway (#391) 2025-03-04 12:49:13 -08:00
Salman Paracha
b3c95a6698
refactor demos (#398) 2025-02-07 18:45:42 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00