Commit graph

225 commits

Author SHA1 Message Date
Salman Paracha
7cba42f887
fixing the README for multi-agent orchestration (#722)
* fixing the README for multi-agent orchestration

* fixed issues where the multi-intent queries weren't being properly handled by GPT-40

* more fixes for the README.md and tracing visuals

* removed remnant Arch README.md

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-02-02 14:35:49 -08:00
Adil Hafeez
d8b4c800e6
release 0.4.4 (#713) 2026-01-28 20:45:10 -08:00
Adil Hafeez
062825f26e
add envoy retries (#712)
* add envoy retries

* add missing file

* fix tests

---------

Co-authored-by: Adil Hafeez <adil.hafeez10@t-mobile.com>
2026-01-28 20:31:01 -08:00
Salman Paracha
2941392ed1
Adding support for wildcard models in the model_providers config (#696)
* cleaning up plano cli commands

* adding support for wildcard model providers

* fixing compile errors

* fixing bugs related to default model provider, provider hint and duplicates in the model provider list

* fixed cargo fmt issues

* updating tests to always include the model id

* using default for the prompt_gateway path

* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config

* making sure that all aliases and models match the config

* fixed the config generator to allow for base_url providers LLMs to include wildcard models

* re-ran the models list utility and added a shell script to run it

* updating docs to mention wildcard model providers

* updated provider_models.json to yaml, added that file to our docs for reference

* updating the build docs to use the new root-based build

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-28 17:47:33 -08:00
Adil Hafeez
da5cbc29b7
release 0.4.3 (#701) 2026-01-18 00:07:46 -08:00
Musa
f2141cbdcb
demo: add multi-framework agent demo (#688) 2026-01-17 15:39:06 -08:00
MeiyuZhong
6a29f8398f
http-filter: add fully http based demo (remove MCP) (#681)
* http-filter: add fully http based demo (remove MCP)

* Fix pre-commit formatting

* add instructions about uv build/sync in cli (#675)

* Musa/demo fix (#676)

* fix demo with travel agent

* Update .gitignore

* remove sse chunk rendering

* ensure that request id is consistent (#677)

* ensure that request id is consistent

* remove test debug/info statements

* Introduce signals change (#655)

* adding support for signals

* reducing false positives for signals like positive interaction

* adding docs. Still need to fix the messages list, but waiting on PR #621

* Improve frustration detection: normalize contractions and refine punctuation

* Further refine test cases with longer messages

* minor doc changes

* fixing echo statement for build

* fixing the messages construction and using the trait for signals

* update signals docs

* fixed some minor doc changes

* added more tests and fixed docuemtnation. PR 100% ready

* made fixes based on PR comments

* Optimize latency

1. replace sliding window approach with trigram containment check
2. add code to pre-compute ngrams for patterns

* removed some debug statements to make tests easier to read

* PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges>

* fixed PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com>
Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>

* pass request_id in orchestrator and routing model (#678)

* release 0.4.2 (#679)

* tweaks to web and docs to align to 0.4.2 (#680)

* tweaks to web and docs to align to 0.4.2

* made our release banner clickable

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>

* Added request id across agents

* fix http_filter agent: request id + pre-commit

---------

Co-authored-by: Adil Hafeez <adil@katanemo.com>
Co-authored-by: Musa <malikmusa1323@gmail.com>
Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>
2026-01-13 13:51:12 -08:00
Adil Hafeez
b7fba7a97f
release 0.4.2 (#679) 2026-01-07 13:02:06 -08:00
Adil Hafeez
78b2ae0cf7
pass request_id in orchestrator and routing model (#678) 2026-01-07 12:04:10 -08:00
Adil Hafeez
57327ba667
ensure that request id is consistent (#677)
* ensure that request id is consistent

* remove test debug/info statements
2026-01-07 08:44:41 -08:00
Musa
b45c7aba86
Musa/demo fix (#676)
* fix demo with travel agent

* Update .gitignore

* remove sse chunk rendering
2026-01-06 14:32:06 -08:00
Adil Hafeez
41aa4abaeb
release 0.4.1 (#670) 2026-01-01 23:39:18 -08:00
Adil Hafeez
77cdc7f6ef
Revert "release 0.4.1 (#666)" (#669)
This reverts commit 77df5160d8.
2025-12-30 15:28:30 -08:00
Adil Hafeez
77df5160d8
release 0.4.1 (#666) 2025-12-28 14:29:19 -08:00
Adil Hafeez
053e2b3a74
use uv instead of poetry (#663) 2025-12-26 11:21:42 -08:00
Adil Hafeez
88d14a205b
restructure cli (#656) 2025-12-25 14:55:29 -08:00
Adil Hafeez
a56bb9d190
add open-web-ui-ref (#653) 2025-12-24 10:44:41 -08:00
Adil Hafeez
911a799bc6
update mcp_filter docs and talk about docker build and jaeger ui (#652) 2025-12-24 10:38:08 -08:00
Adil Hafeez
9366070d76
rename plano => planoai 2025-12-23 19:31:43 -08:00
Adil Hafeez
e8170f76ca
rename to planoai (#650) 2025-12-23 19:26:51 -08:00
Adil Hafeez
e7ce00b5a7
rename cli to plano (#647) 2025-12-23 18:37:58 -08:00
Salman Paracha
e224cba3e3
Update docs to Plano (#639) 2025-12-23 17:14:50 -08:00
Adil Hafeez
15fbb6c3af
plano orchestration using plano orchestration 4b model (#637) 2025-12-22 18:05:49 -08:00
Adil Hafeez
2f9121407b
Use mcp tools for filter chain (#621)
* agents framework demo

* more changes

* add more changes

* pending changes

* fix tests

* fix more

* rebase with main and better handle error from mcp

* add trace for filters

* add test for client error, server error and for mcp error

* update schema validate code and rename kind => type in agent_filter

* fix agent description and pre-commit

* fix tests

* add provider specific request parsing in agents chat

* fix precommit and tests

* cleanup demo

* update readme

* fix pre-commit

* refactor tracing

* fix fmt

* fix: handle MessageContent enum in responses API conversion

- Update request.rs to handle new MessageContent enum structure from main
- MessageContent can now be Text(String) or Items(Vec<InputContent>)
- Handle new InputItem variants (ItemReference, FunctionCallOutput)
- Fixes compilation error after merging latest main (#632)

* address pr feedback

* fix span

* fix build

* update openai version
2025-12-17 17:30:14 -08:00
Salman Paracha
33e90dd338
fixed mixed inputs from openai v1/responses api (#632)
* fixed mixed inputs from openai v1/responses api

* removing tracing from model-alias-rouing

* handling additional input types from openairs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-16 13:39:13 -08:00
Salman Paracha
a79f55f313
Improve end to end tracing (#628)
* adding canonical tracing support via bright-staff

* improved formatting for tools in the traces

* removing anthropic from the currency exchange demo

* using Envoy to transport traces, not calling OTEL directly

* moving otel collcetor cluster outside tracing if/else

* minor fixes to not write to the OTEL collector if tracing is disabled

* fixed PR comments and added more trace attributes

* more fixes based on PR comments

* more clean up based on PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-11 15:21:57 -08:00
Adil Hafeez
8adb9795d8
release 0.3.22 (#629) 2025-12-11 11:20:19 -08:00
Adil Hafeez
09c0b999b2
release 0.3.21 (#626) 2025-12-03 17:12:34 -08:00
Salman Paracha
a448c6e9cb
Add support for v1/responses API (#622)
* making first commit. still need to work on streaming respones

* making first commit. still need to work on streaming respones

* stream buffer implementation with tests

* adding grok API keys to workflow

* fixed changes based on code review

* adding support for bedrock models

* fixed issues with translation to claude code

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-12-03 14:58:26 -08:00
Adil Hafeez
b01a81927d
release 0.3.20 (#620) 2025-11-22 19:29:04 -08:00
Salman Paracha
d37af7605c
removing model_server. buh bye (#619) 2025-11-22 15:04:41 -08:00
Salman Paracha
88c2bd1851
removing model_server python module to brightstaff (function calling) (#615)
* adding function_calling functionality via rust

* fixed rendered YAML file

* removed model_server from envoy.template and forwarding traffic to bright_staff

* fixed bugs in function_calling.rs that were breaking tests. All good now

* updating e2e test to clean up disk usage

* removing Arch* models to be used as a default model if one is not specified

* if the user sets arch-function base_url we should honor it

* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build

* adding a constant for Arch-Function model name

* fixing some edge cases with calls made to Arch-Function

* fixed JSON parsing issues in function_calling.rs

* fixed bug where the raw response from Arch-Function was re-encoded

* removed debug from supervisord.conf

* commenting out disk cleanup

* adding back disk space

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-11-22 12:55:00 -08:00
Adil Hafeez
126b029345
release 0.3.18 (#611) 2025-10-31 12:24:49 -07:00
Salman Paracha
cdfcfb9169
support base_url path for model providers (#608)
* adding support for base_url

* updated docs

* fixed tests for config generator

* making fixes based on PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-29 17:08:07 -07:00
Salman Paracha
5108013df4
fixing a bug where by we were writing the cluster_name for an upstream LLM more than once (#607)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-27 17:01:59 -07:00
Adil Hafeez
f26bb05d35
release 0.3.17 (#604) 2025-10-24 17:52:15 -07:00
Salman Paracha
566e7b9c09
fixed bug in Bedrock translation code and dramatically improved tracing for outbound LLM traffic (#601)
* dramatically improve LLM traces and fixed bug with Bedrock translation from claude code

* addressing comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-24 14:07:05 -07:00
Adil Hafeez
0ee0912a73
fix config generator bug (#599) 2025-10-23 18:50:23 -07:00
Adil Hafeez
6d70545459
release 0.3.16 (#596) 2025-10-22 14:43:33 -07:00
Salman Paracha
9407ae6af7
Add support for Amazon Bedrock Converse and ConverseStream (#588)
* first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames

* adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent

* Claude Code works with Amazon Bedrock

* added tests for openai streaming from bedrock

* PR comments fixed

* adding support for bedrock in docs as supported provider

* cargo fmt

* revertted to chatgpt models for claude code routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
2025-10-22 11:31:21 -07:00
Adil Hafeez
96e0732089
add support for agents (#564) 2025-10-14 14:01:11 -07:00
Salman Paracha
dbeaa51aa7
renaming branch (#582)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-01 08:20:16 -07:00
Adil Hafeez
cd563c2706
release 0.3.15 (#579) 2025-09-30 13:44:11 -07:00
Adil Hafeez
7df1b8cdb0
release 0.3.14 (#577) 2025-09-29 23:11:43 -07:00
Salman Paracha
cf23aefddd
fixing README for claude code and adding a helper script to show model selection (#576)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-09-29 21:20:52 -07:00
Salman Paracha
f00870dccb
adding support for claude code routing (#575)
* fixed for claude code routing. first commit

* removing redundant enum tags for cache_control

* making sure that claude code can run via the archgw cli

* fixing broken config

* adding a README.md and updated the cli to use more of our defined patterns for params

* fixed config.yaml

* minor fixes to make sure PR is clean. Ready to ship

* adding claude-sonnet-4-5 to the config

* fixes based on PR

* fixed alias for README

* fixed 400 error handling tests, now that we write temperature to 1.0 for GPT-5

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-09-29 19:23:08 -07:00
Salman Paracha
03c2cf6f0d
fixed changes related to max_tokens and processing http error codes like 400 properly (#574)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>
2025-09-25 17:00:37 -07:00
Adil Hafeez
7ce8d44d8e
release 0.3.13 (#572) 2025-09-19 11:26:49 -07:00
Salman Paracha
fbe82351c0
Salmanap/fix docs new providers model alias (#571)
* fixed docs and added ollama as a first-class LLM provider

* matching the LLM routing section on the README.md to the docs

* updated the section on preference-based routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>
2025-09-19 10:19:57 -07:00
Salman Paracha
8d0b468345
draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570)
* draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI

* fixing failing tests and updating rederend config file

* Update arch_config_with_aliases.yaml

* adding the AZURE_API_KEY to the GH workflow for e2e

* fixing GH secerts

* adding valdiating for azure_openai

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>
2025-09-18 18:36:30 -07:00