Commit graph

40 commits

Author SHA1 Message Date
Syed Hashmi
cea43c5da5
docs: address signals flywheel review feedback
Addresses review comments on #910:

- Shorten the paper citation to (Chen et al., 2026) per common cite
  practice (replacing the full author list form).
- Replace the Why Signals Matter section with the review-suggested
  rewrite verbatim: more formal intro framing, renumbered steps to
  Instrument / Sample & triage / Data Construction / Model Optimization
  / Deploy, removes 'routing decisions' from the data-construction
  step, and adds DPO/RLHF/SFT as model-optimization examples.
- Renders tau and O(messages) as proper math glyphs via the sphinx
  built-in :math: role (enabled by adding sphinx.ext.mathjax to
  conf.py). Using the RST role form rather than raw $...$ inline so
  sphinx only injects MathJax on pages that actually have math,
  instead of loading ~1MB of JS on every page.

Build verified locally: sphinx-build produces no warnings on the
changed files and the rendered HTML wraps tau and O(messages) in
MathJax-ready <span class="math">\(\tau\)</span> containers.

Made-with: Cursor
2026-04-24 12:05:48 -07:00
Syed Hashmi
ae629d3635
docs: reframe signals intro around the improvement flywheel
Addresses review feedback on #910:

- Replace the triage-only framing at the top with an instrument -> sample
  & triage -> construct data -> optimize -> deploy flywheel that explains
  why signals matter, not just what they surface. Paper's 82% / 1.52x
  numbers move into step 2 of the flywheel where they belong.
- Remove the 'Signals vs Response Quality' section. Per review, signals
  and response quality overlap rather than complement each other, so the
  comparison is misleading.
- Borrow the per-category summaries and leaf-type descriptions verbatim
  from the katanemo/signals reference implementation (module docstrings)
  so the documentation and the detector contract stay in sync. Drops the
  hand-crafted examples that were not strictly accurate (e.g. 'semantic
  overlap is high' for rephrase, 'user explicitly corrects the agent'
  for correction).

Made-with: Cursor
2026-04-24 10:59:37 -07:00
Syed Hashmi
ca4a9e57f2
docs: align signals page with paper taxonomy
Updates docs/source/concepts/signals.rst and the tracing guide's signals
subsection to reflect the three-layer taxonomy shipped in #903:

- Introduces the paper reference (arXiv:2604.00356) and the three layers
  (interaction, execution, environment) with all 20 leaf signal types in
  three reference tables
- Documents the new layered OTel attribute set
  (signals.interaction.*, signals.execution.*, signals.environment.*)
  and marks the legacy aggregate keys (signals.follow_up.repair.*,
  signals.frustration.*, signals.repetition.count,
  signals.escalation.requested, signals.positive_feedback.count) as
  deprecated-but-still-emitted
- Adds a Span Events section describing the per-instance signal.<type>
  events with confidence / snippet / metadata attributes
- Fixes the flag marker reference ([!] in the code vs 🚩 in the old docs)
- Updates all example attributes, dashboard queries, and alert rules to
  use the layered keys
- Updates the tracing guide's behavioral-signals subsection to match
- Notes that the triage sampler is a planned follow-up and today sampling
  is consumer-side via observability-platform filters

Build verified locally: sphinx-build produces no warnings on these files.

Made-with: Cursor
2026-04-23 12:44:27 -07:00
Musa
978b1ea722
Add first-class Xiaomi provider support (#863)
Some checks failed
CI / pre-commit (push) Has been cancelled
CI / plano-tools-tests (push) Has been cancelled
CI / native-smoke-test (push) Has been cancelled
CI / docker-build (push) Has been cancelled
CI / validate-config (push) Has been cancelled
CI / security-scan (push) Has been cancelled
CI / test-prompt-gateway (push) Has been cancelled
CI / test-model-alias-routing (push) Has been cancelled
CI / test-responses-api-with-state (push) Has been cancelled
CI / e2e-plano-tests (3.10) (push) Has been cancelled
CI / e2e-plano-tests (3.11) (push) Has been cancelled
CI / e2e-plano-tests (3.12) (push) Has been cancelled
CI / e2e-plano-tests (3.13) (push) Has been cancelled
CI / e2e-plano-tests (3.14) (push) Has been cancelled
CI / e2e-demo-preference (push) Has been cancelled
CI / e2e-demo-currency (push) Has been cancelled
Publish docker image (latest) / build-arm64 (push) Has been cancelled
Publish docker image (latest) / build-amd64 (push) Has been cancelled
Publish docker image (latest) / create-manifest (push) Has been cancelled
Build and Deploy Documentation / build (push) Has been cancelled
* feat(provider): add xiaomi as first-class provider

* feat(demos): add xiaomi mimo integration demo

* refactor(demos): remove Xiaomi MiMo integration demo and update documentation

* updating model list and adding the xiamoi models

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>
2026-04-04 09:58:36 -07:00
Adil Hafeez
1f23c573bf
add output filter chain (#822) 2026-03-18 17:58:20 -07:00
Adil Hafeez
f63d5de02c
Run plano natively by default (#744) 2026-03-05 07:35:25 -08:00
Adil Hafeez
473996d35d
Overhaul demos directory: cleanup, restructure, and standardize configs (#760) 2026-02-17 03:09:28 -08:00
Adil Hafeez
ba651aaf71
Rename all arch references to plano (#745)
* Rename all arch references to plano across the codebase

Complete rebrand from "Arch"/"archgw" to "Plano" including:
- Config files: arch_config_schema.yaml, workflow, demo configs
- Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_*
- Python CLI: variables, functions, file paths, docker mounts
- Rust crates: config paths, log messages, metadata keys
- Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore
- Docker Compose: volume mounts and env vars across all demos/tests
- GitHub workflows: job/step names
- Shell scripts: log messages
- Demos: Python code, READMEs, VS Code configs, Grafana dashboard
- Docs: RST includes, code comments, config references
- Package metadata: package.json, pyproject.toml, uv.lock

External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update remaining arch references in docs

- Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_*
- Update label references in request_lifecycle.rst
- Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml
- Update config YAML comments: "Arch creates/uses" → "Plano creates/uses"
- Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst
- Update arch_config_schema.yaml reference in provider_models.py
- Rename arch_agent_router → plano_agent_router in config example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix remaining arch references found in second pass

- config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE,
  arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs
- config/test_passthrough.yaml: container mount path
- tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml)
- cli/planoai/core.py: comment and log message
- crates/brightstaff/src/tracing/constants.rs: doc comment
- tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages,
  arch_state/arch_messages variables renamed
- tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages
- demos/shared/test_runner/{common,test_demos}.py: same renames
- tests/e2e/test_model_alias_routing.py: docstring
- .dockerignore: archgw_modelserver → plano_modelserver
- demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name

Note: x-arch-* HTTP header values and Rust constant names intentionally
preserved for backwards compatibility with existing deployments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 15:16:56 -08:00
Salman Paracha
2a36dd7376
fixing the build scripts for documentation (#711)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-28 18:55:35 -08:00
Salman Paracha
2941392ed1
Adding support for wildcard models in the model_providers config (#696)
* cleaning up plano cli commands

* adding support for wildcard model providers

* fixing compile errors

* fixing bugs related to default model provider, provider hint and duplicates in the model provider list

* fixed cargo fmt issues

* updating tests to always include the model id

* using default for the prompt_gateway path

* fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config

* making sure that all aliases and models match the config

* fixed the config generator to allow for base_url providers LLMs to include wildcard models

* re-ran the models list utility and added a shell script to run it

* updating docs to mention wildcard model providers

* updated provider_models.json to yaml, added that file to our docs for reference

* updating the build docs to use the new root-based build

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2026-01-28 17:47:33 -08:00
Tang Quoc Thai
4d53297c17
feat: add passthrough_auth option for forwarding client Authorization header (#687)
* feat: add passthrough_auth option for forwarding client Authorization header

* fix tests

* Update comment to reflect upstream forwarding

* Apply suggestions from code review

---------

Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2026-01-14 15:06:28 -08:00
Salman Paracha
b4543ba56c
Introduce signals change (#655)
* adding support for signals

* reducing false positives for signals like positive interaction

* adding docs. Still need to fix the messages list, but waiting on PR #621

* Improve frustration detection: normalize contractions and refine punctuation

* Further refine test cases with longer messages

* minor doc changes

* fixing echo statement for build

* fixing the messages construction and using the trait for signals

* update signals docs

* fixed some minor doc changes

* added more tests and fixed docuemtnation. PR 100% ready

* made fixes based on PR comments

* Optimize latency

1. replace sliding window approach with trigram containment check
2. add code to pre-compute ngrams for patterns

* removed some debug statements to make tests easier to read

* PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges>

* fixed PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com>
Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>
2026-01-07 11:20:44 -08:00
Salman Paracha
e224cba3e3
Update docs to Plano (#639) 2025-12-23 17:14:50 -08:00
Salman Paracha
cdfcfb9169
support base_url path for model providers (#608)
* adding support for base_url

* updated docs

* fixed tests for config generator

* making fixes based on PR comments

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-29 17:08:07 -07:00
Salman Paracha
9407ae6af7
Add support for Amazon Bedrock Converse and ConverseStream (#588)
* first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames

* adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent

* Claude Code works with Amazon Bedrock

* added tests for openai streaming from bedrock

* PR comments fixed

* adding support for bedrock in docs as supported provider

* cargo fmt

* revertted to chatgpt models for claude code routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>
2025-10-22 11:31:21 -07:00
Salman Paracha
03d8cc1894
fixing docs (#584)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-01 22:26:54 -07:00
Salman Paracha
226139e907
adding support for Qwen models and fixed issue with passing PATH vari… (#583)
* adding support for Qwen models and fixed issue with passing PATH variable

* don't need to have qwen in the model alias routing example

* fixed base_url for qwen

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-10-01 21:57:58 -07:00
Salman Paracha
045a5e9751
adding support for moonshot and z-ai (#578)
* adding support for moonshot and z-ai

* Revert unwanted changes to arch_config.yaml

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
2025-09-30 12:24:06 -07:00
Salman Paracha
fbe82351c0
Salmanap/fix docs new providers model alias (#571)
* fixed docs and added ollama as a first-class LLM provider

* matching the LLM routing section on the README.md to the docs

* updated the section on preference-based routing

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>
2025-09-19 10:19:57 -07:00
Musa
d215724864
Update llm_provider.rst (#543) 2025-07-27 09:26:12 -07:00
Salman Paracha
5e65572573
updating the messaging to call ourselves the edge and AI gateway for … (#527)
* updating the messaging to call ourselves the edge and AI gateway for agents

* updating README to tidy up some language

* updating README to tidy up some language

* updating README to tidy up some language

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-329.local>
2025-07-12 03:25:09 -07:00
Adil Hafeez
a7fddf30f9
better model names (#517) 2025-07-11 16:42:16 -07:00
Adil Hafeez
0f139baf13
use consistent version across all arch_config files (#497) 2025-05-31 01:11:14 -07:00
Mat Sylvia
e7b0de2a72
Tweak readme docs for minor nits (#461)
Co-authored-by: darkdatter <msylvia@tradestax.io>
2025-04-12 23:52:20 -07:00
Ikko Eltociear Ashimine
49e8216061
docs: update llm_provider.rst (#448)
minor fix
2025-03-28 14:35:55 -07:00
Salman Paracha
bd8004d1ae
updated docs to reflect agent routing and hand off (#443)
* updated docs to reflect agent routing and hand off

* updated prompt targets based on review

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-03-20 13:57:33 -07:00
Salman Paracha
6072d6ef30
updating the docs to improve usage guide for prompt_targets and function calling (#434)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2025-03-17 14:07:06 -07:00
Adil Hafeez
e40b13be05
Update arch_config and add tests for arch config file (#407) 2025-02-14 19:28:10 -08:00
Adil Hafeez
a7feb6bffb
fix llm_provider format (#385) 2025-01-24 20:35:56 -08:00
Adil Hafeez
38f7691163
add support for custom llm with ssl support (#380)
* add support for custom llm with ssl support

Add support for using custom llm that are served through https protocol.

* add instructions on how to add custom inference endpoint

* fix formatting

* add more details

* Apply suggestions from code review

Co-authored-by: Salman Paracha <salman.paracha@gmail.com>

* Apply suggestions from code review

* fix precommit

---------

Co-authored-by: Salman Paracha <salman.paracha@gmail.com>
2025-01-24 17:14:24 -08:00
Adil Hafeez
2c67fa3bc0
Fix llm_routing provider element (#382)
* Fix llm_routing provider element

We replaced provider with provider_interface to make it more clear to developers about provider api/backend being used. During that upgrade we removed support for mistral in provider to encourage developers to start using provider_interface. But this demo was not updated to use provider_interface as it was using mistral. This code change fixes it by replacing provider with provider_interface.

Signed-off-by: Adil Hafeez <adil.hafeez@gmail.com>

* fix the path

* move

* add more details

* fix

* Apply suggestions from code review

* fix

* fix

---------

Signed-off-by: Adil Hafeez <adil.hafeez@gmail.com>
2025-01-24 16:34:11 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
Salman Paracha
a0c159c9ba
updating doc versions, images and cleaning up section for prompt-guard (#320)
* updating doc versions, images and cleaning up section for prompt-guard

* updating based on feedback

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-12-01 23:02:08 -08:00
Salman Paracha
a0d87d86c9
updating docs to reflect changes in 0.1.2 like tracing via signoz and… (#271) 2024-11-15 16:55:27 -08:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
Shuguang Chen
11fba23f1f
Update doc (#178)
* Update doc

* Update links
2024-10-10 22:30:54 -07:00
Salman Paracha
1acf43ff7a
fixed cli to use poetry as well. this way we make it easy to have the… (#160) 2024-10-09 15:53:12 -07:00
Salman Paracha
42d4a28e13
updated all demo READMes and minor doc changes (#154)
* updated all demo READMes and minor doc changes

* minor typo fixes

* updated main Readme

* fixed README and docs

* fixed README and docs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-08 23:58:55 -07:00
Shuguang Chen
b30ad791f7
Fix errors and improve Doc (#143)
* Fix link issues and add icons

* Improve Doc

* fix test

* making minor modifications to shuguangs' doc changes

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-08 13:18:34 -07:00
Shuguang Chen
5c7567584d
Doc Update (#129)
* init update

* Update terminology.rst

* fix the branch to create an index.html, and fix pre-commit issues

* Doc update

* made several changes to the docs after Shuguang's revision

* fixing pre-commit issues

* fixed the reference file to the final prompt config file

* added google analytics

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-06 16:54:34 -07:00