plano

mirror of https://github.com/katanemo/plano.git synced 2026-05-10 16:22:42 +02:00

Author	SHA1	Message	Date
Musa	b71a555f19	fix(brightstaff): enable TLS for redis session cache (#934 ) Some checks failed CI / pre-commit (push) Has been cancelled Details CI / plano-tools-tests (push) Has been cancelled Details CI / native-smoke-test (push) Has been cancelled Details CI / docker-build (push) Has been cancelled Details CI / validate-config (push) Has been cancelled Details Publish docker image (latest) / build-arm64 (push) Has been cancelled Details Publish docker image (latest) / build-amd64 (push) Has been cancelled Details Build and Deploy Documentation / build (push) Has been cancelled Details CI / security-scan (push) Has been cancelled Details CI / test-prompt-gateway (push) Has been cancelled Details CI / test-model-alias-routing (push) Has been cancelled Details CI / test-responses-api-with-state (push) Has been cancelled Details CI / e2e-plano-tests (3.10) (push) Has been cancelled Details CI / e2e-plano-tests (3.11) (push) Has been cancelled Details CI / e2e-plano-tests (3.12) (push) Has been cancelled Details CI / e2e-plano-tests (3.13) (push) Has been cancelled Details CI / e2e-plano-tests (3.14) (push) Has been cancelled Details CI / e2e-demo-preference (push) Has been cancelled Details CI / e2e-demo-currency (push) Has been cancelled Details Publish docker image (latest) / create-manifest (push) Has been cancelled Details Turn on the redis crate's tokio-rustls-comp + tls-rustls-webpki-roots features so rediss:// URLs in routing.session_cache.url actually negotiate TLS. Previously connecting to a TLS Redis failed with "can't connect with TLS, the feature is not enabled". Uses pure-Rust rustls + bundled Mozilla CA roots, so no system OpenSSL dependency is needed in the slim runtime image. Works with managed Redis (ElastiCache, Azure Cache, Redis Cloud, Upstash, etc.) out of the box.	2026-04-30 11:41:34 -07:00
Syed A. Hashmi	dafd245332	signals: restore the pre-port flag marker emoji (🚩) (#913 ) * signals: restore the pre-port flag marker emoji #903 inadvertently replaced the legacy FLAG_MARKER (U+1F6A9, '🚩') with '[!]', which broke any downstream dashboard / alert that searches span names for the flag emoji. Restores the original marker and updates the #910 docs pass to match. - crates/brightstaff/src/signals/analyzer.rs: FLAG_MARKER back to "\\u{1F6A9}" with a comment noting the backwards-compatibility reason so it doesn't drift again. - docs/source/concepts/signals.rst and docs/source/guides/observability/ tracing.rst: swap every '[!]' reference (subheading text, example span name, tip box, dashboard query hint) back to 🚩. Verified: cargo test -p brightstaff --lib (162 passed, 1 ignored); sphinx-build clean on both files; rendered HTML shows 🚩 in all flag-marker references. Made-with: Cursor * fix: silence manual_checked_ops clippy lint (rustc 1.95) Pre-existing warning in router/stress_tests.rs that becomes an error under CI's -D warnings with rustc 1.95. Replace the manual if/else with growth.checked_div(num_iterations).unwrap_or(0) as clippy suggests. Made-with: Cursor	2026-04-24 13:54:53 -07:00
Musa	78dc4edad9	Add first-class ChatGPT subscription provider support (#881 ) * Add first-class ChatGPT subscription provider support * Address PR feedback: move uuid import to top, reuse parsed config in up() * Add ChatGPT token watchdog for seamless long-lived sessions * Address PR feedback: error on stream=false for ChatGPT, fix auth file permissions * Replace ChatGPT watchdog/restart with passthrough_auth --------- Co-authored-by: Musa Malik <musam@uw.edu>	2026-04-23 15:34:44 -07:00
Adil Hafeez	aa726b1bba	add jemalloc and /debug/memstats endpoint for OOM diagnosis (#885 ) Some checks are pending CI / pre-commit (push) Waiting to run Details CI / plano-tools-tests (push) Waiting to run Details CI / native-smoke-test (push) Waiting to run Details CI / docker-build (push) Waiting to run Details CI / validate-config (push) Waiting to run Details CI / security-scan (push) Blocked by required conditions Details CI / test-prompt-gateway (push) Blocked by required conditions Details CI / test-model-alias-routing (push) Blocked by required conditions Details CI / test-responses-api-with-state (push) Blocked by required conditions Details CI / e2e-plano-tests (3.10) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.11) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.12) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.13) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.14) (push) Blocked by required conditions Details CI / e2e-demo-preference (push) Blocked by required conditions Details CI / e2e-demo-currency (push) Blocked by required conditions Details Publish docker image (latest) / build-arm64 (push) Waiting to run Details Publish docker image (latest) / build-amd64 (push) Waiting to run Details Publish docker image (latest) / create-manifest (push) Blocked by required conditions Details Build and Deploy Documentation / build (push) Waiting to run Details	2026-04-23 13:59:12 -07:00
Syed A. Hashmi	c8079ac971	signals: feature parity with the latest Signals paper. Porting logic from python repo (#903 ) * signals: port to layered taxonomy with dual-emit OTel Made-with: Cursor * fix: silence collapsible_match clippy lint (rustc 1.95) Made-with: Cursor * test: parity harness for rust vs python signals analyzer Validates the brightstaff signals port against the katanemo/signals Python reference on lmsys/lmsys-chat-1m. Adds a signals_replay bin emitting python- compatible JSON, a pyarrow-based driver (bypasses the datasets loader pickle bug on python 3.14), a 3-tier comparator, and an on-demand workflow_dispatch CI job. Made-with: Cursor * Remove signals test from the gitops flow * style: format parity harness with black Made-with: Cursor * signals: group summary by taxonomy, factor misalignment_ratio Addresses #903 review feedback from @nehcgs: - generate_summary() now renders explicit Interaction / Execution / Environment headers so the paper taxonomy is visible at a glance, even when no signals fired in a given layer. Quality-driving callouts (high misalignment rate, looping detected, escalation requested) are appended after the layer summary as an alerts tail. - repair_ratio (legacy taxonomy name) renamed to misalignment_ratio and factored into a single InteractionSignals::misalignment_ratio() helper so assess_quality and generate_summary share one source of truth instead of recomputing the same divide twice. Two new unit tests pin the layer headers and the (sev N) severity suffix. Parity with the python reference is preserved at the Tier-A level (per-type counts + overall_quality); only the human-readable summary string diverges, which the parity comparator already classifies as Tier-C. Made-with: Cursor	2026-04-23 12:02:30 -07:00
Adil Hafeez	6701195a5d	add overrides.disable_signals to skip CPU-heavy signal analysis (#906 )	2026-04-23 11:38:29 -07:00
Adil Hafeez	22f332f62d	Add Prometheus metrics endpoint and Grafana dashboard for brightstaff (#904 ) Some checks are pending CI / pre-commit (push) Waiting to run Details CI / plano-tools-tests (push) Waiting to run Details CI / native-smoke-test (push) Waiting to run Details CI / docker-build (push) Waiting to run Details CI / validate-config (push) Waiting to run Details CI / security-scan (push) Blocked by required conditions Details CI / test-prompt-gateway (push) Blocked by required conditions Details CI / test-model-alias-routing (push) Blocked by required conditions Details CI / test-responses-api-with-state (push) Blocked by required conditions Details CI / e2e-plano-tests (3.10) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.11) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.12) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.13) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.14) (push) Blocked by required conditions Details CI / e2e-demo-preference (push) Blocked by required conditions Details CI / e2e-demo-currency (push) Blocked by required conditions Details Publish docker image (latest) / build-arm64 (push) Waiting to run Details Publish docker image (latest) / build-amd64 (push) Waiting to run Details Publish docker image (latest) / create-manifest (push) Blocked by required conditions Details Build and Deploy Documentation / build (push) Waiting to run Details	2026-04-22 11:19:10 -07:00
Adil Hafeez	ffea891dba	fix: prevent index-out-of-bounds panic in signal analyzer follow-up (#896 )	2026-04-18 16:24:02 -07:00
Adil Hafeez	95a7beaab3	fix: truncate oversized user messages in orchestrator routing prompt (#895 )	2026-04-17 21:01:30 -07:00
Adil Hafeez	0f67b2c806	planoai obs: live LLM observability TUI (#891 )	2026-04-17 14:03:47 -07:00
Adil Hafeez	90b926c2ce	use plano-orchestrator for LLM routing, remove arch-router (#886 )	2026-04-15 16:41:42 -07:00
Musa	980faef6be	Redis-backed session cache for cross-replica model affinity (#879 ) Some checks failed CI / pre-commit (push) Has been cancelled Details CI / plano-tools-tests (push) Has been cancelled Details CI / native-smoke-test (push) Has been cancelled Details CI / docker-build (push) Has been cancelled Details CI / validate-config (push) Has been cancelled Details Publish docker image (latest) / build-arm64 (push) Has been cancelled Details Publish docker image (latest) / build-amd64 (push) Has been cancelled Details Build and Deploy Documentation / build (push) Has been cancelled Details CI / security-scan (push) Has been cancelled Details CI / test-prompt-gateway (push) Has been cancelled Details CI / test-model-alias-routing (push) Has been cancelled Details CI / test-responses-api-with-state (push) Has been cancelled Details CI / e2e-plano-tests (3.10) (push) Has been cancelled Details CI / e2e-plano-tests (3.11) (push) Has been cancelled Details CI / e2e-plano-tests (3.12) (push) Has been cancelled Details CI / e2e-plano-tests (3.13) (push) Has been cancelled Details CI / e2e-plano-tests (3.14) (push) Has been cancelled Details CI / e2e-demo-preference (push) Has been cancelled Details CI / e2e-demo-currency (push) Has been cancelled Details Publish docker image (latest) / create-manifest (push) Has been cancelled Details * add pluggable session cache with Redis backend * add Redis session affinity demos (Docker Compose and Kubernetes) * address PR review feedback on session cache * document Redis session cache backend for model affinity * sync rendered config reference with session_cache addition * add tenant-scoped Redis session cache keys and remove dead log_affinity_hit - Add tenant_header to SessionCacheConfig; when set, cache keys are scoped as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation - Thread tenant_id through RouterService, routing_service, and llm handlers - Use Cow<'_, str> in session_key to avoid allocation when no tenant is set - Remove unused log_affinity_hit (logging was already inlined at call sites) * remove session_affinity_redis and session_affinity_redis_k8s demos	2026-04-13 19:30:47 -07:00
Adil Hafeez	8dedf0bec1	Model affinity for consistent model selection in agentic loops (#827 ) Some checks are pending CI / pre-commit (push) Waiting to run Details CI / plano-tools-tests (push) Waiting to run Details CI / native-smoke-test (push) Waiting to run Details CI / docker-build (push) Waiting to run Details CI / validate-config (push) Waiting to run Details CI / security-scan (push) Blocked by required conditions Details CI / test-prompt-gateway (push) Blocked by required conditions Details CI / test-model-alias-routing (push) Blocked by required conditions Details CI / test-responses-api-with-state (push) Blocked by required conditions Details CI / e2e-plano-tests (3.10) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.11) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.12) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.13) (push) Blocked by required conditions Details CI / e2e-plano-tests (3.14) (push) Blocked by required conditions Details CI / e2e-demo-preference (push) Blocked by required conditions Details CI / e2e-demo-currency (push) Blocked by required conditions Details Publish docker image (latest) / build-arm64 (push) Waiting to run Details Publish docker image (latest) / build-amd64 (push) Waiting to run Details Publish docker image (latest) / create-manifest (push) Blocked by required conditions Details Build and Deploy Documentation / build (push) Waiting to run Details	2026-04-08 17:32:02 -07:00
Adil Hafeez	7606c55b4b	support developer role in chat completions API (#867 )	2026-04-02 18:10:32 -07:00
Musa	f68c21f8df	Handle null prefer in inline routing policy (#856 ) * Handle null prefer in inline routing policy * Use serde defaulting for null selection preference * Add tests for default selection policy behavior in routing preferences	2026-03-31 17:41:25 -07:00
Musa	3dbda9741e	fix: route Perplexity OpenAI endpoints without /v1 (#854 ) * fix: route Perplexity OpenAI paths without /v1 * add tests for Perplexity provider handling in LLM module * refactor: use constant for Perplexity provider prefix in LLM module * moving const to top of file	2026-03-31 17:40:42 -07:00
Adil Hafeez	af98c11a6d	restructure model_metrics_sources to type + provider (#855 )	2026-03-30 17:12:20 -07:00
Adil Hafeez	e5751d6b13	model routing: cost/latency ranking with ranked fallback list (#849 )	2026-03-30 13:46:52 -07:00
Salman Paracha	69df124c47	the orchestrator had a bug where it was setting the wrong headers for archfc.katanemo.dev (#839 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-03-20 00:40:47 -07:00
Adil Hafeez	1ad3e0f64e	refactor brightstaff (#736 )	2026-03-19 17:58:33 -07:00
Adil Hafeez	1f23c573bf	add output filter chain (#822 )	2026-03-18 17:58:20 -07:00
Adil Hafeez	bc059aed4d	Unified overrides for custom router and orchestrator models (#820 ) * support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference	2026-03-15 09:36:11 -07:00
Musa	6610097659	Support for Codex via Plano (#808 ) * Add Codex CLI support; xAI response improvements * Add native Plano running check and update CLI agent error handling * adding PR suggestions for transformations and code quality * message extraction logic in ResponsesAPIRequest * xAI support for Responses API by routing to native endpoint + refactor code	2026-03-10 20:54:14 -07:00
Adil Hafeez	97b7a390ef	support inline routing_policy in request body (#811 ) (#815 )	2026-03-10 12:23:18 -07:00
Adil Hafeez	028a2cd196	add routing service (#814 ) fixes https://github.com/katanemo/plano/issues/810	2026-03-09 16:32:16 -07:00
Musa	2bde21ff57	add Custom Trace Attributes to extend observability (#708 ) * add custom trace attributes * refactor: prefix custom trace attributes and update schema handlers tests configs * refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers * docs: add section on custom span attributes * refactor: update tracing configuration to use span attributes and adjust related handlers * docs: custom span attributes section to include static attributes and clarify configuration * add custom trace attributes * refactor: prefix custom trace attributes and update schema handlers tests configs * refactor: rename custom_attribute_prefixes to span_attribute_header_prefixes in configuration and related handlers * docs: add section on custom span attributes * refactor: update tracing configuration to use span attributes and adjust related handlers * docs: custom span attributes section to include static attributes and clarify configuration * refactor: remove TraceCollector usage and enhance logging with structured attributes * refactor: custom trace attribute extraction to improve clarity --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-25 16:27:20 -08:00
Syed A. Hashmi	54bc8e5e52	[ISSUE 706]: Standardize returned errors from Plano (#772 ) * [ISSUE 706]: Standardize returned errors from Plano * Standardized errors in chat completion	2026-02-24 14:34:33 -08:00
Adil Hafeez	baeee56f6b	Make model field optional in request types, resolve from default provider (#768 )	2026-02-18 04:43:59 -08:00
Adil Hafeez	ba651aaf71	Rename all arch references to plano (#745 ) * Rename all arch references to plano across the codebase Complete rebrand from "Arch"/"archgw" to "Plano" including: - Config files: arch_config_schema.yaml, workflow, demo configs - Environment variables: ARCH_CONFIG_* → PLANO_CONFIG_* - Python CLI: variables, functions, file paths, docker mounts - Rust crates: config paths, log messages, metadata keys - Docker/build: Dockerfile, supervisord, .dockerignore, .gitignore - Docker Compose: volume mounts and env vars across all demos/tests - GitHub workflows: job/step names - Shell scripts: log messages - Demos: Python code, READMEs, VS Code configs, Grafana dashboard - Docs: RST includes, code comments, config references - Package metadata: package.json, pyproject.toml, uv.lock External URLs (docs.archgw.com, github.com/katanemo/archgw) left as-is. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update remaining arch references in docs - Rename RST cross-reference labels: arch_access_logging, arch_overview_tracing, arch_overview_threading → plano_* - Update label references in request_lifecycle.rst - Rename arch_config_state_storage_example.yaml → plano_config_state_storage_example.yaml - Update config YAML comments: "Arch creates/uses" → "Plano creates/uses" - Update "the Arch gateway" → "the Plano gateway" in configuration_reference.rst - Update arch_config_schema.yaml reference in provider_models.py - Rename arch_agent_router → plano_agent_router in config example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix remaining arch references found in second pass - config/docker-compose.dev.yaml: ARCH_CONFIG_FILE → PLANO_CONFIG_FILE, arch_config.yaml → plano_config.yaml, archgw_logs → plano_logs - config/test_passthrough.yaml: container mount path - tests/e2e/docker-compose.yaml: source file path (was still arch_config.yaml) - cli/planoai/core.py: comment and log message - crates/brightstaff/src/tracing/constants.rs: doc comment - tests/{e2e,archgw}/common.py: get_arch_messages → get_plano_messages, arch_state/arch_messages variables renamed - tests/{e2e,archgw}/test_prompt_gateway.py: updated imports and usages - demos/shared/test_runner/{common,test_demos}.py: same renames - tests/e2e/test_model_alias_routing.py: docstring - .dockerignore: archgw_modelserver → plano_modelserver - demos/use_cases/claude_code_router/pretty_model_resolution.sh: container name Note: x-arch-* HTTP header values and Rust constant names intentionally preserved for backwards compatibility with existing deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 15:16:56 -08:00
Musa	e3bf2b7f71	Introduce brand new CLI experience with tracing and quickstart (#724 ) Release hardens tracing and routing: clearer CLI, modular internals, updated demos/docs/tests, and improved multi-agent reliability. Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>	2026-02-10 13:17:43 -08:00
Adil Hafeez	46de89590b	use standard tracing and logging in brightstaff (#721 )	2026-02-09 13:33:27 -08:00
Salman Paracha	2941392ed1	Adding support for wildcard models in the model_providers config (#696 ) * cleaning up plano cli commands * adding support for wildcard model providers * fixing compile errors * fixing bugs related to default model provider, provider hint and duplicates in the model provider list * fixed cargo fmt issues * updating tests to always include the model id * using default for the prompt_gateway path * fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config * making sure that all aliases and models match the config * fixed the config generator to allow for base_url providers LLMs to include wildcard models * re-ran the models list utility and added a shell script to run it * updating docs to mention wildcard model providers * updated provider_models.json to yaml, added that file to our docs for reference * updating the build docs to use the new root-based build --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-28 17:47:33 -08:00
Salman Paracha	cdc1d7cee2	making Messages.Content optional, and having the upstream LLM fail if the right fields aren't set (#699 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-16 16:24:03 -08:00
Adil Hafeez	626f556cc6	reduce number of info statements in pipeline processor (#698 ) Co-authored-by: Adil Hafeez <adil.hafeez10@t-mobile.com>	2026-01-16 15:38:43 -08:00
Adil Hafeez	11fb4cd633	remove unnecessary clones from code (#682 )	2026-01-08 15:11:05 -08:00
Adil Hafeez	78b2ae0cf7	pass request_id in orchestrator and routing model (#678 )	2026-01-07 12:04:10 -08:00
Salman Paracha	b4543ba56c	Introduce signals change (#655 ) * adding support for signals * reducing false positives for signals like positive interaction * adding docs. Still need to fix the messages list, but waiting on PR #621 * Improve frustration detection: normalize contractions and refine punctuation * Further refine test cases with longer messages * minor doc changes * fixing echo statement for build * fixing the messages construction and using the trait for signals * update signals docs * fixed some minor doc changes * added more tests and fixed docuemtnation. PR 100% ready * made fixes based on PR comments * Optimize latency 1. replace sliding window approach with trigram containment check 2. add code to pre-compute ngrams for patterns * removed some debug statements to make tests easier to read * PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges> * fixed PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com> Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>	2026-01-07 11:20:44 -08:00
Adil Hafeez	57327ba667	ensure that request id is consistent (#677 ) * ensure that request id is consistent * remove test debug/info statements	2026-01-07 08:44:41 -08:00
Adil Hafeez	ca95ffb63d	cargo clippy (#660 )	2025-12-25 21:08:37 -08:00
Salman Paracha	e224cba3e3	Update docs to Plano (#639 )	2025-12-23 17:14:50 -08:00
Adil Hafeez	15fbb6c3af	plano orchestration using plano orchestration 4b model (#637 )	2025-12-22 18:05:49 -08:00
Adil Hafeez	2f9121407b	Use mcp tools for filter chain (#621 ) * agents framework demo * more changes * add more changes * pending changes * fix tests * fix more * rebase with main and better handle error from mcp * add trace for filters * add test for client error, server error and for mcp error * update schema validate code and rename kind => type in agent_filter * fix agent description and pre-commit * fix tests * add provider specific request parsing in agents chat * fix precommit and tests * cleanup demo * update readme * fix pre-commit * refactor tracing * fix fmt * fix: handle MessageContent enum in responses API conversion - Update request.rs to handle new MessageContent enum structure from main - MessageContent can now be Text(String) or Items(Vec<InputContent>) - Handle new InputItem variants (ItemReference, FunctionCallOutput) - Fixes compilation error after merging latest main (#632) * address pr feedback * fix span * fix build * update openai version	2025-12-17 17:30:14 -08:00
Shuguang Chen	cb82a83c7b	orchestration integration (#623 ) * orchestration integration * Convert compact json to spaced json	2025-12-17 17:20:19 -08:00
Salman Paracha	d5a273f740	enable state management for v1/responses (#631 ) * first commit with tests to enable state mamangement via memory * fixed logs to follow the conversational flow a bit better * added support for supabase * added the state_storage_v1_responses flag, and use that to store state appropriately * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixed mixed inputs from openai v1/responses api (#632) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> * resolving PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-17 12:18:38 -08:00
Salman Paracha	a79f55f313	Improve end to end tracing (#628 ) * adding canonical tracing support via bright-staff * improved formatting for tools in the traces * removing anthropic from the currency exchange demo * using Envoy to transport traces, not calling OTEL directly * moving otel collcetor cluster outside tracing if/else * minor fixes to not write to the OTEL collector if tracing is disabled * fixed PR comments and added more trace attributes * more fixes based on PR comments * more clean up based on PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-11 15:21:57 -08:00
Adil Hafeez	367f48bf1e	handle agent error better (#627 )	2025-12-10 11:20:00 -08:00
Salman Paracha	a448c6e9cb	Add support for v1/responses API (#622 ) * making first commit. still need to work on streaming respones * making first commit. still need to work on streaming respones * stream buffer implementation with tests * adding grok API keys to workflow * fixed changes based on code review * adding support for bedrock models * fixed issues with translation to claude code --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-03 14:58:26 -08:00
Salman Paracha	88c2bd1851	removing model_server python module to brightstaff (function calling) (#615 ) * adding function_calling functionality via rust * fixed rendered YAML file * removed model_server from envoy.template and forwarding traffic to bright_staff * fixed bugs in function_calling.rs that were breaking tests. All good now * updating e2e test to clean up disk usage * removing Arch* models to be used as a default model if one is not specified * if the user sets arch-function base_url we should honor it * fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build * adding a constant for Arch-Function model name * fixing some edge cases with calls made to Arch-Function * fixed JSON parsing issues in function_calling.rs * fixed bug where the raw response from Arch-Function was re-encoded * removed debug from supervisord.conf * commenting out disk cleanup * adding back disk space --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-11-22 12:55:00 -08:00
Salman Paracha	9407ae6af7	Add support for Amazon Bedrock Converse and ConverseStream (#588 ) * first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames * adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent * Claude Code works with Amazon Bedrock * added tests for openai streaming from bedrock * PR comments fixed * adding support for bedrock in docs as supported provider * cargo fmt * revertted to chatgpt models for claude code routing --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local> Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>	2025-10-22 11:31:21 -07:00
Adil Hafeez	96e0732089	add support for agents (#564 )	2025-10-14 14:01:11 -07:00

1 2

74 commits