plano

mirror of https://github.com/katanemo/plano.git synced 2026-06-14 15:15:15 +02:00

Author	SHA1	Message	Date
shivani kumar	ecf864df25	Add the system role into messages array (#967 ) * add teh system role into messages array * ci: trigger workflows * dont normalize for anthropic --------- Co-authored-by: Spherrrical <malikmusa1323@gmail.com>	2026-06-12 14:25:22 -07:00
Musa	7906e5d455	chore(models): update provider models (#965 )	2026-06-09 16:05:43 -07:00
Musa	dbe6632d5f	fix(ci): unbreak main — rustfmt warn! + proxy-wasm 0.2.5 for Rust 1.96 (#964 )	2026-06-03 14:38:33 -07:00
ucloudnb666	fb794ae7fe	feat: add Astraflow provider support (#956 ) Signed-off-by: ucloudnb666 <ucloudnb666@users.noreply.github.com>	2026-06-03 13:47:26 -07:00
Musa	f3d6ea41ad	Support Kimi Code API for Claude Code routing (#951 ) * Support Kimi Code API and Claude Code protocol compatibility Co-authored-by: Musa <musa@spherrrical.dev> * Fix black formatting in config_generator Co-authored-by: Musa <musa@spherrrical.dev> * Warn when stripping unsupported Kimi Code request fields Co-authored-by: Musa <musa@spherrrical.dev> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2026-06-03 10:09:50 -07:00
Musa	5a4487fc6e	ci+fix: add update-providers workflow + non-destructive fetch_models (#914 ) Some checks failed CI / pre-commit (push) Has been cancelled CI / plano-tools-tests (push) Has been cancelled CI / native-smoke-test (push) Has been cancelled CI / docker-build (push) Has been cancelled CI / validate-config (push) Has been cancelled Publish docker image (latest) / build-arm64 (push) Has been cancelled Publish docker image (latest) / build-amd64 (push) Has been cancelled Build and Deploy Documentation / build (push) Has been cancelled CI / security-scan (push) Has been cancelled CI / test-prompt-gateway (push) Has been cancelled CI / test-model-alias-routing (push) Has been cancelled CI / test-responses-api-with-state (push) Has been cancelled CI / e2e-plano-tests (3.10) (push) Has been cancelled CI / e2e-plano-tests (3.11) (push) Has been cancelled CI / e2e-plano-tests (3.12) (push) Has been cancelled CI / e2e-plano-tests (3.13) (push) Has been cancelled CI / e2e-plano-tests (3.14) (push) Has been cancelled CI / e2e-demo-preference (push) Has been cancelled CI / e2e-demo-currency (push) Has been cancelled Publish docker image (latest) / create-manifest (push) Has been cancelled * ci: add update-providers workflow Adds .github/workflows/update-providers.yml so the provider_models.yaml refresh can be triggered via workflow_dispatch (manual UI / gh CLI) or repository_dispatch (from the PlanoHelper Slack bot). The workflow: - Runs cargo run --bin fetch_models --features model-fetch with all provider API keys + AWS creds available as env from secrets. - Opens a PR via peter-evans/create-pull-request scoped to just crates/hermesllm/src/bin/provider_models.yaml. - On repository_dispatch, posts the PR link (or failure) back to Slack via the response_url in the dispatch payload. Includes keys for the providers fetch_models reads today (OpenAI, Anthropic, Mistral, DeepSeek, Grok, Moonshot, Dashscope/Qwen, Zhipu, Xiaomi/Mimo, Google) plus forward-compat env for OpenRouter and Vercel AI Gateway (added in #902). The workflow has no push: or schedule: trigger, so landing this is inert until something dispatches it. Required secrets are documented in apps/planohelper/README.md (in a follow-up PR). * fix(fetch_models): preserve existing providers when keys are missing Previously fetch_models rebuilt provider_models.yaml from scratch on every run, so running locally (or in CI) without e.g. ANTHROPIC_API_KEY, GOOGLE_API_KEY, or AWS Bedrock credentials would silently drop those providers' entries from the file. The user only meant to refresh what they had keys for. Now fetch_models loads the existing provider_models.yaml first and treats each provider independently: - Successful fetch -> entry replaced with fresh data ("updated") - Missing API key -> existing entry preserved ("skipped") - Failed fetch -> existing entry preserved ("failed, kept existing") - Missing AWS creds -> Amazon entry preserved instead of running `aws bedrock list-foundation-models` and erroring out If the file doesn't exist yet it starts fresh, same as before. If the file exists but can't be parsed, the binary refuses to overwrite it and exits with an error rather than silently nuking it. Other changes that come along for the ride: - HashMap -> BTreeMap for the providers map. Output YAML now has a stable, alphabetical provider order across runs (eliminates HashMap-iteration churn in PR diffs). The first PR after this lands will reorder existing entries one time. - Per-provider summary at the end (updated / skipped / failed) so the workflow logs and Slack PR body make it obvious what actually changed vs. what was left alone. - File-level usage comment updated to match the new behavior and list the additional env vars (MISTRAL_API_KEY, MIMO_API_KEY). No tests existed for this binary; manually verified with `env -i` (no keys at all) that all 13 existing providers are preserved with their original model counts.	2026-05-05 14:19:52 -07:00
Musa	b81eb7266c	feat(providers): add Vercel AI Gateway and OpenRouter support (#902 ) Some checks are pending CI / pre-commit (push) Waiting to run CI / plano-tools-tests (push) Waiting to run CI / native-smoke-test (push) Waiting to run CI / docker-build (push) Waiting to run CI / validate-config (push) Waiting to run CI / security-scan (push) Blocked by required conditions CI / test-prompt-gateway (push) Blocked by required conditions CI / test-model-alias-routing (push) Blocked by required conditions CI / test-responses-api-with-state (push) Blocked by required conditions CI / e2e-plano-tests (3.10) (push) Blocked by required conditions CI / e2e-plano-tests (3.11) (push) Blocked by required conditions CI / e2e-plano-tests (3.12) (push) Blocked by required conditions CI / e2e-plano-tests (3.13) (push) Blocked by required conditions CI / e2e-plano-tests (3.14) (push) Blocked by required conditions CI / e2e-demo-preference (push) Blocked by required conditions CI / e2e-demo-currency (push) Blocked by required conditions Publish docker image (latest) / build-arm64 (push) Waiting to run Publish docker image (latest) / build-amd64 (push) Waiting to run Publish docker image (latest) / create-manifest (push) Blocked by required conditions Build and Deploy Documentation / build (push) Waiting to run * add Vercel and OpenRouter as OpenAI-compatible LLM providers * fix(fmt): fix cargo fmt line length issues in provider id tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style(hermesllm): fix rustfmt formatting in provider id tests * Add Vercel and OpenRouter to zero-config planoai up defaults Wires `vercel/` and `openrouter/` into the synthesized default config so `planoai up` with no user config exposes both providers out of the box (env-keyed via AI_GATEWAY_API_KEY / OPENROUTER_API_KEY, pass-through otherwise). Registers both in SUPPORTED_PROVIDERS_WITHOUT_BASE_URL so wildcard model entries validate without an explicit provider_interface. --------- Co-authored-by: Musa Malik <musam@uw.edu> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 15:54:39 -07:00
Musa	78dc4edad9	Add first-class ChatGPT subscription provider support (#881 ) * Add first-class ChatGPT subscription provider support * Address PR feedback: move uuid import to top, reuse parsed config in up() * Add ChatGPT token watchdog for seamless long-lived sessions * Address PR feedback: error on stream=false for ChatGPT, fix auth file permissions * Replace ChatGPT watchdog/restart with passthrough_auth --------- Co-authored-by: Musa Malik <musam@uw.edu>	2026-04-23 15:34:44 -07:00
Syed A. Hashmi	c8079ac971	signals: feature parity with the latest Signals paper. Porting logic from python repo (#903 ) * signals: port to layered taxonomy with dual-emit OTel Made-with: Cursor * fix: silence collapsible_match clippy lint (rustc 1.95) Made-with: Cursor * test: parity harness for rust vs python signals analyzer Validates the brightstaff signals port against the katanemo/signals Python reference on lmsys/lmsys-chat-1m. Adds a signals_replay bin emitting python- compatible JSON, a pyarrow-based driver (bypasses the datasets loader pickle bug on python 3.14), a 3-tier comparator, and an on-demand workflow_dispatch CI job. Made-with: Cursor * Remove signals test from the gitops flow * style: format parity harness with black Made-with: Cursor * signals: group summary by taxonomy, factor misalignment_ratio Addresses #903 review feedback from @nehcgs: - generate_summary() now renders explicit Interaction / Execution / Environment headers so the paper taxonomy is visible at a glance, even when no signals fired in a given layer. Quality-driving callouts (high misalignment rate, looping detected, escalation requested) are appended after the layer summary as an alerts tail. - repair_ratio (legacy taxonomy name) renamed to misalignment_ratio and factored into a single InteractionSignals::misalignment_ratio() helper so assess_quality and generate_summary share one source of truth instead of recomputing the same divide twice. Two new unit tests pin the layer headers and the (sev N) severity suffix. Parity with the python reference is preserved at the Tier-A level (per-type counts + overall_quality); only the human-readable summary string diverges, which the parity comparator already classifies as Tier-C. Made-with: Cursor	2026-04-23 12:02:30 -07:00
Adil Hafeez	78d8c90184	Add claude-opus-4-7 to anthropic provider models (#901 ) Some checks are pending CI / pre-commit (push) Waiting to run CI / plano-tools-tests (push) Waiting to run CI / native-smoke-test (push) Waiting to run CI / docker-build (push) Waiting to run CI / validate-config (push) Waiting to run CI / security-scan (push) Blocked by required conditions CI / test-prompt-gateway (push) Blocked by required conditions CI / test-model-alias-routing (push) Blocked by required conditions CI / test-responses-api-with-state (push) Blocked by required conditions CI / e2e-plano-tests (3.10) (push) Blocked by required conditions CI / e2e-plano-tests (3.11) (push) Blocked by required conditions CI / e2e-plano-tests (3.12) (push) Blocked by required conditions CI / e2e-plano-tests (3.13) (push) Blocked by required conditions CI / e2e-plano-tests (3.14) (push) Blocked by required conditions CI / e2e-demo-preference (push) Blocked by required conditions CI / e2e-demo-currency (push) Blocked by required conditions Publish docker image (latest) / build-arm64 (push) Waiting to run Publish docker image (latest) / build-amd64 (push) Waiting to run Publish docker image (latest) / create-manifest (push) Blocked by required conditions Build and Deploy Documentation / build (push) Waiting to run	2026-04-18 19:10:57 -07:00
Adil Hafeez	e7464b817a	fix(anthropic-stream): avoid bare/duplicate message_stop on OpenAI upstream (#898 )	2026-04-18 15:57:34 -07:00
Adil Hafeez	0f67b2c806	planoai obs: live LLM observability TUI (#891 )	2026-04-17 14:03:47 -07:00
Adil Hafeez	711e4dd07d	Add DigitalOcean as a first-class LLM provider (#889 )	2026-04-17 12:25:34 -07:00
Musa	978b1ea722	Add first-class Xiaomi provider support (#863 ) Some checks failed CI / pre-commit (push) Has been cancelled CI / plano-tools-tests (push) Has been cancelled CI / native-smoke-test (push) Has been cancelled CI / docker-build (push) Has been cancelled CI / validate-config (push) Has been cancelled CI / security-scan (push) Has been cancelled CI / test-prompt-gateway (push) Has been cancelled CI / test-model-alias-routing (push) Has been cancelled CI / test-responses-api-with-state (push) Has been cancelled CI / e2e-plano-tests (3.10) (push) Has been cancelled CI / e2e-plano-tests (3.11) (push) Has been cancelled CI / e2e-plano-tests (3.12) (push) Has been cancelled CI / e2e-plano-tests (3.13) (push) Has been cancelled CI / e2e-plano-tests (3.14) (push) Has been cancelled CI / e2e-demo-preference (push) Has been cancelled CI / e2e-demo-currency (push) Has been cancelled Publish docker image (latest) / build-arm64 (push) Has been cancelled Publish docker image (latest) / build-amd64 (push) Has been cancelled Publish docker image (latest) / create-manifest (push) Has been cancelled Build and Deploy Documentation / build (push) Has been cancelled * feat(provider): add xiaomi as first-class provider * feat(demos): add xiaomi mimo integration demo * refactor(demos): remove Xiaomi MiMo integration demo and update documentation * updating model list and adding the xiamoi models --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-04-04 09:58:36 -07:00
Adil Hafeez	7606c55b4b	support developer role in chat completions API (#867 )	2026-04-02 18:10:32 -07:00
Musa	3dbda9741e	fix: route Perplexity OpenAI endpoints without /v1 (#854 ) * fix: route Perplexity OpenAI paths without /v1 * add tests for Perplexity provider handling in LLM module * refactor: use constant for Perplexity provider prefix in LLM module * moving const to top of file	2026-03-31 17:40:42 -07:00
Salman Paracha	4bb5c6404f	adding new supported models to plano (#829 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-389.local>	2026-03-15 12:37:20 -07:00
Adil Hafeez	bc059aed4d	Unified overrides for custom router and orchestrator models (#820 ) * support configurable orchestrator model via orchestration config section * add self-hosting docs and demo for Plano-Orchestrator * list all Plano-Orchestrator model variants in docs * use overrides for custom routing and orchestration model * update docs * update orchestrator model name * rename arch provider to plano, use llm_routing_model and agent_orchestration_model * regenerate rendered config reference	2026-03-15 09:36:11 -07:00
Musa	6610097659	Support for Codex via Plano (#808 ) * Add Codex CLI support; xAI response improvements * Add native Plano running check and update CLI agent error handling * adding PR suggestions for transformations and code quality * message extraction logic in ResponsesAPIRequest * xAI support for Responses API by routing to native endpoint + refactor code	2026-03-10 20:54:14 -07:00
Adil Hafeez	baeee56f6b	Make model field optional in request types, resolve from default provider (#768 )	2026-02-18 04:43:59 -08:00
Salman Paracha	0557f7ff98	updated the models list to include models like Opus 4.6 (#753 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-02-13 15:08:11 -08:00
Adil Hafeez	e41aa0a617	upgrade rust to 1.93.0 and fix pre-commit (#720 )	2026-02-02 11:03:12 -08:00
Salman Paracha	2941392ed1	Adding support for wildcard models in the model_providers config (#696 ) * cleaning up plano cli commands * adding support for wildcard model providers * fixing compile errors * fixing bugs related to default model provider, provider hint and duplicates in the model provider list * fixed cargo fmt issues * updating tests to always include the model id * using default for the prompt_gateway path * fixed the model name, as gpt-5-mini-2025-08-07 wasn't in the config * making sure that all aliases and models match the config * fixed the config generator to allow for base_url providers LLMs to include wildcard models * re-ran the models list utility and added a shell script to run it * updating docs to mention wildcard model providers * updated provider_models.json to yaml, added that file to our docs for reference * updating the build docs to use the new root-based build --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-28 17:47:33 -08:00
Salman Paracha	cdc1d7cee2	making Messages.Content optional, and having the upstream LLM fail if the right fields aren't set (#699 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2026-01-16 16:24:03 -08:00
Salman Paracha	b4543ba56c	Introduce signals change (#655 ) * adding support for signals * reducing false positives for signals like positive interaction * adding docs. Still need to fix the messages list, but waiting on PR #621 * Improve frustration detection: normalize contractions and refine punctuation * Further refine test cases with longer messages * minor doc changes * fixing echo statement for build * fixing the messages construction and using the trait for signals * update signals docs * fixed some minor doc changes * added more tests and fixed docuemtnation. PR 100% ready * made fixes based on PR comments * Optimize latency 1. replace sliding window approach with trigram containment check 2. add code to pre-compute ngrams for patterns * removed some debug statements to make tests easier to read * PR comments to make ObservableStreamProcessor accept optonal Vec<Messagges> * fixed PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> Co-authored-by: MeiyuZhong <mariazhong9612@gmail.com> Co-authored-by: nehcgs <54548843+nehcgs@users.noreply.github.com>	2026-01-07 11:20:44 -08:00
Adil Hafeez	ca95ffb63d	cargo clippy (#660 )	2025-12-25 21:08:37 -08:00
Salman Paracha	48bbc7cce7	fixed reasoning failures (#634 ) * fixed reasoning failures * adding debugging * made several fixes for transmission isses for SSeEvents, incomplete handling of json types by anthropic, and wrote a bunch of tests * removed debugging from supervisord.conf --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-18 11:02:59 -08:00
Adil Hafeez	2f9121407b	Use mcp tools for filter chain (#621 ) * agents framework demo * more changes * add more changes * pending changes * fix tests * fix more * rebase with main and better handle error from mcp * add trace for filters * add test for client error, server error and for mcp error * update schema validate code and rename kind => type in agent_filter * fix agent description and pre-commit * fix tests * add provider specific request parsing in agents chat * fix precommit and tests * cleanup demo * update readme * fix pre-commit * refactor tracing * fix fmt * fix: handle MessageContent enum in responses API conversion - Update request.rs to handle new MessageContent enum structure from main - MessageContent can now be Text(String) or Items(Vec<InputContent>) - Handle new InputItem variants (ItemReference, FunctionCallOutput) - Fixes compilation error after merging latest main (#632) * address pr feedback * fix span * fix build * update openai version	2025-12-17 17:30:14 -08:00
Salman Paracha	d5a273f740	enable state management for v1/responses (#631 ) * first commit with tests to enable state mamangement via memory * fixed logs to follow the conversational flow a bit better * added support for supabase * added the state_storage_v1_responses flag, and use that to store state appropriately * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixed mixed inputs from openai v1/responses api (#632) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local> * resolving PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-17 12:18:38 -08:00
Salman Paracha	33e90dd338	fixed mixed inputs from openai v1/responses api (#632 ) * fixed mixed inputs from openai v1/responses api * removing tracing from model-alias-rouing * handling additional input types from openairs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-16 13:39:13 -08:00
Salman Paracha	a79f55f313	Improve end to end tracing (#628 ) * adding canonical tracing support via bright-staff * improved formatting for tools in the traces * removing anthropic from the currency exchange demo * using Envoy to transport traces, not calling OTEL directly * moving otel collcetor cluster outside tracing if/else * minor fixes to not write to the OTEL collector if tracing is disabled * fixed PR comments and added more trace attributes * more fixes based on PR comments * more clean up based on PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-11 15:21:57 -08:00
Salman Paracha	a448c6e9cb	Add support for v1/responses API (#622 ) * making first commit. still need to work on streaming respones * making first commit. still need to work on streaming respones * stream buffer implementation with tests * adding grok API keys to workflow * fixed changes based on code review * adding support for bedrock models * fixed issues with translation to claude code --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-12-03 14:58:26 -08:00
Salman Paracha	88c2bd1851	removing model_server python module to brightstaff (function calling) (#615 ) * adding function_calling functionality via rust * fixed rendered YAML file * removed model_server from envoy.template and forwarding traffic to bright_staff * fixed bugs in function_calling.rs that were breaking tests. All good now * updating e2e test to clean up disk usage * removing Arch* models to be used as a default model if one is not specified * if the user sets arch-function base_url we should honor it * fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build * adding a constant for Arch-Function model name * fixing some edge cases with calls made to Arch-Function * fixed JSON parsing issues in function_calling.rs * fixed bug where the raw response from Arch-Function was re-encoded * removed debug from supervisord.conf * commenting out disk cleanup * adding back disk space --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>	2025-11-22 12:55:00 -08:00
Salman Paracha	cdfcfb9169	support base_url path for model providers (#608 ) * adding support for base_url * updated docs * fixed tests for config generator * making fixes based on PR comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-10-29 17:08:07 -07:00
Salman Paracha	566e7b9c09	fixed bug in Bedrock translation code and dramatically improved tracing for outbound LLM traffic (#601 ) * dramatically improve LLM traces and fixed bug with Bedrock translation from claude code * addressing comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-10-24 14:07:05 -07:00
Salman Paracha	9407ae6af7	Add support for Amazon Bedrock Converse and ConverseStream (#588 ) * first commit to get Bedrock Converse API working. Next commit support for streaming and binary frames * adding translation from BedrockBinaryFrameDecoder to AnthropicMessagesEvent * Claude Code works with Amazon Bedrock * added tests for openai streaming from bedrock * PR comments fixed * adding support for bedrock in docs as supported provider * cargo fmt * revertted to chatgpt models for claude code routing --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local> Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>	2025-10-22 11:31:21 -07:00
Adil Hafeez	96e0732089	add support for agents (#564 )	2025-10-14 14:01:11 -07:00
Salman Paracha	226139e907	adding support for Qwen models and fixed issue with passing PATH vari… (#583 ) * adding support for Qwen models and fixed issue with passing PATH variable * don't need to have qwen in the model alias routing example * fixed base_url for qwen --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-10-01 21:57:58 -07:00
Salman Paracha	045a5e9751	adding support for moonshot and z-ai (#578 ) * adding support for moonshot and z-ai * Revert unwanted changes to arch_config.yaml --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-09-30 12:24:06 -07:00
Salman Paracha	f00870dccb	adding support for claude code routing (#575 ) * fixed for claude code routing. first commit * removing redundant enum tags for cache_control * making sure that claude code can run via the archgw cli * fixing broken config * adding a README.md and updated the cli to use more of our defined patterns for params * fixed config.yaml * minor fixes to make sure PR is clean. Ready to ship * adding claude-sonnet-4-5 to the config * fixes based on PR * fixed alias for README * fixed 400 error handling tests, now that we write temperature to 1.0 for GPT-5 --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-09-29 19:23:08 -07:00
Salman Paracha	03c2cf6f0d	fixed changes related to max_tokens and processing http error codes like 400 properly (#574 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>	2025-09-25 17:00:37 -07:00
Salman Paracha	fbe82351c0	Salmanap/fix docs new providers model alias (#571 ) * fixed docs and added ollama as a first-class LLM provider * matching the LLM routing section on the README.md to the docs * updated the section on preference-based routing --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-19 10:19:57 -07:00
Salman Paracha	8d0b468345	draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570 ) * draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI * fixing failing tests and updating rederend config file * Update arch_config_with_aliases.yaml * adding the AZURE_API_KEY to the GH workflow for e2e * fixing GH secerts * adding valdiating for azure_openai --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-18 18:36:30 -07:00
Adil Hafeez	3eb6af8829	add default implementation for common openai types (#568 )	2025-09-16 12:48:07 -07:00
Salman Paracha	4eb2b410c5	adding support for model aliases in archgw (#566 ) * adding support for model aliases in archgw * fixed PR based on feedback * removing README. Not relevant for PR --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>	2025-09-16 11:12:08 -07:00
Salman Paracha	fb0581fd39	add support for v1/messages and transformations (#558 ) * pushing draft PR * transformations are working. Now need to add some tests next * updated tests and added necessary response transformations for Anthropics' message response object * fixed bugs for integration tests * fixed doc tests * fixed serialization issues with enums on response * adding some debug logs to help * fixed issues with non-streaming responses * updated the stream_context to update response bytes * the serialized bytes length must be set in the response side * fixed the debug statement that was causing the integration tests for wasm to fail * fixing json parsing errors * intentionally removing the headers * making sure that we convert the raw bytes to the correct provider type upstream * fixing non-streaming responses to tranform correctly * /v1/messages works with transformations to and from /v1/chat/completions * updating the CLI and demos to support anthropic vs. claude * adding the anthropic key to the preference based routing tests * fixed test cases and added more structured logs * fixed integration tests and cleaned up logs * added python client tests for anthropic and openai * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixing the tests. python dependency order was broken * updated the openAI client to fix demos * removed the raw response debug statement * fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits * fixing logs * moved away from string literals to consts * fixed streaming from Anthropic Client to OpenAI * removed debug statement that would likely trip up integration tests * fixed integration tests for llm_gateway * cleaned up test cases and removed unnecessary crates * fixing comments from PR * fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>	2025-09-10 07:40:30 -07:00
Salman Paracha	89ab51697a	updating the implementation of /v1/chat/completions to use the generi… (#548 ) * updating the implementation of /v1/chat/completions to use the generic provider interfaces * saving changes, although we will need a small re-factor after this as well * more refactoring changes, getting close * more refactoring changes to avoid unecessary re-direction and duplication * more clean up * more refactoring * more refactoring to clean code and make stream_context.rs work * removing unecessary trait implemenations * some more clean-up * fixed bugs * fixing test cases, and making sure all references to the ChatCOmpletions* objects point to the new types * refactored changes to support enum dispatch * removed the dependency on try_streaming_from_bytes into a try_from trait implementation * updated readme based on new usage * updated code based on code review comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-2.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>	2025-08-20 12:55:29 -07:00
Salman Paracha	93ff4d7b1f	pushing new apis module for hermes (#547 )	2025-08-07 12:42:09 -07:00
Adil Hafeez	04c7e5a175	bug fix - allow image content to pass through (#539 ) fixes https://github.com/katanemo/archgw/issues/535	2025-07-25 01:22:06 -07:00
Adil Hafeez	00dc95e034	Add support for updating model preferences (#510 )	2025-07-02 14:08:19 -07:00

1 2

53 commits