plano

mirror of https://github.com/katanemo/plano.git synced 2026-04-26 01:06:25 +02:00

Author	SHA1	Message	Date
Adil Hafeez	f4d65e2469	stream access logs and improve access log format (#581 )	2025-09-30 18:46:13 -07:00
Salman Paracha	045a5e9751	adding support for moonshot and z-ai (#578 ) * adding support for moonshot and z-ai * Revert unwanted changes to arch_config.yaml --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-09-30 12:24:06 -07:00
Salman Paracha	f00870dccb	adding support for claude code routing (#575 ) * fixed for claude code routing. first commit * removing redundant enum tags for cache_control * making sure that claude code can run via the archgw cli * fixing broken config * adding a README.md and updated the cli to use more of our defined patterns for params * fixed config.yaml * minor fixes to make sure PR is clean. Ready to ship * adding claude-sonnet-4-5 to the config * fixes based on PR * fixed alias for README * fixed 400 error handling tests, now that we write temperature to 1.0 for GPT-5 --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>	2025-09-29 19:23:08 -07:00
Salman Paracha	8d0b468345	draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570 ) * draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI * fixing failing tests and updating rederend config file * Update arch_config_with_aliases.yaml * adding the AZURE_API_KEY to the GH workflow for e2e * fixing GH secerts * adding valdiating for azure_openai --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-18 18:36:30 -07:00
Salman Paracha	fb0581fd39	add support for v1/messages and transformations (#558 ) * pushing draft PR * transformations are working. Now need to add some tests next * updated tests and added necessary response transformations for Anthropics' message response object * fixed bugs for integration tests * fixed doc tests * fixed serialization issues with enums on response * adding some debug logs to help * fixed issues with non-streaming responses * updated the stream_context to update response bytes * the serialized bytes length must be set in the response side * fixed the debug statement that was causing the integration tests for wasm to fail * fixing json parsing errors * intentionally removing the headers * making sure that we convert the raw bytes to the correct provider type upstream * fixing non-streaming responses to tranform correctly * /v1/messages works with transformations to and from /v1/chat/completions * updating the CLI and demos to support anthropic vs. claude * adding the anthropic key to the preference based routing tests * fixed test cases and added more structured logs * fixed integration tests and cleaned up logs * added python client tests for anthropic and openai * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixing the tests. python dependency order was broken * updated the openAI client to fix demos * removed the raw response debug statement * fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits * fixing logs * moved away from string literals to consts * fixed streaming from Anthropic Client to OpenAI * removed debug statement that would likely trip up integration tests * fixed integration tests for llm_gateway * cleaned up test cases and removed unnecessary crates * fixing comments from PR * fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>	2025-09-10 07:40:30 -07:00
Adil Hafeez	aa9d747fa9	add support for gemini (#505 )	2025-06-11 15:15:00 -07:00
Adil Hafeez	e43d41ba32	add support for bortli compression (#502 )	2025-06-05 17:00:14 -07:00
Adil Hafeez	8d12a9a6e0	add arch provider (#494 )	2025-05-30 17:12:52 -07:00
Adil Hafeez	4899117876	add compress/decompress filter to llm listener (#489 )	2025-05-28 15:06:52 -07:00
Adil Hafeez	f5e77bbe65	add support for claude and add first class support for groq and deepseek (#479 )	2025-05-22 22:55:46 -07:00
Adil Hafeez	27c0f2fdce	Introduce brightstaff a new terminal service for llm routing (#477 )	2025-05-19 09:59:22 -07:00
Adil Hafeez	84cd1df7bf	add preliminary support for llm agents (#432 )	2025-03-19 15:21:34 -07:00
Adil Hafeez	e40b13be05	Update arch_config and add tests for arch config file (#407 )	2025-02-14 19:28:10 -08:00
Adil Hafeez	2bd61d628c	add ability to specify custom http headers in api endpoint (#386 )	2025-02-06 11:48:09 -08:00
Adil Hafeez	962727f244	Infer port from protocol if port is not specified and add ability to override hostname in clusters def (#389 )	2025-02-03 14:51:59 -08:00
Adil Hafeez	38f7691163	add support for custom llm with ssl support (#380 ) * add support for custom llm with ssl support Add support for using custom llm that are served through https protocol. * add instructions on how to add custom inference endpoint * fix formatting * add more details * Apply suggestions from code review Co-authored-by: Salman Paracha <salman.paracha@gmail.com> * Apply suggestions from code review * fix precommit --------- Co-authored-by: Salman Paracha <salman.paracha@gmail.com>	2025-01-24 17:14:24 -08:00
Adil Hafeez	07ef3149b8	add support for using custom upstream llm (#365 )	2025-01-17 18:25:55 -08:00
Shuguang Chen	ba7279becb	Use intent model from archfc to pick prompt gateway (#328 )	2024-12-20 13:25:01 -08:00
Aayush	67b8fd635e	add more granular bucket sizes for ttft (#343 ) * add more granular bucket sizes for ttft	2024-12-12 14:38:36 -08:00
Adil Hafeez	a54db1a098	update getting started guide and add llm gateway and prompt gateway samples (#330 )	2024-12-06 14:37:33 -08:00
Adil Hafeez	36489b4adc	use envoy to publish traces (#270 )	2024-11-18 17:55:39 -08:00
Adil Hafeez	d3c17c7abd	move custom tracer to llm filter (#267 )	2024-11-15 10:44:01 -08:00
Adil Hafeez	30647fd508	Add service to stream custom otel traces to otel-collector (#262 )	2024-11-12 11:09:40 -08:00
Adil Hafeez	a72bb804eb	add support for jaeger tracing (#229 )	2024-11-07 22:11:00 -06:00
José Ulises Niño Rivera	662a840ac5	Add support for streaming and fixes few issues (see description) (#202 )	2024-10-28 17:05:06 -07:00
Adil Hafeez	faf64960df	update observability and dashboards (#198 )	2024-10-18 15:07:49 -07:00
Adil Hafeez	21e7fe2cef	Split arch wasm filter code into prompt and llm gateway filters (#190 )	2024-10-17 10:16:40 -07:00
Adil Hafeez	3bd2ffe9fb	split wasm filter (#186 ) * split wasm filter * fix int and unit tests * rename public_types => common and move common code there * rename * fix int test	2024-10-16 14:20:26 -07:00
Adil Hafeez	3e9327cf36	fix bug in jinja template for tracing	2024-10-09 16:44:50 -07:00
Adil Hafeez	e81ca8d5cf	llm listener split (#155 )	2024-10-09 15:47:32 -07:00
Adil Hafeez	ede125a4f3	ensure that tracing is optional in arch_config (#149 )	2024-10-08 17:15:40 -07:00
Adil Hafeez	285aa1419b	Split listener (#141 )	2024-10-08 16:24:08 -07:00
Salman Paracha	b60ceb9168	model server build (#127 ) * first commit to have model_server not be dependent on Docker * making changes to fix the docker-compose file for archgw to set DNS_V4 and minor fixes with the build * additional fixes for model server to be separated out in the build * additional fixes for model server to be separated out in the build * fix to get model_server to be built as a separate python process. TODO: fix the embeddings logs after cli completes * fixing init to pull tempfile using the tempfile python package --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-06 18:21:43 -07:00
Adil Hafeez	2a747df7c0	don't compute embeddings for names and other fixes see description (#126 ) * serialize tools - 2 * fix int tests * fix int test * fix unit tests	2024-10-05 19:25:16 -07:00
Salman Paracha	dc57f119a0	archgw cli (#117 ) * initial commit of the insurange agent demo, with the CLI tool * committing the cli * fixed some field descriptions for generate-prompt-targets * CLI works with buil, up and down commands. Function calling example works stand-alone * fixed README to install archgw cli * fixing based on feedback * fixing based on feedback --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-03 18:21:27 -07:00
José Ulises Niño Rivera	8ea917aae5	Add the ability to use LLM Providers from the Arch config (#112 ) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-10-03 10:57:01 -07:00
Aayush	c8d0dbec26	change default stat_prefix from ingress_http to arch (#109 ) * change default stat_prefix from ingress_http to arch * Update arch/envoy.template.yaml Co-authored-by: Adil Hafeez <adil@katanemo.com> --------- Co-authored-by: Adil Hafeez <adil@katanemo.com>	2024-10-02 18:21:33 -07:00
Adil Hafeez	1a7c1ad0a5	rename archgw_model_sever => model_server (#106 )	2024-10-01 11:24:43 -07:00
Salman Paracha	8654d3d5c5	simplify developer getting started experience (#102 ) * Fixed build. Now, we have a bare bones version of the docker-compose file with only two services, archgw and archgw-model-server. Tested using CLI * some pre-commit fixes * fixed cargo formatting issues * fixed model server conflict changes --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-01 10:02:23 -07:00
Adil Hafeez	f4395d39f9	Fold function_resolver into model_server (#103 )	2024-10-01 09:13:50 -07:00
Adil Hafeez	cc35eb0cd7	update config (#93 )	2024-09-30 17:49:05 -07:00
Adil Hafeez	ea86f73605	rename envoyfilter => arch (#91 ) * rename envoyfilter => arch * fix more files * more fixes * more renames	2024-09-27 16:41:39 -07:00

42 commits