plano

mirror of https://github.com/katanemo/plano.git synced 2026-04-25 00:36:34 +02:00

Author	SHA1	Message	Date
Salman Paracha	03c2cf6f0d	fixed changes related to max_tokens and processing http error codes like 400 properly (#574 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-257.local>	2025-09-25 17:00:37 -07:00
Salman Paracha	8d0b468345	draft commit to add support for xAI, TogehterAI, AzureOpenAI (#570 ) * draft commit to add support for xAI, LambdaAI, TogehterAI, AzureOpenAI * fixing failing tests and updating rederend config file * Update arch_config_with_aliases.yaml * adding the AZURE_API_KEY to the GH workflow for e2e * fixing GH secerts * adding valdiating for azure_openai --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-167.local>	2025-09-18 18:36:30 -07:00
Salman Paracha	fb0581fd39	add support for v1/messages and transformations (#558 ) * pushing draft PR * transformations are working. Now need to add some tests next * updated tests and added necessary response transformations for Anthropics' message response object * fixed bugs for integration tests * fixed doc tests * fixed serialization issues with enums on response * adding some debug logs to help * fixed issues with non-streaming responses * updated the stream_context to update response bytes * the serialized bytes length must be set in the response side * fixed the debug statement that was causing the integration tests for wasm to fail * fixing json parsing errors * intentionally removing the headers * making sure that we convert the raw bytes to the correct provider type upstream * fixing non-streaming responses to tranform correctly * /v1/messages works with transformations to and from /v1/chat/completions * updating the CLI and demos to support anthropic vs. claude * adding the anthropic key to the preference based routing tests * fixed test cases and added more structured logs * fixed integration tests and cleaned up logs * added python client tests for anthropic and openai * cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo * fixing the tests. python dependency order was broken * updated the openAI client to fix demos * removed the raw response debug statement * fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits * fixing logs * moved away from string literals to consts * fixed streaming from Anthropic Client to OpenAI * removed debug statement that would likely trip up integration tests * fixed integration tests for llm_gateway * cleaned up test cases and removed unnecessary crates * fixing comments from PR * fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>	2025-09-10 07:40:30 -07:00
Salman Paracha	89ab51697a	updating the implementation of /v1/chat/completions to use the generi… (#548 ) * updating the implementation of /v1/chat/completions to use the generic provider interfaces * saving changes, although we will need a small re-factor after this as well * more refactoring changes, getting close * more refactoring changes to avoid unecessary re-direction and duplication * more clean up * more refactoring * more refactoring to clean code and make stream_context.rs work * removing unecessary trait implemenations * some more clean-up * fixed bugs * fixing test cases, and making sure all references to the ChatCOmpletions* objects point to the new types * refactored changes to support enum dispatch * removed the dependency on try_streaming_from_bytes into a try_from trait implementation * updated readme based on new usage * updated code based on code review comments --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-2.local> Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>	2025-08-20 12:55:29 -07:00
Adil Hafeez	a7fddf30f9	better model names (#517 )	2025-07-11 16:42:16 -07:00
Adil Hafeez	147908ba7e	make arch-router cluster optional (#518 )	2025-07-08 00:33:40 -07:00
Adil Hafeez	00dc95e034	Add support for updating model preferences (#510 )	2025-07-02 14:08:19 -07:00
Adil Hafeez	aa9d747fa9	add support for gemini (#505 )	2025-06-11 15:15:00 -07:00
Adil Hafeez	6c53510f49	Introduce hermesllm library to handle llm message translation (#501 )	2025-06-10 12:53:27 -07:00
Adil Hafeez	218e9c540d	Add support for json based content types in Message (#480 )	2025-05-23 00:51:53 -07:00
Adil Hafeez	f5e77bbe65	add support for claude and add first class support for groq and deepseek (#479 )	2025-05-22 22:55:46 -07:00
Adil Hafeez	27c0f2fdce	Introduce brightstaff a new terminal service for llm routing (#477 )	2025-05-19 09:59:22 -07:00
Shuguang Chen	7d4b261a68	Integrate Arch-Function-Chat (#449 )	2025-04-15 14:39:12 -07:00
Salman Paracha	f31aa59fac	fixed issue with groq LLMs that require the openai in the /v1/chat/co… (#460 ) * fixed issue with groq LLMs that require the openai in the /v1/chat/completions path. My first change * updated the GH actions with keys for Groq * adding missing groq API keys * add llama-3.2-3b-preview to the model based on addin groq to the demo --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2025-04-13 14:00:16 -07:00
Adil Hafeez	de221525de	Use better logs (#452 )	2025-03-27 10:40:20 -07:00
Adil Hafeez	76ec5cda68	fix ollama demo (#450 )	2025-03-26 11:01:32 -07:00
Adil Hafeez	eb48f3d5bb	use passed in model name in chat completion request (#445 )	2025-03-21 15:56:17 -07:00
Adil Hafeez	84cd1df7bf	add preliminary support for llm agents (#432 )	2025-03-19 15:21:34 -07:00
Adil Hafeez	ed3845040e	add demo for deepseek (#426 )	2025-03-05 14:08:06 -08:00
Adil Hafeez	10cad4d0b7	add health check endpoint for llm gateway (#420 ) * add health check endpoint for llm gateway * fix rust tests	2025-03-03 13:11:57 -08:00
Adil Hafeez	e40b13be05	Update arch_config and add tests for arch config file (#407 )	2025-02-14 19:28:10 -08:00
Adil Hafeez	39266b5084	log improvements and some code refactor (#379 )	2025-01-31 10:37:53 -08:00
Adil Hafeez	6887d52750	When using ollama token count was not coming in (#375 ) When using ollama token count was not coming in resulting in token count and other metrics to show up as zero. This was not causing tracing to break.	2025-01-21 18:01:56 -08:00
Adil Hafeez	07ef3149b8	add support for using custom upstream llm (#365 )	2025-01-17 18:25:55 -08:00
José Ulises Niño Rivera	cd1b561192	Break apart metrics into their own module (#335 ) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-12-09 10:46:46 -08:00
José Ulises Niño Rivera	d002b2042a	Break apart common_types mod (#334 ) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-12-06 17:25:42 -08:00
Adil Hafeez	36489b4adc	use envoy to publish traces (#270 )	2024-11-18 17:55:39 -08:00
Adil Hafeez	097513ee60	fix start time of llm filter (#278 ) * fix start time of llm filter * fix int tests	2024-11-17 17:01:19 -08:00
Adil Hafeez	d3c17c7abd	move custom tracer to llm filter (#267 )	2024-11-15 10:44:01 -08:00
Aayush	1d229cba8f	Add in tpot (#269 ) * add in tpot and tokens per second * add in debug logs for new stats and update integration tests * update shared dashboard to include new stats	2024-11-14 15:03:08 -08:00
Aayush	5993e36f22	Update arch stats (#250 )	2024-11-12 15:03:26 -08:00
Adil Hafeez	9081eb0f7f	obfuscate auth header (#254 )	2024-11-08 15:17:39 -06:00
Adil Hafeez	a72bb804eb	add support for jaeger tracing (#229 )	2024-11-07 22:11:00 -06:00
José Ulises Niño Rivera	662a840ac5	Add support for streaming and fixes few issues (see description) (#202 )	2024-10-28 17:05:06 -07:00
Adil Hafeez	1719b7d5f8	Send back developer error correctly (#195 )	2024-10-18 13:14:18 -07:00
Adil Hafeez	c6ba28dfcc	Code refactor and some improvements - see description (#194 )	2024-10-18 12:53:44 -07:00
Adil Hafeez	21e7fe2cef	Split arch wasm filter code into prompt and llm gateway filters (#190 )	2024-10-17 10:16:40 -07:00
Adil Hafeez	3bd2ffe9fb	split wasm filter (#186 ) * split wasm filter * fix int and unit tests * rename public_types => common and move common code there * rename * fix int test	2024-10-16 14:20:26 -07:00

38 commits