Commit graph

11 commits

Author SHA1 Message Date
Adil Hafeez
e7ce00b5a7
rename cli to plano (#647) 2025-12-23 18:37:58 -08:00
Salman Paracha
88c2bd1851
removing model_server python module to brightstaff (function calling) (#615)
* adding function_calling functionality via rust

* fixed rendered YAML file

* removed model_server from envoy.template and forwarding traffic to bright_staff

* fixed bugs in function_calling.rs that were breaking tests. All good now

* updating e2e test to clean up disk usage

* removing Arch* models to be used as a default model if one is not specified

* if the user sets arch-function base_url we should honor it

* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build

* adding a constant for Arch-Function model name

* fixing some edge cases with calls made to Arch-Function

* fixed JSON parsing issues in function_calling.rs

* fixed bug where the raw response from Arch-Function was re-encoded

* removed debug from supervisord.conf

* commenting out disk cleanup

* adding back disk space

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
2025-11-22 12:55:00 -08:00
Salman Paracha
fb0581fd39
add support for v1/messages and transformations (#558)
* pushing draft PR

* transformations are working. Now need to add some tests next

* updated tests and added necessary response transformations for Anthropics' message response object

* fixed bugs for integration tests

* fixed doc tests

* fixed serialization issues with enums on response

* adding some debug logs to help

* fixed issues with non-streaming responses

* updated the stream_context to update response bytes

* the serialized bytes length must be set in the response side

* fixed the debug statement that was causing the integration tests for wasm to fail

* fixing json parsing errors

* intentionally removing the headers

* making sure that we convert the raw bytes to the correct provider type upstream

* fixing non-streaming responses to tranform correctly

* /v1/messages works with transformations to and from /v1/chat/completions

* updating the CLI and demos to support anthropic vs. claude

* adding the anthropic key to the preference based routing tests

* fixed test cases and added more structured logs

* fixed integration tests and cleaned up logs

* added python client tests for anthropic and openai

* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo

* fixing the tests. python dependency order was broken

* updated the openAI client to fix demos

* removed the raw response debug statement

* fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits

* fixing logs

* moved away from string literals to consts

* fixed streaming from Anthropic Client to OpenAI

* removed debug statement that would likely trip up integration tests

* fixed integration tests for llm_gateway

* cleaned up test cases and removed unnecessary crates

* fixing comments from PR

* fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
2025-09-10 07:40:30 -07:00
Shuguang Chen
7d4b261a68
Integrate Arch-Function-Chat (#449) 2025-04-15 14:39:12 -07:00
Adil Hafeez
eb48f3d5bb
use passed in model name in chat completion request (#445) 2025-03-21 15:56:17 -07:00
Adil Hafeez
d3c17c7abd
move custom tracer to llm filter (#267) 2024-11-15 10:44:01 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 (#255)
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13

* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
6b62662e01
update docs with weather_forecast path (#253) 2024-11-08 10:00:15 -08:00
Salman Paracha
dab7a44053
several fixes to demos (#238)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-30 18:38:18 -07:00
Salman Paracha
bb882fb59b
Updated hr_agent to be full stack: gradio + fastAPI (#235)
* commiting to remove

* fix

* updating hr_agent

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-30 15:05:34 -07:00
Salman Paracha
bb9a774a72
moving chatbot-ui in demos and out of root project structure (#228)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-29 12:05:29 -07:00