removing model_server python module to brightstaff (function calling) (#615)

* adding function_calling functionality via rust

* fixed rendered YAML file

* removed model_server from envoy.template and forwarding traffic to bright_staff

* fixed bugs in function_calling.rs that were breaking tests. All good now

* updating e2e test to clean up disk usage

* removing Arch* models to be used as a default model if one is not specified

* if the user sets arch-function base_url we should honor it

* fixing demos as we needed to pin to a particular version of huggingface_hub else the chatbot ui wouldn't build

* adding a constant for Arch-Function model name

* fixing some edge cases with calls made to Arch-Function

* fixed JSON parsing issues in function_calling.rs

* fixed bug where the raw response from Arch-Function was re-encoded

* removed debug from supervisord.conf

* commenting out disk cleanup

* adding back disk space

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-288.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-342.local>
This commit is contained in:
Salman Paracha 2025-11-22 12:55:00 -08:00 committed by GitHub
parent 126b029345
commit 88c2bd1851
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
40 changed files with 2517 additions and 1356 deletions

View file

@ -1,4 +1,4 @@
@model_server_endpoint = http://localhost:51000
@model_server_endpoint = http://localhost:12000
@archfc_endpoint = https://archfc.katanemo.dev
### talk to function calling endpoint

View file

@ -1,4 +1,4 @@
@model_server_endpoint = http://localhost:51000
@model_server_endpoint = http://localhost:12000
@archfc_endpoint = https://archfc.katanemo.dev
### multi turn conversation with intent, except parameter gathering
@ -54,26 +54,8 @@ Content-Type: application/json
}
]
}
### talk to Arch-Intent directly for completion
POST https://archfc.katanemo.dev/v1/chat/completions HTTP/1.1
Content-Type: application/json
{
"model": "Arch-Intent",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant.\n\nYou task is to check if there are any tools that can be used to help the last user message in conversations according to the available tools listed below.\n\n<tools>\n{\"index\": \"T0\", \"type\": \"function\", \"function\": {\"name\": \"weather_forecast\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"str\"}, \"days\": {\"type\": \"int\"}}, \"required\": [\"city\", \"days\"]}}}\n</tools>\n\nProvide your tool assessment for ONLY THE LAST USER MESSAGE in the above conversation:\n- First line must read 'Yes' or 'No'.\n- If yes, a second line must include a comma-separated list of tool indexes.\n"
},
{ "role": "user", "content": "hi" }
],
"stream": false
}
### multi turn conversation with correct parameters
### multi turn conversation with intent, except parameter gathering
POST {{model_server_endpoint}}/function_calling HTTP/1.1
Content-Type: application/json
@ -125,21 +107,6 @@ Content-Type: application/json
}
]
}
### talk to Arch-Intent directly for completion, expect No
POST https://archfc.katanemo.dev/v1/chat/completions HTTP/1.1
Content-Type: application/json
{
"model": "Arch-Intent",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant.\n\nYou task is to check if there are any tools that can be used to help the last user message in conversations according to the available tools listed below.\n\n<tools>\n{\"index\": \"T0\", \"type\": \"function\", \"function\": {\"name\": \"weather_forecast\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"str\"}, \"days\": {\"type\": \"int\"}}, \"required\": [\"city\", \"days\"]}}}\n</tools>\n\nProvide your tool assessment for ONLY THE LAST USER MESSAGE in the above conversation:\n- First line must read 'Yes' or 'No'.\n- If yes, a second line must include a comma-separated list of tool indexes.\n"
},
{ "role": "user", "content": "what is your name" }
],
"stream": false
}
### multi turn conversation with correct parameters
POST {{model_server_endpoint}}/function_calling HTTP/1.1

View file

@ -1,4 +1,4 @@
@model_server_endpoint = http://localhost:51000
@model_server_endpoint = http://localhost:12000
@archfc_endpoint = https://archfc.katanemo.dev
### single turn function calling all parameters insurance agent summary