Commit graph

19 commits

Author SHA1 Message Date
Adil Hafeez
3511798fa8
Integrate Arch-Function-Calling-1.5B model (#85)
* add arch support

* add missing file

* e2e tests

* delete old files and fix response

* fmt
2024-09-25 23:30:50 -07:00
José Ulises Niño Rivera
9ea6bb0d73
Revert "Revert "Add support for multiple LLM Providers (#60)"" (#83)
* Revert "Revert "Add support for multiple LLM Providers (#60)""

This reverts commit 43d6bc80e9.

* wip

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* Revert "wip"

This reverts commit 7c4dde5d1f.

* fix parameter name

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* force use openai

---------

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-09-25 23:15:17 -07:00
Adil Hafeez
43d6bc80e9 Revert "Add support for multiple LLM Providers (#60)"
This reverts commit bd8206742a.
2024-09-25 08:15:22 -07:00
Adil Hafeez
d970b214f4
improve logging of api failure (#79) 2024-09-24 23:56:24 -07:00
José Ulises Niño Rivera
bd8206742a
Add support for multiple LLM Providers (#60)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-24 17:13:55 -07:00
Adil Hafeez
87900beddd
Remove OMF (#78)
* Remove OMF

* remove submodule from github workflow
2024-09-24 15:18:20 -07:00
Adil Hafeez
eff4cd9826
improve response handling (#71) 2024-09-23 22:56:35 -07:00
Co Tran
79b1c5415f
[Kan-103] add support toxic/jailbreak model (#49)
* add toxic/jailbreak model

* fix path loading model

* fix syntax

* fix bug,lint, format

* fix bug

* formatting

* add parallel + chunking

* fix bug

* working version

* fix onnnx name erorr

* device

* fix jailbreak config

* fix syntax error

* format

* add requirement + cli download for dockerfile

* add task

* add skeleton change for envoy filter for prompt guard

* fix hardware config

* fix bug

* add config changes

* add gitignore

* merge main

* integrate arch-guard with filter

* add hardware config

* nothing

* add hardware config feature

* fix requirement

* fix chat ui

* fix onnx

* fix lint

* remove non intel cpu

* remove onnx

* working version

* modify docker

* fix guard time

* add nvidia support

* remove nvidia

* add gpu

* add gpu

* add gpu support

* add gpu support for compose

* add gpu support for compose

* add gpu support for compose

* add gpu support for compose

* add gpu support for compose

* fix docker file

* fix int test

* correct gpu docker

* upgrad python 10

* fix logits to be gpu compatible

* default to cpu dockerfile

* resolve comments

* fix lint + unused parameters

* fix

* remove eetq install for cpu

* remove deploy gpu

---------

Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-09-23 12:07:31 -07:00
Adil Hafeez
4438dc7979
remove embeddings config from config (#64)
* remove embeddings config from config

* remove embedding provider
2024-09-19 17:49:42 -07:00
Adil Hafeez
2cd5ec5adf
use openai standard response in arch-fc and in gradio client (#62)
* use openai standard response in arch-fc and in gradio client

also fix code bug in usage

* fix int test
2024-09-19 12:19:14 -07:00
José Ulises Niño Rivera
9f3c845610
Add ability to stream a response (#50)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-17 16:12:41 -07:00
Adil Hafeez
060a0d665e
improve service names (#54)
- embedding-server => model_server
- public-types => public_types
- chatbot-ui => chatbot_ui
- function-calling => function_calling
2024-09-17 08:47:35 -07:00
Adil Hafeez
9e50957f22
Improve prompt target intent matching (#51) 2024-09-16 19:20:07 -07:00
Adil Hafeez
7b5203a2ce
Add function calling support using bolt-fc-1b (#35) 2024-09-10 14:24:46 -07:00
José Ulises Niño Rivera
dd48689aee
Add Ratelimit on request tokens (#44)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-04 17:28:12 -07:00
José Ulises Niño Rivera
d98517f240
Move shared types into their own crate (#41)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-04 15:31:05 -07:00
Adil Hafeez
33f9dd22e6
Add workflow logic for weather forecast demo (#24) 2024-07-30 16:23:23 -07:00
José Ulises Niño Rivera
7ef68eccfb
Improve error handling (#23)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-07-29 12:15:26 -07:00
José Ulises Niño Rivera
a51a467cad
Add initial integration style tests (#20)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-07-25 14:41:36 -07:00