Commit graph

628 commits

Author SHA1 Message Date
Adil Hafeez
0c572dc862 fix connect_timeout value in ref file 2024-10-01 13:36:25 -07:00
Adil Hafeez
dbb8f87787
update path for arch_config.yaml file (#107) 2024-10-01 13:28:53 -07:00
Co Tran
17a643c410
ArchFC endpoint integration (#94)
* integration

* mopdify docker file

* add params and fix python lint

* fix empty context and tool calls

* address comments

* revert port

* fix bug merge

* fix environment

* fix bug

* fix compose

* fix merge
2024-10-01 12:47:26 -07:00
Adil Hafeez
1a7c1ad0a5
rename archgw_model_sever => model_server (#106) 2024-10-01 11:24:43 -07:00
Salman Paracha
8654d3d5c5
simplify developer getting started experience (#102)
* Fixed build. Now, we have a bare bones version of the docker-compose file with only two services, archgw and archgw-model-server. Tested using CLI

* some pre-commit fixes

* fixed cargo formatting issues

* fixed model server conflict changes

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-01 10:02:23 -07:00
Adil Hafeez
41cdef590a
arch schema validator (#105)
* add arch schema validator

* schema validator
2024-10-01 09:22:08 -07:00
Adil Hafeez
15869825e3
add messages in params when making api calls (#104) 2024-10-01 09:14:01 -07:00
Adil Hafeez
f4395d39f9
Fold function_resolver into model_server (#103) 2024-10-01 09:13:50 -07:00
José Ulises Niño Rivera
b0ce5eca93
Rename bolt_config to arch_config (#100)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-30 18:47:35 -07:00
Adil Hafeez
2207021b9c
remove method type (#101) 2024-09-30 17:59:29 -07:00
José Ulises Niño Rivera
f154bc3741
Remove unnecessary envoy.yaml (#99)
* Remove unnecessary envoy.yaml

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* more

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

---------

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-30 17:49:25 -07:00
Adil Hafeez
cc35eb0cd7
update config (#93) 2024-09-30 17:49:05 -07:00
Adil Hafeez
4182879717
add precommit check (#97)
* add precommit check

* remove check

* Revert "remove check"

This reverts commit 9987b62b9b.

* fix checks

* fix whitespace errors
2024-09-30 14:54:01 -07:00
Aayush
1e61452310
changes prometheus config target to arch so that data collection works (#98) 2024-09-30 14:35:21 -07:00
Adil Hafeez
bb746e237a
add support for 3b model (#96) 2024-09-30 09:54:58 -07:00
Adil Hafeez
4d7c07a63c update ctx size to 4k 2024-09-29 17:13:05 -07:00
Adil Hafeez
ea86f73605
rename envoyfilter => arch (#91)
* rename envoyfilter => arch

* fix more files

* more fixes

* more renames
2024-09-27 16:41:39 -07:00
Salman Paracha
7168b14ed3
Salmanap/docs v1 push (#92)
* updated model serving, updated the config references, architecture docs and added the llm_provider section

* several documentation changes to improve sections like life_of_a_request, model serving subsystem

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-09-27 15:37:49 -07:00
Adil Hafeez
8a4e11077c update arch-fc parameters 2024-09-27 13:34:10 -07:00
Adil Hafeez
75cf5e5304
Add supported parameter type, validation and tests (#88)
* Add supported parameter type and validation

* make the tools format more compliant with openai

* more updates

* fix more

* fix unit test
2024-09-27 13:33:05 -07:00
Adil Hafeez
59229b8fc9 fix envoy yaml file to use v4 dns resolver for openai 2024-09-27 00:50:27 -07:00
Adil Hafeez
774c389951
add bolt support (#90)
* add support for bolt

* improve logging

* add support for bolt-fc

* fix int tests
2024-09-26 17:47:01 -07:00
Adil Hafeez
e3a835e5d3
expose access logs from envoy (#89) 2024-09-26 16:03:48 -07:00
Salman Paracha
48a2c1800c
V1 docs push (#86)
* updated docs (again)

* updated the LLMs section, prompt processing section and the RAG section of the docs

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-09-25 23:43:34 -07:00
Adil Hafeez
3511798fa8
Integrate Arch-Function-Calling-1.5B model (#85)
* add arch support

* add missing file

* e2e tests

* delete old files and fix response

* fmt
2024-09-25 23:30:50 -07:00
José Ulises Niño Rivera
9ea6bb0d73
Revert "Revert "Add support for multiple LLM Providers (#60)"" (#83)
* Revert "Revert "Add support for multiple LLM Providers (#60)""

This reverts commit 43d6bc80e9.

* wip

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* Revert "wip"

This reverts commit 7c4dde5d1f.

* fix parameter name

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>

* force use openai

---------

Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-09-25 23:15:17 -07:00
José Ulises Niño Rivera
370f3bb2c5
Fix bug in PromptGuard configuration (#80)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-25 23:36:55 -05:00
Adil Hafeez
7d130e102a remove open-message-format 2024-09-25 13:30:40 -07:00
Sampreeth Sarma
7f0fcb372b
Added Float type to the function parameter values (#77) 2024-09-25 13:29:20 -07:00
Adil Hafeez
7505a0fc1f
Update build_docs.sh 2024-09-25 12:07:06 -07:00
Adil Hafeez
bfaabe75f4
send history to bolt fc model (#84) 2024-09-25 12:03:44 -07:00
Adil Hafeez
425a080c96
add readme and docker build (#81) 2024-09-25 10:05:59 -07:00
Adil Hafeez
43d6bc80e9 Revert "Add support for multiple LLM Providers (#60)"
This reverts commit bd8206742a.
2024-09-25 08:15:22 -07:00
Adil Hafeez
d970b214f4
improve logging of api failure (#79) 2024-09-24 23:56:24 -07:00
José Ulises Niño Rivera
bd8206742a
Add support for multiple LLM Providers (#60)
Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>
2024-09-24 17:13:55 -07:00
Adil Hafeez
87900beddd
Remove OMF (#78)
* Remove OMF

* remove submodule from github workflow
2024-09-24 15:18:20 -07:00
Adil Hafeez
685144bbd7
fix demos code (#76) 2024-09-24 14:34:22 -07:00
Salman Paracha
13dff3089d
Adil/fix salman docs (#75)
* added the first set of docs for our technical docs

* more docuemtnation changes

* added support for prompt processing and updated life of a request

* updated docs to including getting help sections and updated life of a request

* committing local changes for getting started guide, sample applications, and full reference spec for prompt-config

* updated configuration reference, added sample app skeleton, updated favico

* fixed the configuration refernce file, and made minor changes to the intent detection. commit v1 for now

* Updated docs with use cases and example code, updated what is arch, and made minor changes throughout

* fixed imaged and minor doc fixes

* add sphinx_book_theme

* updated README, and make some minor fixes to documetnation

* fixed README.md

* fixed image width

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-09-24 13:54:17 -07:00
Adil Hafeez
2d31aeaa36 fix debug log 2024-09-24 13:45:30 -07:00
Co Tran
d5d79256b0
remove guard config json (#70)
* remove guard config json

* formating
2024-09-24 13:33:31 -07:00
Adil Hafeez
dd8c43a392
improve cluster not configured error (#73)
* improve cluster not configured error

* dont panic

* update format

* Merge branch 'main' into adil/fix_salman_docs
2024-09-24 13:24:26 -07:00
Adil Hafeez
16a8927889
add details on how to use grafana dashboards (#72) 2024-09-24 11:51:31 -07:00
Adil Hafeez
eff4cd9826
improve response handling (#71) 2024-09-23 22:56:35 -07:00
Co Tran
79b1c5415f
[Kan-103] add support toxic/jailbreak model (#49)
* add toxic/jailbreak model

* fix path loading model

* fix syntax

* fix bug,lint, format

* fix bug

* formatting

* add parallel + chunking

* fix bug

* working version

* fix onnnx name erorr

* device

* fix jailbreak config

* fix syntax error

* format

* add requirement + cli download for dockerfile

* add task

* add skeleton change for envoy filter for prompt guard

* fix hardware config

* fix bug

* add config changes

* add gitignore

* merge main

* integrate arch-guard with filter

* add hardware config

* nothing

* add hardware config feature

* fix requirement

* fix chat ui

* fix onnx

* fix lint

* remove non intel cpu

* remove onnx

* working version

* modify docker

* fix guard time

* add nvidia support

* remove nvidia

* add gpu

* add gpu

* add gpu support

* add gpu support for compose

* add gpu support for compose

* add gpu support for compose

* add gpu support for compose

* add gpu support for compose

* fix docker file

* fix int test

* correct gpu docker

* upgrad python 10

* fix logits to be gpu compatible

* default to cpu dockerfile

* resolve comments

* fix lint + unused parameters

* fix

* remove eetq install for cpu

* remove deploy gpu

---------

Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-09-23 12:07:31 -07:00
Salman Paracha
80c554ce1a
Docs branch - v1 of our tech docs (#69)
* added the first set of docs for our technical docs

* more docuemtnation changes

* added support for prompt processing and updated life of a request

* updated docs to including getting help sections and updated life of a request

* committing local changes for getting started guide, sample applications, and full reference spec for prompt-config

* updated configuration reference, added sample app skeleton, updated favico

* fixed the configuration refernce file, and made minor changes to the intent detection. commit v1 for now

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-09-20 17:08:42 -07:00
Adil Hafeez
233976a568 comment required param check 2024-09-20 15:49:49 -07:00
Adil Hafeez
31f26ef7ac
move demo functions out of model_server (#67)
* pending

* remove

* fix docker build
2024-09-20 14:38:10 -07:00
Adil Hafeez
ca5c9e4824 improve logging 2024-09-20 10:16:58 -07:00
Sampreeth Sarma
941869ad24
fix similarity bug (#63) 2024-09-20 09:49:28 -07:00
Adil Hafeez
97b47c2ab4
Include param default in parameters (#68)
* Include param default in parameters

* improve default param pr

* fix integration tests
2024-09-20 09:02:24 -07:00