Commit graph

59 commits

Author SHA1 Message Date
Adil Hafeez
452084423c
add PR to release 0.1.9 (#371) 2025-01-17 18:47:26 -08:00
Shuguang Chen
88a02dc478
Some fixes on model server (#362)
* Some fixes on model server

* Remove prompt_prefilling message

* Fix logging

* Fix poetry issues

* Improve logging and update the support for text truncation

* Fix tests

* Fix tests

* Fix tests

* Fix modelserver tests

* Update modelserver tests
2025-01-10 16:45:36 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
Adil Hafeez
af0e7d178b
update cli to 0.1.6 (#338) 2024-12-06 15:48:07 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples (#330) 2024-12-06 14:37:33 -08:00
CTran
cadd3cdaf9
hallucination with log probs (#281)
* first init

* fix

* fix test

* new implemenetation

* fix bug

* fix bug

* fix bug

* address issue

* address issues

* address comments

* fix test

* fix

* move constatns

* remove consts
2024-11-27 15:17:02 -08:00
Adil Hafeez
704b928d61
release 0.1.5 (#307) 2024-11-26 13:28:52 -08:00
Adil Hafeez
0ff3d43008
remove dependency on docker-compose when starting up archgw (#305) 2024-11-26 13:13:02 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards (#303) 2024-11-25 17:16:35 -08:00
Adil Hafeez
9cee04ed31
release 0.1.3 (#280)
* release 0.1.3

* udpate ver
2024-11-17 17:12:01 -08:00
Adil Hafeez
d1dd8710a4
release 0.1.2 (#266) 2024-11-12 23:56:33 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 (#255)
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13

* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing (#229) 2024-11-07 22:11:00 -06:00
CTran
fb67788be0
add prefill and test (#236)
* add prefill and test

* fix stream

* fix

* feedback

* address comments

* update

* add e2e test

* fix e2e test

* update fix

* fix

* address cmt

* address cmt
2024-11-07 11:59:29 -08:00
Adil Hafeez
8c6ad87c1c
release 0.1.0 (#239)
* set version to 0.1.0

* update readme
2024-10-30 18:56:49 -07:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
Adil Hafeez
60299244b9
Improve Gradio UI and fix arch_state bug (#227) 2024-10-29 11:27:13 -07:00
José Ulises Niño Rivera
662a840ac5
Add support for streaming and fixes few issues (see description) (#202) 2024-10-28 17:05:06 -07:00
CTran
25dddcbfd9
fix model server stop process (#217)
* fix model server stop process

* replace

* replace

* add test

* add multiple pids test

* add check install for linux

* reformat
2024-10-24 19:21:47 -07:00
Salman Paracha
708fa15a9b
HR agent demo (#206)
* commiting my hr_agent branch

* updating the HR agent config

* pushing to remote

* fix hr agent

* committing to merge with main

* updating to merge from main

* updating the demo and model-server-tests to pull from poetry

* updating the poetry.lock files

* updating based on feedback

* updated sysmte prompt for hr_agent

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-23 14:32:40 -07:00
Adil Hafeez
dd1c7be706
Pass tool call and app function response back in metadata (#193) 2024-10-18 13:25:39 -07:00
Adil Hafeez
28421353fd
Update vscode workspce (#199)
- add recommended extensions
- set python interpreter path for all python projects to be venv/bin/python
- update project structure in workspace
- rename project file from gatewa -> archgw
2024-10-18 12:57:58 -07:00
Adil Hafeez
6cd05572c4
update lock file (#192)
```
Installing dependencies from lock file

pyproject.toml changed significantly since poetry.lock was last generated. Run `poetry lock [--no-update]` to fix the lock file.
Error installing model server dependencies: Command '['poetry', 'install', '--no-cache']' returned non-zero exit status 1.
```
2024-10-17 10:42:15 -07:00
CTran
8e54ac20d8
Refactor model server hardware config + add unit tests to load/request to the server (#189)
* remove mode/hardware

* add test and pre commit hook

* add pytest dependieces

* fix format

* fix lint

* fix precommit

* fix pre commit

* fix pre commit

* fix precommit

* fix precommit

* fix precommit

* fix precommit

* fix precommit

* fix precommit

* fix precommit

* fix precommit

* fix precommit

* fix precommit
2024-10-16 16:58:10 -07:00
Co Tran
b1746b38b4
concatenate history of user messages for hallucination (#177)
* concatenate history of user messages for hallucination

* add history of messages

* fix gpt to not arch

* add model prefix

* fix

* correct init of user_messages

* fmt

* fix test
2024-10-15 11:43:05 -07:00
Adil Hafeez
7d5f760884
Improve cli (#179) 2024-10-10 17:44:41 -07:00
Salman Paracha
95a0f1be5b
updated archgw cli to pull from archgw_modelserver from pypi (#169)
* updated archgw cli to pull from archgw_modelserver from pypi

* fix image name

* update rev

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-09 21:00:26 -07:00
Co Tran
f9e3a052fc
change nli model (#167)
* change nli model

* Fix bug in hallucination

---------

Co-authored-by: Shuguang Chen <54548843+nehcgs@users.noreply.github.com>
2024-10-09 19:10:08 -07:00
Shuguang Chen
3b7c58698f
Update model_server (#164)
* Update model server

* Delete model_server/.vscode/settings.json

* Update loader.py

* Fix errors

* Update log mode
2024-10-09 18:04:52 -07:00
Salman Paracha
1acf43ff7a
fixed cli to use poetry as well. this way we make it easy to have the… (#160) 2024-10-09 15:53:12 -07:00
Co Tran
8b5db45507
Fix gpu dependency and only leverage onnx when GPU is available (#157)
* replacing appending instead of write

* fix eetq dependency

* gpu guard required eetq

* fix bug when gpu is available

* fix for gpu device

* reverse

* fix

* replace gpu -> cuda
2024-10-09 11:42:05 -07:00
Co Tran
5c4a6bc8ff
lint + formating with black (#158)
* lint + formating with black

* add black as pre commit
2024-10-09 11:25:07 -07:00
Salman Paracha
b63a01fe82
Salmanap/fix network agent demo (#153)
* staging my changes to re-based from main

* adding debug statements to rust

* merged with main

* ready to push network agent

* removed the incomplete sql example

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-08 22:19:20 -07:00
Adil Hafeez
47c9c0aafc
fix lock file (#151) 2024-10-08 18:16:00 -07:00
Co Tran
e62c6e75ea
fix dependcy + logg info (#148) 2024-10-08 16:42:40 -07:00
Co Tran
80d2229053
Cotran/onnx conversion (#145)
* onnx replacement

* onnx conversion for nli and embedding model

* fix naming

* fix naming

* fix naming

* pin version
2024-10-08 14:37:48 -07:00
Salman Paracha
3ed50e61d2
ensure that we can call the new api.fc.archgw.com url, logging fixes … (#142)
* ensure that we can call the new api.fc.archgw.com url, logging fixes and minor cli bug fixes

* fixed a bug where model_server printed on terminal after start script stopped running

* updating the logo and fixing the website styles

* updated the branch with feedback from Co and Adil

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-08 12:40:24 -07:00
Co Tran
b1fa127704
Hallucination integration with rust (#122) 2024-10-07 18:38:55 -07:00
Co Tran
93abe553e3
formating and mointoring change (#136) 2024-10-07 15:21:05 -07:00
Adil Hafeez
96686dc606
Serialize tool calls for Arch FC (#131)
* Serialize tool calls

* fix int tests
2024-10-07 00:03:25 -07:00
Salman Paracha
b60ceb9168
model server build (#127)
* first commit to have model_server not be dependent on Docker

* making changes to fix the docker-compose file for archgw to set DNS_V4 and minor fixes with the build

* additional fixes for model server to be separated out in the build

* additional fixes for model server to be separated out in the build

* fix to get model_server to be built as a separate python process. TODO: fix the embeddings logs after cli completes

* fixing init to pull tempfile using the tempfile python package

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-06 18:21:43 -07:00
Adil Hafeez
2a747df7c0
don't compute embeddings for names and other fixes see description (#126)
* serialize tools - 2

* fix int tests

* fix int test

* fix unit tests
2024-10-05 19:25:16 -07:00
Salman Paracha
701187474f
load_models checks for device before getting the BGE or NLI model loaded in memory. Was defaulting to CPU. And removed gunk for load_sql (#119)
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-04 13:09:35 -07:00
Co Tran
7d38ef9719
Cotran/hallucination integration (#115)
* fix fc integration

* fix integration

* remove file

* Update arch_fc.py

* create model server hallucination detection class
2024-10-04 11:05:25 -07:00
Salman Paracha
dc57f119a0
archgw cli (#117)
* initial commit of the insurange agent demo, with the CLI tool

* committing the cli

* fixed some field descriptions for generate-prompt-targets

* CLI works with buil, up and down commands. Function calling example works stand-alone

* fixed README to install archgw cli

* fixing based on feedback

* fixing based on feedback

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-03 18:21:27 -07:00
Adil Hafeez
1b57a49c9d
add support for default target (#111)
* add support for default target

* add more fixes
2024-10-02 20:43:16 -07:00
Co Tran
ed50d29ccf
fix fc integration (#110)
* fix fc integration

* fix integration

* remove file

* Update arch_fc.py
2024-10-01 19:20:28 -07:00
Co Tran
17a643c410
ArchFC endpoint integration (#94)
* integration

* mopdify docker file

* add params and fix python lint

* fix empty context and tool calls

* address comments

* revert port

* fix bug merge

* fix environment

* fix bug

* fix compose

* fix merge
2024-10-01 12:47:26 -07:00
Salman Paracha
8654d3d5c5
simplify developer getting started experience (#102)
* Fixed build. Now, we have a bare bones version of the docker-compose file with only two services, archgw and archgw-model-server. Tested using CLI

* some pre-commit fixes

* fixed cargo formatting issues

* fixed model server conflict changes

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
2024-10-01 10:02:23 -07:00
Adil Hafeez
f4395d39f9
Fold function_resolver into model_server (#103) 2024-10-01 09:13:50 -07:00