Commit graph

89 commits

Author SHA1 Message Date
co tran
1b39ee3dd8 fix bugs when no logprob for prefill and bug in function calling loop when it always no tool call 2025-04-01 20:18:56 +00:00
co tran
a3ab6be51c modify hallucination threshold and temperature 2025-04-01 17:36:54 +00:00
Adil Hafeez
f2323f771c
update response from upstream llm to now include it in dict with "response" 2025-03-31 18:42:46 -07:00
co tran
5bd991e97b fix precommit 2025-04-01 01:28:18 +00:00
co tran
b7916ce192 clean code + remove cmts 2025-04-01 01:19:29 +00:00
co tran
6e5cb5d485 fix test 2025-04-01 01:18:15 +00:00
co tran
a61a2a1d70 remove test until more evaluation on example 2025-04-01 00:12:03 +00:00
co tran
610843d98d remove test until more evaluation on example 2025-04-01 00:03:14 +00:00
co tran
7c6ddc9396 remove test until more evaluation on example 2025-03-31 23:41:11 +00:00
Shuguang Chen
6ec4c14407 Fix prompt prefilling 2025-03-31 15:08:38 -07:00
co tran
afe7cc9e9e fix bug and test 2025-03-31 21:50:05 +00:00
co tran
cc0845bce4 fix hallucination loop 2025-03-31 21:08:47 +00:00
Shuguang Chen
f035d166c8 Fix hallucination check 2025-03-28 16:30:03 -07:00
Shuguang Chen
425f9b0dd5 Update model usage 2025-03-28 15:10:51 -07:00
Adil Hafeez
8290d1969f
use public endpoint for arch v1.1 2025-03-28 12:38:44 -07:00
CTran
a3f2b3cef9
add hallucination modification (#455)
* add hallucination modification

* disable test
2025-03-28 09:49:20 -07:00
Adil Hafeez
b31a7a569a
update rest and other parts of the code to work with arch fc 1.1 2025-03-28 03:04:21 -07:00
Shuguang Chen
8335f0c3de minor update 2025-03-27 10:26:47 -07:00
Shuguang Chen
820c0443ee disable hallucination check 2025-03-24 17:07:07 -07:00
Shuguang Chen
cf30e94415 Init update 2025-03-24 16:53:10 -07:00
Adil Hafeez
9f59943041
update code to use 0.2.4 release (#446)
* update code to use 0.2.4 release

* update lock file
2025-03-21 16:08:59 -07:00
Adil Hafeez
84cd1df7bf
add preliminary support for llm agents (#432) 2025-03-19 15:21:34 -07:00
Adil Hafeez
d8b833fe69
release 0.2.3 (#423) 2025-03-04 14:30:44 -08:00
Shuguang Chen
e77fc47225
Handle intent matching better in arch gateway (#391) 2025-03-04 12:49:13 -08:00
Adil Hafeez
1bbc5d2233
release 0.2.2 (#413) 2025-02-14 20:02:59 -08:00
CTran
e7b370cd2f
fix error in function name + new thresholds (#406)
* fix error in function name + new thresholds

* fix

* fix

* remove example

* remove example
2025-02-14 09:57:39 -08:00
Adil Hafeez
4ec03af16e
use archfc hosted on aws (#409) 2025-02-13 11:03:34 -08:00
Adil Hafeez
0ea237fbac
release 0.2.1 (#399) 2025-02-07 19:21:20 -08:00
Adil Hafeez
8de6eacfbd
spotify demo with optimized context window code change (#397) 2025-02-07 19:14:15 -08:00
Adil Hafeez
7830f4b431
release 0.2.0 (#384)
* release 0.2.0

* update versions
2025-01-24 17:31:48 -08:00
Adil Hafeez
452084423c
add PR to release 0.1.9 (#371) 2025-01-17 18:47:26 -08:00
Shuguang Chen
88a02dc478
Some fixes on model server (#362)
* Some fixes on model server

* Remove prompt_prefilling message

* Fix logging

* Fix poetry issues

* Improve logging and update the support for text truncation

* Fix tests

* Fix tests

* Fix tests

* Fix modelserver tests

* Update modelserver tests
2025-01-10 16:45:36 -08:00
Shuguang Chen
ba7279becb
Use intent model from archfc to pick prompt gateway (#328) 2024-12-20 13:25:01 -08:00
Adil Hafeez
af0e7d178b
update cli to 0.1.6 (#338) 2024-12-06 15:48:07 -08:00
Adil Hafeez
a54db1a098
update getting started guide and add llm gateway and prompt gateway samples (#330) 2024-12-06 14:37:33 -08:00
CTran
cadd3cdaf9
hallucination with log probs (#281)
* first init

* fix

* fix test

* new implemenetation

* fix bug

* fix bug

* fix bug

* address issue

* address issues

* address comments

* fix test

* fix

* move constatns

* remove consts
2024-11-27 15:17:02 -08:00
Adil Hafeez
704b928d61
release 0.1.5 (#307) 2024-11-26 13:28:52 -08:00
Adil Hafeez
0ff3d43008
remove dependency on docker-compose when starting up archgw (#305) 2024-11-26 13:13:02 -08:00
Adil Hafeez
9c6fcdb771
use fix prompt guards (#303) 2024-11-25 17:16:35 -08:00
Adil Hafeez
9cee04ed31
release 0.1.3 (#280)
* release 0.1.3

* udpate ver
2024-11-17 17:12:01 -08:00
Adil Hafeez
d1dd8710a4
release 0.1.2 (#266) 2024-11-12 23:56:33 -08:00
Adil Hafeez
d87105882b
update rust toolchain to 1.82 (#255)
* update rust to 1.82 pin it, also update envoy to 1.32 and python to 3.13

* use python:3.12
2024-11-12 10:35:14 -08:00
Adil Hafeez
a72bb804eb
add support for jaeger tracing (#229) 2024-11-07 22:11:00 -06:00
CTran
fb67788be0
add prefill and test (#236)
* add prefill and test

* fix stream

* fix

* feedback

* address comments

* update

* add e2e test

* fix e2e test

* update fix

* fix

* address cmt

* address cmt
2024-11-07 11:59:29 -08:00
Adil Hafeez
8c6ad87c1c
release 0.1.0 (#239)
* set version to 0.1.0

* update readme
2024-10-30 18:56:49 -07:00
Adil Hafeez
e462e393b1
Use large github action machine to run e2e tests (#230) 2024-10-30 17:54:51 -07:00
Adil Hafeez
60299244b9
Improve Gradio UI and fix arch_state bug (#227) 2024-10-29 11:27:13 -07:00
José Ulises Niño Rivera
662a840ac5
Add support for streaming and fixes few issues (see description) (#202) 2024-10-28 17:05:06 -07:00
CTran
25dddcbfd9
fix model server stop process (#217)
* fix model server stop process

* replace

* replace

* add test

* add multiple pids test

* add check install for linux

* reformat
2024-10-24 19:21:47 -07:00
Salman Paracha
708fa15a9b
HR agent demo (#206)
* commiting my hr_agent branch

* updating the HR agent config

* pushing to remote

* fix hr agent

* committing to merge with main

* updating to merge from main

* updating the demo and model-server-tests to pull from poetry

* updating the poetry.lock files

* updating based on feedback

* updated sysmte prompt for hr_agent

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
Co-authored-by: Adil Hafeez <adil@katanemo.com>
2024-10-23 14:32:40 -07:00