plano

mirror of https://github.com/katanemo/plano.git synced 2026-06-17 15:25:17 +02:00

Author	SHA1	Message	Date
aayushwhiz	cb8e2a772b	update stats to output input_sequence_length Histogram Changes the enforce_ratelimit function by getting token count regardless of if there is a ratelimit or not, allowing for metric to be saved. This essentially is the token count of what is sent to openai, but that is not the tokens being sent by user, so rather than info about usage statistics, it's more relavant to price or cost. Not yet sure if this is the best way to go, but i'll use it for now.	2024-11-08 18:09:37 -08:00
aayushwhiz	8fb5c4eceb	Add in Latency and output_sequence_length added latency histogram and ouput sequency length histogram to the wasm metrics. Updated stream context so that When the end_stream is recieved, it stores the time since request was sent as well as total number of tokens up till that point.	2024-11-08 18:09:37 -08:00
aayushwhiz	840b6a0e3e	fix bug with checking for token count of zero Changed check to check that token count is > than 0, changed debug message to say tokens, and divided time by number of tokens received during that time so it is actually per token	2024-11-08 18:09:37 -08:00
aayushwhiz	bf39fecd6d	add in tpot stat setup check for first token as well as time per token after that	2024-11-08 18:09:37 -08:00
aayushwhiz	5543aa543f	add in time to first token stat changes stats to implement debug for histogram, update filter_context to open ttft to stats endpoint and update stream_context to get time between both of those.	2024-11-08 18:09:37 -08:00
Salman Paracha	4b2b371876	removing depdency on mistral keys (#256 )	2024-11-08 16:09:04 -08:00
Adil Hafeez	9081eb0f7f	obfuscate auth header (#254 )	2024-11-08 15:17:39 -06:00
Adil Hafeez	88d0f99866	add requirements to readme (#249 )	2024-11-08 10:43:18 -08:00
Adil Hafeez	6b62662e01	update docs with weather_forecast path (#253 )	2024-11-08 10:00:15 -08:00
Adil Hafeez	a72bb804eb	add support for jaeger tracing (#229 )	2024-11-07 22:11:00 -06:00
CTran	fb67788be0	add prefill and test (#236 ) * add prefill and test * fix stream * fix * feedback * address comments * update * add e2e test * fix e2e test * update fix * fix * address cmt * address cmt	2024-11-07 11:59:29 -08:00
Ikko Eltociear Ashimine	f48489f7c0	chore: update stream_context.rs (#248 ) initalize -> initialize	2024-11-05 10:18:33 -08:00
Adil Hafeez	9a6ae2efee	retry embeddings fetch (#245 )	2024-11-05 10:04:36 -08:00
Adil Hafeez	9a5c5cc3a3	add http files for llm and prompt gateway for local testing (#244 )	2024-11-04 15:53:15 -08:00
Salman Paracha	e4d5293af4	updating README logo (#242 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-31 21:58:08 -07:00
Salman Paracha	21f4e7a5e4	fixing ports in arch_config for demos (#241 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-31 14:37:04 -07:00
Salman Paracha	3bff4a597b	fix ports and update README for paths to agent/chat (#240 ) * fix ports and update README for paths to agent/chat * minor fix --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-31 09:25:24 -07:00
Adil Hafeez	5196fb28a9	update build status badges	2024-10-30 23:27:01 -07:00
Adil Hafeez	9a30afa7e1	update status badges	2024-10-30 23:24:36 -07:00
Adil Hafeez	ad70e540b6	add status badges	2024-10-30 23:24:00 -07:00
Adil Hafeez	8c6ad87c1c	release 0.1.0 (#239 ) * set version to 0.1.0 * update readme	2024-10-30 18:56:49 -07:00
Salman Paracha	dab7a44053	several fixes to demos (#238 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-30 18:38:18 -07:00
Adil Hafeez	e462e393b1	Use large github action machine to run e2e tests (#230 )	2024-10-30 17:54:51 -07:00
Salman Paracha	bb882fb59b	Updated hr_agent to be full stack: gradio + fastAPI (#235 ) * commiting to remove * fix * updating hr_agent --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2024-10-30 15:05:34 -07:00
Salman Paracha	bb9a774a72	moving chatbot-ui in demos and out of root project structure (#228 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-29 12:05:29 -07:00
Adil Hafeez	60299244b9	Improve Gradio UI and fix arch_state bug (#227 )	2024-10-29 11:27:13 -07:00
José Ulises Niño Rivera	662a840ac5	Add support for streaming and fixes few issues (see description) (#202 )	2024-10-28 17:05:06 -07:00
Salman Paracha	29ff8da60f	fixed typos in intro to arch docs (#225 )	2024-10-26 10:41:01 -07:00
CTran	25dddcbfd9	fix model server stop process (#217 ) * fix model server stop process * replace * replace * add test * add multiple pids test * add check install for linux * reformat	2024-10-24 19:21:47 -07:00
Salman Paracha	ff6e9bd9bd	add README for hr_agent (#224 ) * add README for hr_agent * fixed sample prompt for hr_agent in README * added screenshot and updated README.md --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-24 18:21:52 -07:00
Salman Paracha	f88740582f	fixed typos in arch_config.yaml file based on issue #221 (#223 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-24 14:57:29 -07:00
Shuguang Chen	5f3aff4922	Update chatbot UI and update hallucination check (#218 ) * update chatbot UI * Update docker-compose for demos * Fix bugs * fix for emtadata (#219) * fix for emtadata * fix * revert * merge main --------- Co-authored-by: CTran <cotran2@utexas.edu>	2024-10-24 14:11:53 -07:00
Azib Farooq	05f0491f76	updated key name (#211 )	2024-10-23 21:02:24 -07:00
Ikko Eltociear Ashimine	87ce0b1be0	docs: update README.md (#220 ) vist -> visit	2024-10-23 20:37:26 -07:00
Salman Paracha	7a5852b401	fixing discord link and moving contributing guide to root (#215 ) Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>	2024-10-23 15:45:49 -07:00
Salman Paracha	708fa15a9b	HR agent demo (#206 ) * commiting my hr_agent branch * updating the HR agent config * pushing to remote * fix hr agent * committing to merge with main * updating to merge from main * updating the demo and model-server-tests to pull from poetry * updating the poetry.lock files * updating based on feedback * updated sysmte prompt for hr_agent --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2024-10-23 14:32:40 -07:00
CTran	8495f89fda	Cotran/hallucination (#208 )	2024-10-22 12:52:01 -07:00
Adil Hafeez	ea76d85b43	Improve logging (#209 ) * improve logging * fix int tests * better * fix more logs * fix more * fix int	2024-10-22 12:07:40 -07:00
Adil Hafeez	2f374df034	refactor prompt gateway (#204 )	2024-10-21 15:04:15 -07:00
Adil Hafeez	dced8a5708	Add separate util for hallucination and add tests for it (#203 )	2024-10-18 19:34:17 -07:00
Adil Hafeez	faf64960df	update observability and dashboards (#198 )	2024-10-18 15:07:49 -07:00
Adil Hafeez	f189d5703b	update .dockerignore file after filter move	2024-10-18 14:44:39 -07:00
Adil Hafeez	dd1c7be706	Pass tool call and app function response back in metadata (#193 )	2024-10-18 13:25:39 -07:00
José Ulises Niño Rivera	62a000036e	Update arch Dockerfile (#200 )	2024-10-18 16:15:19 -04:00
Adil Hafeez	1719b7d5f8	Send back developer error correctly (#195 )	2024-10-18 13:14:18 -07:00
Adil Hafeez	28421353fd	Update vscode workspce (#199 ) - add recommended extensions - set python interpreter path for all python projects to be venv/bin/python - update project structure in workspace - rename project file from gatewa -> archgw	2024-10-18 12:57:58 -07:00
Adil Hafeez	c6ba28dfcc	Code refactor and some improvements - see description (#194 )	2024-10-18 12:53:44 -07:00
José Ulises Niño Rivera	aa30353c85	Add cargo workspace to allow rust-analyzer to work correctly (#197 ) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-10-18 15:44:52 -04:00
Salman Paracha	6fb63510b3	fix cli models and logs (#196 ) * removing unnecessar setup.py files * updated the cli for debug and access logs * ran the pre-commit locally to fix pull request * fixed bug where if archgw_process is None we didn't handle it gracefully * Apply suggestions from code review Co-authored-by: Adil Hafeez <adil@katanemo.com> * fixed changes based on PR * fixed version not found message * fixed message based on PR feedback * adding poetry lock * fixed pre-commit --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local> Co-authored-by: Adil Hafeez <adil@katanemo.com>	2024-10-18 12:09:45 -07:00
Adil Hafeez	6cd05572c4	update lock file (#192 ) ``` Installing dependencies from lock file pyproject.toml changed significantly since poetry.lock was last generated. Run `poetry lock [--no-update]` to fix the lock file. Error installing model server dependencies: Command '['poetry', 'install', '--no-cache']' returned non-zero exit status 1. ```	2024-10-17 10:42:15 -07:00

1 2 3 4 5 ...

259 commits