plano

mirror of https://github.com/katanemo/plano.git synced 2026-06-17 15:25:17 +02:00

Author	SHA1	Message	Date
aayushwhiz	f4e9624c03	update integration tests to expect new stats and new request for time	2024-11-08 18:09:37 -08:00
aayushwhiz	1f9d5860b5	fix after merge	2024-11-08 18:09:37 -08:00
aayushwhiz	cb8e2a772b	update stats to output input_sequence_length Histogram Changes the enforce_ratelimit function by getting token count regardless of if there is a ratelimit or not, allowing for metric to be saved. This essentially is the token count of what is sent to openai, but that is not the tokens being sent by user, so rather than info about usage statistics, it's more relavant to price or cost. Not yet sure if this is the best way to go, but i'll use it for now.	2024-11-08 18:09:37 -08:00
aayushwhiz	8fb5c4eceb	Add in Latency and output_sequence_length added latency histogram and ouput sequency length histogram to the wasm metrics. Updated stream context so that When the end_stream is recieved, it stores the time since request was sent as well as total number of tokens up till that point.	2024-11-08 18:09:37 -08:00
aayushwhiz	840b6a0e3e	fix bug with checking for token count of zero Changed check to check that token count is > than 0, changed debug message to say tokens, and divided time by number of tokens received during that time so it is actually per token	2024-11-08 18:09:37 -08:00
aayushwhiz	bf39fecd6d	add in tpot stat setup check for first token as well as time per token after that	2024-11-08 18:09:37 -08:00
aayushwhiz	5543aa543f	add in time to first token stat changes stats to implement debug for histogram, update filter_context to open ttft to stats endpoint and update stream_context to get time between both of those.	2024-11-08 18:09:37 -08:00
Adil Hafeez	9081eb0f7f	obfuscate auth header (#254 )	2024-11-08 15:17:39 -06:00
Adil Hafeez	a72bb804eb	add support for jaeger tracing (#229 )	2024-11-07 22:11:00 -06:00
José Ulises Niño Rivera	662a840ac5	Add support for streaming and fixes few issues (see description) (#202 )	2024-10-28 17:05:06 -07:00
Adil Hafeez	1719b7d5f8	Send back developer error correctly (#195 )	2024-10-18 13:14:18 -07:00
Adil Hafeez	c6ba28dfcc	Code refactor and some improvements - see description (#194 )	2024-10-18 12:53:44 -07:00
José Ulises Niño Rivera	aa30353c85	Add cargo workspace to allow rust-analyzer to work correctly (#197 ) Signed-off-by: José Ulises Niño Rivera <junr03@users.noreply.github.com>	2024-10-18 15:44:52 -04:00
Adil Hafeez	21e7fe2cef	Split arch wasm filter code into prompt and llm gateway filters (#190 )	2024-10-17 10:16:40 -07:00
Adil Hafeez	3bd2ffe9fb	split wasm filter (#186 ) * split wasm filter * fix int and unit tests * rename public_types => common and move common code there * rename * fix int test	2024-10-16 14:20:26 -07:00

15 commits