plano/crates/llm_gateway
aayushwhiz cb8e2a772b update stats to output input_sequence_length Histogram
Changes the enforce_ratelimit function by getting token count regardless
of if there is a ratelimit or not, allowing for metric to be saved. This
essentially is the token count of what is sent to openai, but that is
not the tokens being sent by user, so rather than info about usage
statistics, it's more relavant to price or cost. Not yet sure if this is
the best way to go, but i'll use it for now.
2024-11-08 18:09:37 -08:00
..
src update stats to output input_sequence_length Histogram 2024-11-08 18:09:37 -08:00
tests Add support for streaming and fixes few issues (see description) (#202) 2024-10-28 17:05:06 -07:00
Cargo.lock Code refactor and some improvements - see description (#194) 2024-10-18 12:53:44 -07:00
Cargo.toml split wasm filter (#186) 2024-10-16 14:20:26 -07:00