plano/crates/llm_gateway
Adil Hafeez e5f3039924 make tiktoken token counting optional via enable_token_counting override
By default, use cheap len/4 estimate for input token counting (metrics
and ratelimit). When enable_token_counting is set to true in overrides,
use tiktoken BPE for exact counts. This eliminates ~80ms of per-request
latency from tiktoken in the WASM filter while keeping metrics and
ratelimit functional.

Made-with: Cursor
2026-03-22 21:53:47 -07:00
..
src make tiktoken token counting optional via enable_token_counting override 2026-03-22 21:53:47 -07:00
Cargo.toml Add support for Amazon Bedrock Converse and ConverseStream (#588) 2025-10-22 11:31:21 -07:00