plano

mirror of https://github.com/katanemo/plano.git synced 2026-05-27 14:17:15 +02:00

Adil Hafeez e5f3039924 make tiktoken token counting optional via enable_token_counting override By default, use cheap len/4 estimate for input token counting (metrics and ratelimit). When enable_token_counting is set to true in overrides, use tiktoken BPE for exact counts. This eliminates ~80ms of per-request latency from tiktoken in the WASM filter while keeping metrics and ratelimit functional. Made-with: Cursor		2026-03-22 21:53:47 -07:00
..
src	make tiktoken token counting optional via enable_token_counting override	2026-03-22 21:53:47 -07:00
Cargo.toml	Add support for Amazon Bedrock Converse and ConverseStream (#588 )	2025-10-22 11:31:21 -07:00