plano

mirror of https://github.com/katanemo/plano.git synced 2026-05-27 14:17:15 +02:00

Adil Hafeez e5f3039924 make tiktoken token counting optional via enable_token_counting override By default, use cheap len/4 estimate for input token counting (metrics and ratelimit). When enable_token_counting is set to true in overrides, use tiktoken BPE for exact counts. This eliminates ~80ms of per-request latency from tiktoken in the WASM filter while keeping metrics and ratelimit functional. Made-with: Cursor		2026-03-22 21:53:47 -07:00
..
.vscode	use standard tracing and logging in brightstaff (#721 )	2026-02-09 13:33:27 -08:00
brightstaff	the orchestrator had a bug where it was setting the wrong headers for archfc.katanemo.dev (#839 )	2026-03-20 00:40:47 -07:00
common	make tiktoken token counting optional via enable_token_counting override	2026-03-22 21:53:47 -07:00
hermesllm	adding new supported models to plano (#829 )	2026-03-15 12:37:20 -07:00
llm_gateway	make tiktoken token counting optional via enable_token_counting override	2026-03-22 21:53:47 -07:00
prompt_gateway	Rename all arch references to plano (#745 )	2026-02-13 15:16:56 -08:00
build.sh	Use mcp tools for filter chain (#621 )	2025-12-17 17:30:14 -08:00
Cargo.lock	[ISSUE 706]: Standardize returned errors from Plano (#772 )	2026-02-24 14:34:33 -08:00
Cargo.toml	use standard tracing and logging in brightstaff (#721 )	2026-02-09 13:33:27 -08:00