mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
Replaces the previous head-only truncation of oversized user messages with a middle-trim (head + ellipsis + tail) that preserves both the task framing (start of message) and the actual ask (end of message) — a common shape for long pasted content like code dumps or specs. The unicode ellipsis also signals to the router model that content was dropped, which can improve classification accuracy on truncated prompts. Also adds an outer guardrail: only the last `MAX_ROUTING_TURNS` (16) filtered messages are considered when building the routing prompt. This bounds prompt growth for long conversations before the token-budget loop runs, matching the approach HuggingFace chat-ui takes in its arch-router client. Tests: - test_huge_single_user_message_is_middle_trimmed: regression test for the 500KB user message scenario. Verifies the prompt stays bounded, head + tail markers both survive, and the ellipsis is present. - test_turn_cap_limits_routing_history: builds a 32-turn conversation and verifies only the last 16 make it into the prompt. - test_trim_middle_utf8_helper: unit test for the helper covering the no-op path, the 60/40 split, the too-small-for-marker fallback, and UTF-8 boundary safety for multi-byte characters. - Updated test_conversation_trim_upto_user_message to reflect the new middle-trim behavior. |
||
|---|---|---|
| .. | ||
| .vscode | ||
| brightstaff | ||
| common | ||
| hermesllm | ||
| llm_gateway | ||
| prompt_gateway | ||
| build.sh | ||
| Cargo.lock | ||
| Cargo.toml | ||