plano/crates/brightstaff
Adil Hafeez 42b7927122 feat: head+tail trim with ellipsis and 16-turn cap for routing prompt
Replaces the previous head-only truncation of oversized user messages
with a middle-trim (head + ellipsis + tail) that preserves both the task
framing (start of message) and the actual ask (end of message) — a
common shape for long pasted content like code dumps or specs. The
unicode ellipsis also signals to the router model that content was
dropped, which can improve classification accuracy on truncated prompts.

Also adds an outer guardrail: only the last `MAX_ROUTING_TURNS` (16)
filtered messages are considered when building the routing prompt. This
bounds prompt growth for long conversations before the token-budget
loop runs, matching the approach HuggingFace chat-ui takes in its
arch-router client.

Tests:
- test_huge_single_user_message_is_middle_trimmed: regression test for
  the 500KB user message scenario. Verifies the prompt stays bounded,
  head + tail markers both survive, and the ellipsis is present.
- test_turn_cap_limits_routing_history: builds a 32-turn conversation
  and verifies only the last 16 make it into the prompt.
- test_trim_middle_utf8_helper: unit test for the helper covering the
  no-op path, the 60/40 split, the too-small-for-marker fallback, and
  UTF-8 boundary safety for multi-byte characters.
- Updated test_conversation_trim_upto_user_message to reflect the new
  middle-trim behavior.
2026-04-17 19:18:30 -07:00
..
src feat: head+tail trim with ellipsis and 16-turn cap for routing prompt 2026-04-17 19:18:30 -07:00
Cargo.toml Redis-backed session cache for cross-replica model affinity (#879) 2026-04-13 19:30:47 -07:00