plano/crates/brightstaff
Adil Hafeez 321c28da37 fix: truncate oversized user messages in orchestrator routing prompt
The orchestrator trimmer had a bypass that kept the latest user message
whole even when it alone exceeded the configured token budget. This
caused brightstaff to send a ~500KB prompt to the Plano-Orchestrator
model, which rejected it with a 400 "context length exceeded" from the
upstream 32K-token window. Brightstaff then surfaced a confusing
"missing field id" parse error instead of the real upstream message.

Fix the bypass by trimming the overflowing user message from the end
toward the beginning until it fits in the remaining token budget. The
beginning of the message (where user intent usually lives) is preserved
and the tail is dropped. Added a UTF-8-safe byte-truncation helper and a
regression test that mirrors the production payload (a single ~500KB
user message with a small budget).
2026-04-17 18:00:02 -07:00
..
src fix: truncate oversized user messages in orchestrator routing prompt 2026-04-17 18:00:02 -07:00
Cargo.toml Redis-backed session cache for cross-replica model affinity (#879) 2026-04-13 19:30:47 -07:00