mirror of
https://github.com/katanemo/plano.git
synced 2026-05-11 16:52:41 +02:00
Replace per-chunk HTTP requests to output filters with a single bidirectional streaming connection per filter. This eliminates the 50-200+ round-trips per streaming LLM response. Filters opt in via streaming: true in config. When all output filters support streaming, brightstaff opens one POST per filter with a streaming request body (Body::wrap_stream) and reads the streaming response. Filters that don't opt in fall back to the existing per-chunk behavior. Updates the PII deanonymizer demo as the reference implementation with request.stream() + StreamingResponse support. Made-with: Cursor |
||
|---|---|---|
| .. | ||
| api | ||
| traces | ||
| configuration.rs | ||
| consts.rs | ||
| errors.rs | ||
| http.rs | ||
| lib.rs | ||
| llm_providers.rs | ||
| path.rs | ||
| pii.rs | ||
| ratelimit.rs | ||
| routing.rs | ||
| stats.rs | ||
| tokenizer.rs | ||
| tracing.rs | ||
| utils.rs | ||