Replace RouterService/RouterModelV1 (arch-router prompt) with
OrchestratorService/OrchestratorModelV1 (plano-orchestrator prompt)
for LLM routing. This ensures the correct system prompt is used when
llm_routing_model points at a Plano-Orchestrator model.
- Extend OrchestratorService with session caching, ModelMetricsService,
top-level routing preferences, and determine_route() for LLM routing
- Delete RouterService, RouterModel trait, RouterModelV1, and
ARCH_ROUTER_V1_SYSTEM_PROMPT
- Unify defaults to Plano-Orchestrator / plano-orchestrator
- Update CLI config generator, demos, docs, and config schema
Made-with: Cursor
* add pluggable session cache with Redis backend
* add Redis session affinity demos (Docker Compose and Kubernetes)
* address PR review feedback on session cache
* document Redis session cache backend for model affinity
* sync rendered config reference with session_cache addition
* add tenant-scoped Redis session cache keys and remove dead log_affinity_hit
- Add tenant_header to SessionCacheConfig; when set, cache keys are scoped
as plano:affinity:{tenant_id}:{session_id} for multi-tenant isolation
- Thread tenant_id through RouterService, routing_service, and llm handlers
- Use Cow<'_, str> in session_key to avoid allocation when no tenant is set
- Remove unused log_affinity_hit (logging was already inlined at call sites)
* remove session_affinity_redis and session_affinity_redis_k8s demos