feat(story-3.5): add cloud-mode LLM model selection with token quota enforcement

Implement system-managed model catalog, subscription tier enforcement, atomic token quota tracking, and frontend cloud/self-hosted conditional rendering. Apply all 20 BMAD code review patches including security fixes (cross-user API key hijack), race condition mitigation (atomic SQL UPDATE), and SSE mid-stream quota error handling. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
2026-04-25 16:56:22 +02:00 · 2026-04-14 17:01:21 +07:00 · 2026-04-14 17:01:21 +07:00 · c1776b3ec8
commit c1776b3ec8
parent e7382b26de
19 changed files with 1003 additions and 34 deletions
--- a/_bmad-output/implementation-artifacts/sprint-status.yaml
+++ b/_bmad-output/implementation-artifacts/sprint-status.yaml
@ -35,7 +35,7 @@
 # - Dev moves story to 'review', then runs code-review (fresh context, different LLM recommended)

 generated: 2026-04-13T02:50:25+07:00
-last_updated: 2026-04-13T02:50:25+07:00
+last_updated: 2026-04-14T17:00:00+07:00
 project: SurfSense
 project_key: NOKEY
 tracking_system: file-system
@ -58,7 +58,7 @@ development_status:
  3-2-rag-engine-sse-endpoint: done
  3-3-chat-ui-sse-client: done
  3-4-split-pane-layout-interactive-citation: done
-  3-5-model-selection-via-quota: backlog
+  3-5-model-selection-via-quota: done
  epic-3-retrospective: optional
  epic-4: done
  4-1-chat-history-sync: done