feat(story-3.5): add cloud-mode LLM model selection with token quota enforcement

Implement system-managed model catalog, subscription tier enforcement,
atomic token quota tracking, and frontend cloud/self-hosted conditional
rendering. Apply all 20 BMAD code review patches including security
fixes (cross-user API key hijack), race condition mitigation (atomic SQL
UPDATE), and SSE mid-stream quota error handling.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
This commit is contained in:
Vonic 2026-04-14 17:01:21 +07:00
parent e7382b26de
commit c1776b3ec8
19 changed files with 1003 additions and 34 deletions

View file

@ -35,7 +35,7 @@
# - Dev moves story to 'review', then runs code-review (fresh context, different LLM recommended)
generated: 2026-04-13T02:50:25+07:00
last_updated: 2026-04-13T02:50:25+07:00
last_updated: 2026-04-14T17:00:00+07:00
project: SurfSense
project_key: NOKEY
tracking_system: file-system
@ -58,7 +58,7 @@ development_status:
3-2-rag-engine-sse-endpoint: done
3-3-chat-ui-sse-client: done
3-4-split-pane-layout-interactive-citation: done
3-5-model-selection-via-quota: backlog
3-5-model-selection-via-quota: done
epic-3-retrospective: optional
epic-4: done
4-1-chat-history-sync: done