feat(story-3.5): add cloud-mode LLM model selection with token quota enforcement

Implement system-managed model catalog, subscription tier enforcement, atomic token quota tracking, and frontend cloud/self-hosted conditional rendering. Apply all 20 BMAD code review patches including security fixes (cross-user API key hijack), race condition mitigation (atomic SQL UPDATE), and SSE mid-stream quota error handling. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
2026-04-26 09:16:22 +02:00 · 2026-04-14 17:01:21 +07:00 · 2026-04-14 17:01:21 +07:00 · c1776b3ec8
commit c1776b3ec8
parent e7382b26de
19 changed files with 1003 additions and 34 deletions
--- a/surfsense_web/lib/apis/new-llm-config-api.service.ts
+++ b/surfsense_web/lib/apis/new-llm-config-api.service.ts
@ -15,6 +15,7 @@ import {
 	getNewLLMConfigResponse,
 	getNewLLMConfigsRequest,
 	getNewLLMConfigsResponse,
+	getSystemModelsResponse,
 	type UpdateLLMPreferencesRequest,
 	type UpdateNewLLMConfigRequest,
 	updateLLMPreferencesRequest,
@ -153,6 +154,14 @@ class NewLLMConfigApiService {
 		return baseApiService.get(`/api/v1/models`, getModelListResponse);
 	};

+	/**
+	 * Get the system-managed LLM catalogue (cloud mode only)
+	 * Returns backend-configured models from YAML with negative IDs
+	 */
+	getSystemModels = async () => {
+		return baseApiService.get(`/api/v1/models/system`, getSystemModelsResponse);
+	};
+
 	/**
 	 * Update LLM preferences for a search space
 	 */