SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-07-10 22:32:16 +02:00

Author	SHA1	Message	Date
CREDO23	db8bffab38	perf(prompt-cache): enable Azure prompt_cache_key routing hint Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated.	2026-05-20 11:58:15 +02:00
CREDO23	c3db25302b	perf(chat): kill auto-pin preflight + speculative build, rely on reactive 429 recovery The preflight pattern probed the LLM with a 1-token ping before each cold turn (when requested_llm_config_id==0, llm_config_id<0, and the 45s healthy TTL had expired) to detect 429s before fanning out into planner/classifier/title-gen. To absorb its ~1-5s RTT cost we built the agent speculatively in parallel; on 429 we discarded the build and repinned. Three problems with that design: 1. False security. Provider rate limits are token-bucket. A 1-token ping consumes ~5 tokens; the real request consumes 10-50K. The probe can return 200 while the real call still 429s. 2. Pure overhead in the common case. On warm-agent-cache turns the probe dominates wall time: ~2.5s of TTFT pure tax for ~99% of users who never see a 429. 3. The in-stream recovery loop (catch of _is_provider_rate_limited gated by not _first_event_logged) already does the right thing reactively: mark_runtime_cooldown -> resolve_or_get_pinned_llm_config_id with exclude_config_ids={previous} -> rebuild agent -> retry the stream. Preflight was never the only safety net; it was a redundant probe in front of one. Changes: - Delete _preflight_llm, _settle_speculative_agent_build, and the _PREFLIGHT_TIMEOUT_SEC / _PREFLIGHT_MAX_TOKENS constants. - Drop the parallel agent_build_task / preflight_task plumbing in both stream_new_chat and stream_resume_chat; build the agent inline with await _build_main_agent_for_thread(...). - Drop the unused is_recently_healthy / mark_healthy imports here (still exported from auto_model_pin_service since OpenRouter catalogue refresh and a few tests reference clear_healthy). - Remove the obsolete preflight + settle-speculative tests from test_stream_new_chat_contract.py. Net: -447 LOC. ~2.5s removed from TTFT on every cold preflight-eligible turn. 429 recovery path is unchanged - same repin/rebuild/retry, just not paid in advance on the healthy path.	2026-05-20 11:03:08 +02:00
Anish Sarkar	8de7d86d56	Merge remote-tracking branch 'upstream/dev' into fix/backend-tests	2026-05-16 19:40:01 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	9fb9778bd0	test: enhance index batch parallel tests to include hybrid chunker Updated the test for the indexing pipeline to verify that both the standard and hybrid chunkers are called via asyncio.to_thread, ensuring non-blocking behavior. This change reflects the routing of non-code documents through the hybrid chunker, maintaining the event loop contract.	2026-05-15 18:02:04 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	c187b04e82	chore: linting	2026-05-15 17:33:44 -07:00
CREDO23	4980f9f1ba	Merge remote-tracking branch 'upstream/dev' into feature/multi-agent-with-task-parallelization	2026-05-15 16:44:22 +02:00
CREDO23	98b6977c68	permissions/ask: gate 'approve_always' palette entry on MCP-ness Only MCP tools have a persistence target for 'approve_always' (the connector's trusted-tools list); for native tools the decision lives only in the in-memory runtime ruleset. Reflect that in the wire palette so the FE can stay a pure renderer of allowed_decisions instead of peeking at context.mcp_connector_id to decide whether to show the 'Always Allow' button. The backend still accepts an 'approve_always' reply for any tool kind (in-memory promotion is harmless), it just doesn't advertise it when there's nowhere to persist.	2026-05-15 14:54:16 +02:00
CREDO23	c8b756ae8f	hitl/wire: rename 'always' decision-type to 'approve_always' Renames the SurfSense HITL extension decision-type from "always" to "approve_always" so it sits in the same verb-first family as "approve", "reject", and "edit". The Python constant is now SURFSENSE_DECISION_APPROVE_ALWAYS; the wire value, the permission-domain decision_type, and the FE union members all match (no wire/internal mismatch). Both the multi_agent_chat permission middleware and the legacy new_chat one accept the new wire value; the FE types.ts union is updated accordingly. The "context.always" payload key is intentionally left untouched - it's the patterns-to-promote field, semantically distinct from the decision type.	2026-05-15 14:47:32 +02:00
CREDO23	6671c91841	multi_agent_chat/permissions: persist 'always' decisions to trusted-tools list Until now an "Always Allow" reply only updated the in-memory runtime ruleset, evaporating after the session ended. Persist it to the existing connector.config['trusted_tools'] list so the next session's fetch_user_allowlist_rulesets picks it up and the user is never asked again for the same (connector, tool) pair. - TrustedToolSaver + make_trusted_tool_saver(user_id) in user_tool_allowlist: opens its own session via async_session_maker per call, logs and swallows failures (in-memory promotion is the canonical "always" path, durable persistence is opportunistic). - PermissionMiddleware._process is now pure: returns (state_update, list[_AlwaysPromotion]). aafter_model awaits the saver for each promotion; after_model discards them. Promotions are only emitted for tools whose metadata exposes mcp_connector_id, so native tools and KB FS ops are correctly skipped. - main_agent factory builds the saver once per turn and stashes it in dependencies["trusted_tool_saver"]; pack_subagent and the KB middleware stack forward it through build_permission_mw. - Renamed pm._process(state, None) call sites in two existing tests to pm.after_model(state, None) so they exercise the public hook contract instead of the now-tuple-returning private method.	2026-05-15 14:07:08 +02:00
Rohan Verma	9475036b8a	Merge pull request #1389 from CREDO23/feature/multi-agent [Feature] Fix multi-agent delegation: orchestrator-only main agent with knowledge_base specialist	2026-05-15 04:54:17 -07:00
CREDO23	a97d1548a6	multi_agent_chat/permissions: surface MCP tool metadata into ask interrupts The FE permission card needs mcp_connector_id, mcp_server, and tool_description in the interrupt context to render "Always Allow" against the right connected account. Thread the tool through the ask pipeline: - pack_subagent → build_permission_mw(tools=...) → PermissionMiddleware (tools_by_name) → request_permission_decision(tool=...) → build_permission_ask_payload(tool=...) projects card fields out of BaseTool. - mcp_tool.py: stdio path now stashes mcp_connector_id in metadata for parity with the HTTP path.	2026-05-15 11:28:06 +02:00
CREDO23	ef1152b80e	multi_agent_chat/permissions: layer user allow-list into subagent compile	2026-05-14 21:57:38 +02:00
CREDO23	d45dfbfbd6	multi_agent_chat: pack_subagent owns per-subagent PermissionMiddleware via Ruleset	2026-05-14 20:09:29 +02:00
CREDO23	0723702320	multi_agent_chat: real-graph regressions for unified HITL paths + format pass	2026-05-14 17:41:24 +02:00
CREDO23	a36b15b834	multi_agent_chat/middleware: tighten parallel-keying test with heterogeneous bundles and per-slice assertions	2026-05-14 10:11:51 +02:00
CREDO23	d69d2cc1fc	multi_agent_chat/middleware: tighten heterogeneous slice arithmetic to (2,3) bundles	2026-05-14 10:05:04 +02:00
CREDO23	668b89927b	multi_agent_chat/middleware: real-graph regression test for partial-pause parallel routing	2026-05-14 09:47:24 +02:00
CREDO23	8e10f38f32	multi_agent_chat/middleware: real-graph regression test for all-reject parallel routing	2026-05-14 09:36:03 +02:00
CREDO23	ca57b2106e	multi_agent_chat/middleware: real-graph regression test for heterogeneous parallel decisions	2026-05-14 09:26:08 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	3737118050	chore: evals	2026-05-13 14:02:26 -07:00
CREDO23	f2495092da	chat/stream_resume: salt thinking-step prefix with turn_id to avoid duplicate React keys	2026-05-13 21:15:51 +02:00
CREDO23	0fd87ccb7f	chat/stream_resume: key Command(resume=...) by Interrupt.id for parallel HITL	2026-05-13 20:59:57 +02:00
CREDO23	c06dd6e8ba	chat/stream_new_chat: emit one SSE frame per pending interrupt	2026-05-13 20:59:48 +02:00
CREDO23	1001f56206	multi_agent_chat/middleware: parallel task tests and full bridge coverage	2026-05-13 19:57:57 +02:00
CREDO23	6fb011c95c	multi_agent_chat/middleware: real-graph regression tests for interrupt stamping	2026-05-13 19:57:09 +02:00
CREDO23	fc2c5b6445	multi_agent_chat/middleware: per-call thread_id, tcid-keyed resume, decisions slicer	2026-05-13 19:56:51 +02:00
CREDO23	246dae40a8	Merge upstream/dev into feature/multi-agent	2026-05-12 21:23:37 +02:00
Anish Sarkar	9b926b3133	refactor: update test for index() to use chunk_text_hybrid	2026-05-13 00:22:43 +05:30
CREDO23	d843468256	multi_agent_chat/subagents: dict-keyed middleware_stack + always-on KB	2026-05-12 18:04:54 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	c8374e6c5b	feat: improved document, folder mentions rendering Some checks are pending Build and Push Docker Images / tag_release (push) Waiting to run Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions	2026-05-09 22:15:51 -07:00
Rohan Verma	28a02a9143	Merge pull request #1357 from CREDO23/feature/multi-agent [Feature] Multi-agent chat: hierarchical timeline, live subagent streaming, and inline HITL approvals	2026-05-09 16:13:04 -07:00
Rohan Verma	fa31da9937	Merge branch 'dev' into feat/e2e-testing	2026-05-09 16:10:45 -07:00
CREDO23	2ab6b1c757	Merge upstream/dev into feature/multi-agent.	2026-05-09 23:00:56 +02:00
Anish Sarkar	de6fc80dbd	chore: ran linting	2026-05-09 05:28:09 +05:30
Anish Sarkar	f7bac59a4b	test(integration): enhance Drive indexer credential resolution tests for Composio and native connectors	2026-05-09 05:26:36 +05:30
Anish Sarkar	dbf575fbd0	chore: ran linting	2026-05-09 05:16:20 +05:30
CREDO23	1761b60c16	Carry thinkingStepId on tool output and extend builder and parity tests.	2026-05-08 23:17:12 +02:00
CREDO23	007a0a30ec	Cover tool_activity_metadata for span-only, step-only, and combined cases.	2026-05-08 23:16:56 +02:00
CREDO23	f944cdacb7	Add helpers to open and close task delegation span ids.	2026-05-08 22:47:03 +02:00
CREDO23	78f4747382	refactor(chat): stream agent events via stream_output and remove parity v2 flag	2026-05-07 19:40:10 +02:00
CREDO23	7e07092f67	refactor(chat): drop alternate streaming entry path; use graph_stream	2026-05-07 19:25:20 +02:00
CREDO23	52895e37e9	build streaming contexts for chat resume and regenerate paths	2026-05-07 17:57:27 +02:00
CREDO23	a04b2e88bd	wire orchestrator streaming context path and align event relay outputs	2026-05-07 17:06:17 +02:00
CREDO23	0f40279d95	Expand orchestration gate coverage to resume and regenerate flows.	2026-05-07 16:18:29 +02:00
CREDO23	52593d88db	Reorganize streaming orchestration modules into relay and orchestration folders.	2026-05-07 16:00:15 +02:00
CREDO23	f8754a9dab	Rename streaming runtime modules for clearer SRP boundaries.	2026-05-07 15:41:33 +02:00
CREDO23	4e664652a8	Add streaming runtime helpers with behavior-focused unit tests.	2026-05-07 15:13:22 +02:00
CREDO23	8b6ffd12b8	Add parity unit tests for extracted chat streaming vs legacy.	2026-05-06 20:08:48 +02:00
CREDO23	366122da6e	Add unit tests for streaming interrupts and service propagation.	2026-05-06 20:08:48 +02:00
CREDO23	619a8362b7	Add unit tests for streaming emitters and registry wiring.	2026-05-06 20:08:48 +02:00

1 2 3 4

194 commits