Commit graph

5820 commits

Author SHA1 Message Date
Matt Van Horn
047871c47a
fix(citation-panel): reset expanded state when chunkId changes
CitationPanelContent uses a single instance across different citations
(RightPanel.tsx:251 renders without a key), so when the user clicks a
new citation while the panel is open, the prior expanded state leaks
into the new citation. The reset effect had an empty dependency array,
so it only fired on mount.

Add chunkId to the effect deps so the expanded state resets each time
the citation changes.
2026-05-17 22:35:51 -07:00
Rohan Verma
8fc4b98593
Merge pull request #1402 from guangyang1206/fix/extract-domain-helper-1368
Fix/extract domain helper 1368
2026-05-17 18:17:25 -07:00
Rohan Verma
ac76d50ec7
Merge pull request #1400 from guangyang1206/fix/cachekeys-order-stable-1370
Fix/cachekeys order stable 1370
2026-05-17 18:16:30 -07:00
Rohan Verma
3c27fe688a
Merge pull request #1390 from AnishSarkar22/fix/backend-tests
fix: unit and integration tests
2026-05-17 18:15:53 -07:00
Rohan Verma
a065f94048
Merge pull request #1388 from AnishSarkar22/fix/zero-cache-stale-replica-1355
fix: zero cache stale replica & improved mentioned document chip handling
2026-05-17 18:15:36 -07:00
Anish Sarkar
cb9a0f327c test: refactor Gmail indexer tests to utilize ComposioService and hybrid chunking 2026-05-16 21:26:40 +05:30
Anish Sarkar
a0f2563dc3 test: update Stripe and Google Calendar integration tests to use ComposioService 2026-05-16 21:13:17 +05:30
Anish Sarkar
cc06cff4fb feat(tests): add mock response for file ownership in composio_module 2026-05-16 20:20:04 +05:30
Anish Sarkar
8de7d86d56 Merge remote-tracking branch 'upstream/dev' into fix/backend-tests 2026-05-16 19:40:01 +05:30
Anish Sarkar
af1d2fa430 Merge remote-tracking branch 'upstream/dev' into fix/zero-cache-stale-replica-1355 2026-05-16 19:30:09 +05:30
guangyang1206
f096548a16 fix(web): extract single tryGetHostname helper (DRY, unified fallback)
Fixes #1368

Previously,  was duplicated in 4 places with 3 subtly different fallback behaviors:
1. inline-citation.tsx: returned  on error
2. markdown-text.tsx: returned  on error
3. assistant-message.tsx: returned  on error
4. citation.tsx: returned  on error

Created canonical  in  that:
- Returns
- Strips  prefix from hostname
- Returns  on invalid URL (safest contract)

Updated all 4 call sites:
- inline-citation.tsx:  (preserves original fallback)
- markdown-text.tsx:  (preserves original fallback)
- assistant-message.tsx:  (drop-in, both return )
- citation.tsx:  (drop-in, both return )

Co-authored-by: guangyang1206 <guangyang1206@users.noreply.github.com>
2026-05-16 12:15:16 +08:00
guangyang1206
3504be3413 fix(web): make cacheKeys.*.withQueryParams order-stable (sort entries)
Fixes #1370

Object.values() produces order-dependent cache keys because the order of values depends on the order of keys in the object. This causes the same logical query to produce different cache keys when the parameter object has keys in different orders.

Added stableEntries() helper that:
1. Filters out undefined values
2. Sorts entries by key name
3. Returns flat array of [key, value] pairs

This ensures cache key identity is stable regardless of parameter object key order.

Co-authored-by: guangyang1206 <guangyang1206@users.noreply.github.com>
2026-05-16 12:10:04 +08:00
Rohan Verma
1119f557df
Merge pull request #1399 from MODSetter/dev
feat: multi-agent chat architecture, streaming runtime rewrite, and full E2E test harness
2026-05-15 18:08:04 -07:00
DESKTOP-RTLN3BA\$punk
9fb9778bd0 test: enhance index batch parallel tests to include hybrid chunker
Updated the test for the indexing pipeline to verify that both the standard and hybrid chunkers are called via asyncio.to_thread, ensuring non-blocking behavior. This change reflects the routing of non-code documents through the hybrid chunker, maintaining the event loop contract.
2026-05-15 18:02:04 -07:00
DESKTOP-RTLN3BA\$punk
c187b04e82 chore: linting 2026-05-15 17:33:44 -07:00
DESKTOP-RTLN3BA\$punk
219a5977b7 fix: update URLs to use the "www" subdomain across the application
This commit modifies various metadata and canonical URLs in the SurfSense application to ensure consistency by using "https://www.surfsense.com" instead of "https://surfsense.com". Changes were made in layout files, blog posts, and SEO components to reflect this update.
2026-05-15 12:35:15 -07:00
DESKTOP-RTLN3BA\$punk
dc88ce0277 Merge branch 'dev' of https://github.com/MODSetter/SurfSense into dev 2026-05-15 11:55:40 -07:00
DESKTOP-RTLN3BA\$punk
52a64fb96c feat: added blog posts 2026-05-15 11:55:30 -07:00
Rohan Verma
953c654452
Merge pull request #1398 from mvanhorn/osc/1373-platefile-jsdoc-mod-shift-s-dev
docs(editor): align PlateEditor onSave JSDoc with Mod+Shift+S chord
2026-05-15 11:28:15 -07:00
Rohan Verma
1afecc9194
Merge pull request #1397 from voidborne-d/fix/suppress-global-error-toast-mutations-dev
fix(web): suppress global error toast on mutations that own their toast UX
2026-05-15 11:27:40 -07:00
Rohan Verma
7e484b26a4
Merge pull request #1393 from CREDO23/feature/multi-agent-with-task-parallelization
[Feature] Parallel multi-agent task delegation with parallel HITL approvals
2026-05-15 11:26:51 -07:00
Matt Van Horn
f0a51fad6f
docs(editor): align PlateEditor onSave JSDoc with Mod+Shift+S chord
Per #1373, the registered save chord is Mod+Shift+S (not Mod+S, which
collides with the browser's Save-Page-As). The JSDoc on PlateEditorProps.onSave
still claims Mod+S, which is misleading for downstream consumers of the
component. Update the JSDoc to match the actual chord and call out why.

Targeting dev per maintainer request.
2026-05-15 09:06:42 -07:00
voidborne-d
bf2b4ebeb0 fix(web): suppress global error toast on mutations that own their toast UX
Closes #1371. Retarget of #1385 onto dev per maintainer request.

surfsense_web/lib/query-client/client.ts configures a global
MutationCache.onError that shows an error toast for every failed
mutation unless meta.suppressGlobalErrorToast is set. The opt-out
hook existed in the consumer but had zero producers — every mutation
atom that already had its own onError: toast.error(...) was
double-toasting on failure.

Add meta: { suppressGlobalErrorToast: true } to the 30 mutations
across 9 atom files that own their own error toast:

- atoms/prompts/prompts-mutation.atoms.ts (4)
- atoms/invites/invites-mutation.atoms.ts (4)
- atoms/chat-comments/comments-mutation.atoms.ts (4)
- atoms/new-llm-config/new-llm-config-mutation.atoms.ts (4)
- atoms/members/members-mutation.atoms.ts (3)
- atoms/roles/roles-mutation.atoms.ts (3)
- atoms/image-gen-config/image-gen-config-mutation.atoms.ts (3)
- atoms/vision-llm-config/vision-llm-config-mutation.atoms.ts (3)
- atoms/public-chat-snapshots/public-chat-snapshots-mutation.atoms.ts (2)

Atoms intentionally left alone (no local onError, rely on global):
auth, user, search-spaces, logs, documents, connectors.

Local validation (against dev): pnpm biome check on the 9 touched
files is clean; tsc --noEmit shows no new errors in the touched files
(pre-existing errors elsewhere unrelated).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 23:43:30 +08:00
CREDO23
4980f9f1ba Merge remote-tracking branch 'upstream/dev' into feature/multi-agent-with-task-parallelization 2026-05-15 16:44:22 +02:00
CREDO23
5327f3348c connector-popup: surface trusted-tools UI in MCP edit view; consolidate disconnect
- Slot MCPTrustedTools in mcp-service-config (gated on connector.id > 0) so
  any connected MCP-backed connector exposes a revoke surface for
  approve_always grants.
- Add new mcp-trusted-tools.tsx (audit + revoke list) and
  connectorsApiService.untrustMCPTool() that backs it.
- Drop the redundant row-level Disconnect from ConnectorAccountsListView:
  Manage now leads to the edit view whose own Disconnect is the single
  source of truth. Remove the now-dead onDisconnect prop, confirm-flow
  state, and handleDisconnectFromList hook callback + return entry.
2026-05-15 16:40:16 +02:00
CREDO23
a22e0e915f schemas/new_chat: accept 'approve_always' on the resume HTTP boundary
ResumeDecision is the Pydantic gate at the /resume HTTP route. It was
the last spot still rejecting the new wire decision-type, so the FE's
'approve_always' dispatch was being 422'd before it could reach the
permission middleware that already speaks it.
2026-05-15 15:23:39 +02:00
CREDO23
1f1b6c5425 hitl/generic-approval: drop client-side MCP gate, dispatch approve_always
The 'Always Allow' button is now driven entirely by the server-supplied
allowed_decisions palette. The card no longer peeks at
context.mcp_connector_id to decide whether to render the button, and no
longer fires a separate trust-tool HTTP call on click - one
{type: 'approve_always'} dispatch is enough; the agent middleware
handles the in-memory promotion and (for MCP tools) the database save
via its trusted_tool_saver callback.

Drops the dead trustMCPTool / untrustMCPTool service helpers - they had
no remaining callers after this rework. The backing HTTP routes are
kept on the server as a programmatic surface.
2026-05-15 14:59:45 +02:00
CREDO23
98b6977c68 permissions/ask: gate 'approve_always' palette entry on MCP-ness
Only MCP tools have a persistence target for 'approve_always' (the
connector's trusted-tools list); for native tools the decision lives
only in the in-memory runtime ruleset. Reflect that in the wire palette
so the FE can stay a pure renderer of allowed_decisions instead of
peeking at context.mcp_connector_id to decide whether to show the
'Always Allow' button.

The backend still accepts an 'approve_always' reply for any tool kind
(in-memory promotion is harmless), it just doesn't advertise it when
there's nowhere to persist.
2026-05-15 14:54:16 +02:00
CREDO23
c8b756ae8f hitl/wire: rename 'always' decision-type to 'approve_always'
Renames the SurfSense HITL extension decision-type from "always" to
"approve_always" so it sits in the same verb-first family as "approve",
"reject", and "edit". The Python constant is now SURFSENSE_DECISION_APPROVE_ALWAYS;
the wire value, the permission-domain decision_type, and the FE union members
all match (no wire/internal mismatch).

Both the multi_agent_chat permission middleware and the legacy new_chat one
accept the new wire value; the FE types.ts union is updated accordingly.

The "context.always" payload key is intentionally left untouched - it's the
patterns-to-promote field, semantically distinct from the decision type.
2026-05-15 14:47:32 +02:00
CREDO23
6671c91841 multi_agent_chat/permissions: persist 'always' decisions to trusted-tools list
Until now an "Always Allow" reply only updated the in-memory runtime
ruleset, evaporating after the session ended. Persist it to the
existing connector.config['trusted_tools'] list so the next session's
fetch_user_allowlist_rulesets picks it up and the user is never asked
again for the same (connector, tool) pair.

- TrustedToolSaver + make_trusted_tool_saver(user_id) in
  user_tool_allowlist: opens its own session via async_session_maker
  per call, logs and swallows failures (in-memory promotion is the
  canonical "always" path, durable persistence is opportunistic).

- PermissionMiddleware._process is now pure: returns
  (state_update, list[_AlwaysPromotion]). aafter_model awaits the
  saver for each promotion; after_model discards them. Promotions are
  only emitted for tools whose metadata exposes mcp_connector_id, so
  native tools and KB FS ops are correctly skipped.

- main_agent factory builds the saver once per turn and stashes it in
  dependencies["trusted_tool_saver"]; pack_subagent and the KB
  middleware stack forward it through build_permission_mw.

- Renamed pm._process(state, None) call sites in two existing tests to
  pm.after_model(state, None) so they exercise the public hook
  contract instead of the now-tuple-returning private method.
2026-05-15 14:07:08 +02:00
Rohan Verma
eea2d68098
Merge pull request #1396 from guangyang1206/fix/shared-thread-timestamp-formatter-1376
Some checks failed
Build and Push Docker Images / tag_release (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64, production) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64, production) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64, runner) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64, runner) (push) Has been cancelled
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Has been cancelled
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Has been cancelled
feat(shared): extract formatThreadTimestamp helper for chats sidebars…
2026-05-15 04:55:47 -07:00
Rohan Verma
7f66159af1
Merge pull request #1391 from guangyang1206/fix/log-mutations-invalidate-all-keys-1369
fix(web): invalidate all log cache keys on log mutations
2026-05-15 04:55:25 -07:00
Rohan Verma
9475036b8a
Merge pull request #1389 from CREDO23/feature/multi-agent
[Feature] Fix multi-agent delegation: orchestrator-only main agent with knowledge_base specialist
2026-05-15 04:54:17 -07:00
Rohan Verma
4ab9544a66
Merge pull request #1382 from mvanhorn/osc/1372-use-canonical-log-types
refactor(use-logs): use canonical log types from contracts/types/log.types
2026-05-15 04:49:21 -07:00
Rohan Verma
4db3cf7fd5
Merge pull request #1377 from AnishSarkar22/feat/e2e-testing-ci
feat: add E2E CI and harden Docker build migrations
2026-05-15 04:47:26 -07:00
CREDO23
a97d1548a6 multi_agent_chat/permissions: surface MCP tool metadata into ask interrupts
The FE permission card needs mcp_connector_id, mcp_server, and
tool_description in the interrupt context to render "Always Allow"
against the right connected account. Thread the tool through the
ask pipeline:

- pack_subagent → build_permission_mw(tools=...) → PermissionMiddleware
  (tools_by_name) → request_permission_decision(tool=...) →
  build_permission_ask_payload(tool=...) projects card fields out of
  BaseTool.

- mcp_tool.py: stdio path now stashes mcp_connector_id in metadata for
  parity with the HTTP path.
2026-05-15 11:28:06 +02:00
DESKTOP-RTLN3BA\$punk
e8aad48ddf refactor(report): enhance citations and clarify implementation details
Updated the multimodal_doc_parser_compare_n171_report.md to include detailed code citations for preprocessing costs and retry logic. Improved clarity on the implementation of the retry mechanism and its impact on failure rates. Added a new section for a code citations index to ensure reproducibility of technical claims.

This enhances the report's transparency and allows readers to trace the source of each claim back to the codebase.
2026-05-14 20:07:14 -07:00
DESKTOP-RTLN3BA\$punk
9bcd50164d feat(evals): publish multimodal_doc parser_compare benchmark + n=171 report
Adds the full parser_compare experiment for the multimodal_doc suite:
six arms compared on 30 PDFs / 171 questions from MMLongBench-Doc with
anthropic/claude-sonnet-4.5 across the board.

Source code:
- core/parsers/{azure_di,llamacloud,pdf_pages}.py: direct parser SDK
  callers (Azure Document Intelligence prebuilt-read/layout, LlamaParse
  parse_page_with_llm/parse_page_with_agent) used by the LC arms,
  bypassing the SurfSense backend so each (basic/premium) extraction
  is a clean A/B independent of backend ETL routing.
- suites/multimodal_doc/parser_compare/{ingest,runner,prompt}.py:
  six-arm benchmark (native_pdf, azure_basic_lc, azure_premium_lc,
  llamacloud_basic_lc, llamacloud_premium_lc, surfsense_agentic) with
  byte-identical prompts per question, deterministic grader, Wilson
  CIs, and the per-page preprocessing tariff cost overlay.

Reproducibility:
- pyproject.toml + uv.lock pin pypdf, azure-ai-documentintelligence,
  llama-cloud-services as new deps.
- .env.example documents the AZURE_DI_* and LLAMA_CLOUD_API_KEY env
  vars now required for parser_compare.
- 12 analysis scripts under scripts/: retry pass with exponential
  backoff, post-retry accuracy merge, McNemar / latency / per-PDF
  stats, context-overflow hypothesis test, etc. Each produces one
  number cited by the blog report.

Citation surface:
- reports/blog/multimodal_doc_parser_compare_n171_report.md: 1219-line
  technical writeup (16 sections) covering headline accuracy, per-format
  accuracy, McNemar pairwise significance, latency / token / per-PDF
  distributions, error analysis, retry experiment, post-retry final
  accuracy, cost amortization model with closed-form derivation, threats
  to validity, and reproducibility appendix.
- data/multimodal_doc/runs/2026-05-14T00-53-19Z/parser_compare/{raw,
  raw_retries,raw_post_retry}.jsonl + run_artifact.json + retry summary
  whitelisted via data/.gitignore as the verifiable numbers source.

Gitignore:
- ignore logs_*.txt + retry_run.log; structured artifacts cover the
  citation surface, debug logs are noise.
- data/.gitignore default-ignores everything, whitelists the n=171 run
  artifacts only (parser manifest left ignored to avoid leaking local
  Windows usernames in absolute paths; manifest is fully regenerable
  via 'ingest multimodal_doc parser_compare').
- reports/.gitignore now whitelists hand-curated reports/blog/.

Also retires the abandoned CRAG Task 3 implementation (download script,
streaming Task 3 ingest, CragTask3Benchmark + tests) and trims the
runner / ingest module APIs to match.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 19:54:41 -07:00
CREDO23
ef1152b80e multi_agent_chat/permissions: layer user allow-list into subagent compile 2026-05-14 21:57:38 +02:00
CREDO23
e99c06c887 user_tool_allowlist: extract trust-tool storage into reusable service 2026-05-14 21:20:30 +02:00
CREDO23
31d6b43a42 multi_agent_chat/shared: drop bucket types and helpers 2026-05-14 20:10:25 +02:00
CREDO23
014801c764 multi_agent_chat/loader: MCP tools as flat list[BaseTool] per agent 2026-05-14 20:10:11 +02:00
CREDO23
5a00df8e48 multi_agent_chat/builtins: KB+deliverables+memory+research adopt RULESET + flat load_tools() 2026-05-14 20:09:55 +02:00
CREDO23
3bb90124d2 multi_agent_chat/connectors: every route declares its own RULESET + flat load_tools() 2026-05-14 20:09:49 +02:00
CREDO23
d45dfbfbd6 multi_agent_chat: pack_subagent owns per-subagent PermissionMiddleware via Ruleset 2026-05-14 20:09:29 +02:00
CREDO23
67142e68b1 multi_agent_chat: scope MCP allow/ask permissions per subagent + drop "policy" synonym 2026-05-14 18:09:14 +02:00
CREDO23
0723702320 multi_agent_chat: real-graph regressions for unified HITL paths + format pass 2026-05-14 17:41:24 +02:00
CREDO23
adb52fb575 multi_agent_chat: KB owns its ruleset, drop interrupt_on duplication 2026-05-14 17:41:07 +02:00
CREDO23
d68280113b multi_agent_chat/connectors+builtins: adopt symmetric self_gated_tool_permission_row helper 2026-05-14 17:40:59 +02:00
CREDO23
a06aec2821 multi_agent_chat/subagents: HITL umbrella + ToolKind rename 2026-05-14 17:40:29 +02:00