Commit graph

5750 commits

Author SHA1 Message Date
Matt Van Horn
f0a51fad6f
docs(editor): align PlateEditor onSave JSDoc with Mod+Shift+S chord
Per #1373, the registered save chord is Mod+Shift+S (not Mod+S, which
collides with the browser's Save-Page-As). The JSDoc on PlateEditorProps.onSave
still claims Mod+S, which is misleading for downstream consumers of the
component. Update the JSDoc to match the actual chord and call out why.

Targeting dev per maintainer request.
2026-05-15 09:06:42 -07:00
Rohan Verma
eea2d68098
Merge pull request #1396 from guangyang1206/fix/shared-thread-timestamp-formatter-1376
Some checks failed
Build and Push Docker Images / tag_release (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64, production) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64, production) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64, runner) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64, runner) (push) Has been cancelled
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Has been cancelled
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Has been cancelled
feat(shared): extract formatThreadTimestamp helper for chats sidebars…
2026-05-15 04:55:47 -07:00
Rohan Verma
7f66159af1
Merge pull request #1391 from guangyang1206/fix/log-mutations-invalidate-all-keys-1369
fix(web): invalidate all log cache keys on log mutations
2026-05-15 04:55:25 -07:00
Rohan Verma
9475036b8a
Merge pull request #1389 from CREDO23/feature/multi-agent
[Feature] Fix multi-agent delegation: orchestrator-only main agent with knowledge_base specialist
2026-05-15 04:54:17 -07:00
Rohan Verma
4ab9544a66
Merge pull request #1382 from mvanhorn/osc/1372-use-canonical-log-types
refactor(use-logs): use canonical log types from contracts/types/log.types
2026-05-15 04:49:21 -07:00
Rohan Verma
4db3cf7fd5
Merge pull request #1377 from AnishSarkar22/feat/e2e-testing-ci
feat: add E2E CI and harden Docker build migrations
2026-05-15 04:47:26 -07:00
DESKTOP-RTLN3BA\$punk
e8aad48ddf refactor(report): enhance citations and clarify implementation details
Updated the multimodal_doc_parser_compare_n171_report.md to include detailed code citations for preprocessing costs and retry logic. Improved clarity on the implementation of the retry mechanism and its impact on failure rates. Added a new section for a code citations index to ensure reproducibility of technical claims.

This enhances the report's transparency and allows readers to trace the source of each claim back to the codebase.
2026-05-14 20:07:14 -07:00
DESKTOP-RTLN3BA\$punk
9bcd50164d feat(evals): publish multimodal_doc parser_compare benchmark + n=171 report
Adds the full parser_compare experiment for the multimodal_doc suite:
six arms compared on 30 PDFs / 171 questions from MMLongBench-Doc with
anthropic/claude-sonnet-4.5 across the board.

Source code:
- core/parsers/{azure_di,llamacloud,pdf_pages}.py: direct parser SDK
  callers (Azure Document Intelligence prebuilt-read/layout, LlamaParse
  parse_page_with_llm/parse_page_with_agent) used by the LC arms,
  bypassing the SurfSense backend so each (basic/premium) extraction
  is a clean A/B independent of backend ETL routing.
- suites/multimodal_doc/parser_compare/{ingest,runner,prompt}.py:
  six-arm benchmark (native_pdf, azure_basic_lc, azure_premium_lc,
  llamacloud_basic_lc, llamacloud_premium_lc, surfsense_agentic) with
  byte-identical prompts per question, deterministic grader, Wilson
  CIs, and the per-page preprocessing tariff cost overlay.

Reproducibility:
- pyproject.toml + uv.lock pin pypdf, azure-ai-documentintelligence,
  llama-cloud-services as new deps.
- .env.example documents the AZURE_DI_* and LLAMA_CLOUD_API_KEY env
  vars now required for parser_compare.
- 12 analysis scripts under scripts/: retry pass with exponential
  backoff, post-retry accuracy merge, McNemar / latency / per-PDF
  stats, context-overflow hypothesis test, etc. Each produces one
  number cited by the blog report.

Citation surface:
- reports/blog/multimodal_doc_parser_compare_n171_report.md: 1219-line
  technical writeup (16 sections) covering headline accuracy, per-format
  accuracy, McNemar pairwise significance, latency / token / per-PDF
  distributions, error analysis, retry experiment, post-retry final
  accuracy, cost amortization model with closed-form derivation, threats
  to validity, and reproducibility appendix.
- data/multimodal_doc/runs/2026-05-14T00-53-19Z/parser_compare/{raw,
  raw_retries,raw_post_retry}.jsonl + run_artifact.json + retry summary
  whitelisted via data/.gitignore as the verifiable numbers source.

Gitignore:
- ignore logs_*.txt + retry_run.log; structured artifacts cover the
  citation surface, debug logs are noise.
- data/.gitignore default-ignores everything, whitelists the n=171 run
  artifacts only (parser manifest left ignored to avoid leaking local
  Windows usernames in absolute paths; manifest is fully regenerable
  via 'ingest multimodal_doc parser_compare').
- reports/.gitignore now whitelists hand-curated reports/blog/.

Also retires the abandoned CRAG Task 3 implementation (download script,
streaming Task 3 ingest, CragTask3Benchmark + tests) and trims the
runner / ingest module APIs to match.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 19:54:41 -07:00
DESKTOP-RTLN3BA\$punk
3737118050 chore: evals 2026-05-13 14:02:26 -07:00
guangyang1206
b7b4443276 fix(web): invalidate all log cache keys on log mutations
Fixes #1369 — log create/update/delete mutations did not invalidate
the query keys that useLogs actually subscribes to, causing UI staleness.

Replace narrow invalidations (list, summary) with prefix-level
invalidation (["logs"]) to cover withQueryParams, list, summary
and detail in one shot.
2026-05-13 20:59:08 +08:00
Anish Sarkar
883c72396c chore: add minimumReleaseAge configuration to pnpm workspace for dependency management 2026-05-13 03:38:04 +05:30
CREDO23
246dae40a8 Merge upstream/dev into feature/multi-agent 2026-05-12 21:23:37 +02:00
CREDO23
6b60d324a3 multi_agent_chat/main_agent: one specialist per task; advertise write_todos for multi-turn plans 2026-05-12 20:39:14 +02:00
Anish Sarkar
6eb900cb0f chore: update packageManager version to pnpm@10.26.0 in both desktop and web projects 2026-05-12 23:59:58 +05:30
CREDO23
379cc992f4 multi_agent_chat/subagents: expose knowledge_base as ask_knowledge_base tool for siblings 2026-05-12 20:03:59 +02:00
CREDO23
f2f62c1c05 multi_agent_chat/permissions: break circular import in interrupt subpackage 2026-05-12 18:20:07 +02:00
CREDO23
d843468256 multi_agent_chat/subagents: dict-keyed middleware_stack + always-on KB 2026-05-12 18:04:54 +02:00
CREDO23
eee861bb3d multi_agent_chat/main_agent: rewrite system prompt to hierarchical prompts/ tree 2026-05-12 15:35:48 +02:00
CREDO23
9b82f2db1d multi_agent_chat/permissions: clone PermissionMiddleware with SRP split and edit support 2026-05-12 12:58:53 +02:00
CREDO23
3f77c74daf multi_agent_chat: drop general_purpose subagent and dead permission plumbing 2026-05-12 12:00:59 +02:00
CREDO23
3fb1976886 multi_agent_chat/main_agent: route KB work through task(knowledge_base) in <tool_routing> 2026-05-12 11:01:54 +02:00
CREDO23
ea72625a81 multi_agent_chat/main_agent: strip FS toolset + FileIntent from main-agent stack (router-only) 2026-05-12 10:57:36 +02:00
CREDO23
bce21dc4ce subagents/knowledge_base: universalize KB subagent across cloud + desktop modes 2026-05-12 10:51:32 +02:00
CREDO23
3adfa37565 multi_agent_chat/filesystem: extract dedicated FS middleware package 2026-05-12 10:43:45 +02:00
Anish Sarkar
275e2c9e83 chore: fix linting 2026-05-12 04:00:04 +05:30
Anish Sarkar
4dbadbf159 chore: update .gitignore and biome.json to include additional test-related directories and files for improved E2E testing 2026-05-12 03:59:52 +05:30
Anish Sarkar
bed2041a1b chore: modify E2E test configuration by updating global LLM model IDs to negative values for improved test isolation 2026-05-12 03:30:01 +05:30
Anish Sarkar
0b9fc00663 chore: update global LLM config fixture to include both premium and free models for comprehensive E2E testing 2026-05-12 03:00:35 +05:30
Anish Sarkar
650b691a39 chore: enhance E2E tests by adding synthetic global LLM config and updating environment variables for Google OAuth 2026-05-12 02:37:39 +05:30
Anish Sarkar
315329f344 chore: update E2E tests workflow to capture logs on cancellation and add shared volume for backend services 2026-05-12 01:35:33 +05:30
DESKTOP-RTLN3BA\$punk
2402b730fa chore: untrack accidentally embedded hermes-agent repo
Some checks failed
Build and Push Docker Images / tag_release (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Has been cancelled
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Has been cancelled
It was committed as a gitlink (mode 160000) in 81583ef3 despite being
listed in .gitignore, because ignore rules don't apply to already-tracked
paths. Remove it from the index and add a slash-less pattern as a guard
against the gitlink form being re-added.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 12:50:13 -07:00
DESKTOP-RTLN3BA\$punk
ec957e6fae Merge commit 'd6618b8357' into dev 2026-05-11 12:35:04 -07:00
Anish Sarkar
c052fc9304 chore: add fake DoclingService for E2E tests and integrate into runtime fakes 2026-05-12 00:30:16 +05:30
CREDO23
df2afed18d subagents/knowledge_base: wire KB specialist into orchestrator (renderer/projector split, FS middleware stack, cloud-mode gating) 2026-05-11 20:43:44 +02:00
Rohan Verma
d6618b8357
Merge pull request #1384 from MODSetter/chore/hide-blog-nav
Some checks failed
Build and Push Docker Images / tag_release (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Has been cancelled
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Has been cancelled
chore: hide blog from navbar until published
2026-05-11 11:20:45 -07:00
DESKTOP-RTLN3BA\$punk
b7e31f2974 chore: update .gitignore 2026-05-11 11:12:06 -07:00
DESKTOP-RTLN3BA\$punk
81583ef382 chore: hide blog until published 2026-05-11 11:08:42 -07:00
CREDO23
09fc99c435 subagents/knowledge_base: scaffold KB specialist subagent (description, system_prompt with infer-first path resolution + discover-existing-conventions principle, factory shell; not yet wired into registry) 2026-05-11 17:25:01 +02:00
CREDO23
83b51313ee multi_agent_chat/middleware: drop dormant LLMToolSelectorMiddleware from main-agent stack (gate is >30 tools; multi-agent main carries ~20) 2026-05-11 17:24:48 +02:00
Anish Sarkar
b247ff37df chore: implement test-only token mint endpoint and update E2E test authentication flow 2026-05-11 19:48:18 +05:30
CREDO23
44fcb34708 refactor(teams subagent): rewrite system_prompt with native-tool heuristic pattern; trim description to actual tool surface 2026-05-11 14:59:13 +02:00
CREDO23
f45a42e2f6 refactor(luma subagent): rewrite system_prompt with native-tool heuristic pattern; polish description with user-surface verbs 2026-05-11 14:59:06 +02:00
CREDO23
f383de04a4 refactor(discord subagent): rewrite system_prompt with native-tool heuristic pattern; trim description to actual tool surface 2026-05-11 14:58:57 +02:00
CREDO23
6ef4f5ff45 refactor(google_drive subagent): rewrite system_prompt with native-tool heuristic pattern; trim description to actual tool surface 2026-05-11 14:50:05 +02:00
CREDO23
68a3f03347 subagents/onedrive: rewrite system prompt on the native-tool shape (always-Word constraint with block-on-other-formats, KB-indexed name resolution, outcome mapping) and trim description verbing to match actual tool surface. 2026-05-11 14:44:20 +02:00
CREDO23
9d6f0d732f subagents/dropbox: rewrite system prompt on the native-tool shape (Paper-vs-Docx file-type signals, KB-indexed name resolution, outcome mapping) and trim description verbing to match actual tool surface. 2026-05-11 14:41:23 +02:00
CREDO23
ddcb5e26e5 subagents/confluence: rewrite system prompt on the native-tool shape (HTML storage-format guidance, REPLACE-semantics-with-no-read limitation, outcome mapping) and trim description verbing to match actual tool surface. 2026-05-11 14:36:42 +02:00
CREDO23
99610ea2d9 subagents/calendar: rewrite system prompt on the native-tool shape (infer-first inputs, all-day vs timed datetime semantics, search-disambiguation, outcome mapping) and trim description verbing to match actual tool surface. 2026-05-11 14:32:26 +02:00
CREDO23
2f9b06832f subagents/gmail: rewrite system prompt on the native-tool shape (infer-first inputs, irreversibility safety, outcome mapping, MCP-aligned contract) and trim description verbing to match actual tool surface. 2026-05-11 14:24:04 +02:00
Anish Sarkar
741d6e7eea chore: update pnpm workspace configuration to allow builds for specified dependencies 2026-05-11 17:02:06 +05:30