SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-05-25 19:15:18 +02:00

Author	SHA1	Message	Date
Rohan Verma	69388fc710	Merge pull request #1429 from CREDO23/fix-desktop-redirects [Fixes] Packaged desktop: connector redirect + linux launcher icon	2026-05-23 15:51:59 -07:00
Anish Sarkar	98e3950dc8	Merge remote-tracking branch 'upstream/dev' into feat/opentelemetry	2026-05-23 03:21:08 +05:30
Anish Sarkar	4c8d47617d	feat(env): add SURFSENSE_ENV variable for deployment environment and update observability resource attributes	2026-05-23 02:13:24 +05:30
Anish Sarkar	df698e0216	feat(observability): integrate OpenTelemetry collector and configuration for enhanced telemetry	2026-05-23 00:17:23 +05:30
CREDO23	d97b2830c5	fix: resolve desktop KB prompt self-contradiction on chunk_ids The citations fix (`cacb27e0`) added a "Chunk citations in your prose" section to system_prompt_desktop.md telling the KB subagent to always leave `evidence.chunk_ids` null and emit no `[citation:...]` markers in desktop mode, but left the pre-existing line declaring that `chunk_ids` apply to `<priority_documents>` hits. The two rules contradicted each other; the model picked one per turn. Strike the stale conditional clause and point at the dedicated section as the single source of truth. Matches the parallel line in system_prompt_cloud.md and the already-consistent system_prompt_readonly_desktop.md.	2026-05-22 17:24:57 +02:00
Anish Sarkar	51e4d8b489	feat(tasks): enhance Celery task telemetry with queue metadata and latency tracking	2026-05-22 18:19:38 +05:30
Anish Sarkar	7a3b278b75	feat(connectors): add retry and auth telemetry events	2026-05-22 17:50:02 +05:30
Anish Sarkar	c4abbd6e20	feat(pipeline): enrich ETL and indexing failure telemetry	2026-05-22 17:49:46 +05:30
Anish Sarkar	6e03ab044a	feat(tasks): measure Celery queue latency	2026-05-22 17:49:02 +05:30
Anish Sarkar	dc893281ba	feat(chat): add model retry and stream lifecycle events	2026-05-22 17:48:43 +05:30
Anish Sarkar	dbb652d4f8	feat(observability): add telemetry error and event helpers	2026-05-22 17:48:01 +05:30
Anish Sarkar	87a4dcfd05	feat(tasks): record indexing heartbeat metrics	2026-05-22 13:50:32 +05:30
Anish Sarkar	7c07c220fc	feat(connectors): add connector sync spans	2026-05-22 13:49:59 +05:30
Anish Sarkar	4e3a6dff46	feat(etl): instrument extraction spans and outcomes	2026-05-22 13:49:42 +05:30
Anish Sarkar	8bca29fe0d	feat(agents): track subagent invocation telemetry	2026-05-22 13:48:57 +05:30
Anish Sarkar	5a6b92c2b6	feat(chat): instrument streamed chat request telemetry	2026-05-22 13:48:19 +05:30
Anish Sarkar	f7f49de109	feat(observability): add chat subagent and ETL telemetry primitives	2026-05-22 13:47:50 +05:30
Anish Sarkar	21d9b1f218	fix(observability): sanitize outbound HTTP span URLs	2026-05-22 13:47:10 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	2e589091d8	feat: bumped version to 0.0.25	2026-05-21 14:44:33 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	cacb27e007	fix: citations in agent responses	2026-05-21 14:41:32 -07:00
Anish Sarkar	cea5605e32	feat(indexing): track indexing and connector outcomes	2026-05-21 23:03:43 +05:30
Anish Sarkar	b9d76f006d	feat(retriever): instrument knowledge base search	2026-05-21 23:03:31 +05:30
Anish Sarkar	53691f9c51	feat(agents): track permission and compaction events	2026-05-21 23:02:54 +05:30
Anish Sarkar	ea3d0a6463	feat(agents): emit metrics for model and tool calls	2026-05-21 23:02:36 +05:30
Anish Sarkar	6095b48b5f	feat(observability): add SurfSense metric helpers	2026-05-21 23:02:20 +05:30
Anish Sarkar	eb2e2b253b	feat(observability): add OpenTelemetry process bootstrap	2026-05-21 23:01:54 +05:30
Anish Sarkar	60049936e3	chore(dev): add local OpenTelemetry backend configuration	2026-05-21 23:00:56 +05:30
Anish Sarkar	2fd05d720e	chore: add OpenTelemetry dependencies and update lock file	2026-05-21 17:23:41 +05:30
CREDO23	49da7a57df	Merge remote-tracking branch 'upstream/dev' into improvement-agent-speed Resolves: surfsense_backend/app/agents/new_chat/middleware/memory_injection.py - Took both imports: upstream moved MEMORY_HARD_LIMIT/SOFT_LIMIT to app.services.memory; kept our perf-logger import for timing. Pulls in upstream changes: - Memory document feature (services/memory refactor, removal of app.agents.new_chat.memory_extraction and background extraction in stream_new_chat — agent now drives memory via update_memory tool). - BACKEND_URL env refactor across web tool-ui/editor/chat/dashboard/lib. - GitHub Actions backend test workflow + pre-commit biome bump. - Token-display polish in MessageInfoDropdown; save_memory no-update sentinel. Verified: 1723 unit tests pass, ruff clean. No semantic regression in stream_new_chat (their memory-extraction deletion and our preflight removal touch different functions).	2026-05-20 21:23:48 +02:00
Rohan Verma	55cce4ea59	Merge pull request #1414 from AnishSarkar22/feature/memory-support-document-panel feat: improve memory extraction & add document-panel memory editing	2026-05-20 12:12:27 -07:00
CREDO23	d5ee8cc4cd	Merge remote-tracking branch 'upstream/dev' into improvement-agent-speed	2026-05-20 19:22:49 +02:00
CREDO23	2be3f04df5	chore(scripts): drop one-off MCP session lifetime probe The probe answered its question (informing the cached_tools persistence design). Future MCP session-pooling work, if revived, can recreate it.	2026-05-20 19:11:00 +02:00
CREDO23	704d1bf18f	refactor(mcp): per-connector cache refresh on lifecycle events Collapse the invalidate + warmup pair into a single refresh_mcp_tools_cache_for_connector(connector_id, search_space_id) helper and scope live discovery to the one connector that changed instead of the whole search space. - new mcp_tool.discover_single_mcp_connector: load one connector, refresh OAuth if needed, force live MCP discovery so its cached_tools row is rewritten; returned wrappers are discarded since the in-process LRU is rebuilt lazily on the next user query - mcp_tools_cache.refresh_mcp_tools_cache_for_connector: synchronously evicts the per-space LRU (LRU keys cannot scope finer) and schedules the per-connector prefetch via loop.create_task - routes (OAuth callback, MCP POST, MCP PUT) collapse their two back-to-back calls into a single refresh call; DELETE handlers keep using bare invalidate_mcp_tools_cache (nothing to prefetch) No new automated tests: the new functions are I/O glue (DB + network) where mocked unit tests would test implementation rather than behavior. The existing 9 unit tests for the cached_tools data shape are unchanged.	2026-05-20 17:43:27 +02:00
CREDO23	c0aa4261ac	perf(mcp): persist list_tools discovery in connector.config.cached_tools Skip the ~1-3s MCP initialize + list_tools handshake on every cache miss by reading tool definitions from the connector row we already load. Lazy populate on first miss, self-heal on corrupt cache, zero schema migration.	2026-05-20 16:11:07 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	ed22da7b95	feat: bumped version to 0.0.24	2026-05-20 03:01:37 -07:00
CREDO23	db8bffab38	perf(prompt-cache): enable Azure prompt_cache_key routing hint Splits the OpenAI-family gate into per-param predicates so AZURE and AZURE_OPENAI configs now receive prompt_cache_key for backend routing affinity (Microsoft auto-caches GPT-4o+ deployments at >=1024 tokens; the key clusters same-prefix requests on the same GPU pool and raises hit rate on turn 2+). prompt_cache_retention stays opted out for Azure because litellm 1.83.14's Azure transformer would drop it silently; revisit when Azure's supported params list is updated.	2026-05-20 11:58:15 +02:00
CREDO23	71dead0406	perf(kb-planner): route internal planner calls to dedicated small/fast LLM Adds an optional planner LLM role wired through KnowledgePriorityMiddleware so KB query rewriting, date extraction, and recency classification run on a cheap model (e.g. gpt-4o-mini, Haiku, Azure nano) instead of the user's chat LLM. Operators opt in by setting is_planner: true on exactly one global config; without it, behavior is unchanged.	2026-05-20 11:42:52 +02:00
Anish Sarkar	8c9be9796a	feat: add no-update sentinel handling to save_memory function and corresponding unit tests	2026-05-20 15:03:35 +05:30
CREDO23	c3db25302b	perf(chat): kill auto-pin preflight + speculative build, rely on reactive 429 recovery The preflight pattern probed the LLM with a 1-token ping before each cold turn (when requested_llm_config_id==0, llm_config_id<0, and the 45s healthy TTL had expired) to detect 429s before fanning out into planner/classifier/title-gen. To absorb its ~1-5s RTT cost we built the agent speculatively in parallel; on 429 we discarded the build and repinned. Three problems with that design: 1. False security. Provider rate limits are token-bucket. A 1-token ping consumes ~5 tokens; the real request consumes 10-50K. The probe can return 200 while the real call still 429s. 2. Pure overhead in the common case. On warm-agent-cache turns the probe dominates wall time: ~2.5s of TTFT pure tax for ~99% of users who never see a 429. 3. The in-stream recovery loop (catch of _is_provider_rate_limited gated by not _first_event_logged) already does the right thing reactively: mark_runtime_cooldown -> resolve_or_get_pinned_llm_config_id with exclude_config_ids={previous} -> rebuild agent -> retry the stream. Preflight was never the only safety net; it was a redundant probe in front of one. Changes: - Delete _preflight_llm, _settle_speculative_agent_build, and the _PREFLIGHT_TIMEOUT_SEC / _PREFLIGHT_MAX_TOKENS constants. - Drop the parallel agent_build_task / preflight_task plumbing in both stream_new_chat and stream_resume_chat; build the agent inline with await _build_main_agent_for_thread(...). - Drop the unused is_recently_healthy / mark_healthy imports here (still exported from auto_model_pin_service since OpenRouter catalogue refresh and a few tests reference clear_healthy). - Remove the obsolete preflight + settle-speculative tests from test_stream_new_chat_contract.py. Net: -447 LOC. ~2.5s removed from TTFT on every cold preflight-eligible turn. 429 recovery path is unchanged - same repin/rebuild/retry, just not paid in advance on the healthy path.	2026-05-20 11:03:08 +02:00
Anish Sarkar	132e7b3c44	refactor: remove memory extraction functions and related components from the new chat agent	2026-05-20 14:03:28 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	b285293b4e	fix: docker one click setup	2026-05-20 01:25:07 -07:00
CREDO23	1791241c0c	perf(indexers): offload sync embed_text to thread across background workers Connector kb_sync_services (gmail, onedrive, google_calendar, jira), streaming indexers (discord, luma, teams) and the file-processor save path all called embed_text inside async coroutines, blocking the background worker's event loop for the duration of the embed. Wrap each call site in asyncio.to_thread so concurrent indexing tasks stop serialising on the embed.	2026-05-20 10:09:38 +02:00
CREDO23	a8de98895a	perf(revert-service): offload sync embed_texts to thread _restore_in_place_document and _reinsert_document_from_revision are async helpers invoked by the synchronous-feeling POST /api/threads/.../revert route; both ran embed_texts inline, blocking the event loop while the HTTP client waited.	2026-05-20 10:04:26 +02:00
CREDO23	a3d6fa6196	perf(document-converters): offload sync embed_text/embed_texts to thread generate_document_summary and create_document_chunks are async helpers called from the chat path and from many connector indexers. Both wrapped embed_text/embed_texts directly inside the coroutine, blocking the event loop for the full duration of the embedding call.	2026-05-20 10:03:42 +02:00
CREDO23	52d425f170	perf(kb-persistence): offload sync embed_texts to thread _create_document and _update_document run on the chat critical path when the filesystem subagent writes via the user's chat turn. Both called embed_texts synchronously inside an async coroutine, blocking the event loop for the duration of the embed.	2026-05-20 10:03:14 +02:00
CREDO23	4fa85a9a94	perf(kb-search): offload sync embed_texts to thread embed_texts holds a threading.Lock and runs a sync embedding call inside search_knowledge_base, an async coroutine on the KB priority middleware critical path. Blocking the event loop here stalls every other coroutine on the worker (SSE keepalives, concurrent chat requests, background tasks). Wrap in asyncio.to_thread so the embed runs on the default executor pool while the loop keeps serving.	2026-05-20 10:02:38 +02:00
CREDO23	32f6766cb6	fix(tokens): use canonical prompt_tokens_details path for cache fields LiteLLM normalizes every provider's cache fields onto usage.prompt_tokens_details (cached_tokens + cache_creation_tokens). The earlier fallback to usage.cache_read_input_tokens / usage.cache_creation_input_tokens was wrong: Anthropic-shaped fields only live there via a trailing setattr loop, and the canonical field name on the wrapper is cache_creation_tokens (not _input_tokens).	2026-05-20 09:55:39 +02:00
CREDO23	6090980c5e	obs(tokens): log prompt-cache read/write counts and hit ratio per LLM call	2026-05-20 09:51:44 +02:00
Anish Sarkar	a0ff86e0e8	feat: add memory document model and parsing functionality for markdown handling	2026-05-20 13:20:05 +05:30
CREDO23	0cdda14922	perf(kb subagent, desktop): cap evidence.content_excerpt to 500 chars	2026-05-20 09:43:36 +02:00

1 2 3 4 5 ...

2278 commits