SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-06-20 21:18:13 +02:00

Author	SHA1	Message	Date
CREDO23	6d1879ffcb	continue indexing when notification creation fails	2026-06-17 15:06:05 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	0fe650fd8e	Merge commit '`7ce409c580`' into dev	2026-06-16 22:48:14 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	b9702b3245	chore: linting	2026-06-16 16:27:16 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	da64433439	fix(db): reap orphaned idle-in-transaction sessions on the Celery engine The long-running ingestion/podcast/video tasks run on a separate Celery engine (NullPool), so the web engine's idle_in_transaction_session_timeout did not cover them — which is exactly where the original 11h zombie (INSERT INTO chunks) came from. Apply the same protection to the Celery engine with a generous 60-minute default so a worker that hangs/crashes mid-transaction can't hold locks on documents/chunks indefinitely, while never reaping a legitimate per-document embed window. - config + .env.example: DB_CELERY_IDLE_IN_TX_TIMEOUT_MS (default 3600000). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-16 16:26:04 -07:00
CREDO23	32a6e54ce6	Merge remote-tracking branch 'upstream/dev' into features/documents-injestion-layered-cached	2026-06-14 11:30:33 +02:00
Anish Sarkar	c7409c8995	chore: ran linting	2026-06-13 21:59:35 +05:30
Anish Sarkar	ab5423d2d2	Merge remote-tracking branch 'upstream/dev' into feat/unified-model-connections	2026-06-13 19:04:49 +05:30
Anish Sarkar	8fe9c21e76	feat(token-tracking): add model metadata registration and enhance token usage tracking	2026-06-13 03:08:35 +05:30
CREDO23	5a71769dba	fix(chunks): set position on remaining chunk insert paths document_converters, the github size-fallback chunker, revert_service restores, and the kb-persistence middleware now write explicit positions (the middleware read path also orders by position).	2026-06-12 18:53:08 +02:00
CREDO23	0fb1d3d37b	feat(etl-cache): route all file-based sources through the parse cache Every file ingestion path (Dropbox, Google Drive / Composio Drive, OneDrive, local folder, Obsidian, and the legacy upload handlers) now parses via the extract_with_cache facade instead of calling EtlPipelineService.extract directly, so identical bytes are deduplicated globally regardless of source. vision_llm is passed through, keeping the existing cacheability gate intact.	2026-06-12 14:47:25 +02:00
CREDO23	0dc2ccc003	feat(tasks): route extraction through etl cache	2026-06-12 11:23:50 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	c855be8ccd	fix(auto_reload): update task to use a lambda for user_id in async call	2026-06-11 16:51:18 -07:00
Anish Sarkar	8e8cf96faa	feat(error-handling): implement LLM error adaptation and classification for chat streaming - Introduced LLMErrorCategory and adapt_llm_exception to normalize LLM exceptions. - Updated llm_retryable_message and llm_permanent_message to utilize the new adaptation logic. - Enhanced classify_stream_exception to classify provider errors and return user-friendly messages. - Added tests for error classification and adaptation to ensure robustness. - Updated frontend error handling to display appropriate messages based on new classifications.	2026-06-12 05:03:14 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	05190da0a9	chore: linting	2026-06-11 15:31:43 -07:00
Anish Sarkar	908790e40f	Merge remote-tracking branch 'upstream/dev' into feat/unified-model-connections	2026-06-12 03:15:28 +05:30
CREDO23	41f4a58663	Merge remote-tracking branch 'upstream/dev' into improvement-podcast-graph # Conflicts: # surfsense_backend/app/tasks/celery_tasks/podcast_tasks.py	2026-06-11 23:14:49 +02:00
Anish Sarkar	3dd54230e7	fix(chat): normalize provider-safe message history	2026-06-12 02:17:37 +05:30
Anish Sarkar	5d5d574550	refactor(model-connections): move backend model connections to provider capabilities	2026-06-12 02:17:22 +05:30
Anish Sarkar	c28c4f5785	feat(chat): route models by provider capabilities	2026-06-11 18:22:23 +05:30
CREDO23	eb56acc407	refactor(podcasts): regenerate via brief gate, render brief inline in chat	2026-06-11 11:45:17 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	a7407502d3	feat(refactor): refactor payment system to implement unified credit wallet. - Updated environment variables and - configurations for credit purchases via Stripe, replacing legacy page pack system. - Introduced auto-reload feature for credit top-ups and modified database models to track credit transactions. - Updated notification system to handle insufficient credits and auto-reload failures. - Adjusted API routes and schemas to reflect changes in credit management.	2026-06-10 16:49:03 -07:00
CREDO23	97ab7a88fd	refactor(podcasts): remove legacy podcaster agent, task, and schema	2026-06-10 21:45:04 +02:00
CREDO23	3eb7cdb2d8	refactor(podcasts): gate chat-triggered podcast on brief review	2026-06-10 21:44:50 +02:00
CREDO23	ba687813c1	fix(elasticsearch): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	c26181d086	fix(airtable): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	e3afe9d7c7	fix(luma): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	8191118eb4	fix(bookstack): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	45438249b6	fix(clickup): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	f5dd8f3985	fix(github): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	f085ac59e5	fix(teams): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	791b0afe16	fix(discord): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	be8a3bcd00	fix(slack): commit failed status immediately	2026-06-10 00:10:52 +02:00
CREDO23	c47949791b	fix(confluence): fail skipped placeholders so they don't stay pending	2026-06-10 00:10:42 +02:00
CREDO23	d70d01f331	fix(linear): fail skipped placeholders so they don't stay pending	2026-06-10 00:10:42 +02:00
CREDO23	1b0912aaa3	fix(calendar): fail skipped placeholders so they don't stay pending	2026-06-10 00:10:42 +02:00
CREDO23	b2c2fc9c2e	fix(gmail): fail skipped placeholders so they don't stay pending	2026-06-10 00:10:42 +02:00
CREDO23	90b32a8880	fix(notion): fail skipped placeholders so they don't stay pending	2026-06-10 00:10:42 +02:00
CREDO23	33300e4faa	fix(dropbox): sanitize ETL reason and retry stuck pending/processing files	2026-06-10 00:10:25 +02:00
CREDO23	464e7d4554	fix(onedrive): sanitize ETL reason and retry stuck pending/processing files	2026-06-10 00:10:25 +02:00
CREDO23	c0c5f3414e	fix(google-drive): sanitize ETL reason and retry stuck pending/processing files	2026-06-10 00:10:25 +02:00
CREDO23	e45e8389dc	fix(dropbox): mark documents failed on ETL failure	2026-06-09 23:39:25 +02:00
CREDO23	82aaaa5a9f	fix(onedrive): mark documents failed on ETL failure	2026-06-09 23:39:25 +02:00
CREDO23	6fd95f82b4	fix(google-drive): mark placeholders failed on ETL failure	2026-06-09 23:39:25 +02:00
CREDO23	cb10882dc8	feat(indexers): add mark_connector_documents_failed helper	2026-06-09 23:39:25 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	ce952d2ad1	chore: linting	2026-06-09 00:42:26 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	640ef5f15d	feat(proxy): integrate Scrapling for enhanced web scraping capabilities - Replaced Playwright with Scrapling's fetchers in the web crawling and YouTube processing modules for improved performance and flexibility. - Updated proxy configuration to support dynamic proxy selection via environment variables. - Enhanced logging to track performance metrics during web scraping operations. - Refactored related modules to utilize the new proxy utilities and streamline the scraping process.	2026-06-09 00:15:10 -07:00
CREDO23	8bdfd00a15	Merge upstream/dev	2026-06-05 19:18:12 +02:00
CREDO23	0081b627e9	refactor(agents): move kb_persistence middleware into main_agent (owner) The KB-persistence impl lived in shared/middleware/ but no subagent uses it -- consumers are the main_agent builder and the boundary event loop. Colocate with its owner using the folder-per-middleware shape; __init__ re-exports the public surface. Tests that reached module internals now alias the .middleware submodule. main_agent/middleware/kb_persistence.py -> kb_persistence/builder.py shared/middleware/kb_persistence.py -> kb_persistence/middleware.py	2026-06-05 14:11:55 +02:00
CREDO23	a7a642fedc	refactor(agents): move busy_mutex middleware into main_agent (owner) The busy-mutex impl (BusyMutexMiddleware + cancel/turn-lifecycle primitives) lived in shared/middleware/ but no subagent uses it -- consumers are the main_agent builder and the boundary (turn lifecycle). Colocate with its owner using the folder-per-middleware shape; __init__ re-exports the public surface so boundary import sites only change package path: main_agent/middleware/busy_mutex.py -> busy_mutex/builder.py shared/middleware/busy_mutex.py -> busy_mutex/middleware.py	2026-06-05 14:08:45 +02:00
CREDO23	f2a61bc0ef	refactor(agents): consolidate chat runtime infra under chat/runtime Move the lower-level runtime/infra modules out of multi_agent_chat/shared/ (they were never used by subagents, so they failed the shared-by-all-siblings rule) and unify them with the already-relocated checkpointer: agents/runtime/ -> agents/chat/runtime/ mac/shared/errors.py -> chat/runtime/errors.py mac/shared/llm_config.py -> chat/runtime/llm_config.py mac/shared/prompt_caching.py -> chat/runtime/prompt_caching.py mac/shared/mention_resolver.py -> chat/runtime/mention_resolver.py mac/shared/path_resolver.py -> chat/runtime/path_resolver.py These sit below the agent packages: the boundary + agent factory + shared middleware depend on them, and they import no agent code (acyclic).	2026-06-05 13:19:24 +02:00

1 2 3 4 5 ...

739 commits