Closes#1295
The connector indexing route's `_run_indexing_with_notifications` set the
Redis heartbeat key once at the start of indexing and relied on
`on_heartbeat_callback` (only fired in Phase 2 per-document loops) to
refresh it. The GitHub connector's Phase 1 runs `gitingest` as a blocking
subprocess via `asyncio.to_thread`, so for any repo larger than the
2-minute TTL, the key expires before Phase 2 starts. The
`cleanup_stale_indexing_notifications_task` then marks the document as
failed with the misleading "Sync was interrupted unexpectedly. Please
retry." message — even though the indexing thread is still running and
gitingest's own subprocess timeout is 900 seconds.
Add a background asyncio coroutine that refreshes the Redis key every
60 seconds for the duration of the indexing call. Same pattern already
in use at app/tasks/celery_tasks/document_tasks.py:_run_heartbeat_loop,
just adapted to use the route's get_heartbeat_redis_client() and
_get_heartbeat_key() helpers.
Cancellation runs in the `finally` block BEFORE the heartbeat-key
delete so the loop cannot race and re-create the key after we have
deleted it. The new `HEARTBEAT_REFRESH_INTERVAL = 60` constant mirrors
the celery task module's value.
- Implemented dynamic SameSite and Secure cookie settings based on the backend URL context.
- Enhanced cookie handling to ensure proper functionality in cross-domain scenarios.
- Introduced AI File Sorting functionality to automatically organize documents into a smart folder hierarchy based on source, date, and topic.
- Updated README.md to include the new feature.
- Enhanced homepage components with new illustrations and descriptions for AI File Sorting.
- Refactored rate limiting logic to extract real client IPs more accurately.
Build and Push Docker Images / tag_release (push) Waiting to run
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions
- Introduced a `ProcessingMode` enum to differentiate between basic and premium processing modes.
- Updated `EtlRequest` to include a `processing_mode` field, defaulting to basic.
- Enhanced ETL pipeline services to utilize the selected processing mode for Azure Document Intelligence and LlamaCloud parsing.
- Modified various routes and services to handle processing mode, affecting document upload and indexing tasks.
- Improved error handling and logging to include processing mode details.
- Added tests to validate processing mode functionality and its impact on ETL operations.