The long-running ingestion/podcast/video tasks run on a separate Celery
engine (NullPool), so the web engine's idle_in_transaction_session_timeout
did not cover them — which is exactly where the original 11h zombie
(INSERT INTO chunks) came from. Apply the same protection to the Celery
engine with a generous 60-minute default so a worker that hangs/crashes
mid-transaction can't hold locks on documents/chunks indefinitely, while
never reaping a legitimate per-document embed window.
- config + .env.example: DB_CELERY_IDLE_IN_TX_TIMEOUT_MS (default 3600000).
Co-authored-by: Cursor <cursoragent@cursor.com>
document_converters, the github size-fallback chunker, revert_service
restores, and the kb-persistence middleware now write explicit positions
(the middleware read path also orders by position).
Every file ingestion path (Dropbox, Google Drive / Composio Drive, OneDrive,
local folder, Obsidian, and the legacy upload handlers) now parses via the
extract_with_cache facade instead of calling EtlPipelineService.extract
directly, so identical bytes are deduplicated globally regardless of source.
vision_llm is passed through, keeping the existing cacheability gate intact.
- Introduced LLMErrorCategory and adapt_llm_exception to normalize LLM exceptions.
- Updated llm_retryable_message and llm_permanent_message to utilize the new adaptation logic.
- Enhanced classify_stream_exception to classify provider errors and return user-friendly messages.
- Added tests for error classification and adaptation to ensure robustness.
- Updated frontend error handling to display appropriate messages based on new classifications.
- Updated environment variables and - configurations for credit purchases via Stripe, replacing legacy page pack system.
- Introduced auto-reload feature for credit top-ups and modified database models to track credit transactions.
- Updated notification system to handle insufficient credits and auto-reload failures.
- Adjusted API routes and schemas to reflect changes in credit management.
- Replaced Playwright with Scrapling's fetchers in the web crawling and YouTube processing modules for improved performance and flexibility.
- Updated proxy configuration to support dynamic proxy selection via environment variables.
- Enhanced logging to track performance metrics during web scraping operations.
- Refactored related modules to utilize the new proxy utilities and streamline the scraping process.
The KB-persistence impl lived in shared/middleware/ but no subagent uses it --
consumers are the main_agent builder and the boundary event loop. Colocate with
its owner using the folder-per-middleware shape; __init__ re-exports the public
surface. Tests that reached module internals now alias the .middleware submodule.
main_agent/middleware/kb_persistence.py -> kb_persistence/builder.py
shared/middleware/kb_persistence.py -> kb_persistence/middleware.py
The busy-mutex impl (BusyMutexMiddleware + cancel/turn-lifecycle primitives)
lived in shared/middleware/ but no subagent uses it -- consumers are the
main_agent builder and the boundary (turn lifecycle). Colocate with its owner
using the folder-per-middleware shape; __init__ re-exports the public surface so
boundary import sites only change package path:
main_agent/middleware/busy_mutex.py -> busy_mutex/builder.py
shared/middleware/busy_mutex.py -> busy_mutex/middleware.py
Move the lower-level runtime/infra modules out of multi_agent_chat/shared/
(they were never used by subagents, so they failed the shared-by-all-siblings
rule) and unify them with the already-relocated checkpointer:
agents/runtime/ -> agents/chat/runtime/
mac/shared/errors.py -> chat/runtime/errors.py
mac/shared/llm_config.py -> chat/runtime/llm_config.py
mac/shared/prompt_caching.py -> chat/runtime/prompt_caching.py
mac/shared/mention_resolver.py -> chat/runtime/mention_resolver.py
mac/shared/path_resolver.py -> chat/runtime/path_resolver.py
These sit below the agent packages: the boundary + agent factory + shared
middleware depend on them, and they import no agent code (acyclic).