Commit graph

735 commits

Author SHA1 Message Date
CREDO23
32a6e54ce6 Merge remote-tracking branch 'upstream/dev' into features/documents-injestion-layered-cached 2026-06-14 11:30:33 +02:00
Anish Sarkar
c7409c8995 chore: ran linting 2026-06-13 21:59:35 +05:30
Anish Sarkar
ab5423d2d2 Merge remote-tracking branch 'upstream/dev' into feat/unified-model-connections 2026-06-13 19:04:49 +05:30
Anish Sarkar
8fe9c21e76 feat(token-tracking): add model metadata registration and enhance token usage tracking 2026-06-13 03:08:35 +05:30
CREDO23
5a71769dba fix(chunks): set position on remaining chunk insert paths
document_converters, the github size-fallback chunker, revert_service
restores, and the kb-persistence middleware now write explicit positions
(the middleware read path also orders by position).
2026-06-12 18:53:08 +02:00
CREDO23
0fb1d3d37b feat(etl-cache): route all file-based sources through the parse cache
Every file ingestion path (Dropbox, Google Drive / Composio Drive, OneDrive,
local folder, Obsidian, and the legacy upload handlers) now parses via the
extract_with_cache facade instead of calling EtlPipelineService.extract
directly, so identical bytes are deduplicated globally regardless of source.
vision_llm is passed through, keeping the existing cacheability gate intact.
2026-06-12 14:47:25 +02:00
CREDO23
0dc2ccc003 feat(tasks): route extraction through etl cache 2026-06-12 11:23:50 +02:00
DESKTOP-RTLN3BA\$punk
c855be8ccd fix(auto_reload): update task to use a lambda for user_id in async call 2026-06-11 16:51:18 -07:00
Anish Sarkar
8e8cf96faa feat(error-handling): implement LLM error adaptation and classification for chat streaming
- Introduced LLMErrorCategory and adapt_llm_exception to normalize LLM exceptions.
- Updated llm_retryable_message and llm_permanent_message to utilize the new adaptation logic.
- Enhanced classify_stream_exception to classify provider errors and return user-friendly messages.
- Added tests for error classification and adaptation to ensure robustness.
- Updated frontend error handling to display appropriate messages based on new classifications.
2026-06-12 05:03:14 +05:30
DESKTOP-RTLN3BA\$punk
05190da0a9 chore: linting 2026-06-11 15:31:43 -07:00
Anish Sarkar
908790e40f Merge remote-tracking branch 'upstream/dev' into feat/unified-model-connections 2026-06-12 03:15:28 +05:30
CREDO23
41f4a58663 Merge remote-tracking branch 'upstream/dev' into improvement-podcast-graph
# Conflicts:
#	surfsense_backend/app/tasks/celery_tasks/podcast_tasks.py
2026-06-11 23:14:49 +02:00
Anish Sarkar
3dd54230e7 fix(chat): normalize provider-safe message history 2026-06-12 02:17:37 +05:30
Anish Sarkar
5d5d574550 refactor(model-connections): move backend model connections to provider capabilities 2026-06-12 02:17:22 +05:30
Anish Sarkar
c28c4f5785 feat(chat): route models by provider capabilities 2026-06-11 18:22:23 +05:30
CREDO23
eb56acc407 refactor(podcasts): regenerate via brief gate, render brief inline in chat 2026-06-11 11:45:17 +02:00
DESKTOP-RTLN3BA\$punk
a7407502d3 feat(refactor): refactor payment system to implement unified credit wallet.
- Updated environment variables and - configurations for credit purchases via Stripe, replacing legacy page pack system.
- Introduced auto-reload feature for credit top-ups and modified database models to track credit transactions.
- Updated notification system to handle insufficient credits and auto-reload failures.
- Adjusted API routes and schemas to reflect changes in credit management.
2026-06-10 16:49:03 -07:00
CREDO23
97ab7a88fd refactor(podcasts): remove legacy podcaster agent, task, and schema 2026-06-10 21:45:04 +02:00
CREDO23
3eb7cdb2d8 refactor(podcasts): gate chat-triggered podcast on brief review 2026-06-10 21:44:50 +02:00
CREDO23
ba687813c1 fix(elasticsearch): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
c26181d086 fix(airtable): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
e3afe9d7c7 fix(luma): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
8191118eb4 fix(bookstack): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
45438249b6 fix(clickup): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
f5dd8f3985 fix(github): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
f085ac59e5 fix(teams): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
791b0afe16 fix(discord): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
be8a3bcd00 fix(slack): commit failed status immediately 2026-06-10 00:10:52 +02:00
CREDO23
c47949791b fix(confluence): fail skipped placeholders so they don't stay pending 2026-06-10 00:10:42 +02:00
CREDO23
d70d01f331 fix(linear): fail skipped placeholders so they don't stay pending 2026-06-10 00:10:42 +02:00
CREDO23
1b0912aaa3 fix(calendar): fail skipped placeholders so they don't stay pending 2026-06-10 00:10:42 +02:00
CREDO23
b2c2fc9c2e fix(gmail): fail skipped placeholders so they don't stay pending 2026-06-10 00:10:42 +02:00
CREDO23
90b32a8880 fix(notion): fail skipped placeholders so they don't stay pending 2026-06-10 00:10:42 +02:00
CREDO23
33300e4faa fix(dropbox): sanitize ETL reason and retry stuck pending/processing files 2026-06-10 00:10:25 +02:00
CREDO23
464e7d4554 fix(onedrive): sanitize ETL reason and retry stuck pending/processing files 2026-06-10 00:10:25 +02:00
CREDO23
c0c5f3414e fix(google-drive): sanitize ETL reason and retry stuck pending/processing files 2026-06-10 00:10:25 +02:00
CREDO23
e45e8389dc fix(dropbox): mark documents failed on ETL failure 2026-06-09 23:39:25 +02:00
CREDO23
82aaaa5a9f fix(onedrive): mark documents failed on ETL failure 2026-06-09 23:39:25 +02:00
CREDO23
6fd95f82b4 fix(google-drive): mark placeholders failed on ETL failure 2026-06-09 23:39:25 +02:00
CREDO23
cb10882dc8 feat(indexers): add mark_connector_documents_failed helper 2026-06-09 23:39:25 +02:00
DESKTOP-RTLN3BA\$punk
ce952d2ad1 chore: linting 2026-06-09 00:42:26 -07:00
DESKTOP-RTLN3BA\$punk
640ef5f15d feat(proxy): integrate Scrapling for enhanced web scraping capabilities
- Replaced Playwright with Scrapling's fetchers in the web crawling and YouTube processing modules for improved performance and flexibility.
- Updated proxy configuration to support dynamic proxy selection via environment variables.
- Enhanced logging to track performance metrics during web scraping operations.
- Refactored related modules to utilize the new proxy utilities and streamline the scraping process.
2026-06-09 00:15:10 -07:00
CREDO23
8bdfd00a15 Merge upstream/dev 2026-06-05 19:18:12 +02:00
CREDO23
0081b627e9 refactor(agents): move kb_persistence middleware into main_agent (owner)
The KB-persistence impl lived in shared/middleware/ but no subagent uses it --
consumers are the main_agent builder and the boundary event loop. Colocate with
its owner using the folder-per-middleware shape; __init__ re-exports the public
surface. Tests that reached module internals now alias the .middleware submodule.

  main_agent/middleware/kb_persistence.py -> kb_persistence/builder.py
  shared/middleware/kb_persistence.py     -> kb_persistence/middleware.py
2026-06-05 14:11:55 +02:00
CREDO23
a7a642fedc refactor(agents): move busy_mutex middleware into main_agent (owner)
The busy-mutex impl (BusyMutexMiddleware + cancel/turn-lifecycle primitives)
lived in shared/middleware/ but no subagent uses it -- consumers are the
main_agent builder and the boundary (turn lifecycle). Colocate with its owner
using the folder-per-middleware shape; __init__ re-exports the public surface so
boundary import sites only change package path:

  main_agent/middleware/busy_mutex.py    -> busy_mutex/builder.py
  shared/middleware/busy_mutex.py        -> busy_mutex/middleware.py
2026-06-05 14:08:45 +02:00
CREDO23
f2a61bc0ef refactor(agents): consolidate chat runtime infra under chat/runtime
Move the lower-level runtime/infra modules out of multi_agent_chat/shared/
(they were never used by subagents, so they failed the shared-by-all-siblings
rule) and unify them with the already-relocated checkpointer:

  agents/runtime/                      -> agents/chat/runtime/
  mac/shared/errors.py                 -> chat/runtime/errors.py
  mac/shared/llm_config.py             -> chat/runtime/llm_config.py
  mac/shared/prompt_caching.py         -> chat/runtime/prompt_caching.py
  mac/shared/mention_resolver.py       -> chat/runtime/mention_resolver.py
  mac/shared/path_resolver.py          -> chat/runtime/path_resolver.py

These sit below the agent packages: the boundary + agent factory + shared
middleware depend on them, and they import no agent code (acyclic).
2026-06-05 13:19:24 +02:00
CREDO23
7d866a2279 refactor(agents): sink sandbox.py into filesystem subsystem
shared/sandbox.py was used only by the filesystem middleware/tools (and the
boundary) -- never by main_agent or subagents as shared code. Move it next to
its only agent-side consumer:

  multi_agent_chat/shared/sandbox.py
  -> multi_agent_chat/shared/middleware/filesystem/sandbox.py
2026-06-05 13:15:57 +02:00
CREDO23
24b62a63b4 refactor(agents): introduce chat/ category; dissolve top-level agents/shared
Recursive shared-folder rule: a shared/ must be shared by ALL siblings at its
level. The kernel (context, compaction, retry_after, web_search) was shared by
only 2 of the agents -- anonymous_chat + multi_agent_chat -- never by podcaster
or video_presentation. Those 2 are the "chat" category, so their shared code
belongs in that category's shared/, not the top-level one.

  app/agents/anonymous_chat/   -> app/agents/chat/anonymous_chat/
  app/agents/multi_agent_chat/ -> app/agents/chat/multi_agent_chat/
  app/agents/shared/           -> app/agents/chat/shared/   (anon<->mac kernel)

Top-level app/agents/shared/ is gone: nothing was shared across all three
categories (chat / podcaster / video_presentation).

~289 import sites rewritten (app.agents.{anonymous_chat,multi_agent_chat,shared}
-> app.agents.chat.*); all moves are git renames (history preserved).
app/agents/ now: chat/, podcaster/, video_presentation/, runtime/.
2026-06-05 12:54:02 +02:00
CREDO23
b7ea829371 refactor(agents): relocate boundary-only infra out of shared/
Neither module is imported by any sibling agent package, so neither belongs in
the cross-agent shared kernel:

- checkpointer.py -> app/agents/runtime/checkpointer.py
  LangGraph Postgres checkpoint saver. It's cross-agent *runtime infra* wired by
  the boundary (app lifespan + anonymous_chat & multi_agent_chat flows), not
  agent code. New app/agents/runtime/ layer holds boundary-wired agent infra.

- shared/system_prompt.py + shared/prompts/ -> app/prompts/
  The legacy single-agent prompt composer. The live agents don't use it
  (main_agent has its own system_prompt/ builder; anonymous_chat builds inline);
  its only consumer is new_llm_config_routes for displaying default instructions.
  Moved to the existing non-agent prompt domain:
    system_prompt.py        -> app/prompts/default_system_instructions.py
    prompts/                 -> app/prompts/system_prompt_composer/

app/agents/shared/ now contains only genuinely cross-agent code: context,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/.

NOTE: get_default_system_instructions() (LLM-config UI) composes from the legacy
library, which differs from what the live agents actually run -- pre-existing
latent staleness, not changed here.
2026-06-05 12:36:44 +02:00
CREDO23
82c5dc5b02 refactor(agents): move mac-only modules out of the cross-agent shared kernel
app/agents/shared/ is a sibling of anonymous_chat/podcaster/multi_agent_chat/
video_presentation, so it should only hold code shared across 2+ of those
agents. In practice podcaster and video_presentation import nothing from it,
and anonymous_chat needs only context + compaction + retry_after + web_search.
Everything else was multi_agent_chat-only (the boundary just passes through).

Move the multi_agent_chat-only cluster into multi_agent_chat/shared/ (files
moved verbatim via git rename; ~116 import sites rewritten):

  errors, feature_flags, filesystem_selection, path_resolver, prompt_caching,
  sandbox, llm_config, mention_resolver
  middleware/busy_mutex, middleware/kb_persistence

busy_mutex/llm_config/mention_resolver are boundary-only but import the moved
modules, so they were folded in to avoid a backwards shared -> multi_agent_chat
dependency. main_agent builders now import the impls directly; the shared
middleware barrel keeps only the genuinely-shared compaction + retry_after.

Also delete the dead leftover shared/plugins and shared/skills dirs (live
copies already live under main_agent/).

Remaining in app/agents/shared/: context, system_prompt(+prompts), checkpointer,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/. checkpointer and
system_prompt are boundary-only infra pending a dedicated home decision.
2026-06-05 12:30:15 +02:00