Some OpenAI-compatible image backends (e.g. Xinference) return a relative
URL like /files/image.png in data[0].url instead of an absolute one.
Browsers cannot resolve these, causing images to fail to load.
Track the provider's api_base after resolving model config via to_litellm().
When the returned URL starts with "/", extract the origin (scheme + host + port)
from api_base and prepend it to produce a full absolute URL.
No behaviour change for providers that return absolute URLs (OpenAI, Azure, etc).
Closes#1496
Presentation and citation ordering moves off Chunk.id/created_at to the
explicit position column (id kept as tiebreaker). Vector and ts_rank
ranking order_by clauses are untouched.
document_converters, the github size-fallback chunker, revert_service
restores, and the kb-persistence middleware now write explicit positions
(the middleware read path also orders by position).
- Introduced lazy knowledge base retrieval mode, allowing the main agent to fetch KB content on demand via the `search_knowledge_base` tool, improving performance by skipping expensive pre-injection processes.
- Added cross-thread caching capability, enabling reuse of compiled graphs across different user chats, reducing latency for returning users.
- Updated middleware to support new lazy loading and caching features, ensuring efficient resource utilization and improved response times.
- Enhanced logging for performance tracking during knowledge retrieval and agent interactions.
- Integrated performance logging in `OtelSpanMiddleware` to track model call durations even when OTel is disabled.
- Added detailed performance metrics in `KnowledgePriorityMiddleware` for database operations and embedding processes, improving visibility into query performance.
- Utilized `get_perf_logger` for consistent logging across middleware components.
- Replaced Playwright with Scrapling's fetchers in the web crawling and YouTube processing modules for improved performance and flexibility.
- Updated proxy configuration to support dynamic proxy selection via environment variables.
- Enhanced logging to track performance metrics during web scraping operations.
- Refactored related modules to utilize the new proxy utilities and streamline the scraping process.
- Replaced environment variable usage with a centralized configuration system in multiple modules, including `celery_app`, `agent_cache_store`, `sandbox`, `file_storage`, and `connector_service`.
- Enhanced maintainability and readability by sourcing configuration values from the `config` module instead of directly from environment variables.
- Updated relevant settings to ensure consistent access to configuration values across the application.
Recursive pass over the agents module to make docstrings and inline
comments concise and intent-oriented: drop narration that just restates
the code, condense verbose module/function docstrings, and keep only the
non-obvious "why" notes. No functional code changed.
The knowledge_base subagent imported subagent_invoke_config + EXCLUDED_STATE_KEYS
from main_agent's checkpointed_subagent_middleware -- a subagent reaching into
main-agent internals. Both symbols (plus the recursion-limit constant they need)
are a subagent-invocation contract shared by the orchestrator's task middleware
and any nested-invoking subagent. Move them to subagents/shared/invocation.py;
config.py keeps the HITL resume side-channel and constants.py keeps the
main-agent tuning knobs. All consumers (task_tool, kb tool, tests) repointed.
build_credentials/get_token_encryption are Google-OAuth helpers used by both the
Gmail and Calendar connector tools. They lived inside gmail/tools/_helpers.py,
forcing calendar -> gmail coupling. Move them to a neutral connector-level module
(connectors/google_auth.py); gmail/_helpers.py re-exports them under the legacy
private names so existing gmail tools are untouched, and calendar now imports the
shared module directly.