- Introduced lazy knowledge base retrieval mode, allowing the main agent to fetch KB content on demand via the `search_knowledge_base` tool, improving performance by skipping expensive pre-injection processes.
- Added cross-thread caching capability, enabling reuse of compiled graphs across different user chats, reducing latency for returning users.
- Updated middleware to support new lazy loading and caching features, ensuring efficient resource utilization and improved response times.
- Enhanced logging for performance tracking during knowledge retrieval and agent interactions.