Merge remote-tracking branch 'upstream/dev' into improvement-agent-speed

Resolves: surfsense_backend/app/agents/new_chat/middleware/memory_injection.py - Took both imports: upstream moved MEMORY_HARD_LIMIT/SOFT_LIMIT to app.services.memory; kept our perf-logger import for timing. Pulls in upstream changes: - Memory document feature (services/memory refactor, removal of app.agents.new_chat.memory_extraction and background extraction in stream_new_chat — agent now drives memory via update_memory tool). - BACKEND_URL env refactor across web tool-ui/editor/chat/dashboard/lib. - GitHub Actions backend test workflow + pre-commit biome bump. - Token-display polish in MessageInfoDropdown; save_memory no-update sentinel. Verified: 1723 unit tests pass, ruff clean. No semantic regression in stream_new_chat (their memory-extraction deletion and our preflight removal touch different functions).
2026-05-25 19:15:18 +02:00 · 2026-05-20 21:23:48 +02:00 · 2026-05-20 21:23:48 +02:00 · 49da7a57df
commit 49da7a57df
parent d5ee8cc4cd 883ac81ce1
79 changed files with 1992 additions and 2296 deletions
--- a/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/memory_protocol/private.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/memory_protocol/private.md
@ -6,4 +6,10 @@ standing instructions?
 If yes, call `update_memory` **alongside** your normal response — don't
 defer it to a later turn. Skip ephemeral chat noise (one-off Q/A, greetings,
 session logistics). Stay within the budget shown in `<user_memory>`.
+
+Memory is heading-based markdown. New entries should be under `##` headings
+such as `## Facts`, `## Preferences`, or `## Instructions`, with bullets like
+`- YYYY-MM-DD: text`. If existing memory contains legacy
+`(YYYY-MM-DD) [fact|pref|instr]` markers, preserve the information but write
+new saves in the heading-based format.
 </memory_protocol>
--- a/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/memory_protocol/team.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/memory_protocol/team.md
@ -6,4 +6,12 @@ key facts?
 If yes, call `update_memory` **alongside** your normal response — don't
 defer it to a later turn. Skip ephemeral chat noise (one-off Q/A, greetings,
 session logistics). Stay within the budget shown in `<team_memory>`.
+
+Team memory is heading-based markdown. New entries should be under `##`
+headings such as `## Product Decisions`, `## Engineering Conventions`,
+`## Project Facts`, or `## Open Questions`, with bullets like
+`- YYYY-MM-DD: text`. If existing memory contains legacy `(YYYY-MM-DD) [fact]`
+markers, preserve the information but write new saves in the heading-based
+format. Do not create personal headings such as `## Preferences` or
+`## Instructions`.
 </memory_protocol>
--- a/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/private/description.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/private/description.md
@ -9,7 +9,9 @@
  - Skip ephemeral chat noise (one-off Q/A, greetings, session logistics).
  - Args: `updated_memory` — FULL replacement markdown (merge and curate,
    don't only append).
-  - Formatting: bullets `- (YYYY-MM-DD) [marker] text` with markers `[fact]`,
-    `[pref]`, `[instr]` (priority when trimming: `instr > pref > fact`).
-    Group bullets under short `##` headings; stay under the limit shown in
-    `<user_memory>`.
+  - Formatting: heading-based markdown with entries under `##` headings.
+    Recommended headings are `## Facts`, `## Preferences`, `## Instructions`,
+    though clearer natural headings are allowed. New bullets should look like
+    `- YYYY-MM-DD: text`; stay under the limit shown in `<user_memory>`.
+  - If existing memory uses legacy `(YYYY-MM-DD) [fact|pref|instr]` markers,
+    preserve the information but write the updated document in the new format.
--- a/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/private/example.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/private/example.md
@ -1,28 +1,28 @@
 <example>
 <user_name>Alex</user_name>, <user_memory> is empty.
 user: "I'm a space enthusiast, explain astrophage to me"
-→ update_memory(updated_memory="## Interests & background\n- (2025-03-15) [fact] Alex is a space enthusiast\n")
+→ update_memory(updated_memory="## Facts\n- 2025-03-15: Alex is a space enthusiast\n")
 (Casual durable fact; use first name, neutral heading.)
 </example>

 <example>
 user: "Remember that I prefer concise answers over detailed explanations"
-→ update_memory(updated_memory="## Interests & background\n- (2025-03-15) [fact] Alex is a space enthusiast\n\n## Response style\n- (2025-03-15) [pref] Alex prefers concise answers over detailed explanations\n")
+→ update_memory(updated_memory="## Facts\n- 2025-03-15: Alex is a space enthusiast\n\n## Preferences\n- 2025-03-15: Alex prefers concise answers over detailed explanations\n")
 (Durable preference; merge with existing memory.)
 </example>

 <example>
 user: "I actually moved to Tokyo last month"
-→ update_memory(updated_memory="...\n\n## Personal context\n- (2025-03-15) [fact] Alex lives in Tokyo (previously London)\n...")
+→ update_memory(updated_memory="...\n\n## Facts\n- 2025-03-15: Alex lives in Tokyo (previously London)\n...")
 (Updated fact; date reflects when recorded.)
 </example>

 <example>
 user: "I'm a freelance photographer working on a nature documentary"
-→ update_memory(updated_memory="...\n\n## Current focus\n- (2025-03-15) [fact] Alex is a freelance photographer\n- (2025-03-15) [fact] Alex is working on a nature documentary\n")
+→ update_memory(updated_memory="...\n\n## Current Focus\n- 2025-03-15: Alex is a freelance photographer\n- 2025-03-15: Alex is working on a nature documentary\n")
 </example>

 <example>
 user: "Always respond in bullet points"
-→ update_memory(updated_memory="...\n\n## Response style\n- (2025-03-15) [instr] Always respond to Alex in bullet points\n")
+→ update_memory(updated_memory="...\n\n## Instructions\n- 2025-03-15: Always respond to Alex in bullet points\n")
 </example>
--- a/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/team/description.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/team/description.md
@ -9,8 +9,14 @@
  - Skip ephemeral chat noise (one-off Q/A, greetings, session logistics).
  - Args: `updated_memory` — FULL replacement markdown (merge and curate,
    don't only append).
-  - Formatting: bullets `- (YYYY-MM-DD) [fact] text`. Team memory uses ONLY
-    the `[fact]` marker (never `[pref]` or `[instr]`). Group bullets under
-    short `##` headings (2-3 words each); stay under the limit shown in
-    `<team_memory>`. When trimming, prioritise: decisions/conventions > key
-    facts > current priorities.
+  - Formatting: heading-based markdown with entries under `##` headings.
+    Recommended headings are `## Product Decisions`,
+    `## Engineering Conventions`, `## Project Facts`, and `## Open Questions`.
+    New bullets should look like `- YYYY-MM-DD: text`; stay under the limit
+    shown in `<team_memory>`.
+  - If existing memory uses legacy `(YYYY-MM-DD) [fact]` markers, preserve the
+    information but write the updated document in the new format.
+  - Do not create personal headings such as `## Preferences`,
+    `## Instructions`, `## Personal Notes`, or `## Personal Instructions`.
+    When trimming, prioritise: decisions/conventions > key facts > current
+    priorities.
--- a/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/team/example.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/main_agent/system_prompt/prompts/tools/update_memory/team/example.md
@ -1,9 +1,9 @@
 <example>
 user: "Let's remember that we decided to do weekly standup meetings on Mondays"
-→ update_memory(updated_memory="...\n\n## Team rituals\n- (2025-03-15) [fact] Weekly standup meetings on Mondays\n...")
+→ update_memory(updated_memory="...\n\n## Product Decisions\n- 2025-03-15: Weekly standup meetings happen on Mondays\n...")
 </example>

 <example>
 user: "Our office is in downtown Seattle, 5th floor"
-→ update_memory(updated_memory="...\n\n## Workspace\n- (2025-03-15) [fact] Office location: downtown Seattle, 5th floor\n...")
+→ update_memory(updated_memory="...\n\n## Project Facts\n- 2025-03-15: Office location is downtown Seattle, 5th floor\n...")
 </example>
--- a/surfsense_backend/app/agents/multi_agent_chat/subagents/builtins/memory/system_prompt.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/subagents/builtins/memory/system_prompt.md
@ -18,6 +18,10 @@ Persist durable preferences/facts/instructions with `update_memory` while avoidi
 - Do not store transient chatter.
 - Do not store secrets unless explicitly instructed.
 - If memory intent is unclear, return `status=blocked` with the missing intent signal.
+- Persisted memory is heading-based markdown. New saved bullets should look like
+  `- YYYY-MM-DD: text` under `##` headings. If existing memory has legacy
+  `(YYYY-MM-DD) [fact|pref|instr]` markers, preserve the information but write
+  the updated document in the heading-based format.
 </tool_policy>

 <out_of_scope>
@ -53,4 +57,7 @@ Rules:
 - `status=success` -> `next_step=null`, `missing_fields=null`.
 - `status=partial|blocked|error` -> `next_step` must be non-null.
 - `status=blocked` due to missing required inputs -> `missing_fields` must be non-null.
+- `evidence.memory_category` is a semantic classification for supervisor logs
+  only. It is not the persisted storage format and must not force inline
+  `[fact|preference|instruction]` markers into saved memory.
 </output_contract>
--- a/surfsense_backend/app/agents/multi_agent_chat/subagents/builtins/memory/tools/update_memory.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/subagents/builtins/memory/tools/update_memory.py
@ -1,280 +1,23 @@
-"""Overwrite one markdown memory document per user or team, with size and shrink guards."""
+"""Memory update tools backed by the canonical memory service."""

 from __future__ import annotations

 import logging
-import re
-from typing import Any, Literal
+from typing import Any
 from uuid import UUID

-from langchain_core.messages import HumanMessage
 from langchain_core.tools import tool
-from sqlalchemy import select
 from sqlalchemy.ext.asyncio import AsyncSession

-from app.db import SearchSpace, User
+from app.services.memory import (
+    MEMORY_HARD_LIMIT,
+    MEMORY_SOFT_LIMIT,
+    MemoryScope,
+    save_memory,
+)

 logger = logging.getLogger(__name__)

-MEMORY_SOFT_LIMIT = 18_000
-MEMORY_HARD_LIMIT = 25_000
-
-_SECTION_HEADING_RE = re.compile(r"^##\s+(.+)$", re.MULTILINE)
-_HEADING_NORMALIZE_RE = re.compile(r"\s+")
-
-_MARKER_RE = re.compile(r"\[(fact|pref|instr)\]")
-_BULLET_FORMAT_RE = re.compile(r"^- \(\d{4}-\d{2}-\d{2}\) \[(fact|pref|instr)\] .+$")
-_PERSONAL_ONLY_MARKERS = {"pref", "instr"}
-
-
-# ---------------------------------------------------------------------------
-# Diff validation
-# ---------------------------------------------------------------------------
-
-
-def _extract_headings(memory: str) -> set[str]:
-    """Return all ``## …`` heading texts (without the ``## `` prefix)."""
-    return set(_SECTION_HEADING_RE.findall(memory))
-
-
-def _normalize_heading(heading: str) -> str:
-    """Normalize heading text for robust scope checks."""
-    return _HEADING_NORMALIZE_RE.sub(" ", heading.strip().lower())
-
-
-def _validate_memory_scope(
-    content: str, scope: Literal["user", "team"]
-) -> dict[str, Any] | None:
-    """Reject personal-only markers ([pref], [instr]) in team memory."""
-    if scope != "team":
-        return None
-
-    markers = set(_MARKER_RE.findall(content))
-    leaked = sorted(markers & _PERSONAL_ONLY_MARKERS)
-    if leaked:
-        tags = ", ".join(f"[{m}]" for m in leaked)
-        return {
-            "status": "error",
-            "message": (
-                f"Team memory cannot include personal markers: {tags}. "
-                "Use [fact] only in team memory."
-            ),
-        }
-    return None
-
-
-def _validate_bullet_format(content: str) -> list[str]:
-    """Return warnings for bullet lines that don't match the required format.
-
-    Expected: ``- (YYYY-MM-DD) [fact|pref|instr] text``
-    """
-    warnings: list[str] = []
-    for line in content.splitlines():
-        stripped = line.strip()
-        if not stripped.startswith("- "):
-            continue
-        if not _BULLET_FORMAT_RE.match(stripped):
-            short = stripped[:80] + ("..." if len(stripped) > 80 else "")
-            warnings.append(f"Malformed bullet: {short}")
-    return warnings
-
-
-def _validate_diff(old_memory: str | None, new_memory: str) -> list[str]:
-    """Return a list of warning strings about suspicious changes."""
-    if not old_memory:
-        return []
-
-    warnings: list[str] = []
-    old_headings = _extract_headings(old_memory)
-    new_headings = _extract_headings(new_memory)
-    dropped = old_headings - new_headings
-    if dropped:
-        names = ", ".join(sorted(dropped))
-        warnings.append(
-            f"Sections removed: {names}. "
-            "If unintentional, the user can restore from the settings page."
-        )
-
-    old_len = len(old_memory)
-    new_len = len(new_memory)
-    if old_len > 0 and new_len < old_len * 0.4:
-        warnings.append(
-            f"Memory shrank significantly ({old_len:,} -> {new_len:,} chars). "
-            "Possible data loss."
-        )
-    return warnings
-
-
-# ---------------------------------------------------------------------------
-# Size validation & soft warning
-# ---------------------------------------------------------------------------
-
-
-def _validate_memory_size(content: str) -> dict[str, Any] | None:
-    """Return an error/warning dict if *content* is too large, else None."""
-    length = len(content)
-    if length > MEMORY_HARD_LIMIT:
-        return {
-            "status": "error",
-            "message": (
-                f"Memory exceeds {MEMORY_HARD_LIMIT:,} character limit "
-                f"({length:,} chars). Consolidate by merging related items, "
-                "removing outdated entries, and shortening descriptions. "
-                "Then call update_memory again."
-            ),
-        }
-    return None
-
-
-def _soft_warning(content: str) -> str | None:
-    """Return a warning string if content exceeds the soft limit."""
-    length = len(content)
-    if length > MEMORY_SOFT_LIMIT:
-        return (
-            f"Memory is at {length:,}/{MEMORY_HARD_LIMIT:,} characters. "
-            "Consolidate by merging related items and removing less important "
-            "entries on your next update."
-        )
-    return None
-
-
-# ---------------------------------------------------------------------------
-# Forced rewrite when memory exceeds the hard limit
-# ---------------------------------------------------------------------------
-
-_FORCED_REWRITE_PROMPT = """\
-You are a memory curator. The following memory document exceeds the character \
-limit and must be shortened.
-
-RULES:
-1. Rewrite the document to be under {target} characters.
-2. Preserve existing ## headings. Every entry must remain under a heading. You may merge
-   or rename headings to consolidate, but keep names personal and descriptive.
-3. Priority for keeping content: [instr] > [pref] > [fact].
-4. Merge duplicate entries, remove outdated entries, shorten verbose descriptions.
-5. Every bullet MUST have format: - (YYYY-MM-DD) [fact|pref|instr] text
-6. Preserve the user's first name in entries — do not replace it with "the user".
-7. Output ONLY the consolidated markdown — no explanations, no wrapping.
-
-<memory_document>
-{content}
-</memory_document>"""
-
-
-async def _forced_rewrite(content: str, llm: Any) -> str | None:
-    """Use a focused LLM call to compress *content* under the hard limit.
-
-    Returns the rewritten string, or ``None`` if the call fails.
-    """
-    try:
-        prompt = _FORCED_REWRITE_PROMPT.format(
-            target=MEMORY_HARD_LIMIT, content=content
-        )
-        response = await llm.ainvoke(
-            [HumanMessage(content=prompt)],
-            config={"tags": ["surfsense:internal"]},
-        )
-        text = (
-            response.content
-            if isinstance(response.content, str)
-            else str(response.content)
-        )
-        return text.strip()
-    except Exception:
-        logger.exception("Forced rewrite LLM call failed")
-        return None
-
-
-# ---------------------------------------------------------------------------
-# Shared save-and-respond logic
-# ---------------------------------------------------------------------------
-
-
-async def _save_memory(
-    *,
-    updated_memory: str,
-    old_memory: str | None,
-    llm: Any | None,
-    apply_fn,
-    commit_fn,
-    rollback_fn,
-    label: str,
-    scope: Literal["user", "team"],
-) -> dict[str, Any]:
-    """Validate, optionally force-rewrite if over the hard limit, save, and
-    return a response dict.
-
-    Parameters
-    ----------
-    updated_memory : str
-        The new document the agent submitted.
-    old_memory : str | None
-        The previously persisted document (for diff checks).
-    llm : Any | None
-        LLM instance for forced rewrite (may be ``None``).
-    apply_fn : callable(str) -> None
-        Callback that sets the new memory on the ORM object.
-    commit_fn : coroutine
-        ``session.commit``.
-    rollback_fn : coroutine
-        ``session.rollback``.
-    label : str
-        Human label for log messages (e.g. "user memory", "team memory").
-    """
-    content = updated_memory
-
-    # --- forced rewrite if over the hard limit ---
-    if len(content) > MEMORY_HARD_LIMIT and llm is not None:
-        rewritten = await _forced_rewrite(content, llm)
-        if rewritten is not None and len(rewritten) < len(content):
-            content = rewritten
-
-    # --- hard-limit gate (reject if still too large after rewrite) ---
-    size_err = _validate_memory_size(content)
-    if size_err:
-        return size_err
-
-    scope_err = _validate_memory_scope(content, scope)
-    if scope_err:
-        return scope_err
-
-    # --- persist ---
-    try:
-        apply_fn(content)
-        await commit_fn()
-    except Exception as e:
-        logger.exception("Failed to update %s: %s", label, e)
-        await rollback_fn()
-        return {"status": "error", "message": f"Failed to update {label}: {e}"}
-
-    # --- build response ---
-    resp: dict[str, Any] = {
-        "status": "saved",
-        "message": f"{label.capitalize()} updated.",
-    }
-
-    if content is not updated_memory:
-        resp["notice"] = "Memory was automatically rewritten to fit within limits."
-
-    diff_warnings = _validate_diff(old_memory, content)
-    if diff_warnings:
-        resp["diff_warnings"] = diff_warnings
-
-    format_warnings = _validate_bullet_format(content)
-    if format_warnings:
-        resp["format_warnings"] = format_warnings
-
-    warning = _soft_warning(content)
-    if warning:
-        resp["warning"] = warning
-
-    return resp
-
-
-# ---------------------------------------------------------------------------
-# Tool factories
-# ---------------------------------------------------------------------------
-

 def create_update_memory_tool(
    user_id: str | UUID,
@ -287,40 +30,22 @@ def create_update_memory_tool(
    async def update_memory(updated_memory: str) -> dict[str, Any]:
        """Update the user's personal memory document.

-        Your current memory is shown in <user_memory> in the system prompt.
-        When the user shares important long-term information (preferences,
-        facts, instructions, context), rewrite the memory document to include
-        the new information.  Merge new facts with existing ones, update
-        contradictions, remove outdated entries, and keep it concise.
-
-        Args:
-            updated_memory: The FULL updated markdown document (not a diff).
+        The current memory is shown in <user_memory>. Pass the FULL updated
+        markdown document, not a diff.
        """
        try:
-            result = await db_session.execute(select(User).where(User.id == uid))
-            user = result.scalars().first()
-            if not user:
-                return {"status": "error", "message": "User not found."}
-
-            old_memory = user.memory_md
-
-            return await _save_memory(
-                updated_memory=updated_memory,
-                old_memory=old_memory,
+            result = await save_memory(
+                scope=MemoryScope.USER,
+                target_id=uid,
+                content=updated_memory,
+                session=db_session,
                llm=llm,
-                apply_fn=lambda content: setattr(user, "memory_md", content),
-                commit_fn=db_session.commit,
-                rollback_fn=db_session.rollback,
-                label="memory",
-                scope="user",
            )
+            return result.to_dict()
        except Exception as e:
            logger.exception("Failed to update user memory: %s", e)
            await db_session.rollback()
-            return {
-                "status": "error",
-                "message": f"Failed to update memory: {e}",
-            }
+            return {"status": "error", "message": f"Failed to update memory: {e}"}

    return update_memory

@ -334,36 +59,18 @@ def create_update_team_memory_tool(
    async def update_memory(updated_memory: str) -> dict[str, Any]:
        """Update the team's shared memory document for this search space.

-        Your current team memory is shown in <team_memory> in the system
-        prompt.  When the team shares important long-term information
-        (decisions, conventions, key facts, priorities), rewrite the memory
-        document to include the new information.  Merge new facts with
-        existing ones, update contradictions, remove outdated entries, and
-        keep it concise.
-
-        Args:
-            updated_memory: The FULL updated markdown document (not a diff).
+        The current team memory is shown in <team_memory>. Pass the FULL updated
+        markdown document, not a diff.
        """
        try:
-            result = await db_session.execute(
-                select(SearchSpace).where(SearchSpace.id == search_space_id)
-            )
-            space = result.scalars().first()
-            if not space:
-                return {"status": "error", "message": "Search space not found."}
-
-            old_memory = space.shared_memory_md
-
-            return await _save_memory(
-                updated_memory=updated_memory,
-                old_memory=old_memory,
+            result = await save_memory(
+                scope=MemoryScope.TEAM,
+                target_id=search_space_id,
+                content=updated_memory,
+                session=db_session,
                llm=llm,
-                apply_fn=lambda content: setattr(space, "shared_memory_md", content),
-                commit_fn=db_session.commit,
-                rollback_fn=db_session.rollback,
-                label="team memory",
-                scope="team",
            )
+            return result.to_dict()
        except Exception as e:
            logger.exception("Failed to update team memory: %s", e)
            await db_session.rollback()
@ -373,3 +80,11 @@ def create_update_team_memory_tool(
            }

    return update_memory
+
+
+__all__ = [
+    "MEMORY_HARD_LIMIT",
+    "MEMORY_SOFT_LIMIT",
+    "create_update_memory_tool",
+    "create_update_team_memory_tool",
+]
--- a/surfsense_backend/app/agents/new_chat/memory_extraction.py
+++ b/surfsense_backend/app/agents/new_chat/memory_extraction.py
@ -1,232 +0,0 @@
-"""Background memory extraction for the SurfSense agent.
-
-After each agent response, if the agent did not call ``update_memory`` during
-the turn, this module can run a lightweight LLM call to decide whether the
-latest message contains long-term information worth persisting.
-"""
-
-from __future__ import annotations
-
-import logging
-from typing import Any
-from uuid import UUID
-
-from langchain_core.messages import HumanMessage
-from sqlalchemy import select
-
-from app.agents.new_chat.tools.update_memory import _save_memory
-from app.db import SearchSpace, User, shielded_async_session
-from app.utils.content_utils import extract_text_content
-
-logger = logging.getLogger(__name__)
-
-_MEMORY_EXTRACT_PROMPT = """\
-You are a memory extraction assistant. Analyze the user's message and decide \
-if it contains any long-term information worth persisting to memory.
-
-Worth remembering: preferences, background/identity, goals, projects, \
-instructions, tools/languages they use, decisions, expertise, workplace — \
-durable facts that will matter in future conversations.
-
-NOT worth remembering: greetings, one-off factual questions, session \
-logistics, ephemeral requests, follow-up clarifications with no new personal \
-info, things that only matter for the current task.
-
-If the message contains memorizable information, output the FULL updated \
-memory document with the new facts merged into the existing content. Follow \
-these rules:
- Every entry MUST be under a ## heading. Preserve existing headings; create new ones
-  freely. Keep heading names short (2-3 words) and natural. Do NOT include the user's
-  name in headings.
- Keep entries as single bullet points. Be descriptive but concise — include relevant
-  details and context rather than just a few words.
- Every bullet MUST use format: - (YYYY-MM-DD) [fact|pref|instr] text
-  [fact] = durable facts, [pref] = preferences, [instr] = standing instructions.
- Use the user's first name (from <user_name>) in entry text, not "the user".
- If a new fact contradicts an existing entry, update the existing entry.
- Do not duplicate information that is already present.
-
-If nothing is worth remembering, output exactly: NO_UPDATE
-
-<user_name>{user_name}</user_name>
-
-<current_memory>
-{current_memory}
-</current_memory>
-
-<user_message>
-{user_message}
-</user_message>"""
-
-_TEAM_MEMORY_EXTRACT_PROMPT = """\
-You are a team-memory extraction assistant. Analyze the latest message and \
-decide if it contains durable TEAM-level information worth persisting.
-
-Decision policy:
- Prioritize recall for durable team context, while avoiding personal-only facts.
- Do NOT require explicit consensus language. A direct team-level statement can
-  be stored if it is stable and broadly useful for future team chats.
- If evidence is weak or clearly tentative, output NO_UPDATE.
-
-Worth remembering (team-level only):
- Decisions and defaults that guide future team work
- Team conventions/standards (naming, review policy, coding norms)
- Stable org/project facts (locations, ownership, constraints)
- Long-lived architecture/process facts
- Ongoing priorities that are likely relevant beyond this turn
-
-NOT worth remembering:
- Personal preferences or biography of one person
- Questions, brainstorming, tentative ideas, or speculation
- One-off requests, status updates, TODOs, logistics for this session
- Information scoped only to a single ephemeral task
-
-If the message contains memorizable team information, output the FULL updated \
-team memory document with new facts merged into existing content. Follow rules:
- Every entry MUST be under a ## heading. Preserve existing headings; create new ones
-  freely. Keep heading names short (2-3 words) and natural.
- Keep entries as single bullet points. Be descriptive but concise — include relevant
-  details and context rather than just a few words.
- Every bullet MUST use format: - (YYYY-MM-DD) [fact] text
-  Team memory uses ONLY the [fact] marker. Never use [pref] or [instr].
- If a new fact contradicts an existing entry, update the existing entry.
- Do not duplicate existing information.
- Preserve neutral team phrasing; avoid person-specific memory unless role-anchored.
-
-If nothing is worth remembering, output exactly: NO_UPDATE
-
-<current_team_memory>
-{current_memory}
-</current_team_memory>
-
-<latest_message_author>
-{author}
-</latest_message_author>
-
-<latest_message>
-{user_message}
-</latest_message>"""
-
-
-async def extract_and_save_memory(
-    *,
-    user_message: str,
-    user_id: str | None,
-    llm: Any,
-) -> None:
-    """Background task: extract memorizable info and persist it.
-
-    Designed to be fire-and-forget — catches all exceptions internally.
-    """
-    if not user_id:
-        return
-
-    try:
-        uid = UUID(user_id) if isinstance(user_id, str) else user_id
-
-        async with shielded_async_session() as session:
-            result = await session.execute(select(User).where(User.id == uid))
-            user = result.scalars().first()
-            if not user:
-                return
-
-            old_memory = user.memory_md
-            first_name = (
-                user.display_name.strip().split()[0]
-                if user.display_name and user.display_name.strip()
-                else "The user"
-            )
-            prompt = _MEMORY_EXTRACT_PROMPT.format(
-                current_memory=old_memory or "(empty)",
-                user_message=user_message,
-                user_name=first_name,
-            )
-            response = await llm.ainvoke(
-                [HumanMessage(content=prompt)],
-                config={"tags": ["surfsense:internal", "memory-extraction"]},
-            )
-            text = extract_text_content(response.content).strip()
-
-            if text == "NO_UPDATE" or not text:
-                logger.debug("Memory extraction: no update needed (user %s)", uid)
-                return
-
-            save_result = await _save_memory(
-                updated_memory=text,
-                old_memory=old_memory,
-                llm=llm,
-                apply_fn=lambda content: setattr(user, "memory_md", content),
-                commit_fn=session.commit,
-                rollback_fn=session.rollback,
-                label="memory",
-                scope="user",
-            )
-            logger.info(
-                "Background memory extraction for user %s: %s",
-                uid,
-                save_result.get("status"),
-            )
-    except Exception:
-        logger.exception("Background user memory extraction failed")
-
-
-async def extract_and_save_team_memory(
-    *,
-    user_message: str,
-    search_space_id: int | None,
-    llm: Any,
-    author_display_name: str | None = None,
-) -> None:
-    """Background task: extract team-level memory and persist it.
-
-    Runs only for shared threads. Designed to be fire-and-forget and catches
-    exceptions internally.
-    """
-    if not search_space_id:
-        return
-
-    try:
-        async with shielded_async_session() as session:
-            result = await session.execute(
-                select(SearchSpace).where(SearchSpace.id == search_space_id)
-            )
-            space = result.scalars().first()
-            if not space:
-                return
-
-            old_memory = space.shared_memory_md
-            prompt = _TEAM_MEMORY_EXTRACT_PROMPT.format(
-                current_memory=old_memory or "(empty)",
-                author=author_display_name or "Unknown team member",
-                user_message=user_message,
-            )
-            response = await llm.ainvoke(
-                [HumanMessage(content=prompt)],
-                config={"tags": ["surfsense:internal", "team-memory-extraction"]},
-            )
-            text = extract_text_content(response.content).strip()
-
-            if text == "NO_UPDATE" or not text:
-                logger.debug(
-                    "Team memory extraction: no update needed (space %s)",
-                    search_space_id,
-                )
-                return
-
-            save_result = await _save_memory(
-                updated_memory=text,
-                old_memory=old_memory,
-                llm=llm,
-                apply_fn=lambda content: setattr(space, "shared_memory_md", content),
-                commit_fn=session.commit,
-                rollback_fn=session.rollback,
-                label="team memory",
-                scope="team",
-            )
-            logger.info(
-                "Background team memory extraction for space %s: %s",
-                search_space_id,
-                save_result.get("status"),
-            )
-    except Exception:
-        logger.exception("Background team memory extraction failed")
--- a/surfsense_backend/app/agents/new_chat/middleware/memory_injection.py
+++ b/surfsense_backend/app/agents/new_chat/middleware/memory_injection.py
@ -18,8 +18,8 @@ from langgraph.runtime import Runtime
 from sqlalchemy import select
 from sqlalchemy.ext.asyncio import AsyncSession

-from app.agents.new_chat.tools.update_memory import MEMORY_HARD_LIMIT, MEMORY_SOFT_LIMIT
 from app.db import ChatVisibility, SearchSpace, User, shielded_async_session
+from app.services.memory import MEMORY_HARD_LIMIT, MEMORY_SOFT_LIMIT
 from app.utils.perf import get_perf_logger

 logger = logging.getLogger(__name__)
--- a/surfsense_backend/app/agents/new_chat/prompts/base/memory_protocol_private.md
+++ b/surfsense_backend/app/agents/new_chat/prompts/base/memory_protocol_private.md
@ -3,4 +3,10 @@ IMPORTANT — After understanding each user message, ALWAYS check: does this mes
 reveal durable facts about the user (role, interests, preferences, projects,
 background, or standing instructions)? If yes, you MUST call update_memory
 alongside your normal response — do not defer this to a later turn.
+
+Memory is stored as a heading-based markdown document. New entries should be
+under `##` headings such as `## Facts`, `## Preferences`, or `## Instructions`
+with bullets like `- YYYY-MM-DD: text`. If existing memory contains legacy
+`(YYYY-MM-DD) [fact|pref|instr]` markers, preserve the information but write
+new saves in the heading-based format.
 </memory_protocol>
--- a/surfsense_backend/app/agents/new_chat/prompts/base/memory_protocol_team.md
+++ b/surfsense_backend/app/agents/new_chat/prompts/base/memory_protocol_team.md
@ -3,4 +3,12 @@ IMPORTANT — After understanding each user message, ALWAYS check: does this mes
 reveal durable facts about the team (decisions, conventions, architecture, processes,
 or key facts)? If yes, you MUST call update_memory alongside your normal response —
 do not defer this to a later turn.
+
+Team memory is stored as a heading-based markdown document. New entries should
+be under `##` headings such as `## Product Decisions`,
+`## Engineering Conventions`, `## Project Facts`, or `## Open Questions` with
+bullets like `- YYYY-MM-DD: text`. If existing memory contains legacy
+`(YYYY-MM-DD) [fact]` markers, preserve the information but write new saves in
+the heading-based format. Do not create personal headings such as
+`## Preferences` or `## Instructions`.
 </memory_protocol>
--- a/surfsense_backend/app/agents/new_chat/prompts/examples/update_memory_private.md
+++ b/surfsense_backend/app/agents/new_chat/prompts/examples/update_memory_private.md
@ -1,16 +1,16 @@

 - <user_name>Alex</user_name>, <user_memory> is empty. User: "I'm a space enthusiast, explain astrophage to me"
-  - The user casually shared a durable fact. Use their first name in the entry, short neutral heading:
-    update_memory(updated_memory="## Interests & background\n- (2025-03-15) [fact] Alex is a space enthusiast\n")
+  - The user casually shared a durable fact:
+    update_memory(updated_memory="## Facts\n- 2025-03-15: Alex is a space enthusiast\n")
 - User: "Remember that I prefer concise answers over detailed explanations"
-  - Durable preference. Merge with existing memory, add a new heading:
-    update_memory(updated_memory="## Interests & background\n- (2025-03-15) [fact] Alex is a space enthusiast\n\n## Response style\n- (2025-03-15) [pref] Alex prefers concise answers over detailed explanations\n")
+  - Durable preference. Merge with existing memory:
+    update_memory(updated_memory="## Facts\n- 2025-03-15: Alex is a space enthusiast\n\n## Preferences\n- 2025-03-15: Alex prefers concise answers over detailed explanations\n")
 - User: "I actually moved to Tokyo last month"
  - Updated fact, date prefix reflects when recorded:
-    update_memory(updated_memory="## Interests & background\n...\n\n## Personal context\n- (2025-03-15) [fact] Alex lives in Tokyo (previously London)\n...")
+    update_memory(updated_memory="## Facts\n- 2025-03-15: Alex lives in Tokyo (previously London)\n...")
 - User: "I'm a freelance photographer working on a nature documentary"
  - Durable background info under a fitting heading:
-    update_memory(updated_memory="...\n\n## Current focus\n- (2025-03-15) [fact] Alex is a freelance photographer\n- (2025-03-15) [fact] Alex is working on a nature documentary\n")
+    update_memory(updated_memory="...\n\n## Current Focus\n- 2025-03-15: Alex is a freelance photographer\n- 2025-03-15: Alex is working on a nature documentary\n")
 - User: "Always respond in bullet points"
  - Standing instruction:
-    update_memory(updated_memory="...\n\n## Response style\n- (2025-03-15) [instr] Always respond to Alex in bullet points\n")
+    update_memory(updated_memory="...\n\n## Instructions\n- 2025-03-15: Always respond to Alex in bullet points\n")
--- a/surfsense_backend/app/agents/new_chat/prompts/examples/update_memory_team.md
+++ b/surfsense_backend/app/agents/new_chat/prompts/examples/update_memory_team.md
@ -1,7 +1,7 @@

 - User: "Let's remember that we decided to do weekly standup meetings on Mondays"
  - Durable team decision:
-    update_memory(updated_memory="- (2025-03-15) [fact] Weekly standup meetings on Mondays\n...")
+    update_memory(updated_memory="## Product Decisions\n- 2025-03-15: Weekly standup meetings happen on Mondays\n...")
 - User: "Our office is in downtown Seattle, 5th floor"
  - Durable team fact:
-    update_memory(updated_memory="- (2025-03-15) [fact] Office location: downtown Seattle, 5th floor\n...")
+    update_memory(updated_memory="## Project Facts\n- 2025-03-15: Office location is downtown Seattle, 5th floor\n...")
--- a/surfsense_backend/app/agents/new_chat/prompts/tools/update_memory_private.md
+++ b/surfsense_backend/app/agents/new_chat/prompts/tools/update_memory_private.md
@ -1,31 +1,26 @@

 - update_memory: Update your personal memory document about the user.
-  - Your current memory is already in <user_memory> in your context.  The `chars` and
-    `limit` attributes show your current usage and the maximum allowed size.
-  - This is your curated long-term memory — the distilled essence of what you know about
-    the user, not raw conversation logs.
-  - Call update_memory when:
-    * The user explicitly asks to remember or forget something
-    * The user shares durable facts or preferences that will matter in future conversations
-  - The user's first name is provided in <user_name>. Use it in memory entries
-    instead of "the user" (e.g. "{name} works at..." not "The user works at...").
-    Do not store the name itself as a separate memory entry.
-  - Do not store short-lived or ephemeral info: one-off questions, greetings,
-    session logistics, or things that only matter for the current task.
+  - Your current memory is already in <user_memory> in your context. The `chars`
+    and `limit` attributes show current usage and the maximum allowed size.
+  - This is curated long-term memory, not raw conversation logs.
+  - Call update_memory when the user explicitly asks to remember/forget
+    something or shares durable facts, preferences, or standing instructions.
+  - The user's first name is provided in <user_name>. Use it in entries instead
+    of "the user" when helpful. Do not store the name alone as a memory entry.
+  - Do not store short-lived info: one-off questions, greetings, session
+    logistics, or things that only matter for the current task.
  - Args:
-    - updated_memory: The FULL updated markdown document (not a diff).
-      Merge new facts with existing ones, update contradictions, remove outdated entries.
-      Treat every update as a curation pass — consolidate, don't just append.
-  - Every bullet MUST use this format: - (YYYY-MM-DD) [marker] text
-    Markers:
-      [fact]  — durable facts (role, background, projects, tools, expertise)
-      [pref]  — preferences (response style, languages, formats, tools)
-      [instr] — standing instructions (always/never do, response rules)
-  - Keep it concise and well under the character limit shown in <user_memory>.
-  - Every entry MUST be under a `##` heading. Keep heading names short (2-3 words) and
-    natural. Do NOT include the user's name in headings. Organize by context — e.g.
-    who they are, what they're focused on, how they prefer things. Create, split, or
-    merge headings freely as the memory grows.
-  - Each entry MUST be a single bullet point. Be descriptive but concise — include relevant
-    details and context rather than just a few words.
-  - During consolidation, prioritize keeping: [instr] > [pref] > [fact].
+    - updated_memory: The FULL updated markdown document, not a diff. Merge new
+      facts with existing ones, update contradictions, remove outdated entries,
+      and consolidate instead of only appending.
+  - Use heading-based Markdown:
+    * Every entry must be under a `##` heading.
+    * Recommended headings: `## Facts`, `## Preferences`, `## Instructions`.
+      Specific natural headings are allowed when clearer.
+    * New bullets should use `- YYYY-MM-DD: text`.
+    * Each entry should be one concise but descriptive bullet.
+  - If existing memory uses legacy `(YYYY-MM-DD) [fact|pref|instr]` markers,
+    preserve the information but write the updated document in the new
+    heading-based format.
+  - During consolidation, prioritize durable instructions and preferences before
+    generic facts.
--- a/surfsense_backend/app/agents/new_chat/prompts/tools/update_memory_team.md
+++ b/surfsense_backend/app/agents/new_chat/prompts/tools/update_memory_team.md
@ -1,26 +1,28 @@

 - update_memory: Update the team's shared memory document for this search space.
-  - Your current team memory is already in <team_memory> in your context.  The `chars`
-    and `limit` attributes show current usage and the maximum allowed size.
-  - This is the team's curated long-term memory — decisions, conventions, key facts.
-  - NEVER store personal memory in team memory (e.g. personal bio, individual
-    preferences, or user-only standing instructions).
-  - Call update_memory when:
-    * A team member explicitly asks to remember or forget something
-    * The conversation surfaces durable team decisions, conventions, or facts
-      that will matter in future conversations
-  - Do not store short-lived or ephemeral info: one-off questions, greetings,
-    session logistics, or things that only matter for the current task.
+  - Your current team memory is already in <team_memory> in your context. The
+    `chars` and `limit` attributes show current usage and the maximum allowed size.
+  - This is curated long-term team memory: decisions, conventions, architecture,
+    processes, and key shared facts.
+  - NEVER store personal memory in team memory: individual bios, personal
+    preferences, or user-only standing instructions.
+  - Call update_memory when a team member asks to remember/forget something, or
+    when the conversation surfaces durable team context that matters later.
+  - Do not store short-lived info: one-off questions, greetings, session
+    logistics, or things that only matter for the current task.
  - Args:
-    - updated_memory: The FULL updated markdown document (not a diff).
-      Merge new facts with existing ones, update contradictions, remove outdated entries.
-      Treat every update as a curation pass — consolidate, don't just append.
-  - Every bullet MUST use this format: - (YYYY-MM-DD) [fact] text
-    Team memory uses ONLY the [fact] marker. Never use [pref] or [instr] in team memory.
-  - Keep it concise and well under the character limit shown in <team_memory>.
-  - Every entry MUST be under a `##` heading. Keep heading names short (2-3 words) and
-    natural. Organize by context — e.g. what the team decided, current architecture,
-    active processes. Create, split, or merge headings freely as the memory grows.
-  - Each entry MUST be a single bullet point. Be descriptive but concise — include relevant
-    details and context rather than just a few words.
-  - During consolidation, prioritize keeping: decisions/conventions > key facts > current priorities.
+    - updated_memory: The FULL updated markdown document, not a diff. Merge new
+      facts with existing ones, update contradictions, remove outdated entries,
+      and consolidate instead of only appending.
+  - Use heading-based Markdown:
+    * Every entry must be under a `##` heading.
+    * Recommended headings: `## Product Decisions`, `## Engineering Conventions`,
+      `## Project Facts`, `## Open Questions`.
+    * New bullets should use `- YYYY-MM-DD: text`.
+    * Each entry should be one concise but descriptive bullet.
+  - If existing memory uses legacy `(YYYY-MM-DD) [fact]` markers, preserve the
+    information but write the updated document in the new heading-based format.
+  - Do not create personal headings such as `## Preferences`, `## Instructions`,
+    `## Personal Notes`, or `## Personal Instructions`.
+  - During consolidation, prioritize decisions/conventions, then key facts, then
+    current priorities.
--- a/surfsense_backend/app/agents/new_chat/tools/update_memory.py
+++ b/surfsense_backend/app/agents/new_chat/tools/update_memory.py
@ -1,369 +1,53 @@
-"""Markdown-document memory tool for the SurfSense agent.
-
-Replaces the old row-per-fact save_memory / recall_memory tools with a single
-update_memory tool that overwrites a freeform markdown TEXT column.  The LLM
-always sees the current memory in <user_memory> / <team_memory> tags injected
-by MemoryInjectionMiddleware, so it passes the FULL updated document each time.
-
-Overflow handling:
-  - Soft limit (18K chars): a warning is returned telling the agent to
-    consolidate on the next update.
-  - Hard limit (25K chars): a forced LLM-driven rewrite compresses the document.
-    If it still exceeds the limit after rewriting, the save is rejected.
-  - Diff validation: warns when entire ``##`` sections are dropped or when the
-    document shrinks by more than 60%.
-"""
+"""Memory update tools backed by the canonical memory service."""

 from __future__ import annotations

 import logging
-import re
-from typing import Any, Literal
+from typing import Any
 from uuid import UUID

-from langchain_core.messages import HumanMessage
 from langchain_core.tools import tool
-from sqlalchemy import select
 from sqlalchemy.ext.asyncio import AsyncSession

-from app.db import SearchSpace, User, async_session_maker
-from app.utils.content_utils import extract_text_content
+from app.db import async_session_maker
+from app.services.memory import MemoryScope, save_memory

 logger = logging.getLogger(__name__)

-MEMORY_SOFT_LIMIT = 18_000
-MEMORY_HARD_LIMIT = 25_000
-
-_SECTION_HEADING_RE = re.compile(r"^##\s+(.+)$", re.MULTILINE)
-_HEADING_NORMALIZE_RE = re.compile(r"\s+")
-
-_MARKER_RE = re.compile(r"\[(fact|pref|instr)\]")
-_BULLET_FORMAT_RE = re.compile(r"^- \(\d{4}-\d{2}-\d{2}\) \[(fact|pref|instr)\] .+$")
-_PERSONAL_ONLY_MARKERS = {"pref", "instr"}
-
-
-# ---------------------------------------------------------------------------
-# Diff validation
-# ---------------------------------------------------------------------------
-
-
-def _extract_headings(memory: str) -> set[str]:
-    """Return all ``## …`` heading texts (without the ``## `` prefix)."""
-    return set(_SECTION_HEADING_RE.findall(memory))
-
-
-def _normalize_heading(heading: str) -> str:
-    """Normalize heading text for robust scope checks."""
-    return _HEADING_NORMALIZE_RE.sub(" ", heading.strip().lower())
-
-
-def _validate_memory_scope(
-    content: str, scope: Literal["user", "team"]
-) -> dict[str, Any] | None:
-    """Reject personal-only markers ([pref], [instr]) in team memory."""
-    if scope != "team":
-        return None
-
-    markers = set(_MARKER_RE.findall(content))
-    leaked = sorted(markers & _PERSONAL_ONLY_MARKERS)
-    if leaked:
-        tags = ", ".join(f"[{m}]" for m in leaked)
-        return {
-            "status": "error",
-            "message": (
-                f"Team memory cannot include personal markers: {tags}. "
-                "Use [fact] only in team memory."
-            ),
-        }
-    return None
-
-
-def _validate_bullet_format(content: str) -> list[str]:
-    """Return warnings for bullet lines that don't match the required format.
-
-    Expected: ``- (YYYY-MM-DD) [fact|pref|instr] text``
-    """
-    warnings: list[str] = []
-    for line in content.splitlines():
-        stripped = line.strip()
-        if not stripped.startswith("- "):
-            continue
-        if not _BULLET_FORMAT_RE.match(stripped):
-            short = stripped[:80] + ("..." if len(stripped) > 80 else "")
-            warnings.append(f"Malformed bullet: {short}")
-    return warnings
-
-
-def _validate_diff(old_memory: str | None, new_memory: str) -> list[str]:
-    """Return a list of warning strings about suspicious changes."""
-    if not old_memory:
-        return []
-
-    warnings: list[str] = []
-    old_headings = _extract_headings(old_memory)
-    new_headings = _extract_headings(new_memory)
-    dropped = old_headings - new_headings
-    if dropped:
-        names = ", ".join(sorted(dropped))
-        warnings.append(
-            f"Sections removed: {names}. "
-            "If unintentional, the user can restore from the settings page."
-        )
-
-    old_len = len(old_memory)
-    new_len = len(new_memory)
-    if old_len > 0 and new_len < old_len * 0.4:
-        warnings.append(
-            f"Memory shrank significantly ({old_len:,} -> {new_len:,} chars). "
-            "Possible data loss."
-        )
-    return warnings
-
-
-# ---------------------------------------------------------------------------
-# Size validation & soft warning
-# ---------------------------------------------------------------------------
-
-
-def _validate_memory_size(content: str) -> dict[str, Any] | None:
-    """Return an error/warning dict if *content* is too large, else None."""
-    length = len(content)
-    if length > MEMORY_HARD_LIMIT:
-        return {
-            "status": "error",
-            "message": (
-                f"Memory exceeds {MEMORY_HARD_LIMIT:,} character limit "
-                f"({length:,} chars). Consolidate by merging related items, "
-                "removing outdated entries, and shortening descriptions. "
-                "Then call update_memory again."
-            ),
-        }
-    return None
-
-
-def _soft_warning(content: str) -> str | None:
-    """Return a warning string if content exceeds the soft limit."""
-    length = len(content)
-    if length > MEMORY_SOFT_LIMIT:
-        return (
-            f"Memory is at {length:,}/{MEMORY_HARD_LIMIT:,} characters. "
-            "Consolidate by merging related items and removing less important "
-            "entries on your next update."
-        )
-    return None
-
-
-# ---------------------------------------------------------------------------
-# Forced rewrite when memory exceeds the hard limit
-# ---------------------------------------------------------------------------
-
-_FORCED_REWRITE_PROMPT = """\
-You are a memory curator. The following memory document exceeds the character \
-limit and must be shortened.
-
-RULES:
-1. Rewrite the document to be under {target} characters.
-2. Preserve existing ## headings. Every entry must remain under a heading. You may merge
-   or rename headings to consolidate, but keep names personal and descriptive.
-3. Priority for keeping content: [instr] > [pref] > [fact].
-4. Merge duplicate entries, remove outdated entries, shorten verbose descriptions.
-5. Every bullet MUST have format: - (YYYY-MM-DD) [fact|pref|instr] text
-6. Preserve the user's first name in entries — do not replace it with "the user".
-7. Output ONLY the consolidated markdown — no explanations, no wrapping.
-
-<memory_document>
-{content}
-</memory_document>"""
-
-
-async def _forced_rewrite(content: str, llm: Any) -> str | None:
-    """Use a focused LLM call to compress *content* under the hard limit.
-
-    Returns the rewritten string, or ``None`` if the call fails.
-    """
-    try:
-        prompt = _FORCED_REWRITE_PROMPT.format(
-            target=MEMORY_HARD_LIMIT, content=content
-        )
-        response = await llm.ainvoke(
-            [HumanMessage(content=prompt)],
-            config={"tags": ["surfsense:internal"]},
-        )
-        text = extract_text_content(response.content).strip()
-        if not text:
-            logger.warning("Forced rewrite returned empty text; aborting rewrite")
-            return None
-        return text
-    except Exception:
-        logger.exception("Forced rewrite LLM call failed")
-        return None
-
-
-# ---------------------------------------------------------------------------
-# Shared save-and-respond logic
-# ---------------------------------------------------------------------------
-
-
-async def _save_memory(
-    *,
-    updated_memory: str,
-    old_memory: str | None,
-    llm: Any | None,
-    apply_fn,
-    commit_fn,
-    rollback_fn,
-    label: str,
-    scope: Literal["user", "team"],
-) -> dict[str, Any]:
-    """Validate, optionally force-rewrite if over the hard limit, save, and
-    return a response dict.
-
-    Parameters
-    ----------
-    updated_memory : str
-        The new document the agent submitted.
-    old_memory : str | None
-        The previously persisted document (for diff checks).
-    llm : Any | None
-        LLM instance for forced rewrite (may be ``None``).
-    apply_fn : callable(str) -> None
-        Callback that sets the new memory on the ORM object.
-    commit_fn : coroutine
-        ``session.commit``.
-    rollback_fn : coroutine
-        ``session.rollback``.
-    label : str
-        Human label for log messages (e.g. "user memory", "team memory").
-    """
-    if not isinstance(updated_memory, str):
-        logger.warning(
-            "Refusing non-string memory payload (type=%s)",
-            type(updated_memory).__name__,
-        )
-        return {
-            "status": "error",
-            "message": "Internal error: memory payload must be a string.",
-        }
-
-    content = updated_memory
-
-    # --- forced rewrite if over the hard limit ---
-    if len(content) > MEMORY_HARD_LIMIT and llm is not None:
-        rewritten = await _forced_rewrite(content, llm)
-        if rewritten is not None and len(rewritten) < len(content):
-            content = rewritten
-
-    # --- hard-limit gate (reject if still too large after rewrite) ---
-    size_err = _validate_memory_size(content)
-    if size_err:
-        return size_err
-
-    scope_err = _validate_memory_scope(content, scope)
-    if scope_err:
-        return scope_err
-
-    # --- persist ---
-    try:
-        apply_fn(content)
-        await commit_fn()
-    except Exception as e:
-        logger.exception("Failed to update %s: %s", label, e)
-        await rollback_fn()
-        return {"status": "error", "message": f"Failed to update {label}: {e}"}
-
-    # --- build response ---
-    resp: dict[str, Any] = {
-        "status": "saved",
-        "message": f"{label.capitalize()} updated.",
-    }
-
-    if content is not updated_memory:
-        resp["notice"] = "Memory was automatically rewritten to fit within limits."
-
-    diff_warnings = _validate_diff(old_memory, content)
-    if diff_warnings:
-        resp["diff_warnings"] = diff_warnings
-
-    format_warnings = _validate_bullet_format(content)
-    if format_warnings:
-        resp["format_warnings"] = format_warnings
-
-    warning = _soft_warning(content)
-    if warning:
-        resp["warning"] = warning
-
-    return resp
-
-
-# ---------------------------------------------------------------------------
-# Tool factories
-# ---------------------------------------------------------------------------
-

 def create_update_memory_tool(
    user_id: str | UUID,
    db_session: AsyncSession,
    llm: Any | None = None,
 ):
-    """Factory function to create the user-memory update tool.
+    """Factory for the user-memory update tool.

-    The tool acquires its own short-lived ``AsyncSession`` per call via
-    :data:`async_session_maker` so the closure is safe to share across
-    HTTP requests by the compiled-agent cache. Capturing a per-request
-    session here would surface stale/closed sessions on cache hits.
-    The session's bound ``commit``/``rollback`` methods are captured at
-    call time, after ``async with`` has bound ``db_session`` locally.
-
-    Args:
-        user_id: ID of the user whose memory document is being updated.
-        db_session: Reserved for registry compatibility. Per-call sessions
-            are opened via :data:`async_session_maker` inside the tool body.
-        llm: Optional LLM for the forced-rewrite path.
-
-    Returns:
-        Configured update_memory tool for the user-memory scope.
+    Uses a fresh short-lived session per call so compiled-agent caches never
+    retain a stale request-scoped session.
    """
-    del db_session  # per-call session — see docstring
+    del db_session
    uid = UUID(user_id) if isinstance(user_id, str) else user_id

    @tool
    async def update_memory(updated_memory: str) -> dict[str, Any]:
        """Update the user's personal memory document.

-        Your current memory is shown in <user_memory> in the system prompt.
-        When the user shares important long-term information (preferences,
-        facts, instructions, context), rewrite the memory document to include
-        the new information.  Merge new facts with existing ones, update
-        contradictions, remove outdated entries, and keep it concise.
-
-        Args:
-            updated_memory: The FULL updated markdown document (not a diff).
+        The current memory is shown in <user_memory>. Pass the FULL updated
+        markdown document, not a diff.
        """
        try:
            async with async_session_maker() as db_session:
-                result = await db_session.execute(select(User).where(User.id == uid))
-                user = result.scalars().first()
-                if not user:
-                    return {"status": "error", "message": "User not found."}
-
-                old_memory = user.memory_md
-
-                return await _save_memory(
-                    updated_memory=updated_memory,
-                    old_memory=old_memory,
+                result = await save_memory(
+                    scope=MemoryScope.USER,
+                    target_id=uid,
+                    content=updated_memory,
+                    session=db_session,
                    llm=llm,
-                    apply_fn=lambda content: setattr(user, "memory_md", content),
-                    commit_fn=db_session.commit,
-                    rollback_fn=db_session.rollback,
-                    label="memory",
-                    scope="user",
                )
+                return result.to_dict()
        except Exception as e:
            logger.exception("Failed to update user memory: %s", e)
-            return {
-                "status": "error",
-                "message": f"Failed to update memory: {e}",
-            }
+            return {"status": "error", "message": f"Failed to update memory: {e}"}

    return update_memory

@ -373,64 +57,26 @@ def create_update_team_memory_tool(
    db_session: AsyncSession,
    llm: Any | None = None,
 ):
-    """Factory function to create the team-memory update tool.
-
-    The tool acquires its own short-lived ``AsyncSession`` per call via
-    :data:`async_session_maker` so the closure is safe to share across
-    HTTP requests by the compiled-agent cache. Capturing a per-request
-    session here would surface stale/closed sessions on cache hits.
-    The session's bound ``commit``/``rollback`` methods are captured at
-    call time, after ``async with`` has bound ``db_session`` locally.
-
-    Args:
-        search_space_id: ID of the search space whose team memory is being
-            updated.
-        db_session: Reserved for registry compatibility. Per-call sessions
-            are opened via :data:`async_session_maker` inside the tool body.
-        llm: Optional LLM for the forced-rewrite path.
-
-    Returns:
-        Configured update_memory tool for the team-memory scope.
-    """
-    del db_session  # per-call session — see docstring
+    """Factory for the team-memory update tool."""
+    del db_session

    @tool
    async def update_memory(updated_memory: str) -> dict[str, Any]:
        """Update the team's shared memory document for this search space.

-        Your current team memory is shown in <team_memory> in the system
-        prompt.  When the team shares important long-term information
-        (decisions, conventions, key facts, priorities), rewrite the memory
-        document to include the new information.  Merge new facts with
-        existing ones, update contradictions, remove outdated entries, and
-        keep it concise.
-
-        Args:
-            updated_memory: The FULL updated markdown document (not a diff).
+        The current team memory is shown in <team_memory>. Pass the FULL updated
+        markdown document, not a diff.
        """
        try:
            async with async_session_maker() as db_session:
-                result = await db_session.execute(
-                    select(SearchSpace).where(SearchSpace.id == search_space_id)
-                )
-                space = result.scalars().first()
-                if not space:
-                    return {"status": "error", "message": "Search space not found."}
-
-                old_memory = space.shared_memory_md
-
-                return await _save_memory(
-                    updated_memory=updated_memory,
-                    old_memory=old_memory,
+                result = await save_memory(
+                    scope=MemoryScope.TEAM,
+                    target_id=search_space_id,
+                    content=updated_memory,
+                    session=db_session,
                    llm=llm,
-                    apply_fn=lambda content: setattr(
-                        space, "shared_memory_md", content
-                    ),
-                    commit_fn=db_session.commit,
-                    rollback_fn=db_session.rollback,
-                    label="team memory",
-                    scope="team",
                )
+                return result.to_dict()
        except Exception as e:
            logger.exception("Failed to update team memory: %s", e)
            return {
@ -439,3 +85,9 @@ def create_update_team_memory_tool(
            }

    return update_memory
+
+
+__all__ = [
+    "create_update_memory_tool",
+    "create_update_team_memory_tool",
+]
--- a/surfsense_backend/app/routes/init.py
+++ b/surfsense_backend/app/routes/init.py
@ -54,6 +54,7 @@ from .search_spaces_routes import router as search_spaces_router
 from .slack_add_connector_route import router as slack_add_connector_router
 from .stripe_routes import router as stripe_router
 from .surfsense_docs_routes import router as surfsense_docs_router
+from .team_memory_routes import router as team_memory_router
 from .teams_add_connector_route import router as teams_add_connector_router
 from .video_presentations_routes import router as video_presentations_router
 from .vision_llm_routes import router as vision_llm_router
@ -117,3 +118,4 @@ router.include_router(stripe_router)  # Stripe checkout for additional page pack
 router.include_router(youtube_router)  # YouTube playlist resolution
 router.include_router(prompts_router)
 router.include_router(memory_router)  # User personal memory (memory.md style)
+router.include_router(team_memory_router)  # Search-space team memory
--- a/surfsense_backend/app/routes/memory_routes.py
+++ b/surfsense_backend/app/routes/memory_routes.py
@ -1,75 +1,40 @@
-"""Routes for user memory management (personal memory.md)."""
+"""Routes for user memory management."""

 from __future__ import annotations

-import logging
-
 from fastapi import APIRouter, Depends, HTTPException
-from langchain_core.messages import HumanMessage
 from pydantic import BaseModel
 from sqlalchemy.ext.asyncio import AsyncSession

-from app.agents.new_chat.llm_config import (
-    create_chat_litellm_from_agent_config,
-    load_agent_llm_config_for_search_space,
-)
-from app.agents.new_chat.tools.update_memory import MEMORY_HARD_LIMIT, _save_memory
 from app.db import User, get_async_session
+from app.services.memory import (
+    MemoryRead,
+    MemoryScope,
+    memory_limits,
+    read_memory,
+    reset_memory,
+    save_memory,
+)
 from app.users import current_active_user
-from app.utils.content_utils import extract_text_content
-
-logger = logging.getLogger(__name__)

 router = APIRouter()


-class MemoryRead(BaseModel):
-    memory_md: str
-
-
 class MemoryUpdate(BaseModel):
    memory_md: str


-class MemoryEditRequest(BaseModel):
-    query: str
-    search_space_id: int
-
-
-_MEMORY_EDIT_PROMPT = """\
-You are a memory editor. The user wants to modify their memory document. \
-Apply the user's instruction to the existing memory document and output the \
-FULL updated document.
-
-RULES:
-1. If the instruction asks to add something, add it with format: \
- (YYYY-MM-DD) [fact|pref|instr] text, under an existing or new ## heading. \
-Heading names should be personal and descriptive, not generic categories.
-2. If the instruction asks to remove something, remove the matching entry.
-3. If the instruction asks to change something, update the matching entry.
-4. Preserve existing ## headings and all other entries.
-5. Every bullet must include a marker: [fact], [pref], or [instr].
-6. Use the user's first name (from <user_name>) in entries instead of "the user".
-7. Output ONLY the updated markdown — no explanations, no wrapping.
-
-<user_name>{user_name}</user_name>
-
-<current_memory>
-{current_memory}
-</current_memory>
-
-<user_instruction>
-{instruction}
-</user_instruction>"""
-
-
@router.get("/users/me/memory", response_model=MemoryRead)
 async def get_user_memory(
    user: User = Depends(current_active_user),
    session: AsyncSession = Depends(get_async_session),
 ):
-    await session.refresh(user, ["memory_md"])
-    return MemoryRead(memory_md=user.memory_md or "")
+    memory_md = await read_memory(
+        scope=MemoryScope.USER,
+        target_id=user.id,
+        session=session,
+    )
+    return MemoryRead(memory_md=memory_md, limits=memory_limits())


@router.put("/users/me/memory", response_model=MemoryRead)
@ -78,73 +43,27 @@ async def update_user_memory(
    user: User = Depends(current_active_user),
    session: AsyncSession = Depends(get_async_session),
 ):
-    if len(body.memory_md) > MEMORY_HARD_LIMIT:
-        raise HTTPException(
-            status_code=400,
-            detail=f"Memory exceeds {MEMORY_HARD_LIMIT:,} character limit ({len(body.memory_md):,} chars).",
-        )
-    user.memory_md = body.memory_md
-    session.add(user)
-    await session.commit()
-    await session.refresh(user, ["memory_md"])
-    return MemoryRead(memory_md=user.memory_md or "")
+    result = await save_memory(
+        scope=MemoryScope.USER,
+        target_id=user.id,
+        content=body.memory_md,
+        session=session,
+    )
+    if result.status == "error":
+        raise HTTPException(status_code=400, detail=result.message)
+    return MemoryRead(memory_md=result.memory_md, limits=memory_limits())


-@router.post("/users/me/memory/edit", response_model=MemoryRead)
-async def edit_user_memory(
-    body: MemoryEditRequest,
+@router.post("/users/me/memory/reset", response_model=MemoryRead)
+async def reset_user_memory(
    user: User = Depends(current_active_user),
    session: AsyncSession = Depends(get_async_session),
 ):
-    """Apply a natural language edit to the user's personal memory via LLM."""
-    agent_config = await load_agent_llm_config_for_search_space(
-        session, body.search_space_id
+    result = await reset_memory(
+        scope=MemoryScope.USER,
+        target_id=user.id,
+        session=session,
    )
-    if not agent_config:
-        raise HTTPException(status_code=500, detail="No LLM configuration available.")
-    llm = create_chat_litellm_from_agent_config(agent_config)
-    if not llm:
-        raise HTTPException(status_code=500, detail="Failed to create LLM instance.")
-
-    await session.refresh(user, ["memory_md", "display_name"])
-    current_memory = user.memory_md or ""
-    first_name = (
-        user.display_name.strip().split()[0]
-        if user.display_name and user.display_name.strip()
-        else "The user"
-    )
-
-    prompt = _MEMORY_EDIT_PROMPT.format(
-        current_memory=current_memory or "(empty)",
-        instruction=body.query,
-        user_name=first_name,
-    )
-    try:
-        response = await llm.ainvoke(
-            [HumanMessage(content=prompt)],
-            config={"tags": ["surfsense:internal", "memory-edit"]},
-        )
-        updated = extract_text_content(response.content).strip()
-    except Exception as e:
-        logger.exception("Memory edit LLM call failed: %s", e)
-        raise HTTPException(status_code=500, detail="Memory edit failed.") from e
-
-    if not updated:
-        raise HTTPException(status_code=400, detail="LLM returned empty result.")
-
-    result = await _save_memory(
-        updated_memory=updated,
-        old_memory=current_memory,
-        llm=llm,
-        apply_fn=lambda content: setattr(user, "memory_md", content),
-        commit_fn=session.commit,
-        rollback_fn=session.rollback,
-        label="memory",
-        scope="user",
-    )
-
-    if result.get("status") == "error":
-        raise HTTPException(status_code=400, detail=result["message"])
-
-    await session.refresh(user, ["memory_md"])
-    return MemoryRead(memory_md=user.memory_md or "")
+    if result.status == "error":
+        raise HTTPException(status_code=400, detail=result.message)
+    return MemoryRead(memory_md=result.memory_md, limits=memory_limits())
--- a/surfsense_backend/app/routes/search_spaces_routes.py
+++ b/surfsense_backend/app/routes/search_spaces_routes.py
@ -1,17 +1,10 @@
 import logging

 from fastapi import APIRouter, Depends, HTTPException
-from langchain_core.messages import HumanMessage
-from pydantic import BaseModel as PydanticBaseModel
 from sqlalchemy import func, update
 from sqlalchemy.ext.asyncio import AsyncSession
 from sqlalchemy.future import select

-from app.agents.new_chat.llm_config import (
-    create_chat_litellm_from_agent_config,
-    load_agent_llm_config_for_search_space,
-)
-from app.agents.new_chat.tools.update_memory import MEMORY_HARD_LIMIT, _save_memory
 from app.config import config
 from app.db import (
    ImageGenerationConfig,
@ -35,7 +28,6 @@ from app.schemas import (
    SearchSpaceWithStats,
 )
 from app.users import current_active_user
-from app.utils.content_utils import extract_text_content
 from app.utils.rbac import check_permission, check_search_space_access

 logger = logging.getLogger(__name__)
@ -43,34 +35,6 @@ logger = logging.getLogger(__name__)
 router = APIRouter()


-class _TeamMemoryEditRequest(PydanticBaseModel):
-    query: str
-
-
-_TEAM_MEMORY_EDIT_PROMPT = """\
-You are a memory editor for a team workspace. The user wants to modify the \
-team's shared memory document. Apply the user's instruction to the existing \
-memory document and output the FULL updated document.
-
-RULES:
-1. If the instruction asks to add something, add it with format: \
- (YYYY-MM-DD) [fact] text, under an existing or new ## heading. \
-Heading names should be descriptive, not generic categories.
-2. If the instruction asks to remove something, remove the matching entry.
-3. If the instruction asks to change something, update the matching entry.
-4. Preserve existing ## headings and all other entries.
-5. NEVER use [pref] or [instr] markers. Team memory uses [fact] only.
-6. Output ONLY the updated markdown — no explanations, no wrapping.
-
-<current_memory>
-{current_memory}
-</current_memory>
-
-<user_instruction>
-{instruction}
-</user_instruction>"""
-
-
 async def create_default_roles_and_membership(
    session: AsyncSession,
    search_space_id: int,
@ -294,15 +258,6 @@ async def update_search_space(

        update_data = search_space_update.model_dump(exclude_unset=True)

-        if (
-            "shared_memory_md" in update_data
-            and len(update_data["shared_memory_md"] or "") > MEMORY_HARD_LIMIT
-        ):
-            raise HTTPException(
-                status_code=400,
-                detail=f"Team memory exceeds {MEMORY_HARD_LIMIT:,} character limit.",
-            )
-
        for key, value in update_data.items():
            setattr(db_search_space, key, value)
        await session.commit()
@ -317,72 +272,6 @@ async def update_search_space(
        ) from e


-@router.post(
-    "/searchspaces/{search_space_id}/memory/edit",
-    response_model=SearchSpaceRead,
-)
-async def edit_team_memory(
-    search_space_id: int,
-    body: _TeamMemoryEditRequest,
-    session: AsyncSession = Depends(get_async_session),
-    user: User = Depends(current_active_user),
-):
-    """Apply a natural language edit to the team memory via LLM."""
-    await check_search_space_access(session, user, search_space_id)
-
-    agent_config = await load_agent_llm_config_for_search_space(
-        session, search_space_id
-    )
-    if not agent_config:
-        raise HTTPException(status_code=500, detail="No LLM configuration available.")
-    llm = create_chat_litellm_from_agent_config(agent_config)
-    if not llm:
-        raise HTTPException(status_code=500, detail="Failed to create LLM instance.")
-
-    result = await session.execute(
-        select(SearchSpace).filter(SearchSpace.id == search_space_id)
-    )
-    db_search_space = result.scalars().first()
-    if not db_search_space:
-        raise HTTPException(status_code=404, detail="Search space not found")
-
-    current_memory = db_search_space.shared_memory_md or ""
-
-    prompt = _TEAM_MEMORY_EDIT_PROMPT.format(
-        current_memory=current_memory or "(empty)",
-        instruction=body.query,
-    )
-    try:
-        response = await llm.ainvoke(
-            [HumanMessage(content=prompt)],
-            config={"tags": ["surfsense:internal", "memory-edit"]},
-        )
-        updated = extract_text_content(response.content).strip()
-    except Exception as e:
-        logger.exception("Team memory edit LLM call failed: %s", e)
-        raise HTTPException(status_code=500, detail="Team memory edit failed.") from e
-
-    if not updated:
-        raise HTTPException(status_code=400, detail="LLM returned empty result.")
-
-    save_result = await _save_memory(
-        updated_memory=updated,
-        old_memory=current_memory,
-        llm=llm,
-        apply_fn=lambda content: setattr(db_search_space, "shared_memory_md", content),
-        commit_fn=session.commit,
-        rollback_fn=session.rollback,
-        label="team memory",
-        scope="team",
-    )
-
-    if save_result.get("status") == "error":
-        raise HTTPException(status_code=400, detail=save_result["message"])
-
-    await session.refresh(db_search_space)
-    return db_search_space
-
-
@router.post("/searchspaces/{search_space_id}/ai-sort")
 async def trigger_ai_sort(
    search_space_id: int,
--- a/surfsense_backend/app/routes/team_memory_routes.py
+++ b/surfsense_backend/app/routes/team_memory_routes.py
@ -0,0 +1,76 @@
+"""Routes for search-space team memory."""
+
+from __future__ import annotations
+
+from fastapi import APIRouter, Depends, HTTPException
+from pydantic import BaseModel
+from sqlalchemy.ext.asyncio import AsyncSession
+
+from app.db import User, get_async_session
+from app.services.memory import (
+    MemoryRead,
+    MemoryScope,
+    memory_limits,
+    read_memory,
+    reset_memory,
+    save_memory,
+)
+from app.users import current_active_user
+from app.utils.rbac import check_search_space_access
+
+router = APIRouter()
+
+
+class TeamMemoryUpdate(BaseModel):
+    memory_md: str
+
+
+@router.get("/searchspaces/{search_space_id}/memory", response_model=MemoryRead)
+async def get_team_memory(
+    search_space_id: int,
+    session: AsyncSession = Depends(get_async_session),
+    user: User = Depends(current_active_user),
+):
+    await check_search_space_access(session, user, search_space_id)
+    memory_md = await read_memory(
+        scope=MemoryScope.TEAM,
+        target_id=search_space_id,
+        session=session,
+    )
+    return MemoryRead(memory_md=memory_md, limits=memory_limits())
+
+
+@router.put("/searchspaces/{search_space_id}/memory", response_model=MemoryRead)
+async def update_team_memory(
+    search_space_id: int,
+    body: TeamMemoryUpdate,
+    session: AsyncSession = Depends(get_async_session),
+    user: User = Depends(current_active_user),
+):
+    await check_search_space_access(session, user, search_space_id)
+    result = await save_memory(
+        scope=MemoryScope.TEAM,
+        target_id=search_space_id,
+        content=body.memory_md,
+        session=session,
+    )
+    if result.status == "error":
+        raise HTTPException(status_code=400, detail=result.message)
+    return MemoryRead(memory_md=result.memory_md, limits=memory_limits())
+
+
+@router.post("/searchspaces/{search_space_id}/memory/reset", response_model=MemoryRead)
+async def reset_team_memory(
+    search_space_id: int,
+    session: AsyncSession = Depends(get_async_session),
+    user: User = Depends(current_active_user),
+):
+    await check_search_space_access(session, user, search_space_id)
+    result = await reset_memory(
+        scope=MemoryScope.TEAM,
+        target_id=search_space_id,
+        session=session,
+    )
+    if result.status == "error":
+        raise HTTPException(status_code=400, detail=result.message)
+    return MemoryRead(memory_md=result.memory_md, limits=memory_limits())
--- a/surfsense_backend/app/schemas/search_space.py
+++ b/surfsense_backend/app/schemas/search_space.py
@ -21,7 +21,6 @@ class SearchSpaceUpdate(BaseModel):
    description: str | None = None
    citations_enabled: bool | None = None
    qna_custom_instructions: str | None = None
-    shared_memory_md: str | None = None
    ai_file_sort_enabled: bool | None = None


--- a/surfsense_backend/app/services/memory/init.py
+++ b/surfsense_backend/app/services/memory/init.py
@ -0,0 +1,32 @@
+"""First-class memory service for user and team markdown memory."""
+
+from .schemas import MemoryLimits, MemoryRead
+from .service import (
+    MemoryScope,
+    SaveResult,
+    memory_limits,
+    read_memory,
+    reset_memory,
+    save_memory,
+)
+from .validation import (
+    MEMORY_HARD_LIMIT,
+    MEMORY_SOFT_LIMIT,
+    validate_bullet_format,
+    validate_memory_scope,
+)
+
+__all__ = [
+    "MEMORY_HARD_LIMIT",
+    "MEMORY_SOFT_LIMIT",
+    "MemoryLimits",
+    "MemoryRead",
+    "MemoryScope",
+    "SaveResult",
+    "memory_limits",
+    "read_memory",
+    "reset_memory",
+    "save_memory",
+    "validate_bullet_format",
+    "validate_memory_scope",
+]
--- a/surfsense_backend/app/services/memory/document.py
+++ b/surfsense_backend/app/services/memory/document.py
@ -0,0 +1,200 @@
+"""Memory-specific markdown document model and canonical renderer.
+
+This intentionally parses only SurfSense memory's small markdown contract:
+``##`` sections with dated bullet items. Unknown lines are preserved so user
+edits are not lost, while legacy marker bullets are normalized on render.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from datetime import date
+
+DEFAULT_LEGACY_SECTION = "Memory"
+LEGACY_MARKERS = frozenset({"fact", "pref", "instr"})
+
+
+@dataclass(frozen=True)
+class MemoryBullet:
+    entry_date: date
+    text: str
+
+
+@dataclass(frozen=True)
+class MemoryRawLine:
+    text: str
+
+
+MemoryLine = MemoryBullet | MemoryRawLine
+
+
+@dataclass(frozen=True)
+class MemorySection:
+    heading: str
+    lines: list[MemoryLine] = field(default_factory=list)
+    explicit_heading: bool = True
+
+
+@dataclass(frozen=True)
+class MemoryDocument:
+    sections: list[MemorySection] = field(default_factory=list)
+
+    @property
+    def has_explicit_heading(self) -> bool:
+        return any(section.explicit_heading for section in self.sections)
+
+
+def is_section_heading(line: str) -> bool:
+    return line.startswith("## ") and bool(line[3:].strip())
+
+
+def heading_text(line: str) -> str:
+    return line[3:].strip()
+
+
+def normalize_heading(heading: str) -> str:
+    chars: list[str] = []
+    previous_was_space = True
+    for char in heading.strip().lower():
+        if char.isalnum():
+            chars.append(char)
+            previous_was_space = False
+        elif not previous_was_space:
+            chars.append(" ")
+            previous_was_space = True
+    return "".join(chars).strip()
+
+
+def parse_bullet_line(line: str) -> MemoryBullet | None:
+    stripped = line.strip()
+    if not stripped.startswith("- "):
+        return None
+
+    body = stripped[2:]
+    parsed = _parse_canonical_bullet(body)
+    if parsed is not None:
+        return parsed
+    return _parse_legacy_bullet(body)
+
+
+def _parse_canonical_bullet(body: str) -> MemoryBullet | None:
+    if len(body) < 13 or body[10:12] != ": ":
+        return None
+    try:
+        entry_date = date.fromisoformat(body[:10])
+    except ValueError:
+        return None
+    text = body[12:].strip()
+    if not text:
+        return None
+    return MemoryBullet(entry_date=entry_date, text=text)
+
+
+def _parse_legacy_bullet(body: str) -> MemoryBullet | None:
+    if len(body) < 20 or not body.startswith("("):
+        return None
+    if len(body) < 14 or body[11:14] != ") [":
+        return None
+    try:
+        entry_date = date.fromisoformat(body[1:11])
+    except ValueError:
+        return None
+
+    marker_end = body.find("] ", 14)
+    if marker_end == -1:
+        return None
+    marker = body[14:marker_end]
+    if marker not in LEGACY_MARKERS:
+        return None
+
+    text = body[marker_end + 2 :].strip()
+    if not text:
+        return None
+    return MemoryBullet(entry_date=entry_date, text=text)
+
+
+def parse_memory_document(content: str | None) -> MemoryDocument:
+    if not content:
+        return MemoryDocument()
+
+    sections: list[MemorySection] = []
+    current_heading: str | None = None
+    current_explicit = True
+    current_lines: list[MemoryLine] = []
+
+    def flush_current() -> None:
+        nonlocal current_heading, current_explicit, current_lines
+        if current_heading is None:
+            return
+        sections.append(
+            MemorySection(
+                heading=current_heading,
+                lines=current_lines,
+                explicit_heading=current_explicit,
+            )
+        )
+        current_heading = None
+        current_explicit = True
+        current_lines = []
+
+    for raw_line in content.strip().splitlines():
+        line = raw_line.rstrip()
+        if is_section_heading(line):
+            flush_current()
+            current_heading = heading_text(line)
+            current_explicit = True
+            current_lines = []
+            continue
+
+        bullet = parse_bullet_line(line)
+        if current_heading is None:
+            if bullet is None:
+                continue
+            current_heading = DEFAULT_LEGACY_SECTION
+            current_explicit = False
+            current_lines = [bullet]
+            continue
+
+        current_lines.append(bullet if bullet is not None else MemoryRawLine(text=line))
+
+    flush_current()
+    return MemoryDocument(sections=sections)
+
+
+def render_memory_document(document: MemoryDocument) -> str:
+    rendered_sections: list[str] = []
+    for section in document.sections:
+        section_lines = [f"## {section.heading}"]
+        for line in section.lines:
+            if isinstance(line, MemoryBullet):
+                section_lines.append(f"- {line.entry_date.isoformat()}: {line.text}")
+            else:
+                section_lines.append(line.text)
+        rendered_sections.append("\n".join(section_lines).strip())
+    return "\n\n".join(section for section in rendered_sections if section).strip()
+
+
+def extract_headings(memory: str | None) -> set[str]:
+    document = parse_memory_document(memory)
+    return {
+        normalize_heading(section.heading)
+        for section in document.sections
+        if section.explicit_heading
+    }
+
+
+def has_explicit_heading(content: str) -> bool:
+    return parse_memory_document(content).has_explicit_heading
+
+
+def nonstandard_bullets(content: str) -> list[str]:
+    warnings: list[str] = []
+    for line in content.splitlines():
+        stripped = line.strip()
+        if not stripped.startswith("- "):
+            continue
+        if parse_bullet_line(stripped) is not None:
+            continue
+        short = stripped[:80] + ("..." if len(stripped) > 80 else "")
+        warnings.append(f"Non-standard memory bullet: {short}")
+    return warnings
--- a/surfsense_backend/app/services/memory/prompts.py
+++ b/surfsense_backend/app/services/memory/prompts.py
@ -0,0 +1,20 @@
+"""Prompts used by the memory service."""
+
+FORCED_REWRITE_PROMPT = """\
+You are a memory curator. The following memory document exceeds the character \
+limit and must be shortened.
+
+RULES:
+1. Rewrite the document to be under {target} characters.
+2. Output Markdown only. Use clear `##` headings and concise bullet points.
+3. New-format bullets should look like: `- YYYY-MM-DD: memory text`.
+4. If the input contains legacy markers like `(YYYY-MM-DD) [fact]`, preserve the
+   information but remove the inline marker in the output.
+5. Preserve durable instructions and preferences before generic facts when
+   compressing personal memory.
+6. Preserve existing headings when useful; merge duplicate headings and bullets.
+7. Output ONLY the consolidated markdown — no explanations, no wrapping.
+
+<memory_document>
+{content}
+</memory_document>"""
--- a/surfsense_backend/app/services/memory/rewrite.py
+++ b/surfsense_backend/app/services/memory/rewrite.py
@ -0,0 +1,35 @@
+"""LLM-backed memory rewrite helpers."""
+
+from __future__ import annotations
+
+import logging
+from typing import Any
+
+from langchain_core.messages import HumanMessage
+
+from app.services.memory.prompts import FORCED_REWRITE_PROMPT
+from app.services.memory.validation import MEMORY_HARD_LIMIT
+from app.utils.content_utils import extract_text_content
+
+logger = logging.getLogger(__name__)
+
+
+async def forced_rewrite(content: str, llm: Any) -> str | None:
+    """Use a focused LLM call to compress memory under the hard limit."""
+    try:
+        prompt = FORCED_REWRITE_PROMPT.format(
+            target=MEMORY_HARD_LIMIT,
+            content=content,
+        )
+        response = await llm.ainvoke(
+            [HumanMessage(content=prompt)],
+            config={"tags": ["surfsense:internal", "memory-rewrite"]},
+        )
+        text = extract_text_content(response.content).strip()
+        if not text:
+            logger.warning("Forced memory rewrite returned empty text")
+            return None
+        return text
+    except Exception:
+        logger.exception("Forced memory rewrite LLM call failed")
+        return None
--- a/surfsense_backend/app/services/memory/schemas.py
+++ b/surfsense_backend/app/services/memory/schemas.py
@ -0,0 +1,19 @@
+"""Schemas for memory API responses and structured extraction."""
+
+from __future__ import annotations
+
+from pydantic import BaseModel
+
+
+class MemoryLimits(BaseModel):
+    """Canonical memory size limits exposed to clients."""
+
+    soft: int
+    hard: int
+
+
+class MemoryRead(BaseModel):
+    """Memory document payload returned by user and team memory APIs."""
+
+    memory_md: str
+    limits: MemoryLimits
--- a/surfsense_backend/app/services/memory/service.py
+++ b/surfsense_backend/app/services/memory/service.py
@ -0,0 +1,247 @@
+"""Canonical read/write/reset/extract service for markdown memory."""
+
+from __future__ import annotations
+
+import logging
+from dataclasses import dataclass, field
+from enum import StrEnum
+from typing import Any, Literal
+from uuid import UUID
+
+from sqlalchemy import select
+from sqlalchemy.ext.asyncio import AsyncSession
+
+from app.db import SearchSpace, User
+from app.services.memory.document import parse_memory_document, render_memory_document
+from app.services.memory.rewrite import forced_rewrite
+from app.services.memory.schemas import MemoryLimits
+from app.services.memory.validation import (
+    MEMORY_HARD_LIMIT,
+    MEMORY_SOFT_LIMIT,
+    soft_limit_warning,
+    strip_preamble_to_first_heading,
+    validate_bullet_format,
+    validate_diff,
+    validate_heading_sanity,
+    validate_memory_scope,
+    validate_memory_size,
+)
+
+logger = logging.getLogger(__name__)
+
+_NO_UPDATE_SENTINELS = frozenset(
+    {
+        "NO_UPDATE",
+        "NO UPDATE",
+        "NO_CHANGE",
+        "NO CHANGE",
+    }
+)
+
+
+class MemoryScope(StrEnum):
+    USER = "user"
+    TEAM = "team"
+
+
+@dataclass(frozen=True)
+class SaveResult:
+    status: Literal["saved", "error", "no_op"]
+    message: str
+    memory_md: str = ""
+    warnings: list[str] = field(default_factory=list)
+    diff_warnings: list[str] = field(default_factory=list)
+    format_warnings: list[str] = field(default_factory=list)
+    notice: str | None = None
+
+    def to_dict(self) -> dict[str, Any]:
+        data: dict[str, Any] = {
+            "status": self.status,
+            "message": self.message,
+            "memory_md": self.memory_md,
+        }
+        if self.notice:
+            data["notice"] = self.notice
+        if self.warnings:
+            data["warnings"] = self.warnings
+            if len(self.warnings) == 1:
+                data["warning"] = self.warnings[0]
+        if self.diff_warnings:
+            data["diff_warnings"] = self.diff_warnings
+        if self.format_warnings:
+            data["format_warnings"] = self.format_warnings
+        return data
+
+
+def memory_limits() -> MemoryLimits:
+    return MemoryLimits(soft=MEMORY_SOFT_LIMIT, hard=MEMORY_HARD_LIMIT)
+
+
+def _normalize_scope(scope: MemoryScope | str) -> MemoryScope:
+    return scope if isinstance(scope, MemoryScope) else MemoryScope(scope)
+
+
+def _normalize_user_id(target_id: str | UUID) -> UUID:
+    return UUID(target_id) if isinstance(target_id, str) else target_id
+
+
+async def _load_target(
+    *,
+    scope: MemoryScope | str,
+    target_id: str | int | UUID,
+    session: AsyncSession,
+) -> User | SearchSpace | None:
+    normalized = _normalize_scope(scope)
+    if normalized is MemoryScope.USER:
+        result = await session.execute(
+            select(User).where(User.id == _normalize_user_id(target_id))  # type: ignore[arg-type]
+        )
+        return result.scalars().first()
+    result = await session.execute(
+        select(SearchSpace).where(SearchSpace.id == int(target_id))
+    )
+    return result.scalars().first()
+
+
+def _get_memory(target: User | SearchSpace, scope: MemoryScope) -> str:
+    if scope is MemoryScope.USER:
+        return getattr(target, "memory_md", None) or ""
+    return getattr(target, "shared_memory_md", None) or ""
+
+
+def _set_memory(target: User | SearchSpace, scope: MemoryScope, content: str) -> None:
+    if scope is MemoryScope.USER:
+        target.memory_md = content
+    else:
+        target.shared_memory_md = content
+
+
+async def read_memory(
+    *,
+    scope: MemoryScope | str,
+    target_id: str | int | UUID,
+    session: AsyncSession,
+) -> str:
+    normalized = _normalize_scope(scope)
+    target = await _load_target(scope=normalized, target_id=target_id, session=session)
+    if target is None:
+        return ""
+    return _get_memory(target, normalized)
+
+
+async def save_memory(
+    *,
+    scope: MemoryScope | str,
+    target_id: str | int | UUID,
+    content: str,
+    session: AsyncSession,
+    llm: Any | None = None,
+) -> SaveResult:
+    normalized = _normalize_scope(scope)
+    if not isinstance(content, str):
+        return SaveResult(
+            status="error",
+            message="Internal error: memory payload must be a string.",
+        )
+
+    target = await _load_target(scope=normalized, target_id=target_id, session=session)
+    if target is None:
+        return SaveResult(
+            status="error",
+            message="User not found."
+            if normalized is MemoryScope.USER
+            else "Search space not found.",
+        )
+
+    old_memory = _get_memory(target, normalized)
+    next_content = strip_preamble_to_first_heading(content.strip())
+    notice: str | None = None
+    warnings: list[str] = []
+
+    if next_content.upper() in _NO_UPDATE_SENTINELS:
+        return SaveResult(
+            status="no_op",
+            message="No memory update requested.",
+            memory_md=old_memory,
+        )
+
+    if len(next_content) > MEMORY_HARD_LIMIT and llm is not None:
+        rewritten = await forced_rewrite(next_content, llm)
+        if rewritten is not None and len(rewritten) < len(next_content):
+            next_content = strip_preamble_to_first_heading(rewritten)
+            notice = "Memory was automatically rewritten to fit within limits."
+
+    for validation in (
+        validate_memory_size(next_content),
+        validate_heading_sanity(next_content),
+    ):
+        if validation:
+            return SaveResult(
+                status="error",
+                message=validation["message"],
+                memory_md=old_memory,
+            )
+
+    scope_error, scope_warnings = validate_memory_scope(
+        next_content,
+        normalized.value,
+        old_memory=old_memory,
+    )
+    warnings.extend(scope_warnings)
+    if scope_error:
+        return SaveResult(
+            status="error",
+            message=scope_error["message"],
+            memory_md=old_memory,
+            warnings=warnings,
+        )
+
+    next_content = render_memory_document(parse_memory_document(next_content))
+
+    try:
+        _set_memory(target, normalized, next_content)
+        session.add(target)
+        await session.commit()
+    except Exception as e:
+        logger.exception("Failed to update %s memory: %s", normalized.value, e)
+        await session.rollback()
+        return SaveResult(
+            status="error",
+            message=f"Failed to update {normalized.value} memory: {e}",
+            memory_md=old_memory,
+        )
+
+    diff_warnings = validate_diff(old_memory, next_content)
+    format_warnings = validate_bullet_format(next_content)
+    warning = soft_limit_warning(next_content)
+    if warning:
+        warnings.append(warning)
+
+    return SaveResult(
+        status="saved",
+        message=(
+            "Memory updated."
+            if normalized is MemoryScope.USER
+            else "Team memory updated."
+        ),
+        memory_md=next_content,
+        warnings=warnings,
+        diff_warnings=diff_warnings,
+        format_warnings=format_warnings,
+        notice=notice,
+    )
+
+
+async def reset_memory(
+    *,
+    scope: MemoryScope | str,
+    target_id: str | int | UUID,
+    session: AsyncSession,
+) -> SaveResult:
+    return await save_memory(
+        scope=scope,
+        target_id=target_id,
+        content="",
+        session=session,
+        llm=None,
+    )
--- a/surfsense_backend/app/services/memory/validation.py
+++ b/surfsense_backend/app/services/memory/validation.py
@ -0,0 +1,140 @@
+"""Validation helpers for markdown-backed memory."""
+
+from __future__ import annotations
+
+from typing import Literal
+
+from app.services.memory.document import (
+    extract_headings,
+    has_explicit_heading,
+    nonstandard_bullets,
+    parse_memory_document,
+)
+
+MEMORY_SOFT_LIMIT = 18_000
+MEMORY_HARD_LIMIT = 25_000
+
+_FORBIDDEN_TEAM_HEADINGS = {
+    "preferences",
+    "instructions",
+    "personal notes",
+    "personal instructions",
+}
+
+
+def has_markdown_heading(content: str) -> bool:
+    return has_explicit_heading(content)
+
+
+def strip_preamble_to_first_heading(content: str) -> str:
+    """Drop model preamble before the first ``##`` heading, if one exists."""
+    lines = content.splitlines()
+    for index, line in enumerate(lines):
+        if line.startswith("## ") and line[3:].strip():
+            return "\n".join(lines[index:]).strip()
+    return content.strip()
+
+
+def validate_memory_size(content: str) -> dict[str, str] | None:
+    length = len(content)
+    if length > MEMORY_HARD_LIMIT:
+        return {
+            "status": "error",
+            "message": (
+                f"Memory exceeds {MEMORY_HARD_LIMIT:,} character limit "
+                f"({length:,} chars). Consolidate by merging related items, "
+                "removing outdated entries, and shortening descriptions."
+            ),
+        }
+    return None
+
+
+def validate_heading_sanity(content: str) -> dict[str, str] | None:
+    """Block long prose blobs without headings unless they are legacy bullets."""
+    stripped = content.strip()
+    if not stripped:
+        return None
+    if has_markdown_heading(stripped):
+        return None
+    if len(stripped) <= 40:
+        return None
+    if parse_memory_document(stripped).sections:
+        return None
+    return {
+        "status": "error",
+        "message": "Memory must be markdown with at least one ## heading.",
+    }
+
+
+def validate_memory_scope(
+    content: str,
+    scope: Literal["user", "team"],
+    *,
+    old_memory: str | None = None,
+) -> tuple[dict[str, str] | None, list[str]]:
+    """Reject new personal headings in team memory, grandfather existing ones."""
+    if scope != "team":
+        return None, []
+
+    old_forbidden = extract_headings(old_memory) & _FORBIDDEN_TEAM_HEADINGS
+    new_forbidden = extract_headings(content) & _FORBIDDEN_TEAM_HEADINGS
+    introduced = sorted(new_forbidden - old_forbidden)
+    grandfathered = sorted(new_forbidden & old_forbidden)
+
+    warnings: list[str] = []
+    if grandfathered:
+        warnings.append(
+            "Team memory contains legacy personal headings: "
+            + ", ".join(grandfathered)
+            + ". Please consolidate them into team-safe headings."
+        )
+    if introduced:
+        return (
+            {
+                "status": "error",
+                "message": (
+                    "Team memory cannot introduce personal headings: "
+                    + ", ".join(introduced)
+                    + ". Use team-safe headings instead."
+                ),
+            },
+            warnings,
+        )
+    return None, warnings
+
+
+def validate_bullet_format(content: str) -> list[str]:
+    return nonstandard_bullets(content)
+
+
+def validate_diff(old_memory: str | None, new_memory: str) -> list[str]:
+    if not old_memory:
+        return []
+
+    warnings: list[str] = []
+    old_headings = extract_headings(old_memory)
+    new_headings = extract_headings(new_memory)
+    dropped = old_headings - new_headings
+    if dropped:
+        names = ", ".join(sorted(dropped))
+        warnings.append(
+            f"Sections removed: {names}. If unintentional, restore them from the memory document."
+        )
+
+    old_len = len(old_memory)
+    new_len = len(new_memory)
+    if old_len > 0 and new_len < old_len * 0.4:
+        warnings.append(
+            f"Memory shrank significantly ({old_len:,} -> {new_len:,} chars). Possible data loss."
+        )
+    return warnings
+
+
+def soft_limit_warning(content: str) -> str | None:
+    length = len(content)
+    if length > MEMORY_SOFT_LIMIT:
+        return (
+            f"Memory is at {length:,}/{MEMORY_HARD_LIMIT:,} characters. "
+            "Consolidate by merging related items and removing less important entries."
+        )
+    return None
--- a/surfsense_backend/app/tasks/chat/stream_new_chat.py
+++ b/surfsense_backend/app/tasks/chat/stream_new_chat.py
@ -39,10 +39,6 @@ from app.agents.new_chat.llm_config import (
    load_agent_config,
    load_global_llm_config_by_id,
 )
-from app.agents.new_chat.memory_extraction import (
-    extract_and_save_memory,
-    extract_and_save_team_memory,
-)
 from app.agents.new_chat.mention_resolver import resolve_mentions, substitute_in_text
 from app.agents.new_chat.middleware.busy_mutex import (
    end_turn,
@ -281,7 +277,6 @@ class StreamResult:
    accumulated_text: str = ""
    is_interrupted: bool = False
    sandbox_files: list[str] = field(default_factory=list)
-    agent_called_update_memory: bool = False
    request_id: str | None = None
    turn_id: str = ""
    filesystem_mode: str = "cloud"
@ -1992,36 +1987,6 @@ async def stream_new_chat(
                },
            )

-        # Fire background memory extraction if the agent didn't handle it.
-        # Shared threads write to team memory; private threads write to user memory.
-        if not stream_result.agent_called_update_memory:
-            memory_seed = user_query.strip() or (
-                f"[{len(user_image_data_urls or [])} image(s)]"
-                if user_image_data_urls
-                else "(message)"
-            )
-            if visibility == ChatVisibility.SEARCH_SPACE:
-                task = asyncio.create_task(
-                    extract_and_save_team_memory(
-                        user_message=memory_seed,
-                        search_space_id=search_space_id,
-                        llm=llm,
-                        author_display_name=current_user_display_name,
-                    )
-                )
-                _background_tasks.add(task)
-                task.add_done_callback(_background_tasks.discard)
-            elif user_id:
-                task = asyncio.create_task(
-                    extract_and_save_memory(
-                        user_message=memory_seed,
-                        user_id=user_id,
-                        llm=llm,
-                    )
-                )
-                _background_tasks.add(task)
-                task.add_done_callback(_background_tasks.discard)
-
        # Finish the step and message
        yield streaming_service.format_data("turn-status", {"status": "idle"})
        yield streaming_service.format_finish_step()
--- a/surfsense_backend/app/tasks/chat/streaming/graph_stream/event_stream.py
+++ b/surfsense_backend/app/tasks/chat/streaming/graph_stream/event_stream.py
@ -48,4 +48,3 @@ async def stream_output(
        yield frame

    result.accumulated_text = state.accumulated_text
-    result.agent_called_update_memory = state.called_update_memory
--- a/surfsense_backend/app/tasks/chat/streaming/graph_stream/result.py
+++ b/surfsense_backend/app/tasks/chat/streaming/graph_stream/result.py
@ -11,7 +11,6 @@ class StreamingResult:
    accumulated_text: str = ""
    is_interrupted: bool = False
    sandbox_files: list[str] = field(default_factory=list)
-    agent_called_update_memory: bool = False
    request_id: str | None = None
    turn_id: str = ""
    filesystem_mode: str = "cloud"
--- a/surfsense_backend/app/tasks/chat/streaming/handlers/tool_end.py
+++ b/surfsense_backend/app/tasks/chat/streaming/handlers/tool_end.py
@ -36,9 +36,6 @@ def iter_tool_end_frames(
    raw_output = event.get("data", {}).get("output", "")
    staged_file_path = state.file_path_by_run.pop(run_id, None) if run_id else None

-    if tool_name == "update_memory":
-        state.called_update_memory = True
-
    if hasattr(raw_output, "content"):
        content = raw_output.content
        if isinstance(content, str):
--- a/surfsense_backend/app/tasks/chat/streaming/relay/state.py
+++ b/surfsense_backend/app/tasks/chat/streaming/relay/state.py
@ -32,7 +32,6 @@ class AgentEventRelayState:
    last_active_step_items: list[str] = field(default_factory=list)
    just_finished_tool: bool = False
    active_tool_depth: int = 0
-    called_update_memory: bool = False
    current_reasoning_id: str | None = None
    pending_tool_call_chunks: list[dict[str, Any]] = field(default_factory=list)
    lc_tool_call_id_by_run: dict[str, str] = field(default_factory=dict)