Add the checkpointed CitationRegistry (load/merge helpers + state field)
and a lightweight CitationStateMiddleware so subagents can register into
the same conversation registry. Resolve [n] -> [citation:<payload>] at
stream finalize from the registry, polymorphically by source type.
Add a shared document_render package that renders sources as
<document view="excerpt|full"> blocks with server-assigned [n] passage
labels (KB locator {document_id, chunk_id}, web locator {url}). Wire the
KB read backend (kb_postgres) and read_file to the new renderer and drop
the legacy per-document XML renderer (document_xml, retrieved_context) and
the old chunk_index / matched="true" / <chunk id> read format.
Mermaid diagram rendering was wired up upstream but the package was never
declared, breaking the dev build ("Module not found: Can't resolve
'mermaid'"). Add it to package.json and lock it.
The composer cleared the live mention atom synchronously on send (via the
editor reset), which raced the async onNew handler that read it — dropping
every @-mention (docs, folders, connectors, and the new chat references)
from the request.
handleSubmit now snapshots the chips before clearing, and onNew consumes
that snapshot (falling back to the live atom for the send-button path),
derives the payload via deriveMentionedPayload, and sends mentioned_thread_ids.
Add a Chats tab to the mention picker (excluding the current chat), carry
the "thread" kind through the inline editor's chip nodes, and render thread
chips on user messages with navigation to the referenced conversation.
Add the "thread" mention kind (makeThreadMention + stable dedup key) so a
chat can be referenced like a document. Also introduce submittedMentionsAtom
and a pure deriveMentionedPayload() helper, the building blocks for capturing
chips at submit time and mapping them to backend payload buckets.
Mirror search_threads visibility in the referenced-chat resolver: a
search-space owner can now @-mention legacy threads that predate creator
tracking (null created_by_id), instead of those being silently dropped.
Add unit tests for role-specific turn extraction in the resolver and for
the transcript renderer: full rendering within budget, dropping oldest
turns with a marker, partial-tail fill of an overflowing turn, and
multi-chat tagging.
Thread mentioned_thread_ids from the route through the orchestrator into
input-state assembly, resolve them for the requesting user, and append
the rendered referenced-chat block to the agent's query context.
Add the referenced_chat_context slice: models for the data shapes, a
fail-closed resolver that fetches mentioned threads and their visible
turns under the same access rules as thread search, and a transcript
renderer that emits a budgeted <referenced_chat_context> block. When a
chat exceeds the per-reference character budget, recent turns are kept
and any leftover budget is filled with the overflowing turn's tail, with
truncation markers signalling the cut.