The pull-based KB design (on-demand search_knowledge_base tool + pre-injected
workspace tree) fully replaced the old eager retrieval path. Remove its last
remnants:
- Delete KnowledgePriorityMiddleware (knowledge_search.py) and its tests.
- Drop the kb_priority state field + reducer default; trim
KbContextProjectionMiddleware to project only workspace_tree_text.
- Remove the now-dead feature flags enable_kb_priority_preinjection and
enable_kb_planner_runnable across backend (flags, route schema, tests,
env examples) and frontend (settings toggle, zod schema).
- Scrub <priority_documents> and stale KnowledgePriorityMiddleware references
from prompts, docstrings, and the ADR.
No functional change: nothing wrote kb_priority and neither flag gated live
behavior after the cutover. Full backend suite green (pre-existing unrelated
failures aside).
The model frequently writes citations glued to the preceding word
(docs[17]); the (?<!\w) lookbehind (added to dodge arr[1] array indexing)
silently skipped these, leaving raw [n] that fails to render and reads
like array access. Drop the lookbehind so glued citations resolve; genuine
code/array syntax stays protected by the existing code-region carve-out and
unresolved ordinals still drop harmlessly.
Rewrite the main-agent citation contract to a single [n] channel and sync
the orphaned system_prompt_composer surface to match; drop stale
[citation:chunk_id] / <chunk_index> references from dynamic_context and
provider hints. Reuse the shared hybrid search in the deliverables report
(citations omitted for now) and delete the orphaned report KB helper.
Remove the dead eager KnowledgePriorityMiddleware wiring (knowledge_priority
+ stack) and its legacy browse test. Update ADR 0001 to reflect the cutover.
web_search now registers each result as a WEB_RESULT (locator {url}) and
renders a <web_results> block of <document view="excerpt"> [n] passages,
returning Command(update={messages, citation_registry}) like
search_knowledge_base. Collapse the duplicate research-subagent web_search
into the shared tool and teach the prompts to cite web hits with [n].
The main agent's search_knowledge_base tool runs the hybrid spine, renders
a <retrieved_context> of numbered [n] passages, and persists the registry.
KB subagent prompts teach citing [n] from <document view="full"> reads
(evidence.chunk_ids -> evidence.citations). Delete the now-unused
search->read highlighting hand-off: the kb_matched_chunk_ids state field,
its reducer default, the tool's _matched_chunk_ids writer, and the dead
KnowledgePriorityMiddleware writes.
Add the checkpointed CitationRegistry (load/merge helpers + state field)
and a lightweight CitationStateMiddleware so subagents can register into
the same conversation registry. Resolve [n] -> [citation:<payload>] at
stream finalize from the registry, polymorphically by source type.
Add a shared document_render package that renders sources as
<document view="excerpt|full"> blocks with server-assigned [n] passage
labels (KB locator {document_id, chunk_id}, web locator {url}). Wire the
KB read backend (kb_postgres) and read_file to the new renderer and drop
the legacy per-document XML renderer (document_xml, retrieved_context) and
the old chunk_index / matched="true" / <chunk id> read format.
Some OpenAI-compatible image backends (e.g. Xinference) return a relative
URL like /files/image.png in data[0].url instead of an absolute one.
Browsers cannot resolve these, causing images to fail to load.
Track the provider's api_base after resolving model config via to_litellm().
When the returned URL starts with "/", extract the origin (scheme + host + port)
from api_base and prepend it to produce a full absolute URL.
No behaviour change for providers that return absolute URLs (OpenAI, Azure, etc).
Closes#1496
Presentation and citation ordering moves off Chunk.id/created_at to the
explicit position column (id kept as tiebreaker). Vector and ts_rank
ranking order_by clauses are untouched.
document_converters, the github size-fallback chunker, revert_service
restores, and the kb-persistence middleware now write explicit positions
(the middleware read path also orders by position).