mirror of
https://github.com/samvallad33/vestige.git
synced 2026-06-30 21:59:39 +02:00
feat(connectors): add Redmine and source filters (#57)
This commit is contained in:
parent
50e7f2d0fb
commit
4e893c02ff
9 changed files with 1445 additions and 108 deletions
45
CHANGELOG.md
45
CHANGELOG.md
|
|
@ -10,11 +10,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||||
> Bump `version` in the workspace `Cargo.toml`, both crates, `server.json`, and
|
> Bump `version` in the workspace `Cargo.toml`, both crates, `server.json`, and
|
||||||
> `package.json` to `2.1.27` at release/tag time, and date this heading.
|
> `package.json` to `2.1.27` at release/tag time, and date this heading.
|
||||||
|
|
||||||
Roadmap [#57](https://github.com/samvallad33/vestige/issues/57), **Phase 1–3**:
|
Roadmap [#57](https://github.com/samvallad33/vestige/issues/57), **Phases 1–4
|
||||||
Vestige can now act as a durable, local, semantically-searchable retrieval layer
|
(complete)**: Vestige can now act as a durable, local, semantically-searchable
|
||||||
over an external system of record — starting with GitHub Issues — without
|
retrieval layer over an external system of record — GitHub Issues and Redmine —
|
||||||
replacing it. The external system stays canonical; Vestige **indexes, connects,
|
without replacing it. The external system stays canonical; Vestige **indexes,
|
||||||
retrieves, and cites back** to the source record.
|
connects, retrieves, and cites back** to the source record.
|
||||||
|
|
||||||
Unlike a live ticket-system MCP proxy (which holds no state and is rate-limited
|
Unlike a live ticket-system MCP proxy (which holds no state and is rate-limited
|
||||||
per query), Vestige keeps a durable embedded index: searchable **offline**,
|
per query), Vestige keeps a durable embedded index: searchable **offline**,
|
||||||
|
|
@ -25,12 +25,22 @@ content-hash idempotent sync, and tombstoning of vanished records.
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|
||||||
- **`source_sync` MCP tool** — point Vestige at a GitHub repo
|
- **`source_sync` MCP tool** — index an external system into Vestige.
|
||||||
(`{"repo": "owner/name"}`) and it indexes every issue + its comments as
|
- GitHub: `{"source": "github", "repo": "owner/name"}` indexes every issue +
|
||||||
source-aware memories. Re-running updates changed issues in place (no
|
its comments. Auth via `GITHUB_TOKEN` (public repos work tokenless at a
|
||||||
duplicates); `reconcile: true` tombstones issues no longer visible upstream.
|
lower rate limit).
|
||||||
Auth via the `GITHUB_TOKEN` (or `VESTIGE_GITHUB_TOKEN`) environment variable;
|
- Redmine: `{"source": "redmine", "project": "<id>"}` indexes a project's
|
||||||
public repos work without a token at a lower rate limit.
|
issues + journals (comments and status/assignment history). Host from
|
||||||
|
`REDMINE_URL`, auth from `REDMINE_API_KEY`.
|
||||||
|
- Re-running updates changed issues in place (no duplicates); `reconcile:
|
||||||
|
true` tombstones issues no longer visible upstream.
|
||||||
|
- **Source-aware investigation filters on `search`** (Phase 4) — filter results
|
||||||
|
by `source_system`, `source_project`, `source_id`, `source_type`,
|
||||||
|
`source_author`, a `source_updated_after`/`source_updated_before` date range,
|
||||||
|
and `source_status` (`valid` / `tombstoned` / `any`). Status, tracker, and
|
||||||
|
priority remain filterable via the existing `tag_prefix` (the connectors emit
|
||||||
|
`status:`/`tracker:`/`priority:`/`label:` tags). Applied as post-filters;
|
||||||
|
non-connector memories are excluded from a source-scoped query.
|
||||||
- **Source envelope** on every memory — structured, machine-readable provenance
|
- **Source envelope** on every memory — structured, machine-readable provenance
|
||||||
(`source_system`, `source_id`, `source_url`, `source_updated_at`,
|
(`source_system`, `source_id`, `source_url`, `source_updated_at`,
|
||||||
`content_hash`, `synced_at`, `source_project`, `source_type`, `source_author`)
|
`content_hash`, `synced_at`, `source_project`, `source_type`, `source_author`)
|
||||||
|
|
@ -44,10 +54,15 @@ content-hash idempotent sync, and tombstoning of vanished records.
|
||||||
record is retained for audit but drops out of current retrieval).
|
record is retained for audit but drops out of current retrieval).
|
||||||
- **Connector contract** (`vestige_core::connectors`) — a small source-agnostic
|
- **Connector contract** (`vestige_core::connectors`) — a small source-agnostic
|
||||||
`Connector` trait + `run_sync` driver (cursor overlap window, incremental
|
`Connector` trait + `run_sync` driver (cursor overlap window, incremental
|
||||||
paging, optional deletion reconcile) and a GitHub Issues reference connector
|
paging, optional deletion reconcile) with two reference connectors behind the
|
||||||
behind the optional `connectors` cargo feature (on by default in the MCP
|
optional `connectors` cargo feature (on by default in the MCP server, off in
|
||||||
server, off in the core library's default features so non-connector consumers
|
the core library's default features so non-connector consumers link no HTTP
|
||||||
link no HTTP client).
|
client):
|
||||||
|
- **GitHub Issues** — `state=all`, `since` cursor, Link-header pagination,
|
||||||
|
drops PRs, host-pinned next-url.
|
||||||
|
- **Redmine** — `status_id=*` (open + closed), hex-encoded `updated_on>=`
|
||||||
|
cursor, `offset` pagination, per-issue detail fetch for journals (the list
|
||||||
|
endpoint omits them), `X-Redmine-API-Key` header auth.
|
||||||
|
|
||||||
### Database
|
### Database
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -490,25 +490,25 @@ mod tests {
|
||||||
fn hash_stable_across_label_order_and_changes_on_edit() {
|
fn hash_stable_across_label_order_and_changes_on_edit() {
|
||||||
let c = connector();
|
let c = connector();
|
||||||
let mut a = issue(1, "T", "body", "open");
|
let mut a = issue(1, "T", "body", "open");
|
||||||
a.labels = vec![
|
a.labels = vec![RawLabel { name: "b".into() }, RawLabel { name: "a".into() }];
|
||||||
RawLabel { name: "b".into() },
|
|
||||||
RawLabel { name: "a".into() },
|
|
||||||
];
|
|
||||||
let mut b = issue(1, "T", "body", "open");
|
let mut b = issue(1, "T", "body", "open");
|
||||||
b.labels = vec![
|
b.labels = vec![RawLabel { name: "a".into() }, RawLabel { name: "b".into() }];
|
||||||
RawLabel { name: "a".into() },
|
|
||||||
RawLabel { name: "b".into() },
|
|
||||||
];
|
|
||||||
let ha = c.normalize(&a, &[]).envelope.content_hash;
|
let ha = c.normalize(&a, &[]).envelope.content_hash;
|
||||||
let hb = c.normalize(&b, &[]).envelope.content_hash;
|
let hb = c.normalize(&b, &[]).envelope.content_hash;
|
||||||
assert_eq!(ha, hb, "label order must not change the hash");
|
assert_eq!(ha, hb, "label order must not change the hash");
|
||||||
|
|
||||||
// Editing the body must change the hash.
|
// Editing the body must change the hash.
|
||||||
let edited = c.normalize(&issue(1, "T", "EDITED", "open"), &[]).envelope.content_hash;
|
let edited = c
|
||||||
|
.normalize(&issue(1, "T", "EDITED", "open"), &[])
|
||||||
|
.envelope
|
||||||
|
.content_hash;
|
||||||
assert_ne!(ha, edited);
|
assert_ne!(ha, edited);
|
||||||
|
|
||||||
// Closing the issue changes state → changes the hash (not a no-op).
|
// Closing the issue changes state → changes the hash (not a no-op).
|
||||||
let closed = c.normalize(&issue(1, "T", "body", "closed"), &[]).envelope.content_hash;
|
let closed = c
|
||||||
|
.normalize(&issue(1, "T", "body", "closed"), &[])
|
||||||
|
.envelope
|
||||||
|
.content_hash;
|
||||||
assert_ne!(ha, closed);
|
assert_ne!(ha, closed);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -533,7 +533,10 @@ mod tests {
|
||||||
let second_pos = rec.content.find("second").unwrap();
|
let second_pos = rec.content.find("second").unwrap();
|
||||||
assert!(first_pos < second_pos, "comments must fold in id order");
|
assert!(first_pos < second_pos, "comments must fold in id order");
|
||||||
|
|
||||||
let no_comments = c.normalize(&issue(1, "T", "body", "open"), &[]).envelope.content_hash;
|
let no_comments = c
|
||||||
|
.normalize(&issue(1, "T", "body", "open"), &[])
|
||||||
|
.envelope
|
||||||
|
.content_hash;
|
||||||
assert_ne!(
|
assert_ne!(
|
||||||
rec.envelope.content_hash, no_comments,
|
rec.envelope.content_hash, no_comments,
|
||||||
"comments must contribute to the hash"
|
"comments must contribute to the hash"
|
||||||
|
|
|
||||||
|
|
@ -11,9 +11,9 @@
|
||||||
//! - The [`Connector`] contract, [`NormalizedRecord`] shape, and the stable
|
//! - The [`Connector`] contract, [`NormalizedRecord`] shape, and the stable
|
||||||
//! [`content_hash`] are pure (no network) and always compiled, so the sync
|
//! [`content_hash`] are pure (no network) and always compiled, so the sync
|
||||||
//! semantics are unit-testable without hitting an API.
|
//! semantics are unit-testable without hitting an API.
|
||||||
//! - Network-backed reference connectors (e.g. [`github`]) live behind the
|
//! - Network-backed reference connectors ([`github`] and [`redmine`]) live
|
||||||
//! `connectors` cargo feature so the default local-first build links no HTTP
|
//! behind the `connectors` cargo feature so the default local-first build
|
||||||
//! client.
|
//! links no HTTP client.
|
||||||
//!
|
//!
|
||||||
//! ## Sync contract (the part that makes re-running safe)
|
//! ## Sync contract (the part that makes re-running safe)
|
||||||
//!
|
//!
|
||||||
|
|
@ -39,6 +39,9 @@ use crate::storage::ConnectorCursor;
|
||||||
#[cfg(feature = "connectors")]
|
#[cfg(feature = "connectors")]
|
||||||
pub mod github;
|
pub mod github;
|
||||||
|
|
||||||
|
#[cfg(feature = "connectors")]
|
||||||
|
pub mod redmine;
|
||||||
|
|
||||||
/// A single external record, already normalized into the fields Vestige needs.
|
/// A single external record, already normalized into the fields Vestige needs.
|
||||||
///
|
///
|
||||||
/// The connector is responsible for flattening a possibly-rich source record
|
/// The connector is responsible for flattening a possibly-rich source record
|
||||||
|
|
@ -329,8 +332,16 @@ mod tests {
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn content_hash_is_order_independent() {
|
fn content_hash_is_order_independent() {
|
||||||
let a = content_hash(&[("title", "Crash"), ("body", "stacktrace"), ("state", "open")]);
|
let a = content_hash(&[
|
||||||
let b = content_hash(&[("state", "open"), ("title", "Crash"), ("body", "stacktrace")]);
|
("title", "Crash"),
|
||||||
|
("body", "stacktrace"),
|
||||||
|
("state", "open"),
|
||||||
|
]);
|
||||||
|
let b = content_hash(&[
|
||||||
|
("state", "open"),
|
||||||
|
("title", "Crash"),
|
||||||
|
("body", "stacktrace"),
|
||||||
|
]);
|
||||||
assert_eq!(a, b, "reordering fields must not change the hash");
|
assert_eq!(a, b, "reordering fields must not change the hash");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
737
crates/vestige-core/src/connectors/redmine.rs
Normal file
737
crates/vestige-core/src/connectors/redmine.rs
Normal file
|
|
@ -0,0 +1,737 @@
|
||||||
|
//! Redmine connector (#57).
|
||||||
|
//!
|
||||||
|
//! Indexes a Redmine project's issues + journals (comments and status/assignment
|
||||||
|
//! history) into source-aware Vestige memories so an investigative agent can
|
||||||
|
//! search and reason over years of ticket history **offline**, **semantically**,
|
||||||
|
//! and **cited back to the canonical issue URL**. Redmine stays the system of
|
||||||
|
//! record; Vestige indexes, connects, retrieves, and links back.
|
||||||
|
//!
|
||||||
|
//! ## Incremental sync (per the connector sync contract)
|
||||||
|
//!
|
||||||
|
//! Redmine's REST API has three traps this connector handles explicitly (all
|
||||||
|
//! confirmed against the official wiki + canonical defects):
|
||||||
|
//!
|
||||||
|
//! - **`status_id=*` is mandatory.** The list endpoint returns *open issues
|
||||||
|
//! only* by default, so without it closing an issue looks like a deletion and
|
||||||
|
//! closed issues are never synced (Defect #19088). We pass it on both the
|
||||||
|
//! incremental pull and the reconcile enumeration.
|
||||||
|
//! - **`include=journals` is silently ignored on the list endpoint.** Journals
|
||||||
|
//! come back only on the per-issue detail endpoint `GET /issues/{id}.json`
|
||||||
|
//! (Defect #35242), so each changed issue costs one extra round-trip.
|
||||||
|
//! - **Filter operators must be hex-encoded** in the compact form
|
||||||
|
//! (`updated_on=>=…` → `updated_on=%3E%3D…`). We build the query with
|
||||||
|
//! `reqwest`'s `.query(&[…])` and pass the raw `>=…` value so it is encoded
|
||||||
|
//! exactly once (no double-encoding).
|
||||||
|
//!
|
||||||
|
//! `sort=updated_on:asc` pages forward in cursor order so a mid-run interruption
|
||||||
|
//! resumes safely; the `since = cursor − overlap` window + the `content_hash`
|
||||||
|
//! no-op make the re-scan free. Redmine has no deletion feed, so deletions are
|
||||||
|
//! reconciled out-of-band via [`list_live_ids`](Connector::list_live_ids).
|
||||||
|
|
||||||
|
use chrono::{DateTime, Utc};
|
||||||
|
use serde::Deserialize;
|
||||||
|
|
||||||
|
use super::{
|
||||||
|
Connector, ConnectorError, ConnectorResult, FetchPage, NormalizedRecord, content_hash,
|
||||||
|
};
|
||||||
|
use crate::memory::SourceEnvelope;
|
||||||
|
|
||||||
|
const USER_AGENT: &str = concat!("vestige-connector/", env!("CARGO_PKG_VERSION"));
|
||||||
|
const PAGE_LIMIT: u32 = 100;
|
||||||
|
|
||||||
|
/// Configuration for a Redmine connector instance bound to one project.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct RedmineConfig {
|
||||||
|
/// Base URL of the Redmine instance, e.g. `https://redmine.example.com`.
|
||||||
|
pub base_url: String,
|
||||||
|
/// Project identifier to scope the sync to. May be the numeric id or the
|
||||||
|
/// project identifier slug — used as `project_id` and stored as
|
||||||
|
/// `source_project`. (Note: Redmine's `project_id` list filter wants the
|
||||||
|
/// numeric id; the slug works as the human-readable scope label.)
|
||||||
|
pub project: String,
|
||||||
|
/// API access key. Optional only if the instance allows anonymous REST.
|
||||||
|
pub api_key: Option<String>,
|
||||||
|
/// Max journals to fold into one issue memory (defense against huge threads).
|
||||||
|
pub max_journals: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RedmineConfig {
|
||||||
|
pub fn new(base_url: impl Into<String>, project: impl Into<String>) -> Self {
|
||||||
|
Self {
|
||||||
|
base_url: base_url.into(),
|
||||||
|
project: project.into(),
|
||||||
|
api_key: None,
|
||||||
|
max_journals: 100,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn with_api_key(mut self, key: Option<String>) -> Self {
|
||||||
|
self.api_key = key;
|
||||||
|
self
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Base URL with any trailing slash removed.
|
||||||
|
fn root(&self) -> String {
|
||||||
|
self.base_url.trim_end_matches('/').to_string()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A Redmine connector bound to one project.
|
||||||
|
pub struct RedmineConnector {
|
||||||
|
config: RedmineConfig,
|
||||||
|
scope: String,
|
||||||
|
client: reqwest::Client,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RedmineConnector {
|
||||||
|
pub fn new(config: RedmineConfig) -> ConnectorResult<Self> {
|
||||||
|
if config.base_url.trim().is_empty() {
|
||||||
|
return Err(ConnectorError::Config("base_url is required".to_string()));
|
||||||
|
}
|
||||||
|
if config.project.trim().is_empty() {
|
||||||
|
return Err(ConnectorError::Config("project is required".to_string()));
|
||||||
|
}
|
||||||
|
if reqwest::Url::parse(&config.root()).is_err() {
|
||||||
|
return Err(ConnectorError::Config(format!(
|
||||||
|
"base_url is not a valid URL: {}",
|
||||||
|
config.base_url
|
||||||
|
)));
|
||||||
|
}
|
||||||
|
let client = reqwest::Client::builder()
|
||||||
|
.user_agent(USER_AGENT)
|
||||||
|
.build()
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
let scope = config.project.clone();
|
||||||
|
Ok(Self {
|
||||||
|
config,
|
||||||
|
scope,
|
||||||
|
client,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn auth(&self, req: reqwest::RequestBuilder) -> reqwest::RequestBuilder {
|
||||||
|
let req = req.header("Accept", "application/json");
|
||||||
|
match &self.config.api_key {
|
||||||
|
// The key goes in the header (not the URL) so it stays out of proxy
|
||||||
|
// and access logs.
|
||||||
|
Some(k) => req.header("X-Redmine-API-Key", k),
|
||||||
|
None => req,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn classify_status(resp: &reqwest::Response) -> Option<ConnectorError> {
|
||||||
|
let status = resp.status();
|
||||||
|
if status.is_success() {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
if status.as_u16() == 429 {
|
||||||
|
let retry = resp
|
||||||
|
.headers()
|
||||||
|
.get("retry-after")
|
||||||
|
.and_then(|v| v.to_str().ok())
|
||||||
|
.and_then(|s| s.parse::<u64>().ok())
|
||||||
|
.map(std::time::Duration::from_secs);
|
||||||
|
return Some(ConnectorError::RateLimited(retry));
|
||||||
|
}
|
||||||
|
let message = match status.as_u16() {
|
||||||
|
// A valid key against an instance with REST disabled 401/403s; make
|
||||||
|
// that distinguishable from "no results".
|
||||||
|
401 | 403 => {
|
||||||
|
"unauthorized — check REDMINE_API_KEY and that the instance has the REST API enabled (Administration → Settings → API)"
|
||||||
|
.to_string()
|
||||||
|
}
|
||||||
|
_ => status
|
||||||
|
.canonical_reason()
|
||||||
|
.unwrap_or("request failed")
|
||||||
|
.to_string(),
|
||||||
|
};
|
||||||
|
Some(ConnectorError::Source {
|
||||||
|
status: status.as_u16(),
|
||||||
|
message,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Fetch the journals + relations for one issue (the detail endpoint —
|
||||||
|
/// journals are not returned on the list endpoint).
|
||||||
|
async fn fetch_detail(&self, issue_id: u64) -> ConnectorResult<RawIssue> {
|
||||||
|
let url = format!("{}/issues/{}.json", self.config.root(), issue_id);
|
||||||
|
let resp = self
|
||||||
|
.auth(self.client.get(&url))
|
||||||
|
.query(&[("include", "journals,relations")])
|
||||||
|
.send()
|
||||||
|
.await
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
if let Some(err) = Self::classify_status(&resp) {
|
||||||
|
return Err(err);
|
||||||
|
}
|
||||||
|
let wrapper: IssueWrapper = resp
|
||||||
|
.json()
|
||||||
|
.await
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
Ok(wrapper.issue)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Fold a raw issue (with journals) into one normalized memory record.
|
||||||
|
fn normalize(&self, issue: &RawIssue) -> NormalizedRecord {
|
||||||
|
let status_name = issue.status.as_ref().map(|s| s.name.clone());
|
||||||
|
let tracker_name = issue.tracker.as_ref().map(|t| t.name.clone());
|
||||||
|
let author = issue.author.as_ref().map(|a| a.name.clone());
|
||||||
|
|
||||||
|
// Journals sorted by id for a stable order + stable hash. Keep notes
|
||||||
|
// and field changes so status/assignment history remains searchable.
|
||||||
|
let mut journals: Vec<&RawJournal> = issue
|
||||||
|
.journals
|
||||||
|
.iter()
|
||||||
|
.filter(|j| {
|
||||||
|
j.notes
|
||||||
|
.as_deref()
|
||||||
|
.map(|n| !n.trim().is_empty())
|
||||||
|
.unwrap_or(false)
|
||||||
|
|| !j.details.is_empty()
|
||||||
|
})
|
||||||
|
.collect();
|
||||||
|
journals.sort_by_key(|j| j.id);
|
||||||
|
journals.truncate(self.config.max_journals);
|
||||||
|
|
||||||
|
// Human-readable content.
|
||||||
|
let mut content = format!("[{}#{}] {}\n", self.scope, issue.id, issue.subject);
|
||||||
|
if let Some(s) = &status_name {
|
||||||
|
content.push_str(&format!("Status: {s}\n"));
|
||||||
|
}
|
||||||
|
if let Some(t) = &tracker_name {
|
||||||
|
content.push_str(&format!("Tracker: {t}\n"));
|
||||||
|
}
|
||||||
|
if let Some(desc) = &issue.description
|
||||||
|
&& !desc.trim().is_empty()
|
||||||
|
{
|
||||||
|
content.push('\n');
|
||||||
|
content.push_str(desc.trim());
|
||||||
|
content.push('\n');
|
||||||
|
}
|
||||||
|
for j in &journals {
|
||||||
|
let who = j.user.as_ref().map(|u| u.name.as_str()).unwrap_or("?");
|
||||||
|
let note = j.notes.as_deref().unwrap_or("").trim();
|
||||||
|
if !note.is_empty() {
|
||||||
|
content.push_str(&format!("\n- {who}: {note}"));
|
||||||
|
}
|
||||||
|
for detail in &j.details {
|
||||||
|
content.push_str(&format!(
|
||||||
|
"\n- {who} changed {}{}: {} -> {}",
|
||||||
|
detail.property.as_deref().unwrap_or("field"),
|
||||||
|
detail
|
||||||
|
.name
|
||||||
|
.as_deref()
|
||||||
|
.map(|n| format!(".{n}"))
|
||||||
|
.unwrap_or_default(),
|
||||||
|
detail.old_value.as_deref().unwrap_or(""),
|
||||||
|
detail.new_value.as_deref().unwrap_or("")
|
||||||
|
));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if !issue.relations.is_empty() {
|
||||||
|
content.push_str("\n\nRelations:");
|
||||||
|
let mut relations: Vec<&RawRelation> = issue.relations.iter().collect();
|
||||||
|
relations.sort_by_key(|r| r.id);
|
||||||
|
for relation in relations {
|
||||||
|
let related = relation.related_issue_id(issue.id);
|
||||||
|
content.push_str(&format!(
|
||||||
|
"\n- #{} ({})",
|
||||||
|
related,
|
||||||
|
relation.relation_type.as_deref().unwrap_or("relates")
|
||||||
|
));
|
||||||
|
if let Some(delay) = relation.delay {
|
||||||
|
content.push_str(&format!(", delay {delay}"));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Stable content hash — meaning only, never the cursor (`updated_on`) or
|
||||||
|
// volatile counts. Journals and relations contribute stable fields in id
|
||||||
|
// order.
|
||||||
|
let journals_blob = journals
|
||||||
|
.iter()
|
||||||
|
.map(|j| {
|
||||||
|
let details = j
|
||||||
|
.details
|
||||||
|
.iter()
|
||||||
|
.map(|d| {
|
||||||
|
format!(
|
||||||
|
"{}:{}:{}:{}",
|
||||||
|
d.property.as_deref().unwrap_or(""),
|
||||||
|
d.name.as_deref().unwrap_or(""),
|
||||||
|
d.old_value.as_deref().unwrap_or(""),
|
||||||
|
d.new_value.as_deref().unwrap_or("")
|
||||||
|
)
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>()
|
||||||
|
.join("\u{1e}");
|
||||||
|
format!(
|
||||||
|
"{}:{}:{}",
|
||||||
|
j.id,
|
||||||
|
j.notes.as_deref().unwrap_or("").trim(),
|
||||||
|
details
|
||||||
|
)
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>()
|
||||||
|
.join("\u{1f}");
|
||||||
|
let relations_blob = {
|
||||||
|
let mut relations: Vec<&RawRelation> = issue.relations.iter().collect();
|
||||||
|
relations.sort_by_key(|r| r.id);
|
||||||
|
relations
|
||||||
|
.iter()
|
||||||
|
.map(|r| {
|
||||||
|
format!(
|
||||||
|
"{}:{}:{}:{}",
|
||||||
|
r.id,
|
||||||
|
r.issue_id.unwrap_or_default(),
|
||||||
|
r.issue_to_id.unwrap_or_default(),
|
||||||
|
r.relation_type.as_deref().unwrap_or("")
|
||||||
|
)
|
||||||
|
})
|
||||||
|
.collect::<Vec<_>>()
|
||||||
|
.join("\u{1f}")
|
||||||
|
};
|
||||||
|
let id_str = issue.id.to_string();
|
||||||
|
let status_id_str = issue
|
||||||
|
.status
|
||||||
|
.as_ref()
|
||||||
|
.map(|s| s.id.to_string())
|
||||||
|
.unwrap_or_default();
|
||||||
|
let tracker_id_str = issue
|
||||||
|
.tracker
|
||||||
|
.as_ref()
|
||||||
|
.map(|t| t.id.to_string())
|
||||||
|
.unwrap_or_default();
|
||||||
|
let done_ratio_str = issue.done_ratio.unwrap_or(0).to_string();
|
||||||
|
let desc_str = issue.description.clone().unwrap_or_default();
|
||||||
|
let hash = content_hash(&[
|
||||||
|
("id", &id_str),
|
||||||
|
("subject", &issue.subject),
|
||||||
|
("description", &desc_str),
|
||||||
|
("status_id", &status_id_str),
|
||||||
|
("tracker_id", &tracker_id_str),
|
||||||
|
("done_ratio", &done_ratio_str),
|
||||||
|
("journals", &journals_blob),
|
||||||
|
("relations", &relations_blob),
|
||||||
|
]);
|
||||||
|
|
||||||
|
// Tags, lowercased — `tag_prefix` matching is case-sensitive, and
|
||||||
|
// Redmine status/tracker names are mixed-case.
|
||||||
|
let mut tags = vec!["redmine".to_string(), "issue".to_string()];
|
||||||
|
if let Some(s) = &status_name {
|
||||||
|
tags.push(format!("status:{}", s.to_lowercase()));
|
||||||
|
}
|
||||||
|
if let Some(t) = &tracker_name {
|
||||||
|
tags.push(format!("tracker:{}", t.to_lowercase()));
|
||||||
|
}
|
||||||
|
if let Some(p) = &issue.priority {
|
||||||
|
tags.push(format!("priority:{}", p.name.to_lowercase()));
|
||||||
|
}
|
||||||
|
|
||||||
|
let envelope = SourceEnvelope {
|
||||||
|
source_system: Some("redmine".to_string()),
|
||||||
|
source_id: Some(issue.id.to_string()),
|
||||||
|
source_url: Some(format!("{}/issues/{}", self.config.root(), issue.id)),
|
||||||
|
source_updated_at: issue
|
||||||
|
.updated_on
|
||||||
|
.as_deref()
|
||||||
|
.and_then(|s| DateTime::parse_from_rfc3339(s).ok())
|
||||||
|
.map(|d| d.with_timezone(&Utc)),
|
||||||
|
content_hash: Some(hash),
|
||||||
|
synced_at: Some(Utc::now()),
|
||||||
|
source_project: Some(self.scope.clone()),
|
||||||
|
source_type: Some("issue".to_string()),
|
||||||
|
source_author: author,
|
||||||
|
};
|
||||||
|
|
||||||
|
NormalizedRecord {
|
||||||
|
content,
|
||||||
|
tags,
|
||||||
|
envelope,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Connector for RedmineConnector {
|
||||||
|
fn source_system(&self) -> &str {
|
||||||
|
"redmine"
|
||||||
|
}
|
||||||
|
|
||||||
|
fn scope(&self) -> &str {
|
||||||
|
&self.scope
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn fetch_updated(
|
||||||
|
&self,
|
||||||
|
since: Option<DateTime<Utc>>,
|
||||||
|
cursor: Option<String>,
|
||||||
|
) -> ConnectorResult<FetchPage> {
|
||||||
|
// The cursor carries the next offset (Redmine pages by offset, not an
|
||||||
|
// opaque url). First page = offset 0.
|
||||||
|
let offset: u32 = cursor.as_deref().and_then(|c| c.parse().ok()).unwrap_or(0);
|
||||||
|
|
||||||
|
let url = format!("{}/issues.json", self.config.root());
|
||||||
|
let limit_str = PAGE_LIMIT.to_string();
|
||||||
|
let offset_str = offset.to_string();
|
||||||
|
// Build params; reqwest percent-encodes each value exactly once, so we
|
||||||
|
// pass the RAW `>=…` operator (it becomes %3E%3D on the wire). Do not
|
||||||
|
// pre-encode here or it would be double-encoded.
|
||||||
|
let mut params: Vec<(&str, String)> = vec![
|
||||||
|
("status_id", "*".to_string()),
|
||||||
|
("sort", "updated_on:asc".to_string()),
|
||||||
|
("project_id", self.config.project.clone()),
|
||||||
|
("limit", limit_str),
|
||||||
|
("offset", offset_str),
|
||||||
|
];
|
||||||
|
if let Some(s) = since {
|
||||||
|
let since_z = s.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
|
||||||
|
params.push(("updated_on", format!(">={since_z}")));
|
||||||
|
}
|
||||||
|
|
||||||
|
let resp = self
|
||||||
|
.auth(self.client.get(&url))
|
||||||
|
.query(¶ms)
|
||||||
|
.send()
|
||||||
|
.await
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
if let Some(err) = Self::classify_status(&resp) {
|
||||||
|
return Err(err);
|
||||||
|
}
|
||||||
|
let page: IssueListResponse = resp
|
||||||
|
.json()
|
||||||
|
.await
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
|
||||||
|
// Per-issue detail fetch for journals (list endpoint omits them).
|
||||||
|
let mut records = Vec::new();
|
||||||
|
for summary in &page.issues {
|
||||||
|
let detailed = match self.fetch_detail(summary.id).await {
|
||||||
|
Ok(d) => d,
|
||||||
|
// A single issue failing detail-fetch should not abort the page;
|
||||||
|
// fall back to the list-level fields (no journals).
|
||||||
|
Err(_) => summary.clone(),
|
||||||
|
};
|
||||||
|
records.push(self.normalize(&detailed));
|
||||||
|
}
|
||||||
|
|
||||||
|
// Advance the offset cursor until we've walked total_count.
|
||||||
|
let next_offset = offset + page.issues.len() as u32;
|
||||||
|
let next_cursor = if (next_offset as u64) < page.total_count && !page.issues.is_empty() {
|
||||||
|
Some(next_offset.to_string())
|
||||||
|
} else {
|
||||||
|
None
|
||||||
|
};
|
||||||
|
|
||||||
|
Ok(FetchPage {
|
||||||
|
records,
|
||||||
|
next_cursor,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn list_live_ids(&self) -> ConnectorResult<Option<Vec<String>>> {
|
||||||
|
// Enumerate all issue ids (open AND closed) for the reconcile pass.
|
||||||
|
// status_id=* is mandatory here too, or closed issues read as deleted.
|
||||||
|
let mut ids = Vec::new();
|
||||||
|
let mut offset: u32 = 0;
|
||||||
|
loop {
|
||||||
|
let url = format!("{}/issues.json", self.config.root());
|
||||||
|
let resp = self
|
||||||
|
.auth(self.client.get(&url))
|
||||||
|
.query(&[
|
||||||
|
("status_id", "*".to_string()),
|
||||||
|
("project_id", self.config.project.clone()),
|
||||||
|
("limit", PAGE_LIMIT.to_string()),
|
||||||
|
("offset", offset.to_string()),
|
||||||
|
])
|
||||||
|
.send()
|
||||||
|
.await
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
if let Some(err) = Self::classify_status(&resp) {
|
||||||
|
return Err(err);
|
||||||
|
}
|
||||||
|
let page: IssueListResponse = resp
|
||||||
|
.json()
|
||||||
|
.await
|
||||||
|
.map_err(|e| ConnectorError::Transport(e.to_string()))?;
|
||||||
|
if page.issues.is_empty() {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
for issue in &page.issues {
|
||||||
|
ids.push(issue.id.to_string());
|
||||||
|
}
|
||||||
|
offset += page.issues.len() as u32;
|
||||||
|
if (offset as u64) >= page.total_count {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Ok(Some(ids))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Raw Redmine API shapes (only the fields we use)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[derive(Debug, Deserialize)]
|
||||||
|
struct IssueListResponse {
|
||||||
|
#[serde(default)]
|
||||||
|
issues: Vec<RawIssue>,
|
||||||
|
#[serde(default)]
|
||||||
|
total_count: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Deserialize)]
|
||||||
|
struct IssueWrapper {
|
||||||
|
issue: RawIssue,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize)]
|
||||||
|
struct RawIssue {
|
||||||
|
id: u64,
|
||||||
|
#[serde(default)]
|
||||||
|
subject: String,
|
||||||
|
#[serde(default)]
|
||||||
|
description: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
status: Option<NamedRef>,
|
||||||
|
#[serde(default)]
|
||||||
|
tracker: Option<NamedRef>,
|
||||||
|
#[serde(default)]
|
||||||
|
priority: Option<NamedRef>,
|
||||||
|
#[serde(default)]
|
||||||
|
author: Option<NamedRef>,
|
||||||
|
#[serde(default)]
|
||||||
|
done_ratio: Option<i64>,
|
||||||
|
#[serde(default)]
|
||||||
|
updated_on: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
journals: Vec<RawJournal>,
|
||||||
|
#[serde(default)]
|
||||||
|
relations: Vec<RawRelation>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Redmine `{id, name}` reference (status, tracker, priority, user, …).
|
||||||
|
#[derive(Debug, Clone, Deserialize)]
|
||||||
|
struct NamedRef {
|
||||||
|
#[serde(default)]
|
||||||
|
id: i64,
|
||||||
|
#[serde(default)]
|
||||||
|
name: String,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize)]
|
||||||
|
struct RawJournal {
|
||||||
|
id: u64,
|
||||||
|
#[serde(default)]
|
||||||
|
notes: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
user: Option<NamedRef>,
|
||||||
|
#[serde(default)]
|
||||||
|
details: Vec<RawJournalDetail>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize)]
|
||||||
|
struct RawJournalDetail {
|
||||||
|
#[serde(default)]
|
||||||
|
property: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
name: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
old_value: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
new_value: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize)]
|
||||||
|
struct RawRelation {
|
||||||
|
#[serde(default)]
|
||||||
|
id: u64,
|
||||||
|
#[serde(default)]
|
||||||
|
issue_id: Option<u64>,
|
||||||
|
#[serde(default)]
|
||||||
|
issue_to_id: Option<u64>,
|
||||||
|
#[serde(default)]
|
||||||
|
relation_type: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
delay: Option<i64>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RawRelation {
|
||||||
|
fn related_issue_id(&self, current_issue_id: u64) -> u64 {
|
||||||
|
match (self.issue_id, self.issue_to_id) {
|
||||||
|
(Some(from), Some(to)) if from == current_issue_id => to,
|
||||||
|
(Some(from), Some(to)) if to == current_issue_id => from,
|
||||||
|
(_, Some(to)) => to,
|
||||||
|
(Some(from), _) => from,
|
||||||
|
_ => 0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
fn issue(id: u64, subject: &str, desc: &str, status: (i64, &str)) -> RawIssue {
|
||||||
|
RawIssue {
|
||||||
|
id,
|
||||||
|
subject: subject.to_string(),
|
||||||
|
description: Some(desc.to_string()),
|
||||||
|
status: Some(NamedRef {
|
||||||
|
id: status.0,
|
||||||
|
name: status.1.to_string(),
|
||||||
|
}),
|
||||||
|
tracker: Some(NamedRef {
|
||||||
|
id: 1,
|
||||||
|
name: "Bug".to_string(),
|
||||||
|
}),
|
||||||
|
priority: Some(NamedRef {
|
||||||
|
id: 2,
|
||||||
|
name: "Normal".to_string(),
|
||||||
|
}),
|
||||||
|
author: Some(NamedRef {
|
||||||
|
id: 7,
|
||||||
|
name: "Jane Dev".to_string(),
|
||||||
|
}),
|
||||||
|
done_ratio: Some(0),
|
||||||
|
updated_on: Some("2026-06-19T00:00:00Z".to_string()),
|
||||||
|
journals: vec![],
|
||||||
|
relations: vec![],
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn connector() -> RedmineConnector {
|
||||||
|
RedmineConnector::new(RedmineConfig::new("https://redmine.example.com", "infra")).unwrap()
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn rejects_empty_and_bad_config() {
|
||||||
|
assert!(RedmineConnector::new(RedmineConfig::new("", "p")).is_err());
|
||||||
|
assert!(RedmineConnector::new(RedmineConfig::new("https://r.example", "")).is_err());
|
||||||
|
assert!(RedmineConnector::new(RedmineConfig::new("not a url", "p")).is_err());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn normalize_builds_keyed_envelope_with_citation() {
|
||||||
|
let c = connector();
|
||||||
|
let rec = c.normalize(&issue(123, "Disk full", "df -h shows 100%", (1, "New")));
|
||||||
|
let env = &rec.envelope;
|
||||||
|
assert!(env.has_key());
|
||||||
|
assert_eq!(env.source_system.as_deref(), Some("redmine"));
|
||||||
|
assert_eq!(env.source_id.as_deref(), Some("123"));
|
||||||
|
assert_eq!(
|
||||||
|
env.source_url.as_deref(),
|
||||||
|
Some("https://redmine.example.com/issues/123")
|
||||||
|
);
|
||||||
|
assert_eq!(env.source_project.as_deref(), Some("infra"));
|
||||||
|
assert_eq!(env.source_author.as_deref(), Some("Jane Dev"));
|
||||||
|
assert!(rec.content.contains("Disk full"));
|
||||||
|
// Tags lowercased so the case-sensitive tag_prefix filter matches.
|
||||||
|
assert!(rec.tags.contains(&"status:new".to_string()));
|
||||||
|
assert!(rec.tags.contains(&"tracker:bug".to_string()));
|
||||||
|
assert!(rec.tags.contains(&"priority:normal".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn status_change_changes_hash() {
|
||||||
|
let c = connector();
|
||||||
|
let new = c
|
||||||
|
.normalize(&issue(1, "T", "body", (1, "New")))
|
||||||
|
.envelope
|
||||||
|
.content_hash;
|
||||||
|
let closed = c
|
||||||
|
.normalize(&issue(1, "T", "body", (5, "Closed")))
|
||||||
|
.envelope
|
||||||
|
.content_hash;
|
||||||
|
assert_ne!(
|
||||||
|
new, closed,
|
||||||
|
"a status change must change the hash → re-embed"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn journals_fold_in_id_order_and_affect_hash() {
|
||||||
|
let c = connector();
|
||||||
|
let mut iss = issue(1, "T", "body", (1, "New"));
|
||||||
|
iss.journals = vec![
|
||||||
|
RawJournal {
|
||||||
|
id: 20,
|
||||||
|
notes: Some("second".to_string()),
|
||||||
|
user: Some(NamedRef {
|
||||||
|
id: 1,
|
||||||
|
name: "B".to_string(),
|
||||||
|
}),
|
||||||
|
details: vec![],
|
||||||
|
},
|
||||||
|
RawJournal {
|
||||||
|
id: 10,
|
||||||
|
notes: Some("first".to_string()),
|
||||||
|
user: Some(NamedRef {
|
||||||
|
id: 2,
|
||||||
|
name: "A".to_string(),
|
||||||
|
}),
|
||||||
|
details: vec![],
|
||||||
|
},
|
||||||
|
// Pure empty journal must be dropped, not folded.
|
||||||
|
RawJournal {
|
||||||
|
id: 30,
|
||||||
|
notes: None,
|
||||||
|
user: Some(NamedRef {
|
||||||
|
id: 3,
|
||||||
|
name: "C".to_string(),
|
||||||
|
}),
|
||||||
|
details: vec![],
|
||||||
|
},
|
||||||
|
];
|
||||||
|
let rec = c.normalize(&iss);
|
||||||
|
let first = rec.content.find("first").unwrap();
|
||||||
|
let second = rec.content.find("second").unwrap();
|
||||||
|
assert!(first < second, "journals fold in id order");
|
||||||
|
|
||||||
|
let no_journals = c
|
||||||
|
.normalize(&issue(1, "T", "body", (1, "New")))
|
||||||
|
.envelope
|
||||||
|
.content_hash;
|
||||||
|
assert_ne!(
|
||||||
|
rec.envelope.content_hash, no_journals,
|
||||||
|
"journals must contribute to the hash"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn journal_details_and_relations_are_searchable_and_hashed() {
|
||||||
|
let c = connector();
|
||||||
|
let mut iss = issue(1, "T", "body", (1, "New"));
|
||||||
|
iss.journals = vec![RawJournal {
|
||||||
|
id: 1,
|
||||||
|
notes: None,
|
||||||
|
user: Some(NamedRef {
|
||||||
|
id: 2,
|
||||||
|
name: "A".to_string(),
|
||||||
|
}),
|
||||||
|
details: vec![RawJournalDetail {
|
||||||
|
property: Some("attr".to_string()),
|
||||||
|
name: Some("status_id".to_string()),
|
||||||
|
old_value: Some("1".to_string()),
|
||||||
|
new_value: Some("5".to_string()),
|
||||||
|
}],
|
||||||
|
}];
|
||||||
|
iss.relations = vec![RawRelation {
|
||||||
|
id: 9,
|
||||||
|
issue_id: Some(1),
|
||||||
|
issue_to_id: Some(42),
|
||||||
|
relation_type: Some("relates".to_string()),
|
||||||
|
delay: None,
|
||||||
|
}];
|
||||||
|
|
||||||
|
let rec = c.normalize(&iss);
|
||||||
|
assert!(rec.content.contains("changed attr.status_id: 1 -> 5"));
|
||||||
|
assert!(rec.content.contains("#42 (relates)"));
|
||||||
|
|
||||||
|
let no_history = c.normalize(&issue(1, "T", "body", (1, "New")));
|
||||||
|
assert_ne!(
|
||||||
|
rec.envelope.content_hash, no_history.envelope.content_hash,
|
||||||
|
"field-change journals and relations must affect idempotent updates"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -9655,7 +9655,11 @@ impl SqliteMemoryStore {
|
||||||
|
|
||||||
/// Read the incremental-sync checkpoint for a `(source_system, scope)`.
|
/// Read the incremental-sync checkpoint for a `(source_system, scope)`.
|
||||||
/// Returns a zeroed cursor (no high-water mark) if none has been saved yet.
|
/// Returns a zeroed cursor (no high-water mark) if none has been saved yet.
|
||||||
pub fn get_connector_cursor(&self, source_system: &str, scope: &str) -> Result<ConnectorCursor> {
|
pub fn get_connector_cursor(
|
||||||
|
&self,
|
||||||
|
source_system: &str,
|
||||||
|
scope: &str,
|
||||||
|
) -> Result<ConnectorCursor> {
|
||||||
let reader = self
|
let reader = self
|
||||||
.reader
|
.reader
|
||||||
.lock()
|
.lock()
|
||||||
|
|
@ -9873,7 +9877,11 @@ mod tests {
|
||||||
assert_eq!(r.outcome, SourceUpsertOutcome::Unchanged);
|
assert_eq!(r.outcome, SourceUpsertOutcome::Unchanged);
|
||||||
assert_eq!(r.node_id, r1.node_id, "must reuse the same memory id");
|
assert_eq!(r.node_id, r1.node_id, "must reuse the same memory id");
|
||||||
}
|
}
|
||||||
assert_eq!(node_count(&store), 1, "idempotent: still exactly one memory");
|
assert_eq!(
|
||||||
|
node_count(&store),
|
||||||
|
1,
|
||||||
|
"idempotent: still exactly one memory"
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
|
|
@ -9945,7 +9953,13 @@ mod tests {
|
||||||
let mut c2 = cursor.clone();
|
let mut c2 = cursor.clone();
|
||||||
c2.records_seen = 99;
|
c2.records_seen = 99;
|
||||||
store.save_connector_cursor(&c2).unwrap();
|
store.save_connector_cursor(&c2).unwrap();
|
||||||
assert_eq!(store.get_connector_cursor("github", "o/r").unwrap().records_seen, 99);
|
assert_eq!(
|
||||||
|
store
|
||||||
|
.get_connector_cursor("github", "o/r")
|
||||||
|
.unwrap()
|
||||||
|
.records_seen,
|
||||||
|
99
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
|
|
@ -9977,9 +9991,15 @@ mod tests {
|
||||||
)
|
)
|
||||||
.unwrap()
|
.unwrap()
|
||||||
};
|
};
|
||||||
assert!(two.1.is_some(), "tombstoned record must have valid_until set");
|
assert!(
|
||||||
|
two.1.is_some(),
|
||||||
|
"tombstoned record must have valid_until set"
|
||||||
|
);
|
||||||
let node = store.get_node(&two.0).unwrap().unwrap();
|
let node = store.get_node(&two.0).unwrap().unwrap();
|
||||||
assert!(!node.is_currently_valid(), "tombstoned node is not valid now");
|
assert!(
|
||||||
|
!node.is_currently_valid(),
|
||||||
|
"tombstoned node is not valid now"
|
||||||
|
);
|
||||||
assert_eq!(node.content, "issue 2", "content retained for audit");
|
assert_eq!(node.content, "issue 2", "content retained for audit");
|
||||||
|
|
||||||
// A reappearing record un-tombstones on next upsert (clears valid_until).
|
// A reappearing record un-tombstones on next upsert (clears valid_until).
|
||||||
|
|
@ -9987,7 +10007,10 @@ mod tests {
|
||||||
.upsert_by_source(source_input("2", "issue 2", "h2"))
|
.upsert_by_source(source_input("2", "issue 2", "h2"))
|
||||||
.unwrap();
|
.unwrap();
|
||||||
let revived = store.get_node(&two.0).unwrap().unwrap();
|
let revived = store.get_node(&two.0).unwrap().unwrap();
|
||||||
assert!(revived.is_currently_valid(), "re-synced record is valid again");
|
assert!(
|
||||||
|
revived.is_currently_valid(),
|
||||||
|
"re-synced record is valid again"
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
|
|
@ -10013,7 +10036,10 @@ mod tests {
|
||||||
.unwrap();
|
.unwrap();
|
||||||
}
|
}
|
||||||
assert!(
|
assert!(
|
||||||
store.superseded_node_ids().unwrap().contains(&created.node_id),
|
store
|
||||||
|
.superseded_node_ids()
|
||||||
|
.unwrap()
|
||||||
|
.contains(&created.node_id),
|
||||||
"precondition: node is superseded"
|
"precondition: node is superseded"
|
||||||
);
|
);
|
||||||
|
|
||||||
|
|
@ -10023,7 +10049,10 @@ mod tests {
|
||||||
.unwrap();
|
.unwrap();
|
||||||
assert_eq!(res.outcome, SourceUpsertOutcome::Updated);
|
assert_eq!(res.outcome, SourceUpsertOutcome::Updated);
|
||||||
assert!(
|
assert!(
|
||||||
!store.superseded_node_ids().unwrap().contains(&created.node_id),
|
!store
|
||||||
|
.superseded_node_ids()
|
||||||
|
.unwrap()
|
||||||
|
.contains(&created.node_id),
|
||||||
"superseded_by must be cleared on re-sync (no bitemporal zombie)"
|
"superseded_by must be cleared on re-sync (no bitemporal zombie)"
|
||||||
);
|
);
|
||||||
let node = store.get_node(&created.node_id).unwrap().unwrap();
|
let node = store.get_node(&created.node_id).unwrap().unwrap();
|
||||||
|
|
@ -10044,7 +10073,10 @@ mod tests {
|
||||||
.unwrap();
|
.unwrap();
|
||||||
assert_eq!(res2.outcome, SourceUpsertOutcome::Unchanged);
|
assert_eq!(res2.outcome, SourceUpsertOutcome::Unchanged);
|
||||||
assert!(
|
assert!(
|
||||||
!store.superseded_node_ids().unwrap().contains(&created.node_id),
|
!store
|
||||||
|
.superseded_node_ids()
|
||||||
|
.unwrap()
|
||||||
|
.contains(&created.node_id),
|
||||||
"Unchanged branch must also clear superseded_by"
|
"Unchanged branch must also clear superseded_by"
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -285,7 +285,7 @@ impl McpServer {
|
||||||
// ================================================================
|
// ================================================================
|
||||||
ToolDescription {
|
ToolDescription {
|
||||||
name: "source_sync".to_string(),
|
name: "source_sync".to_string(),
|
||||||
description: Some("Index an external system (GitHub Issues) into Vestige as a durable, offline, semantically-searchable index that cites back to the canonical record. Provide 'repo' as 'owner/name'. Idempotent: re-running updates changed issues without duplicating; set reconcile=true to tombstone issues removed upstream. Auth via the GITHUB_TOKEN env var (optional for public repos).".to_string()),
|
description: Some("Index an external system into Vestige as a durable, offline, semantically-searchable index that cites back to the canonical record. GitHub: source='github', repo='owner/name' (auth via GITHUB_TOKEN env). Redmine: source='redmine', project='<id>' (host via REDMINE_URL, auth via REDMINE_API_KEY env). Idempotent: re-running updates changed issues without duplicating; set reconcile=true to tombstone issues removed upstream.".to_string()),
|
||||||
input_schema: tools::source_sync::schema(),
|
input_schema: tools::source_sync::schema(),
|
||||||
..Default::default()
|
..Default::default()
|
||||||
},
|
},
|
||||||
|
|
|
||||||
|
|
@ -96,6 +96,40 @@ pub fn schema() -> Value {
|
||||||
"tag_prefix": {
|
"tag_prefix": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Optional tag-prefix filter. When set, only results carrying at least one tag whose value starts with this prefix are returned (case-sensitive). Example: tag_prefix=\"meeting:\" matches memories tagged 'meeting:standup', 'meeting:1-on-1', etc. Applied as a post-filter; combine with a larger 'limit' if you expect heavy thinning."
|
"description": "Optional tag-prefix filter. When set, only results carrying at least one tag whose value starts with this prefix are returned (case-sensitive). Example: tag_prefix=\"meeting:\" matches memories tagged 'meeting:standup', 'meeting:1-on-1', etc. Applied as a post-filter; combine with a larger 'limit' if you expect heavy thinning."
|
||||||
|
},
|
||||||
|
"source_system": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter (#57): only memories ingested from this external system, e.g. 'github' or 'redmine'. Post-filter — non-connector memories are excluded. Combine with a larger 'limit' if thinning is heavy."
|
||||||
|
},
|
||||||
|
"source_project": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter: only memories from this source project/repo, exact match (GitHub 'owner/repo', Redmine project id)."
|
||||||
|
},
|
||||||
|
"source_id": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter: a specific source record id (issue number / ticket id). Pair with source_system to disambiguate across systems."
|
||||||
|
},
|
||||||
|
"source_type": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter: source record type, e.g. 'issue', 'comment'."
|
||||||
|
},
|
||||||
|
"source_author": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter: the source author/reporter (not assignee)."
|
||||||
|
},
|
||||||
|
"source_updated_after": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter: only records whose source was updated at/after this RFC3339 timestamp (inclusive)."
|
||||||
|
},
|
||||||
|
"source_updated_before": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Investigation filter: only records whose source was updated at/before this RFC3339 timestamp (inclusive)."
|
||||||
|
},
|
||||||
|
"source_status": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["any", "valid", "tombstoned"],
|
||||||
|
"description": "Investigation filter: 'any' (default), 'valid' (currently-valid records only), or 'tombstoned' (records no longer visible upstream, kept for audit).",
|
||||||
|
"default": "any"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"required": ["query"]
|
"required": ["query"]
|
||||||
|
|
@ -126,6 +160,23 @@ struct SearchArgs {
|
||||||
concrete: Option<bool>,
|
concrete: Option<bool>,
|
||||||
#[serde(alias = "tag_prefix")]
|
#[serde(alias = "tag_prefix")]
|
||||||
tag_prefix: Option<String>,
|
tag_prefix: Option<String>,
|
||||||
|
// #57 Phase 4 — source-aware investigation filters (all post-filters).
|
||||||
|
#[serde(alias = "source_system")]
|
||||||
|
source_system: Option<String>,
|
||||||
|
#[serde(alias = "source_project")]
|
||||||
|
source_project: Option<String>,
|
||||||
|
#[serde(alias = "source_id")]
|
||||||
|
source_id: Option<String>,
|
||||||
|
#[serde(alias = "source_type")]
|
||||||
|
source_type: Option<String>,
|
||||||
|
#[serde(alias = "source_author")]
|
||||||
|
source_author: Option<String>,
|
||||||
|
#[serde(alias = "source_updated_after")]
|
||||||
|
source_updated_after: Option<String>,
|
||||||
|
#[serde(alias = "source_updated_before")]
|
||||||
|
source_updated_before: Option<String>,
|
||||||
|
#[serde(alias = "source_status")]
|
||||||
|
source_status: Option<String>,
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Execute unified search with 7-stage cognitive pipeline.
|
/// Execute unified search with 7-stage cognitive pipeline.
|
||||||
|
|
@ -190,15 +241,19 @@ pub async fn execute(
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// #57 Phase 4 — parse the source-aware investigation filter once (shared by
|
||||||
|
// both the concrete and hybrid paths). Hard-errors on malformed input.
|
||||||
|
let source_filter = SourceFilter::from_args(&args)?;
|
||||||
|
|
||||||
let concrete = args
|
let concrete = args
|
||||||
.concrete
|
.concrete
|
||||||
.unwrap_or_else(|| is_literal_query(&args.query));
|
.unwrap_or_else(|| is_literal_query(&args.query));
|
||||||
if concrete {
|
if concrete {
|
||||||
// When a tag_prefix is requested, fetch a larger pool so the
|
// When a tag_prefix OR a source filter is requested, fetch a larger
|
||||||
// post-filter has enough headroom to still return ~limit results
|
// pool so the post-filter has enough headroom to still return ~limit
|
||||||
// after thinning. Cap at the same upper bound the underlying SQL
|
// results after thinning. Cap at the same upper bound the underlying
|
||||||
// path uses elsewhere (100).
|
// SQL path uses elsewhere (100).
|
||||||
let concrete_fetch_limit = if args.tag_prefix.is_some() {
|
let concrete_fetch_limit = if args.tag_prefix.is_some() || source_filter.is_active() {
|
||||||
(limit * 3).min(100)
|
(limit * 3).min(100)
|
||||||
} else {
|
} else {
|
||||||
limit
|
limit
|
||||||
|
|
@ -215,14 +270,15 @@ pub async fn execute(
|
||||||
// Apply tag_prefix post-filter BEFORE strengthen-on-access so
|
// Apply tag_prefix post-filter BEFORE strengthen-on-access so
|
||||||
// results the caller did not actually receive do not get a
|
// results the caller did not actually receive do not get a
|
||||||
// testing-effect boost.
|
// testing-effect boost.
|
||||||
let filtered_results: Vec<&vestige_core::SearchResult> = match args.tag_prefix.as_deref() {
|
let filtered_results: Vec<&vestige_core::SearchResult> = results
|
||||||
Some(prefix) => results
|
.iter()
|
||||||
.iter()
|
.filter(|r| match args.tag_prefix.as_deref() {
|
||||||
.filter(|r| tags_match_prefix(&r.node.tags, prefix))
|
Some(prefix) => tags_match_prefix(&r.node.tags, prefix),
|
||||||
.take(limit as usize)
|
None => true,
|
||||||
.collect(),
|
})
|
||||||
None => results.iter().collect(),
|
.filter(|r| node_matches_source(&r.node, &source_filter))
|
||||||
};
|
.take(limit as usize)
|
||||||
|
.collect();
|
||||||
|
|
||||||
let ids: Vec<&str> = filtered_results
|
let ids: Vec<&str> = filtered_results
|
||||||
.iter()
|
.iter()
|
||||||
|
|
@ -334,11 +390,15 @@ pub async fn execute(
|
||||||
"exhaustive" => 5, // Deep overfetch for maximum recall
|
"exhaustive" => 5, // Deep overfetch for maximum recall
|
||||||
_ => 3, // Balanced default
|
_ => 3, // Balanced default
|
||||||
};
|
};
|
||||||
// When a tag_prefix filter is requested, double the overfetch (capped at
|
// When a tag_prefix OR source filter is requested, double the overfetch
|
||||||
// the same 100 ceiling) so the post-filter has enough headroom to still
|
// (capped at the same 100 ceiling) so the post-filter has enough headroom
|
||||||
// return ~limit results after thinning.
|
// to still return ~limit results after thinning.
|
||||||
let tag_prefix_multiplier = if args.tag_prefix.is_some() { 2 } else { 1 };
|
let post_filter_multiplier = if args.tag_prefix.is_some() || source_filter.is_active() {
|
||||||
let overfetch_limit = (limit * overfetch_multiplier * tag_prefix_multiplier).min(100); // Cap at 100 to avoid excessive DB load
|
2
|
||||||
|
} else {
|
||||||
|
1
|
||||||
|
};
|
||||||
|
let overfetch_limit = (limit * overfetch_multiplier * post_filter_multiplier).min(100); // Cap at 100 to avoid excessive DB load
|
||||||
|
|
||||||
let results = storage
|
let results = storage
|
||||||
.hybrid_search_filtered(
|
.hybrid_search_filtered(
|
||||||
|
|
@ -375,6 +435,10 @@ pub async fn execute(
|
||||||
if let Some(prefix) = args.tag_prefix.as_deref() {
|
if let Some(prefix) = args.tag_prefix.as_deref() {
|
||||||
filtered_results.retain(|r| tags_match_prefix(&r.node.tags, prefix));
|
filtered_results.retain(|r| tags_match_prefix(&r.node.tags, prefix));
|
||||||
}
|
}
|
||||||
|
// #57 Phase 4 — source-aware investigation post-filter (same precedent).
|
||||||
|
if source_filter.is_active() {
|
||||||
|
filtered_results.retain(|r| node_matches_source(&r.node, &source_filter));
|
||||||
|
}
|
||||||
|
|
||||||
// ====================================================================
|
// ====================================================================
|
||||||
// Dedup: merge Stage 0 keyword-priority results into Stage 1 results
|
// Dedup: merge Stage 0 keyword-priority results into Stage 1 results
|
||||||
|
|
@ -387,6 +451,10 @@ pub async fn execute(
|
||||||
{
|
{
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
// Respect the source filter on re-inject for the same reason.
|
||||||
|
if source_filter.is_active() && !node_matches_source(&kp.node, &source_filter) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
if let Some(existing) = filtered_results
|
if let Some(existing) = filtered_results
|
||||||
.iter_mut()
|
.iter_mut()
|
||||||
.find(|r| r.node.id == kp.node.id)
|
.find(|r| r.node.id == kp.node.id)
|
||||||
|
|
@ -852,6 +920,156 @@ fn tags_match_prefix(tags: &[String], prefix: &str) -> bool {
|
||||||
tags.iter().any(|t| t.starts_with(prefix))
|
tags.iter().any(|t| t.starts_with(prefix))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Validity filter for source-aware search (#57 Phase 4).
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
|
||||||
|
enum SourceStatus {
|
||||||
|
/// No validity constraint.
|
||||||
|
#[default]
|
||||||
|
Any,
|
||||||
|
/// Only currently-valid records.
|
||||||
|
Valid,
|
||||||
|
/// Only tombstoned records (no longer visible upstream, kept for audit).
|
||||||
|
Tombstoned,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parsed source-aware investigation filter (#57 Phase 4).
|
||||||
|
///
|
||||||
|
/// All fields are optional; an all-empty filter matches every node (so search
|
||||||
|
/// behavior is byte-for-byte unchanged when no source filter is supplied). Any
|
||||||
|
/// source-scoped field being set excludes legacy/agent memories that have no
|
||||||
|
/// `source_envelope`. Applied as a post-filter on the recalled nodes, mirroring
|
||||||
|
/// the existing `tag_prefix` precedent (no SQL changes).
|
||||||
|
#[derive(Debug, Clone, Default)]
|
||||||
|
struct SourceFilter {
|
||||||
|
system: Option<String>,
|
||||||
|
project: Option<String>,
|
||||||
|
id: Option<String>,
|
||||||
|
source_type: Option<String>,
|
||||||
|
author: Option<String>,
|
||||||
|
updated_after: Option<chrono::DateTime<chrono::Utc>>,
|
||||||
|
updated_before: Option<chrono::DateTime<chrono::Utc>>,
|
||||||
|
status: SourceStatus,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl SourceFilter {
|
||||||
|
/// Build from raw args, hard-erroring on malformed timestamps / status enum
|
||||||
|
/// (consistent with how `detail_level` / `retrieval_mode` reject bad input —
|
||||||
|
/// a silently-`None` bound would widen the filter and return wrong rows).
|
||||||
|
fn from_args(args: &SearchArgs) -> Result<Self, String> {
|
||||||
|
let parse_ts = |s: &Option<String>,
|
||||||
|
field: &str|
|
||||||
|
-> Result<Option<chrono::DateTime<chrono::Utc>>, String> {
|
||||||
|
match s {
|
||||||
|
None => Ok(None),
|
||||||
|
Some(v) => chrono::DateTime::parse_from_rfc3339(v)
|
||||||
|
.map(|dt| Some(dt.with_timezone(&chrono::Utc)))
|
||||||
|
.map_err(|_| format!("Invalid {field}: '{v}' is not an RFC3339 timestamp")),
|
||||||
|
}
|
||||||
|
};
|
||||||
|
let status = match args.source_status.as_deref() {
|
||||||
|
None | Some("any") => SourceStatus::Any,
|
||||||
|
Some("valid") => SourceStatus::Valid,
|
||||||
|
Some("tombstoned") => SourceStatus::Tombstoned,
|
||||||
|
Some(other) => {
|
||||||
|
return Err(format!(
|
||||||
|
"Invalid source_status '{other}'. Must be 'any', 'valid', or 'tombstoned'."
|
||||||
|
));
|
||||||
|
}
|
||||||
|
};
|
||||||
|
Ok(Self {
|
||||||
|
system: args.source_system.clone(),
|
||||||
|
project: args.source_project.clone(),
|
||||||
|
id: args.source_id.clone(),
|
||||||
|
source_type: args.source_type.clone(),
|
||||||
|
author: args.source_author.clone(),
|
||||||
|
updated_after: parse_ts(&args.source_updated_after, "source_updated_after")?,
|
||||||
|
updated_before: parse_ts(&args.source_updated_before, "source_updated_before")?,
|
||||||
|
status,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// True when at least one filter is set (used to size the over-fetch pool).
|
||||||
|
fn is_active(&self) -> bool {
|
||||||
|
self.system.is_some()
|
||||||
|
|| self.project.is_some()
|
||||||
|
|| self.id.is_some()
|
||||||
|
|| self.source_type.is_some()
|
||||||
|
|| self.author.is_some()
|
||||||
|
|| self.updated_after.is_some()
|
||||||
|
|| self.updated_before.is_some()
|
||||||
|
|| self.status != SourceStatus::Any
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Predicate: does this node satisfy the source-aware investigation filter?
|
||||||
|
/// An all-empty filter returns `true` for every node.
|
||||||
|
fn node_matches_source(node: &vestige_core::KnowledgeNode, filter: &SourceFilter) -> bool {
|
||||||
|
// Validity check operates on the NODE (valid_until lives on the node).
|
||||||
|
match filter.status {
|
||||||
|
SourceStatus::Any => {}
|
||||||
|
SourceStatus::Valid if !node.is_currently_valid() => return false,
|
||||||
|
SourceStatus::Tombstoned if node.is_currently_valid() => return false,
|
||||||
|
_ => {}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Any source-scoped field requires an envelope; legacy memories are out.
|
||||||
|
// This includes `source_status=valid`: otherwise a source-scoped query for
|
||||||
|
// valid connector records would also return ordinary valid agent memories.
|
||||||
|
let envelope_scoped = filter.system.is_some()
|
||||||
|
|| filter.project.is_some()
|
||||||
|
|| filter.id.is_some()
|
||||||
|
|| filter.source_type.is_some()
|
||||||
|
|| filter.author.is_some()
|
||||||
|
|| filter.updated_after.is_some()
|
||||||
|
|| filter.updated_before.is_some()
|
||||||
|
|| filter.status != SourceStatus::Any;
|
||||||
|
if !envelope_scoped {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
let Some(env) = node.source_envelope.as_ref() else {
|
||||||
|
return false;
|
||||||
|
};
|
||||||
|
|
||||||
|
let exact = |want: &Option<String>, have: &Option<String>| -> bool {
|
||||||
|
match want {
|
||||||
|
None => true,
|
||||||
|
Some(w) => have.as_deref() == Some(w.as_str()),
|
||||||
|
}
|
||||||
|
};
|
||||||
|
if !exact(&filter.system, &env.source_system) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
if !exact(&filter.project, &env.source_project) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
if !exact(&filter.id, &env.source_id) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
if !exact(&filter.source_type, &env.source_type) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
if !exact(&filter.author, &env.source_author) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
// Date bounds (inclusive) on the source-updated time.
|
||||||
|
if filter.updated_after.is_some() || filter.updated_before.is_some() {
|
||||||
|
let Some(ts) = env.source_updated_at else {
|
||||||
|
return false;
|
||||||
|
};
|
||||||
|
if let Some(after) = filter.updated_after
|
||||||
|
&& ts < after
|
||||||
|
{
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
if let Some(before) = filter.updated_before
|
||||||
|
&& ts > before
|
||||||
|
{
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
true
|
||||||
|
}
|
||||||
|
|
||||||
/// Format a search result based on the requested detail level.
|
/// Format a search result based on the requested detail level.
|
||||||
/// Score field keys dropped when an output profile suppresses scores.
|
/// Score field keys dropped when an output profile suppresses scores.
|
||||||
const SCORE_FIELDS: &[&str] = &["combinedScore", "keywordScore", "semanticScore"];
|
const SCORE_FIELDS: &[&str] = &["combinedScore", "keywordScore", "semanticScore"];
|
||||||
|
|
@ -1880,6 +2098,167 @@ mod tests {
|
||||||
assert!(!required.contains(&serde_json::json!("tag_prefix")));
|
assert!(!required.contains(&serde_json::json!("tag_prefix")));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ===================== #57 Phase 4 source filters =====================
|
||||||
|
|
||||||
|
/// Build a KnowledgeNode carrying a source envelope for filter tests.
|
||||||
|
fn node_with_source(
|
||||||
|
system: &str,
|
||||||
|
project: &str,
|
||||||
|
id: &str,
|
||||||
|
author: &str,
|
||||||
|
updated: &str,
|
||||||
|
) -> vestige_core::KnowledgeNode {
|
||||||
|
let mut n = vestige_core::KnowledgeNode::default();
|
||||||
|
n.id = format!("{system}-{id}");
|
||||||
|
// SourceEnvelope is #[non_exhaustive]; build via Default + field set.
|
||||||
|
let mut env = vestige_core::SourceEnvelope::default();
|
||||||
|
env.source_system = Some(system.to_string());
|
||||||
|
env.source_id = Some(id.to_string());
|
||||||
|
env.source_url = Some(format!("https://x/{id}"));
|
||||||
|
env.source_updated_at = chrono::DateTime::parse_from_rfc3339(updated)
|
||||||
|
.ok()
|
||||||
|
.map(|d| d.with_timezone(&chrono::Utc));
|
||||||
|
env.content_hash = Some("h".to_string());
|
||||||
|
env.source_project = Some(project.to_string());
|
||||||
|
env.source_type = Some("issue".to_string());
|
||||||
|
env.source_author = Some(author.to_string());
|
||||||
|
n.source_envelope = Some(env);
|
||||||
|
n
|
||||||
|
}
|
||||||
|
|
||||||
|
fn filter_from(json: serde_json::Value) -> SourceFilter {
|
||||||
|
let mut v = json;
|
||||||
|
v["query"] = serde_json::json!("q");
|
||||||
|
let args: SearchArgs = serde_json::from_value(v).unwrap();
|
||||||
|
SourceFilter::from_args(&args).unwrap()
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn source_filter_empty_matches_everything() {
|
||||||
|
let f = SourceFilter::default();
|
||||||
|
assert!(!f.is_active());
|
||||||
|
let gh = node_with_source("github", "o/r", "1", "octo", "2026-06-19T00:00:00Z");
|
||||||
|
let legacy = vestige_core::KnowledgeNode::default(); // no envelope
|
||||||
|
assert!(node_matches_source(&gh, &f));
|
||||||
|
assert!(node_matches_source(&legacy, &f), "no filter = unchanged");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn source_filter_exact_fields() {
|
||||||
|
let gh = node_with_source("github", "o/r", "57", "octo", "2026-06-19T00:00:00Z");
|
||||||
|
let rm = node_with_source("redmine", "infra", "57", "jane", "2026-06-19T00:00:00Z");
|
||||||
|
|
||||||
|
let by_system = filter_from(serde_json::json!({"sourceSystem": "github"}));
|
||||||
|
assert!(node_matches_source(&gh, &by_system));
|
||||||
|
assert!(!node_matches_source(&rm, &by_system));
|
||||||
|
|
||||||
|
let by_project = filter_from(serde_json::json!({"sourceProject": "infra"}));
|
||||||
|
assert!(node_matches_source(&rm, &by_project));
|
||||||
|
assert!(!node_matches_source(&gh, &by_project));
|
||||||
|
|
||||||
|
let by_author = filter_from(serde_json::json!({"sourceAuthor": "octo"}));
|
||||||
|
assert!(node_matches_source(&gh, &by_author));
|
||||||
|
assert!(!node_matches_source(&rm, &by_author));
|
||||||
|
|
||||||
|
// id + system together disambiguate across systems sharing an id.
|
||||||
|
let by_id_sys =
|
||||||
|
filter_from(serde_json::json!({"sourceSystem": "redmine", "sourceId": "57"}));
|
||||||
|
assert!(node_matches_source(&rm, &by_id_sys));
|
||||||
|
assert!(!node_matches_source(&gh, &by_id_sys));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn source_filter_excludes_legacy_memories_when_envelope_scoped() {
|
||||||
|
let legacy = vestige_core::KnowledgeNode::default();
|
||||||
|
let f = filter_from(serde_json::json!({"sourceSystem": "github"}));
|
||||||
|
assert!(
|
||||||
|
!node_matches_source(&legacy, &f),
|
||||||
|
"an envelope-scoped filter must exclude memories with no source"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn source_filter_date_bounds_inclusive() {
|
||||||
|
let n = node_with_source("github", "o/r", "1", "octo", "2026-06-15T12:00:00Z");
|
||||||
|
// After bound: inclusive at the exact instant, excludes earlier.
|
||||||
|
assert!(node_matches_source(
|
||||||
|
&n,
|
||||||
|
&filter_from(serde_json::json!({"sourceUpdatedAfter": "2026-06-15T12:00:00Z"}))
|
||||||
|
));
|
||||||
|
assert!(!node_matches_source(
|
||||||
|
&n,
|
||||||
|
&filter_from(serde_json::json!({"sourceUpdatedAfter": "2026-06-16T00:00:00Z"}))
|
||||||
|
));
|
||||||
|
// Before bound: inclusive, excludes later.
|
||||||
|
assert!(node_matches_source(
|
||||||
|
&n,
|
||||||
|
&filter_from(serde_json::json!({"sourceUpdatedBefore": "2026-06-15T12:00:00Z"}))
|
||||||
|
));
|
||||||
|
assert!(!node_matches_source(
|
||||||
|
&n,
|
||||||
|
&filter_from(serde_json::json!({"sourceUpdatedBefore": "2026-06-15T00:00:00Z"}))
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn source_filter_status_valid_vs_tombstoned() {
|
||||||
|
let mut live = node_with_source("github", "o/r", "1", "octo", "2026-06-19T00:00:00Z");
|
||||||
|
let mut dead = node_with_source("github", "o/r", "2", "octo", "2026-06-19T00:00:00Z");
|
||||||
|
let legacy = vestige_core::KnowledgeNode::default();
|
||||||
|
// Tombstone `dead` by setting valid_until in the past.
|
||||||
|
dead.valid_until = Some(chrono::Utc::now() - chrono::Duration::days(1));
|
||||||
|
live.valid_until = None;
|
||||||
|
|
||||||
|
let valid = filter_from(serde_json::json!({"sourceStatus": "valid"}));
|
||||||
|
assert!(node_matches_source(&live, &valid));
|
||||||
|
assert!(!node_matches_source(&dead, &valid));
|
||||||
|
assert!(
|
||||||
|
!node_matches_source(&legacy, &valid),
|
||||||
|
"source_status is source-scoped and must not include legacy memories"
|
||||||
|
);
|
||||||
|
|
||||||
|
let tomb = filter_from(serde_json::json!({"sourceStatus": "tombstoned"}));
|
||||||
|
assert!(!node_matches_source(&live, &tomb));
|
||||||
|
assert!(node_matches_source(&dead, &tomb));
|
||||||
|
assert!(!node_matches_source(&legacy, &tomb));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn source_filter_rejects_bad_timestamp_and_status() {
|
||||||
|
let mut v = serde_json::json!({"query": "q", "sourceUpdatedAfter": "not-a-date"});
|
||||||
|
let args: SearchArgs = serde_json::from_value(v.take()).unwrap();
|
||||||
|
assert!(SourceFilter::from_args(&args).is_err());
|
||||||
|
|
||||||
|
let mut v2 = serde_json::json!({"query": "q", "sourceStatus": "bogus"});
|
||||||
|
let args2: SearchArgs = serde_json::from_value(v2.take()).unwrap();
|
||||||
|
assert!(SourceFilter::from_args(&args2).is_err());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_schema_has_source_filters() {
|
||||||
|
let s = schema();
|
||||||
|
for prop in [
|
||||||
|
"source_system",
|
||||||
|
"source_project",
|
||||||
|
"source_id",
|
||||||
|
"source_type",
|
||||||
|
"source_author",
|
||||||
|
"source_updated_after",
|
||||||
|
"source_updated_before",
|
||||||
|
"source_status",
|
||||||
|
] {
|
||||||
|
assert!(
|
||||||
|
s["properties"][prop].is_object(),
|
||||||
|
"schema must expose {prop}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
// None of the source filters are required.
|
||||||
|
let required = s["required"].as_array().unwrap();
|
||||||
|
for prop in ["source_system", "source_status"] {
|
||||||
|
assert!(!required.contains(&serde_json::json!(prop)));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/// Helper that ingests a memory with specific tags. The base
|
/// Helper that ingests a memory with specific tags. The base
|
||||||
/// `ingest_test_content` helper passes `tags: vec![]`, which is fine
|
/// `ingest_test_content` helper passes `tags: vec![]`, which is fine
|
||||||
/// for legacy tests but not for tag_prefix coverage.
|
/// for legacy tests but not for tag_prefix coverage.
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,11 @@
|
||||||
//! `source_sync` MCP tool (#57) — index an external system into Vestige.
|
//! `source_sync` MCP tool (#57) — index an external system into Vestige.
|
||||||
//!
|
//!
|
||||||
//! Turns Vestige into a durable, offline, provenance-linked retrieval layer
|
//! Turns Vestige into a durable, offline, provenance-linked retrieval layer
|
||||||
//! over a long-lived external system. The first connector is GitHub Issues:
|
//! over a long-lived external system. GitHub Issues and Redmine are the first
|
||||||
//! point it at `owner/repo` and Vestige indexes every issue + its comments as
|
//! reference connectors: Vestige indexes issues, comments/journals, and source
|
||||||
//! source-aware memories you can search semantically and cite back to the
|
//! metadata as source-aware memories you can search semantically and cite back
|
||||||
//! canonical issue URL — re-runnable idempotently (no duplicates) and able to
|
//! to the canonical issue URL — re-runnable idempotently (no duplicates) and
|
||||||
//! tombstone issues that vanish upstream.
|
//! able to tombstone issues that vanish upstream.
|
||||||
//!
|
//!
|
||||||
//! Unlike the official GitHub MCP server (a stateless live API proxy), this
|
//! Unlike the official GitHub MCP server (a stateless live API proxy), this
|
||||||
//! keeps a local index: searchable offline, embedded for semantic recall,
|
//! keeps a local index: searchable offline, embedded for semantic recall,
|
||||||
|
|
@ -13,10 +13,11 @@
|
||||||
//!
|
//!
|
||||||
//! ## Auth (security)
|
//! ## Auth (security)
|
||||||
//!
|
//!
|
||||||
//! The GitHub token is read from the `GITHUB_TOKEN` (or `VESTIGE_GITHUB_TOKEN`)
|
//! Tokens are read from environment variables (`GITHUB_TOKEN` /
|
||||||
//! environment variable, never from tool arguments, so credentials are not
|
//! `VESTIGE_GITHUB_TOKEN`, `REDMINE_API_KEY` / `VESTIGE_REDMINE_API_KEY`) and
|
||||||
//! logged in the conversation. Public repositories work without a token at a
|
//! never from tool arguments, so credentials are not logged in the conversation.
|
||||||
//! lower rate limit.
|
//! Public GitHub repositories and anonymous Redmine instances can work without a
|
||||||
|
//! token/key at lower capability.
|
||||||
|
|
||||||
use std::sync::Arc;
|
use std::sync::Arc;
|
||||||
|
|
||||||
|
|
@ -32,13 +33,17 @@ pub fn schema() -> Value {
|
||||||
"properties": {
|
"properties": {
|
||||||
"source": {
|
"source": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"enum": ["github"],
|
"enum": ["github", "redmine"],
|
||||||
"description": "External system to sync. Currently: 'github' (GitHub Issues).",
|
"description": "External system to sync: 'github' (GitHub Issues) or 'redmine' (a Redmine project).",
|
||||||
"default": "github"
|
"default": "github"
|
||||||
},
|
},
|
||||||
"repo": {
|
"repo": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "GitHub repository as 'owner/name', e.g. 'samvallad33/vestige'."
|
"description": "GitHub only: repository as 'owner/name', e.g. 'samvallad33/vestige'."
|
||||||
|
},
|
||||||
|
"project": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "Redmine only: project identifier (slug or numeric id) to sync. The Redmine host comes from the REDMINE_URL env var."
|
||||||
},
|
},
|
||||||
"reconcile": {
|
"reconcile": {
|
||||||
"type": "boolean",
|
"type": "boolean",
|
||||||
|
|
@ -47,13 +52,13 @@ pub fn schema() -> Value {
|
||||||
},
|
},
|
||||||
"max_pages": {
|
"max_pages": {
|
||||||
"type": "integer",
|
"type": "integer",
|
||||||
"description": "Max API pages to fetch this run (each page is up to 100 issues). Lets a first sync of a large repo be resumed across calls. Default 10.",
|
"description": "Max API pages to fetch this run (each page is up to 100 issues). Lets a first sync of a large project be resumed across calls. Default 10.",
|
||||||
"default": 10,
|
"default": 10,
|
||||||
"minimum": 1,
|
"minimum": 1,
|
||||||
"maximum": 1000
|
"maximum": 1000
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"required": ["repo"]
|
"required": []
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -62,7 +67,10 @@ pub fn schema() -> Value {
|
||||||
struct SourceSyncArgs {
|
struct SourceSyncArgs {
|
||||||
#[serde(default = "default_source")]
|
#[serde(default = "default_source")]
|
||||||
source: String,
|
source: String,
|
||||||
repo: String,
|
#[serde(default)]
|
||||||
|
repo: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
project: Option<String>,
|
||||||
#[serde(default)]
|
#[serde(default)]
|
||||||
reconcile: bool,
|
reconcile: bool,
|
||||||
#[serde(default, alias = "max_pages")]
|
#[serde(default, alias = "max_pages")]
|
||||||
|
|
@ -81,35 +89,60 @@ fn github_token() -> Option<String> {
|
||||||
.filter(|s| !s.trim().is_empty())
|
.filter(|s| !s.trim().is_empty())
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Read the Redmine API key from the environment (never from tool args).
|
||||||
|
fn redmine_api_key() -> Option<String> {
|
||||||
|
std::env::var("REDMINE_API_KEY")
|
||||||
|
.or_else(|_| std::env::var("VESTIGE_REDMINE_API_KEY"))
|
||||||
|
.ok()
|
||||||
|
.filter(|s| !s.trim().is_empty())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Read the Redmine base URL from the environment.
|
||||||
|
fn redmine_url() -> Option<String> {
|
||||||
|
std::env::var("REDMINE_URL")
|
||||||
|
.or_else(|_| std::env::var("VESTIGE_REDMINE_URL"))
|
||||||
|
.ok()
|
||||||
|
.filter(|s| !s.trim().is_empty())
|
||||||
|
}
|
||||||
|
|
||||||
pub async fn execute(storage: &Arc<Storage>, args: Option<Value>) -> Result<Value, String> {
|
pub async fn execute(storage: &Arc<Storage>, args: Option<Value>) -> Result<Value, String> {
|
||||||
let args: SourceSyncArgs = match args {
|
let args: SourceSyncArgs = match args {
|
||||||
Some(v) => serde_json::from_value(v).map_err(|e| format!("Invalid arguments: {e}"))?,
|
Some(v) => serde_json::from_value(v).map_err(|e| format!("Invalid arguments: {e}"))?,
|
||||||
None => return Err("Missing arguments".to_string()),
|
None => return Err("Missing arguments".to_string()),
|
||||||
};
|
};
|
||||||
|
|
||||||
if args.source != "github" {
|
let max_pages = args.max_pages.unwrap_or(10);
|
||||||
return Err(format!(
|
|
||||||
"Unsupported source '{}'. Currently only 'github' is supported.",
|
match args.source.as_str() {
|
||||||
args.source
|
"github" => {
|
||||||
));
|
let repo = args
|
||||||
|
.repo
|
||||||
|
.as_deref()
|
||||||
|
.ok_or_else(|| "github requires a 'repo' ('owner/name')".to_string())?;
|
||||||
|
let (owner, repo) = repo
|
||||||
|
.split_once('/')
|
||||||
|
.filter(|(o, r)| !o.is_empty() && !r.is_empty())
|
||||||
|
.ok_or_else(|| {
|
||||||
|
"repo must be in 'owner/name' form, e.g. 'samvallad33/vestige'".to_string()
|
||||||
|
})?;
|
||||||
|
execute_github(storage, owner, repo, args.reconcile, max_pages).await
|
||||||
|
}
|
||||||
|
"redmine" => {
|
||||||
|
let project = args
|
||||||
|
.project
|
||||||
|
.as_deref()
|
||||||
|
.filter(|p| !p.trim().is_empty())
|
||||||
|
.ok_or_else(|| "redmine requires a 'project' identifier".to_string())?;
|
||||||
|
let base_url = redmine_url().ok_or_else(|| {
|
||||||
|
"set the REDMINE_URL env var to the Redmine host (e.g. https://redmine.example.com)"
|
||||||
|
.to_string()
|
||||||
|
})?;
|
||||||
|
execute_redmine(storage, &base_url, project, args.reconcile, max_pages).await
|
||||||
|
}
|
||||||
|
other => Err(format!(
|
||||||
|
"Unsupported source '{other}'. Supported: 'github', 'redmine'."
|
||||||
|
)),
|
||||||
}
|
}
|
||||||
|
|
||||||
let (owner, repo) = args
|
|
||||||
.repo
|
|
||||||
.split_once('/')
|
|
||||||
.filter(|(o, r)| !o.is_empty() && !r.is_empty())
|
|
||||||
.ok_or_else(|| {
|
|
||||||
"repo must be in 'owner/name' form, e.g. 'samvallad33/vestige'".to_string()
|
|
||||||
})?;
|
|
||||||
|
|
||||||
execute_github(
|
|
||||||
storage,
|
|
||||||
owner,
|
|
||||||
repo,
|
|
||||||
args.reconcile,
|
|
||||||
args.max_pages.unwrap_or(10),
|
|
||||||
)
|
|
||||||
.await
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Connectors are feature-gated; surface a clear message when the build omits
|
/// Connectors are feature-gated; surface a clear message when the build omits
|
||||||
|
|
@ -122,11 +155,24 @@ async fn execute_github(
|
||||||
_reconcile: bool,
|
_reconcile: bool,
|
||||||
_max_pages: usize,
|
_max_pages: usize,
|
||||||
) -> Result<Value, String> {
|
) -> Result<Value, String> {
|
||||||
Err("This Vestige build was compiled without the 'connectors' feature. \
|
Err(NO_CONNECTORS_MSG.to_string())
|
||||||
Rebuild with --features connectors to enable source_sync."
|
|
||||||
.to_string())
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[cfg(not(feature = "connectors"))]
|
||||||
|
async fn execute_redmine(
|
||||||
|
_storage: &Arc<Storage>,
|
||||||
|
_base_url: &str,
|
||||||
|
_project: &str,
|
||||||
|
_reconcile: bool,
|
||||||
|
_max_pages: usize,
|
||||||
|
) -> Result<Value, String> {
|
||||||
|
Err(NO_CONNECTORS_MSG.to_string())
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(not(feature = "connectors"))]
|
||||||
|
const NO_CONNECTORS_MSG: &str = "This Vestige build was compiled without the 'connectors' feature. \
|
||||||
|
Rebuild with --features connectors to enable source_sync.";
|
||||||
|
|
||||||
#[cfg(feature = "connectors")]
|
#[cfg(feature = "connectors")]
|
||||||
async fn execute_github(
|
async fn execute_github(
|
||||||
storage: &Arc<Storage>,
|
storage: &Arc<Storage>,
|
||||||
|
|
@ -185,3 +231,61 @@ async fn execute_github(
|
||||||
}
|
}
|
||||||
}))
|
}))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[cfg(feature = "connectors")]
|
||||||
|
async fn execute_redmine(
|
||||||
|
storage: &Arc<Storage>,
|
||||||
|
base_url: &str,
|
||||||
|
project: &str,
|
||||||
|
reconcile: bool,
|
||||||
|
max_pages: usize,
|
||||||
|
) -> Result<Value, String> {
|
||||||
|
use vestige_core::connectors::redmine::{RedmineConfig, RedmineConnector};
|
||||||
|
use vestige_core::connectors::run_sync;
|
||||||
|
|
||||||
|
let config = RedmineConfig::new(base_url, project).with_api_key(redmine_api_key());
|
||||||
|
let connector =
|
||||||
|
RedmineConnector::new(config).map_err(|e| format!("connector init failed: {e}"))?;
|
||||||
|
|
||||||
|
let report = run_sync(storage.as_ref(), &connector, reconcile, max_pages)
|
||||||
|
.await
|
||||||
|
.map_err(|e| format!("sync failed: {e}"))?;
|
||||||
|
|
||||||
|
let total = report.created + report.updated + report.unchanged;
|
||||||
|
let authed = redmine_api_key().is_some();
|
||||||
|
|
||||||
|
let summary = format!(
|
||||||
|
"Synced redmine project '{project}': {} created, {} updated, {} unchanged{} ({total} records seen{}).",
|
||||||
|
report.created,
|
||||||
|
report.updated,
|
||||||
|
report.unchanged,
|
||||||
|
if report.reconciled {
|
||||||
|
format!(", {} tombstoned", report.tombstoned)
|
||||||
|
} else {
|
||||||
|
String::new()
|
||||||
|
},
|
||||||
|
if authed { "" } else { ", anonymous" },
|
||||||
|
);
|
||||||
|
|
||||||
|
Ok(json!({
|
||||||
|
"ok": true,
|
||||||
|
"summary": summary,
|
||||||
|
"source": "redmine",
|
||||||
|
"scope": project,
|
||||||
|
"created": report.created,
|
||||||
|
"updated": report.updated,
|
||||||
|
"unchanged": report.unchanged,
|
||||||
|
"tombstoned": report.tombstoned,
|
||||||
|
"reconciled": report.reconciled,
|
||||||
|
"cursor": report.new_cursor.map(|d| d.to_rfc3339()),
|
||||||
|
"authenticated": authed,
|
||||||
|
"warnings": report.warnings,
|
||||||
|
"hint": if total == 0 && !authed {
|
||||||
|
"No records returned. Set REDMINE_API_KEY (and confirm the REST API is enabled on the instance) for private projects."
|
||||||
|
} else if report.new_cursor.is_some() && total >= 100 {
|
||||||
|
"More may remain — run source_sync again to continue from the saved cursor."
|
||||||
|
} else {
|
||||||
|
"Search these with the normal search tools; results cite the Redmine issue URL."
|
||||||
|
}
|
||||||
|
}))
|
||||||
|
}
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
# External-Source Connectors
|
# External-Source Connectors
|
||||||
|
|
||||||
> Status: **v2.1.27** — GitHub Issues connector (reference). Redmine and others
|
> Status: **v2.1.27** — GitHub Issues + Redmine reference connectors, plus
|
||||||
> follow the same contract. Tracking issue:
|
> source-aware investigation filters for search. Tracking issue:
|
||||||
> [#57](https://github.com/samvallad33/vestige/issues/57).
|
> [#57](https://github.com/samvallad33/vestige/issues/57).
|
||||||
|
|
||||||
Connectors let Vestige act as a durable, local **retrieval and reasoning layer**
|
Connectors let Vestige act as a durable, local **retrieval and reasoning layer**
|
||||||
|
|
@ -59,18 +59,63 @@ you can:
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Quick start (Redmine)
|
||||||
|
|
||||||
|
Redmine stays the system of record; Vestige indexes a project's issues +
|
||||||
|
journals (comments and status/assignment history).
|
||||||
|
|
||||||
|
1. Point Vestige at the Redmine host and key (env only, never tool args):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
export REDMINE_URL=https://redmine.example.com
|
||||||
|
export REDMINE_API_KEY=xxxxxxxx # or VESTIGE_REDMINE_API_KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
The instance must have the REST API enabled (Administration → Settings → API)
|
||||||
|
or every call returns 401/403 even with a valid key.
|
||||||
|
|
||||||
|
2. Run `source_sync`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{ "source": "redmine", "project": "infra" }
|
||||||
|
```
|
||||||
|
|
||||||
|
Results cite the canonical `https://redmine.example.com/issues/<id>` URL.
|
||||||
|
|
||||||
## The `source_sync` tool
|
## The `source_sync` tool
|
||||||
|
|
||||||
| Field | Type | Default | Meaning |
|
| Field | Type | Default | Meaning |
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
| `repo` | string | — (required) | `owner/name`, e.g. `samvallad33/vestige`. |
|
| `source` | string | `github` | `github` or `redmine`. |
|
||||||
| `source` | string | `github` | External system. Currently only `github`. |
|
| `repo` | string | — | **GitHub:** `owner/name`, e.g. `samvallad33/vestige`. |
|
||||||
|
| `project` | string | — | **Redmine:** project identifier (host from `REDMINE_URL`). |
|
||||||
| `reconcile` | bool | `false` | Also tombstone local memories for issues no longer visible upstream (an extra full-enumeration pass). |
|
| `reconcile` | bool | `false` | Also tombstone local memories for issues no longer visible upstream (an extra full-enumeration pass). |
|
||||||
| `max_pages` | int | `10` | API pages to fetch this run (≤100 issues each). Lets a first sync of a large repo resume across calls. |
|
| `max_pages` | int | `10` | API pages to fetch this run (≤100 issues each). Lets a first sync of a large project resume across calls. |
|
||||||
|
|
||||||
The tool returns counts (`created` / `updated` / `unchanged` / `tombstoned`),
|
The tool returns counts (`created` / `updated` / `unchanged` / `tombstoned`),
|
||||||
the saved `cursor`, whether it ran authenticated, and a `hint` for the next step.
|
the saved `cursor`, whether it ran authenticated, and a `hint` for the next step.
|
||||||
|
|
||||||
|
## Investigation filters (Phase 4)
|
||||||
|
|
||||||
|
`search` accepts source-aware filters so an agent can scope a query to indexed
|
||||||
|
records. All are optional post-filters; combine with a larger `limit` if you
|
||||||
|
expect heavy thinning. A source-scoped query excludes non-connector memories.
|
||||||
|
|
||||||
|
| Filter | Matches |
|
||||||
|
|---|---|
|
||||||
|
| `source_system` | `github`, `redmine`, … |
|
||||||
|
| `source_project` | repo / project (exact) |
|
||||||
|
| `source_id` | a specific issue/ticket id |
|
||||||
|
| `source_type` | `issue`, `comment`, … |
|
||||||
|
| `source_author` | reporter/author (not assignee) |
|
||||||
|
| `source_updated_after` / `source_updated_before` | RFC3339 date range (inclusive) |
|
||||||
|
| `source_status` | `valid` (default `any`) or `tombstoned` |
|
||||||
|
|
||||||
|
Status, tracker, and priority are filterable through the existing `tag_prefix`
|
||||||
|
(the connectors emit lowercase `status:`, `tracker:`, `priority:`, and GitHub
|
||||||
|
`label:` / `state:` tags) — e.g. `tag_prefix: "status:open"`. Assignee and
|
||||||
|
linked-issue graph traversal are not yet exposed (see below).
|
||||||
|
|
||||||
### Idempotent, incremental sync
|
### Idempotent, incremental sync
|
||||||
|
|
||||||
Each run:
|
Each run:
|
||||||
|
|
@ -144,7 +189,18 @@ cargo build -p vestige-core --features connectors
|
||||||
Implement the `Connector` trait in `vestige_core::connectors` (fetch a window of
|
Implement the `Connector` trait in `vestige_core::connectors` (fetch a window of
|
||||||
records updated since a cursor, page forward, and optionally enumerate live ids
|
records updated since a cursor, page forward, and optionally enumerate live ids
|
||||||
for reconciliation), produce `NormalizedRecord`s with a filled
|
for reconciliation), produce `NormalizedRecord`s with a filled
|
||||||
`SourceEnvelope`, and hand them to `run_sync`. The GitHub connector
|
`SourceEnvelope`, and hand them to `run_sync`. Two reference connectors show the
|
||||||
(`crates/vestige-core/src/connectors/github.rs`) is the reference
|
shape — `crates/vestige-core/src/connectors/github.rs` (Link-header pagination,
|
||||||
implementation. The sync driver, idempotent upsert, cursor checkpointing, and
|
opaque-url cursor) and `crates/vestige-core/src/connectors/redmine.rs`
|
||||||
tombstone reconciliation are all reused for free.
|
(offset pagination, two-phase list-then-detail fetch). The sync driver,
|
||||||
|
idempotent upsert, cursor checkpointing, and tombstone reconciliation are all
|
||||||
|
reused for free.
|
||||||
|
|
||||||
|
## Not yet supported
|
||||||
|
|
||||||
|
- **Assignee filter** — the envelope stores `source_author` (reporter) only; no
|
||||||
|
assignee column yet.
|
||||||
|
- **Tracker / version dedicated filter params** — reachable today via
|
||||||
|
`tag_prefix` (`tracker:`, and `version:`/`category:` when emitted).
|
||||||
|
- **Linked-issue graph traversal** — connectors import relations into the memory
|
||||||
|
body, but issue-to-issue graph edges are not yet exposed in search.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue