Merge pull request #78 from samvallad33/feat/connectors-issue57-v2.1.27

feat(connectors): GitHub + Redmine retrieval layer (#57)
2026-06-20 21:18:08 +02:00 · 2026-06-19 02:35:14 -05:00 · 2026-06-19 02:35:14 -05:00 · 8e4f4052cc
commit 8e4f4052cc
parent 22d0d192eb 4e893c02ff
40 changed files with 3893 additions and 103 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -5,7 +5,78 @@ All notable changes to Vestige will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## [Unreleased]
+## [Unreleased] — "External-Source Connectors"
+
+> Bump `version` in the workspace `Cargo.toml`, both crates, `server.json`, and
+> `package.json` to `2.1.27` at release/tag time, and date this heading.
+
+Roadmap [#57](https://github.com/samvallad33/vestige/issues/57), **Phases 1–4
+(complete)**: Vestige can now act as a durable, local, semantically-searchable
+retrieval layer over an external system of record — GitHub Issues and Redmine —
+without replacing it. The external system stays canonical; Vestige **indexes,
+connects, retrieves, and cites back** to the source record.
+
+Unlike a live ticket-system MCP proxy (which holds no state and is rate-limited
+per query), Vestige keeps a durable embedded index: searchable **offline**,
+**semantically**, joinable with the rest of your memory, temporally versioned,
+and re-syncable **idempotently** with no duplication. To our knowledge no other
+local-first memory layer combines native connectors, external-URL provenance,
+content-hash idempotent sync, and tombstoning of vanished records.
+
+### Added
+
+- **`source_sync` MCP tool** — index an external system into Vestige.
+  - GitHub: `{"source": "github", "repo": "owner/name"}` indexes every issue +
+    its comments. Auth via `GITHUB_TOKEN` (public repos work tokenless at a
+    lower rate limit).
+  - Redmine: `{"source": "redmine", "project": "<id>"}` indexes a project's
+    issues + journals (comments and status/assignment history). Host from
+    `REDMINE_URL`, auth from `REDMINE_API_KEY`.
+  - Re-running updates changed issues in place (no duplicates); `reconcile:
+    true` tombstones issues no longer visible upstream.
+- **Source-aware investigation filters on `search`** (Phase 4) — filter results
+  by `source_system`, `source_project`, `source_id`, `source_type`,
+  `source_author`, a `source_updated_after`/`source_updated_before` date range,
+  and `source_status` (`valid` / `tombstoned` / `any`). Status, tracker, and
+  priority remain filterable via the existing `tag_prefix` (the connectors emit
+  `status:`/`tracker:`/`priority:`/`label:` tags). Applied as post-filters;
+  non-connector memories are excluded from a source-scoped query.
+- **Source envelope** on every memory — structured, machine-readable provenance
+  (`source_system`, `source_id`, `source_url`, `source_updated_at`,
+  `content_hash`, `synced_at`, `source_project`, `source_type`, `source_author`)
+  distinct from the legacy free-form `source` label. Search results gain a
+  `sourceRecord` object (with the canonical `url`) **only** for
+  connector-ingested memories, so an agent can cite and follow the source.
+- **Idempotent sync primitives** (`vestige-core`): `upsert_by_source` (keyed on
+  `(source_system, source_id)`, content-hash change detection), per-connector
+  cursor checkpoints (`connector_cursors`), and `reconcile_source_tombstones`
+  (invalidate-don't-delete via the bitemporal `valid_until`, so a vanished
+  record is retained for audit but drops out of current retrieval).
+- **Connector contract** (`vestige_core::connectors`) — a small source-agnostic
+  `Connector` trait + `run_sync` driver (cursor overlap window, incremental
+  paging, optional deletion reconcile) with two reference connectors behind the
+  optional `connectors` cargo feature (on by default in the MCP server, off in
+  the core library's default features so non-connector consumers link no HTTP
+  client):
+  - **GitHub Issues** — `state=all`, `since` cursor, Link-header pagination,
+    drops PRs, host-pinned next-url.
+  - **Redmine** — `status_id=*` (open + closed), hex-encoded `updated_on>=`
+    cursor, `offset` pagination, per-issue detail fetch for journals (the list
+    endpoint omits them), `X-Redmine-API-Key` header auth.
+
+### Database
+
+- **Migration V17** — nine nullable source-envelope columns on `knowledge_nodes`
+  (additive; every existing memory is untouched), a partial UNIQUE index on
+  `(source_system, source_id)` enforcing one memory per external record while
+  costing nothing for envelope-less legacy rows, and the `connector_cursors`
+  checkpoint table. Idempotent on replay, following the established
+  `add_column_if_missing` pattern.
+
+### Notes
+
+- Local-first and optional: with no `source_sync` call, behavior is unchanged.
+  The default core-library build does not link an HTTP client.

 ## [2.1.26] - 2026-06-15 — "Configurable Output"

--- a/Cargo.lock
+++ b/Cargo.lock
@ -521,6 +521,12 @@ version = "1.0.4"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"

+[[package]]
+name = "cfg_aliases"
+version = "0.2.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
+
 [[package]]
 name = "chrono"
 version = "0.4.44"
@ -1681,8 +1687,10 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "ff2abc00be7fca6ebc474524697ae276ad847ad0a6b3faa4bcb027e9a4614ad0"
 dependencies = [
 "cfg-if",
+ "js-sys",
 "libc",
 "wasi",
+ "wasm-bindgen",
 ]

 [[package]]
@ -1692,9 +1700,11 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "899def5c37c4fd7b2664648c28120ecec138e4d395b459e5ca34f9cce2dd77fd"
 dependencies = [
 "cfg-if",
+ "js-sys",
 "libc",
 "r-efi 5.3.0",
 "wasip2",
+ "wasm-bindgen",
 ]

 [[package]]
@ -1932,6 +1942,7 @@ dependencies = [
 "tokio",
 "tokio-rustls",
 "tower-service",
+ "webpki-roots 1.0.6",
 ]

 [[package]]
@ -2519,6 +2530,12 @@ dependencies = [
 "hashbrown 0.16.1",
 ]

+[[package]]
+name = "lru-slab"
+version = "0.1.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "112b39cec0b298b6c1999fee3e31427f74f676e4cb9879ed1a121b43661a4154"
+
 [[package]]
 name = "lzma-rust2"
 version = "0.15.7"
@ -3335,6 +3352,61 @@ version = "2.0.1"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "a993555f31e5a609f617c12db6250dedcac1b0a85076912c436e6fc9b2c8e6a3"

+[[package]]
+name = "quinn"
+version = "0.11.9"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "b9e20a958963c291dc322d98411f541009df2ced7b5a4f2bd52337638cfccf20"
+dependencies = [
+ "bytes",
+ "cfg_aliases",
+ "pin-project-lite",
+ "quinn-proto",
+ "quinn-udp",
+ "rustc-hash",
+ "rustls",
+ "socket2",
+ "thiserror 2.0.18",
+ "tokio",
+ "tracing",
+ "web-time",
+]
+
+[[package]]
+name = "quinn-proto"
+version = "0.11.14"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "434b42fec591c96ef50e21e886936e66d3cc3f737104fdb9b737c40ffb94c098"
+dependencies = [
+ "bytes",
+ "getrandom 0.3.4",
+ "lru-slab",
+ "rand",
+ "ring",
+ "rustc-hash",
+ "rustls",
+ "rustls-pki-types",
+ "slab",
+ "thiserror 2.0.18",
+ "tinyvec",
+ "tracing",
+ "web-time",
+]
+
+[[package]]
+name = "quinn-udp"
+version = "0.5.14"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "addec6a0dcad8a8d96a771f815f0eaf55f9d1805756410b39f5fa81332574cbd"
+dependencies = [
+ "cfg_aliases",
+ "libc",
+ "once_cell",
+ "socket2",
+ "tracing",
+ "windows-sys 0.52.0",
+]
+
 [[package]]
 name = "quote"
 version = "1.0.45"
@ -3571,6 +3643,8 @@ dependencies = [
 "native-tls",
 "percent-encoding",
 "pin-project-lite",
+ "quinn",
+ "rustls",
 "rustls-pki-types",
 "serde",
 "serde_json",
@ -3578,6 +3652,7 @@ dependencies = [
 "sync_wrapper",
 "tokio",
 "tokio-native-tls",
+ "tokio-rustls",
 "tokio-util",
 "tower",
 "tower-http",
@ -3587,6 +3662,7 @@ dependencies = [
 "wasm-bindgen-futures",
 "wasm-streams",
 "web-sys",
+ "webpki-roots 1.0.6",
 ]

 [[package]]
@ -3636,6 +3712,12 @@ dependencies = [
 "sqlite-wasm-rs",
 ]

+[[package]]
+name = "rustc-hash"
+version = "2.1.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "94300abf3f1ae2e2b8ffb7b58043de3d399c73fa6f4b73826402a5c457614dbe"
+
 [[package]]
 name = "rustix"
 version = "1.1.4"
@ -3670,6 +3752,7 @@ version = "1.14.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "be040f8b0a225e40375822a563fa9524378b9d63112f53e19ffff34df5d33fdd"
 dependencies = [
+ "web-time",
 "zeroize",
 ]

@ -4162,6 +4245,21 @@ dependencies = [
 "serde_json",
 ]

+[[package]]
+name = "tinyvec"
+version = "1.11.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "3e61e67053d25a4e82c844e8424039d9745781b3fc4f32b8d55ed50f5f667ef3"
+dependencies = [
+ "tinyvec_macros",
+]
+
+[[package]]
+name = "tinyvec_macros"
+version = "0.1.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "1f3ccbac311fea05f86f61904b462b55fb3df8837a366dfc601a0161d0532f20"
+
 [[package]]
 name = "tokenizers"
 version = "0.22.2"
@ -4684,6 +4782,7 @@ dependencies = [
 "git2",
 "lru",
 "notify",
+ "reqwest",
 "rusqlite",
 "serde",
 "serde_json",
--- a/crates/vestige-core/Cargo.toml
+++ b/crates/vestige-core/Cargo.toml
@ -57,6 +57,12 @@ qwen3-embeddings = ["embeddings", "fastembed/qwen3", "dep:candle-core"]
 # Backwards-compatible feature alias from the original v2.1.0 naming.
 qwen3-reranker = ["qwen3-embeddings"]

+# External-source connectors (#57). The connector *contract*, normalization,
+# and content-hashing are always compiled (pure, no network). This feature adds
+# the network-backed reference connectors (GitHub Issues, …) via `reqwest`, so
+# the default local-first build never links an HTTP client.
+connectors = ["dep:reqwest"]
+
 # Metal GPU acceleration on Apple Silicon (significantly faster inference)
 metal = ["fastembed/metal"]

@ -132,6 +138,14 @@ lru = "0.16"
 trait-variant = "0.1"
 blake3 = "1"

+# ============================================================================
+# OPTIONAL: External-source connectors (#57)
+# ============================================================================
+# HTTP client for network-backed reference connectors (GitHub Issues, Redmine).
+# rustls so connectors build with no system OpenSSL dependency. Behind the
+# `connectors` feature — the default local-first build does not link reqwest.
+reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"], optional = true }
+
 [dev-dependencies]
 tempfile = "3"
 criterion = { version = "0.5", features = ["html_reports"] }
--- a/crates/vestige-core/src/connectors/github.rs
+++ b/crates/vestige-core/src/connectors/github.rs
@ -0,0 +1,573 @@
+//! GitHub Issues connector (#57).
+//!
+//! Indexes a repository's issues + comments into source-aware Vestige memories
+//! so an agent can search and reason over the full issue history **offline**,
+//! **semantically**, and **cited back to the canonical issue URL**. Unlike the
+//! official GitHub MCP server — a stateless live API proxy — this builds a
+//! durable, embedded, temporally-versioned local index.
+//!
+//! ## Incremental sync (per the connector sync contract)
+//!
+//! - `state=all` so closing an issue is not mistaken for a deletion.
+//! - `sort=updated&direction=asc` so we page forward in cursor order and a
+//!   mid-run interruption resumes safely.
+//! - `since=<cursor − overlap>` filters on `updated_at`; the overlap + the
+//!   `content_hash` no-op makes re-scans safe and cheap.
+//! - `Link: rel="next"` drives pagination (never hand-built page urls).
+//! - Entries carrying a `pull_request` key are dropped (PRs are not issues).
+//! - Per issue we fold the body + comments into one memory; the hash covers
+//!   the stable fields only (title, body, state, labels, comments) — never the
+//!   cursor timestamp or volatile counts.
+//!
+//! GitHub has no deletion feed, so deletions are reconciled out-of-band via
+//! [`list_live_ids`](Connector::list_live_ids).
+
+use chrono::{DateTime, Utc};
+use serde::Deserialize;
+
+use super::{
+    Connector, ConnectorError, ConnectorResult, FetchPage, NormalizedRecord, content_hash,
+};
+use crate::memory::SourceEnvelope;
+
+const API_ROOT: &str = "https://api.github.com";
+const USER_AGENT: &str = concat!("vestige-connector/", env!("CARGO_PKG_VERSION"));
+const PER_PAGE: u32 = 100;
+
+/// Configuration for a GitHub Issues connector instance.
+#[derive(Debug, Clone)]
+pub struct GithubConfig {
+    /// Repository owner (user or org).
+    pub owner: String,
+    /// Repository name.
+    pub repo: String,
+    /// Personal access token. Optional for public repos (60 req/hr
+    /// unauthenticated) but strongly recommended (5000 req/hr authenticated).
+    pub token: Option<String>,
+    /// Override the API root (for GitHub Enterprise or tests).
+    pub api_root: Option<String>,
+    /// Max comments to fold into one issue memory (defense against huge threads).
+    pub max_comments: usize,
+}
+
+impl GithubConfig {
+    pub fn new(owner: impl Into<String>, repo: impl Into<String>) -> Self {
+        Self {
+            owner: owner.into(),
+            repo: repo.into(),
+            token: None,
+            api_root: None,
+            max_comments: 50,
+        }
+    }
+
+    pub fn with_token(mut self, token: Option<String>) -> Self {
+        self.token = token;
+        self
+    }
+
+    fn scope(&self) -> String {
+        format!("{}/{}", self.owner, self.repo)
+    }
+
+    fn root(&self) -> &str {
+        self.api_root.as_deref().unwrap_or(API_ROOT)
+    }
+}
+
+/// A GitHub Issues connector bound to one repository.
+pub struct GithubConnector {
+    config: GithubConfig,
+    scope: String,
+    client: reqwest::Client,
+}
+
+impl GithubConnector {
+    pub fn new(config: GithubConfig) -> ConnectorResult<Self> {
+        if config.owner.is_empty() || config.repo.is_empty() {
+            return Err(ConnectorError::Config(
+                "owner and repo are required".to_string(),
+            ));
+        }
+        let client = reqwest::Client::builder()
+            .user_agent(USER_AGENT)
+            .build()
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        let scope = config.scope();
+        Ok(Self {
+            config,
+            scope,
+            client,
+        })
+    }
+
+    fn auth(&self, req: reqwest::RequestBuilder) -> reqwest::RequestBuilder {
+        let req = req
+            .header("Accept", "application/vnd.github+json")
+            .header("X-GitHub-Api-Version", "2022-11-28");
+        match &self.config.token {
+            Some(t) => req.bearer_auth(t),
+            None => req,
+        }
+    }
+
+    /// Map an HTTP response status into a connector error, honoring rate-limit
+    /// signals so the driver can back off politely.
+    fn classify_status(resp: &reqwest::Response) -> Option<ConnectorError> {
+        let status = resp.status();
+        if status.is_success() {
+            return None;
+        }
+        // Primary rate limit: 403/429 with remaining=0.
+        if status.as_u16() == 403 || status.as_u16() == 429 {
+            let remaining = resp
+                .headers()
+                .get("x-ratelimit-remaining")
+                .and_then(|v| v.to_str().ok())
+                .and_then(|s| s.parse::<i64>().ok());
+            if remaining == Some(0) || status.as_u16() == 429 {
+                let retry = resp
+                    .headers()
+                    .get("retry-after")
+                    .and_then(|v| v.to_str().ok())
+                    .and_then(|s| s.parse::<u64>().ok())
+                    .map(std::time::Duration::from_secs);
+                return Some(ConnectorError::RateLimited(retry));
+            }
+        }
+        Some(ConnectorError::Source {
+            status: status.as_u16(),
+            message: status
+                .canonical_reason()
+                .unwrap_or("request failed")
+                .to_string(),
+        })
+    }
+
+    /// Parse the `Link` header for the `rel="next"` url, if any.
+    ///
+    /// The `next` url comes from the server response, so we pin it to the
+    /// configured API host before following it: otherwise a malicious or
+    /// compromised endpoint could redirect the connector — which attaches the
+    /// bearer token to every request — to an attacker-controlled URL and
+    /// exfiltrate the credential (SSRF / token leak). `expected_host` is the
+    /// host of the connector's API root.
+    fn next_link(resp: &reqwest::Response, expected_host: Option<&str>) -> Option<String> {
+        let link = resp.headers().get(reqwest::header::LINK)?.to_str().ok()?;
+        for part in link.split(',') {
+            let part = part.trim();
+            if part.contains("rel=\"next\"")
+                && let (Some(start), Some(end)) = (part.find('<'), part.find('>'))
+                && start < end
+            {
+                let url = &part[start + 1..end];
+                // Host-pin: only follow a next-url on the same host as the API
+                // root we were configured with.
+                if let Some(expected) = expected_host {
+                    match reqwest::Url::parse(url) {
+                        Ok(parsed) if parsed.host_str() == Some(expected) => {
+                            return Some(url.to_string());
+                        }
+                        _ => {
+                            tracing::warn!(
+                                next_url = url,
+                                "dropping cross-host Link next url (host pin)"
+                            );
+                            return None;
+                        }
+                    }
+                }
+                return Some(url.to_string());
+            }
+        }
+        None
+    }
+
+    /// Host of the configured API root, used to pin Link `next` urls.
+    fn api_host(&self) -> Option<String> {
+        reqwest::Url::parse(self.config.root())
+            .ok()
+            .and_then(|u| u.host_str().map(|h| h.to_string()))
+    }
+
+    /// Fetch the comments for one issue (a single page; capped by `max_comments`).
+    async fn fetch_comments(&self, issue_number: u64) -> ConnectorResult<Vec<RawComment>> {
+        let url = format!(
+            "{}/repos/{}/{}/issues/{}/comments?per_page={}",
+            self.config.root(),
+            self.config.owner,
+            self.config.repo,
+            issue_number,
+            self.config.max_comments.min(100),
+        );
+        let resp = self
+            .auth(self.client.get(&url))
+            .send()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        if let Some(err) = Self::classify_status(&resp) {
+            return Err(err);
+        }
+        resp.json::<Vec<RawComment>>()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))
+    }
+
+    /// Fold a raw issue + its comments into one normalized memory record.
+    fn normalize(&self, issue: &RawIssue, comments: &[RawComment]) -> NormalizedRecord {
+        let author = issue.user.as_ref().map(|u| u.login.clone());
+
+        // Human-readable content: header + body + chronological comments.
+        let mut content = format!(
+            "[{}#{}] {}\nState: {}\n",
+            self.scope, issue.number, issue.title, issue.state
+        );
+        if let Some(body) = &issue.body
+            && !body.trim().is_empty()
+        {
+            content.push('\n');
+            content.push_str(body.trim());
+            content.push('\n');
+        }
+        let mut sorted_comments: Vec<&RawComment> = comments.iter().collect();
+        sorted_comments.sort_by_key(|c| c.id);
+        for c in &sorted_comments {
+            let who = c.user.as_ref().map(|u| u.login.as_str()).unwrap_or("?");
+            content.push_str(&format!("\n— {who}: {}", c.body.trim()));
+        }
+
+        // Labels, sorted for a stable hash.
+        let mut labels: Vec<String> = issue.labels.iter().map(|l| l.name.clone()).collect();
+        labels.sort();
+
+        // Stable content hash — meaning only, never the cursor timestamp or
+        // volatile counts. Comments contribute their id+body in id order.
+        let comments_blob = sorted_comments
+            .iter()
+            .map(|c| format!("{}:{}", c.id, c.body.trim()))
+            .collect::<Vec<_>>()
+            .join("\u{1f}");
+        let labels_blob = labels.join(",");
+        let number_str = issue.number.to_string();
+        let body_str = issue.body.clone().unwrap_or_default();
+        let hash = content_hash(&[
+            ("number", &number_str),
+            ("title", &issue.title),
+            ("state", &issue.state),
+            ("body", &body_str),
+            ("labels", &labels_blob),
+            ("comments", &comments_blob),
+        ]);
+
+        let mut tags = vec![
+            "github".to_string(),
+            "issue".to_string(),
+            format!("state:{}", issue.state),
+        ];
+        tags.extend(labels.into_iter().map(|l| format!("label:{l}")));
+
+        let envelope = SourceEnvelope {
+            source_system: Some("github".to_string()),
+            source_id: Some(issue.number.to_string()),
+            source_url: Some(issue.html_url.clone()),
+            source_updated_at: DateTime::parse_from_rfc3339(&issue.updated_at)
+                .ok()
+                .map(|d| d.with_timezone(&Utc)),
+            content_hash: Some(hash),
+            synced_at: Some(Utc::now()),
+            source_project: Some(self.scope.clone()),
+            source_type: Some("issue".to_string()),
+            source_author: author,
+        };
+
+        NormalizedRecord {
+            content,
+            tags,
+            envelope,
+        }
+    }
+}
+
+impl Connector for GithubConnector {
+    fn source_system(&self) -> &str {
+        "github"
+    }
+
+    fn scope(&self) -> &str {
+        &self.scope
+    }
+
+    async fn fetch_updated(
+        &self,
+        since: Option<DateTime<Utc>>,
+        cursor: Option<String>,
+    ) -> ConnectorResult<FetchPage> {
+        // `cursor` is a full next-page url from a previous Link header; on the
+        // first page we build the url from owner/repo + since.
+        let url = match cursor {
+            Some(u) => u,
+            None => {
+                let mut u = format!(
+                    "{}/repos/{}/{}/issues?state=all&sort=updated&direction=asc&per_page={}",
+                    self.config.root(),
+                    self.config.owner,
+                    self.config.repo,
+                    PER_PAGE,
+                );
+                if let Some(s) = since {
+                    // GitHub documents the `since` format as YYYY-MM-DDTHH:MM:SSZ.
+                    // `to_rfc3339()` emits the `+00:00` offset form, and the `+`
+                    // is a reserved query char that the server decodes as a
+                    // space — corrupting the timestamp and silently re-fetching
+                    // all history every run. Emit the `Z` form (no reserved
+                    // char, exact documented format) instead.
+                    let since_z = s.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
+                    u.push_str(&format!("&since={since_z}"));
+                }
+                u
+            }
+        };
+
+        let resp = self
+            .auth(self.client.get(&url))
+            .send()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        if let Some(err) = Self::classify_status(&resp) {
+            return Err(err);
+        }
+        let next_cursor = Self::next_link(&resp, self.api_host().as_deref());
+        let issues: Vec<RawIssue> = resp
+            .json()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+
+        let mut records = Vec::new();
+        for issue in &issues {
+            // Drop pull requests — "every PR is an issue, but not vice versa".
+            if issue.pull_request.is_some() {
+                continue;
+            }
+            // Fetch comments only when the issue has any.
+            let comments = if issue.comments > 0 {
+                self.fetch_comments(issue.number).await.unwrap_or_default()
+            } else {
+                Vec::new()
+            };
+            records.push(self.normalize(issue, &comments));
+        }
+
+        Ok(FetchPage {
+            records,
+            next_cursor,
+        })
+    }
+
+    async fn list_live_ids(&self) -> ConnectorResult<Option<Vec<String>>> {
+        // Enumerate all issue numbers (ids only) for the reconcile pass, paging
+        // via Link. Cheap relative to full sync (no comment fetch, no bodies).
+        let mut ids = Vec::new();
+        let mut url = Some(format!(
+            "{}/repos/{}/{}/issues?state=all&per_page={}",
+            self.config.root(),
+            self.config.owner,
+            self.config.repo,
+            PER_PAGE,
+        ));
+        while let Some(u) = url {
+            let resp = self
+                .auth(self.client.get(&u))
+                .send()
+                .await
+                .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+            if let Some(err) = Self::classify_status(&resp) {
+                return Err(err);
+            }
+            let next = Self::next_link(&resp, self.api_host().as_deref());
+            let issues: Vec<RawIssue> = resp
+                .json()
+                .await
+                .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+            for issue in issues {
+                if issue.pull_request.is_none() {
+                    ids.push(issue.number.to_string());
+                }
+            }
+            url = next;
+        }
+        Ok(Some(ids))
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Raw GitHub API shapes (only the fields we use)
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Deserialize)]
+struct RawIssue {
+    number: u64,
+    title: String,
+    #[serde(default)]
+    body: Option<String>,
+    state: String,
+    html_url: String,
+    updated_at: String,
+    #[serde(default)]
+    comments: u64,
+    #[serde(default)]
+    labels: Vec<RawLabel>,
+    #[serde(default)]
+    user: Option<RawUser>,
+    /// Present iff this "issue" is actually a pull request.
+    #[serde(default)]
+    pull_request: Option<serde_json::Value>,
+}
+
+#[derive(Debug, Deserialize)]
+struct RawLabel {
+    name: String,
+}
+
+#[derive(Debug, Deserialize)]
+struct RawUser {
+    login: String,
+}
+
+#[derive(Debug, Deserialize)]
+struct RawComment {
+    id: u64,
+    body: String,
+    #[serde(default)]
+    user: Option<RawUser>,
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn issue(number: u64, title: &str, body: &str, state: &str) -> RawIssue {
+        RawIssue {
+            number,
+            title: title.to_string(),
+            body: Some(body.to_string()),
+            state: state.to_string(),
+            html_url: format!("https://github.com/o/r/issues/{number}"),
+            updated_at: "2026-06-19T00:00:00Z".to_string(),
+            comments: 0,
+            labels: vec![RawLabel {
+                name: "bug".to_string(),
+            }],
+            user: Some(RawUser {
+                login: "octocat".to_string(),
+            }),
+            pull_request: None,
+        }
+    }
+
+    fn connector() -> GithubConnector {
+        GithubConnector::new(GithubConfig::new("o", "r")).unwrap()
+    }
+
+    #[test]
+    fn normalize_builds_keyed_envelope_with_citation() {
+        let c = connector();
+        let rec = c.normalize(&issue(57, "Connectors", "Add Redmine", "open"), &[]);
+        let env = &rec.envelope;
+        assert!(env.has_key());
+        assert_eq!(env.source_system.as_deref(), Some("github"));
+        assert_eq!(env.source_id.as_deref(), Some("57"));
+        assert_eq!(
+            env.source_url.as_deref(),
+            Some("https://github.com/o/r/issues/57")
+        );
+        assert_eq!(env.source_project.as_deref(), Some("o/r"));
+        assert!(rec.content.contains("Connectors"));
+        assert!(rec.tags.contains(&"state:open".to_string()));
+        assert!(rec.tags.contains(&"label:bug".to_string()));
+    }
+
+    #[test]
+    fn hash_stable_across_label_order_and_changes_on_edit() {
+        let c = connector();
+        let mut a = issue(1, "T", "body", "open");
+        a.labels = vec![RawLabel { name: "b".into() }, RawLabel { name: "a".into() }];
+        let mut b = issue(1, "T", "body", "open");
+        b.labels = vec![RawLabel { name: "a".into() }, RawLabel { name: "b".into() }];
+        let ha = c.normalize(&a, &[]).envelope.content_hash;
+        let hb = c.normalize(&b, &[]).envelope.content_hash;
+        assert_eq!(ha, hb, "label order must not change the hash");
+
+        // Editing the body must change the hash.
+        let edited = c
+            .normalize(&issue(1, "T", "EDITED", "open"), &[])
+            .envelope
+            .content_hash;
+        assert_ne!(ha, edited);
+
+        // Closing the issue changes state → changes the hash (not a no-op).
+        let closed = c
+            .normalize(&issue(1, "T", "body", "closed"), &[])
+            .envelope
+            .content_hash;
+        assert_ne!(ha, closed);
+    }
+
+    #[test]
+    fn comments_fold_in_id_order_and_affect_hash() {
+        let c = connector();
+        let comments = vec![
+            RawComment {
+                id: 2,
+                body: "second".into(),
+                user: Some(RawUser { login: "x".into() }),
+            },
+            RawComment {
+                id: 1,
+                body: "first".into(),
+                user: Some(RawUser { login: "y".into() }),
+            },
+        ];
+        let rec = c.normalize(&issue(1, "T", "body", "open"), &comments);
+        // Folded in id order regardless of input order.
+        let first_pos = rec.content.find("first").unwrap();
+        let second_pos = rec.content.find("second").unwrap();
+        assert!(first_pos < second_pos, "comments must fold in id order");
+
+        let no_comments = c
+            .normalize(&issue(1, "T", "body", "open"), &[])
+            .envelope
+            .content_hash;
+        assert_ne!(
+            rec.envelope.content_hash, no_comments,
+            "comments must contribute to the hash"
+        );
+    }
+
+    #[test]
+    fn rejects_empty_owner_repo() {
+        assert!(GithubConnector::new(GithubConfig::new("", "r")).is_err());
+        assert!(GithubConnector::new(GithubConfig::new("o", "")).is_err());
+    }
+
+    #[test]
+    fn since_uses_z_form_not_plus_offset() {
+        // Regression: to_rfc3339() emits `+00:00`; the `+` decodes to a space
+        // server-side and corrupts the cursor. We must emit the `Z` form.
+        let ts = DateTime::parse_from_rfc3339("2026-06-19T00:00:00Z")
+            .unwrap()
+            .with_timezone(&Utc);
+        let z = ts.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
+        assert_eq!(z, "2026-06-19T00:00:00Z");
+        assert!(!z.contains('+'), "since must not contain a reserved '+'");
+    }
+
+    #[test]
+    fn next_link_host_pin_drops_cross_host_url() {
+        // The host-pin parsing logic (used to prevent token exfiltration via a
+        // malicious Link header) must reject a different host.
+        let same = reqwest::Url::parse("https://api.github.com/x?page=2").unwrap();
+        let other = reqwest::Url::parse("https://evil.example/x?page=2").unwrap();
+        assert_eq!(same.host_str(), Some("api.github.com"));
+        assert_ne!(other.host_str(), Some("api.github.com"));
+    }
+}
--- a/crates/vestige-core/src/connectors/mod.rs
+++ b/crates/vestige-core/src/connectors/mod.rs
@ -0,0 +1,383 @@
+//! External-source connectors (#57).
+//!
+//! A connector turns records in a long-lived external system (a ticket tracker,
+//! an issue board, a support queue) into source-aware Vestige memories, so an
+//! investigative agent can search and reason over years of history **offline**,
+//! **semantically**, and **cited back to the canonical record** — something no
+//! live ticket-system MCP proxy can do.
+//!
+//! ## Layering
+//!
+//! - The [`Connector`] contract, [`NormalizedRecord`] shape, and the stable
+//!   [`content_hash`] are pure (no network) and always compiled, so the sync
+//!   semantics are unit-testable without hitting an API.
+//! - Network-backed reference connectors ([`github`] and [`redmine`]) live
+//!   behind the `connectors` cargo feature so the default local-first build
+//!   links no HTTP client.
+//!
+//! ## Sync contract (the part that makes re-running safe)
+//!
+//! Every connector produces [`NormalizedRecord`]s. Each carries a
+//! [`SourceEnvelope`](crate::memory::SourceEnvelope) whose
+//! `(source_system, source_id)` is the idempotency key and whose `content_hash`
+//! is the change detector. The driver routes each record through
+//! [`upsert_by_source`](crate::storage::SqliteMemoryStore::upsert_by_source):
+//!
+//! - unseen record → insert
+//! - changed `content_hash` → update in place (+ re-embed)
+//! - same `content_hash` → no-op (only liveness advances)
+//!
+//! Because neither GitHub nor Redmine expose a deletion feed, deletions are
+//! handled out-of-band by a periodic reconcile pass
+//! ([`reconcile_source_tombstones`](crate::storage::SqliteMemoryStore::reconcile_source_tombstones)).
+
+use chrono::{DateTime, Utc};
+
+use crate::memory::{IngestInput, SourceEnvelope};
+use crate::storage::ConnectorCursor;
+
+#[cfg(feature = "connectors")]
+pub mod github;
+
+#[cfg(feature = "connectors")]
+pub mod redmine;
+
+/// A single external record, already normalized into the fields Vestige needs.
+///
+/// The connector is responsible for flattening a possibly-rich source record
+/// (an issue plus its comments / journals / status changes) into a single
+/// retrievable `content` blob plus the structured envelope. Keeping one memory
+/// per logical record (rather than per comment) keeps retrieval coherent and
+/// the idempotency key simple.
+#[derive(Debug, Clone)]
+pub struct NormalizedRecord {
+    /// Human-readable content to embed and search over.
+    pub content: String,
+    /// Tags for categorization (e.g. `["github", "issue", "state:open"]`).
+    pub tags: Vec<String>,
+    /// The provenance envelope. `source_system`, `source_id`, and `content_hash`
+    /// MUST be set for idempotent upsert.
+    pub envelope: SourceEnvelope,
+}
+
+impl NormalizedRecord {
+    /// Convert into an [`IngestInput`] ready for `upsert_by_source`.
+    pub fn into_ingest_input(self) -> IngestInput {
+        IngestInput {
+            content: self.content,
+            node_type: "event".to_string(),
+            source: self.envelope.source_url.clone(),
+            tags: self.tags,
+            source_envelope: Some(self.envelope),
+            ..Default::default()
+        }
+    }
+}
+
+/// One page of records plus the cursor needed to fetch the next page.
+#[derive(Debug, Clone, Default)]
+pub struct FetchPage {
+    pub records: Vec<NormalizedRecord>,
+    /// Opaque token to resume after this page, or `None` when exhausted.
+    pub next_cursor: Option<String>,
+}
+
+/// Errors a connector can surface.
+#[derive(Debug, thiserror::Error)]
+pub enum ConnectorError {
+    #[error("configuration error: {0}")]
+    Config(String),
+    #[error("transport error: {0}")]
+    Transport(String),
+    #[error("rate limited; retry after {0:?}")]
+    RateLimited(Option<std::time::Duration>),
+    #[error("source error ({status}): {message}")]
+    Source { status: u16, message: String },
+}
+
+pub type ConnectorResult<T> = Result<T, ConnectorError>;
+
+/// The contract every external-source connector implements.
+///
+/// Intentionally minimal: fetch a window of records updated since a cursor,
+/// page through them, and (separately) enumerate currently-live ids for the
+/// deletion-reconcile pass. The driver owns persistence, embedding, and cursor
+/// checkpointing — a connector is just a typed, incremental reader.
+#[allow(async_fn_in_trait)]
+pub trait Connector {
+    /// Stable system identifier written into every envelope (`github`, …).
+    fn source_system(&self) -> &str;
+
+    /// The scope this connector instance is bound to (`owner/repo`, project id).
+    fn scope(&self) -> &str;
+
+    /// Fetch one page of records whose source-updated time is `>= since`
+    /// (inclusive on purpose — see the overlap note below), resuming from
+    /// `cursor` when provided. Records should be returned in ascending
+    /// update-time order so a mid-run interruption resumes safely.
+    ///
+    /// Callers pass `since = checkpoint − overlap` (a few minutes) so a record
+    /// written with a slightly-behind upstream clock, or one sharing the exact
+    /// boundary second, is never skipped. The `content_hash` short-circuit in
+    /// `upsert_by_source` makes the resulting re-scan free.
+    async fn fetch_updated(
+        &self,
+        since: Option<DateTime<Utc>>,
+        cursor: Option<String>,
+    ) -> ConnectorResult<FetchPage>;
+
+    /// Enumerate the ids currently visible upstream for this scope, for the
+    /// deletion-reconcile pass. Cheap (ids only). `None` means the connector
+    /// cannot enumerate, so the driver must skip reconciliation rather than
+    /// tombstone everything.
+    async fn list_live_ids(&self) -> ConnectorResult<Option<Vec<String>>> {
+        Ok(None)
+    }
+}
+
+/// Recommended overlap subtracted from the saved cursor before the next fetch,
+/// to absorb clock skew and same-second boundary updates (the `>=` window).
+pub const CURSOR_OVERLAP_SECS: i64 = 120;
+
+/// Summary of one sync run, returned to the caller / surfaced by the MCP tool.
+#[derive(Debug, Clone, Default, serde::Serialize)]
+pub struct SyncReport {
+    pub source_system: String,
+    pub scope: String,
+    pub created: usize,
+    pub updated: usize,
+    pub unchanged: usize,
+    pub tombstoned: usize,
+    /// New high-water mark persisted as the cursor for the next run.
+    pub new_cursor: Option<DateTime<Utc>>,
+    /// Whether a deletion-reconcile pass ran this time.
+    pub reconciled: bool,
+    /// Non-fatal warnings (e.g. a page that failed and was skipped).
+    pub warnings: Vec<String>,
+}
+
+/// Drive a full incremental sync of one connector into the store (#57).
+///
+/// This is the orchestration the MCP `source_sync` tool calls. It:
+/// 1. loads the saved checkpoint and starts from `cursor − overlap` (the `>=`
+///    window that prevents missing same-second / clock-skewed updates);
+/// 2. pages the connector forward in update order, routing each record through
+///    [`upsert_by_source`](crate::storage::SqliteMemoryStore::upsert_by_source)
+///    (insert / update-in-place / no-op by content hash);
+/// 3. advances the cursor to the max `source_updated_at` actually observed,
+///    persisting it only after the run so a crash re-scans rather than skips;
+/// 4. optionally reconciles deletions when `reconcile` is set and the connector
+///    can enumerate live ids.
+///
+/// `max_pages` bounds a single run (so a first sync of a 15-year tracker can be
+/// resumed across calls rather than blocking on one enormous fetch).
+pub async fn run_sync<C: Connector>(
+    store: &crate::storage::SqliteMemoryStore,
+    connector: &C,
+    reconcile: bool,
+    max_pages: usize,
+) -> ConnectorResult<SyncReport> {
+    use crate::storage::SourceUpsertOutcome;
+
+    let source_system = connector.source_system().to_string();
+    let scope = connector.scope().to_string();
+
+    let mut report = SyncReport {
+        source_system: source_system.clone(),
+        scope: scope.clone(),
+        ..Default::default()
+    };
+
+    // 1. Load checkpoint, apply the overlap window.
+    let checkpoint = store
+        .get_connector_cursor(&source_system, &scope)
+        .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+    let since = checkpoint
+        .cursor_updated_at
+        .map(|c| c - chrono::Duration::seconds(CURSOR_OVERLAP_SECS));
+
+    // 2. Page forward, upserting each record.
+    let mut cursor: Option<String> = None;
+    let mut max_seen = checkpoint.cursor_updated_at;
+    // Oldest source_updated_at among records that FAILED to upsert this run. We
+    // must not advance the persisted cursor past this, or the failed record —
+    // fetched in ascending update order — would fall outside the next run's
+    // `since` window and never be retried (a silent permanent gap).
+    let mut oldest_failure: Option<DateTime<Utc>> = None;
+    // Count of genuinely new records (Created). Unchanged re-scans of the
+    // overlap window must not inflate the running total.
+    let mut created_this_run = 0i64;
+
+    for _ in 0..max_pages.max(1) {
+        let page = connector.fetch_updated(since, cursor.clone()).await?;
+        for record in page.records {
+            let observed = record.envelope.source_updated_at;
+            match store.upsert_by_source(record.into_ingest_input()) {
+                Ok(res) => {
+                    match res.outcome {
+                        SourceUpsertOutcome::Created => {
+                            report.created += 1;
+                            created_this_run += 1;
+                        }
+                        SourceUpsertOutcome::Updated => report.updated += 1,
+                        SourceUpsertOutcome::Unchanged => report.unchanged += 1,
+                    }
+                    if let Some(ts) = observed
+                        && max_seen.map(|m| ts > m).unwrap_or(true)
+                    {
+                        max_seen = Some(ts);
+                    }
+                }
+                Err(e) => {
+                    report.warnings.push(format!("upsert failed: {e}"));
+                    if let Some(ts) = observed
+                        && oldest_failure.map(|f| ts < f).unwrap_or(true)
+                    {
+                        oldest_failure = Some(ts);
+                    }
+                }
+            }
+        }
+        match page.next_cursor {
+            Some(next) => cursor = Some(next),
+            None => break,
+        }
+    }
+
+    // Clamp the cursor so we never advance past a record that failed this run.
+    // Subtract one second so the next run's inclusive `since` re-includes it.
+    if let Some(failed_at) = oldest_failure {
+        let clamp_to = failed_at - chrono::Duration::seconds(1);
+        max_seen = Some(match max_seen {
+            Some(m) if m < clamp_to => m,
+            _ => clamp_to,
+        });
+    }
+
+    // 3. Optional deletion reconciliation.
+    let mut reconciled = false;
+    if reconcile {
+        match connector.list_live_ids().await {
+            Ok(Some(live_ids)) => {
+                match store.reconcile_source_tombstones(&source_system, &scope, &live_ids) {
+                    Ok(r) => {
+                        report.tombstoned = r.tombstoned.len();
+                        reconciled = true;
+                    }
+                    Err(e) => report.warnings.push(format!("reconcile failed: {e}")),
+                }
+            }
+            Ok(None) => report
+                .warnings
+                .push("connector cannot enumerate live ids; skipped reconcile".to_string()),
+            Err(e) => report.warnings.push(format!("list_live_ids failed: {e}")),
+        }
+    }
+    report.reconciled = reconciled;
+    report.new_cursor = max_seen;
+
+    // 4. Persist the checkpoint (only after the run).
+    let now = Utc::now();
+    let new_checkpoint = ConnectorCursor {
+        source_system: source_system.clone(),
+        scope: scope.clone(),
+        cursor_updated_at: max_seen,
+        last_synced_at: Some(now),
+        last_full_reconcile_at: if reconciled {
+            Some(now)
+        } else {
+            checkpoint.last_full_reconcile_at
+        },
+        // Accumulate only NEW records, so re-scanning the overlap window (which
+        // reports Unchanged) does not inflate the running total.
+        records_seen: checkpoint.records_seen + created_this_run,
+    };
+    store
+        .save_connector_cursor(&new_checkpoint)
+        .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+
+    Ok(report)
+}
+
+/// Compute a stable content hash over the record's meaning.
+///
+/// Stability requirements (so re-syncing an unchanged record is a true no-op):
+/// - **key order independent** — callers pass `(field, value)` pairs which we
+///   sort before hashing, so map/field ordering never changes the digest;
+/// - **volatile fields excluded** — the caller must omit the cursor timestamp,
+///   view/comment counts, and ephemeral permission flags (hash the meaning,
+///   not the metadata);
+/// - **collision-resistant** — BLAKE3 (already a Vestige dependency).
+///
+/// Comment/journal arrays should be flattened into the pairs in a stable order
+/// (sorted by their own id) by the caller before hashing.
+pub fn content_hash(fields: &[(&str, &str)]) -> String {
+    let mut pairs: Vec<(&str, &str)> = fields.to_vec();
+    pairs.sort_by(|a, b| a.0.cmp(b.0).then(a.1.cmp(b.1)));
+
+    let mut hasher = blake3::Hasher::new();
+    for (k, v) in pairs {
+        // Length-prefix each field so ("ab","c") can't collide with ("a","bc").
+        hasher.update(&(k.len() as u64).to_le_bytes());
+        hasher.update(k.as_bytes());
+        hasher.update(&(v.len() as u64).to_le_bytes());
+        hasher.update(v.as_bytes());
+    }
+    hasher.finalize().to_hex().to_string()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn content_hash_is_order_independent() {
+        let a = content_hash(&[
+            ("title", "Crash"),
+            ("body", "stacktrace"),
+            ("state", "open"),
+        ]);
+        let b = content_hash(&[
+            ("state", "open"),
+            ("title", "Crash"),
+            ("body", "stacktrace"),
+        ]);
+        assert_eq!(a, b, "reordering fields must not change the hash");
+    }
+
+    #[test]
+    fn content_hash_changes_with_content() {
+        let a = content_hash(&[("body", "v1")]);
+        let b = content_hash(&[("body", "v2")]);
+        assert_ne!(a, b, "different content must hash differently");
+    }
+
+    #[test]
+    fn content_hash_no_boundary_collision() {
+        // ("ab","c") vs ("a","bc") must differ thanks to length prefixing.
+        let a = content_hash(&[("ab", "c")]);
+        let b = content_hash(&[("a", "bc")]);
+        assert_ne!(a, b);
+    }
+
+    #[test]
+    fn normalized_record_carries_envelope_into_input() {
+        let rec = NormalizedRecord {
+            content: "issue body".to_string(),
+            tags: vec!["github".to_string()],
+            envelope: SourceEnvelope {
+                source_system: Some("github".to_string()),
+                source_id: Some("42".to_string()),
+                source_url: Some("https://example/42".to_string()),
+                content_hash: Some("h".to_string()),
+                ..Default::default()
+            },
+        };
+        let input = rec.into_ingest_input();
+        assert_eq!(input.content, "issue body");
+        assert_eq!(input.source.as_deref(), Some("https://example/42"));
+        let env = input.source_envelope.unwrap();
+        assert!(env.has_key());
+        assert_eq!(env.source_id.as_deref(), Some("42"));
+    }
+}
--- a/crates/vestige-core/src/connectors/redmine.rs
+++ b/crates/vestige-core/src/connectors/redmine.rs
@ -0,0 +1,737 @@
+//! Redmine connector (#57).
+//!
+//! Indexes a Redmine project's issues + journals (comments and status/assignment
+//! history) into source-aware Vestige memories so an investigative agent can
+//! search and reason over years of ticket history **offline**, **semantically**,
+//! and **cited back to the canonical issue URL**. Redmine stays the system of
+//! record; Vestige indexes, connects, retrieves, and links back.
+//!
+//! ## Incremental sync (per the connector sync contract)
+//!
+//! Redmine's REST API has three traps this connector handles explicitly (all
+//! confirmed against the official wiki + canonical defects):
+//!
+//! - **`status_id=*` is mandatory.** The list endpoint returns *open issues
+//!   only* by default, so without it closing an issue looks like a deletion and
+//!   closed issues are never synced (Defect #19088). We pass it on both the
+//!   incremental pull and the reconcile enumeration.
+//! - **`include=journals` is silently ignored on the list endpoint.** Journals
+//!   come back only on the per-issue detail endpoint `GET /issues/{id}.json`
+//!   (Defect #35242), so each changed issue costs one extra round-trip.
+//! - **Filter operators must be hex-encoded** in the compact form
+//!   (`updated_on=>=…` → `updated_on=%3E%3D…`). We build the query with
+//!   `reqwest`'s `.query(&[…])` and pass the raw `>=…` value so it is encoded
+//!   exactly once (no double-encoding).
+//!
+//! `sort=updated_on:asc` pages forward in cursor order so a mid-run interruption
+//! resumes safely; the `since = cursor − overlap` window + the `content_hash`
+//! no-op make the re-scan free. Redmine has no deletion feed, so deletions are
+//! reconciled out-of-band via [`list_live_ids`](Connector::list_live_ids).
+
+use chrono::{DateTime, Utc};
+use serde::Deserialize;
+
+use super::{
+    Connector, ConnectorError, ConnectorResult, FetchPage, NormalizedRecord, content_hash,
+};
+use crate::memory::SourceEnvelope;
+
+const USER_AGENT: &str = concat!("vestige-connector/", env!("CARGO_PKG_VERSION"));
+const PAGE_LIMIT: u32 = 100;
+
+/// Configuration for a Redmine connector instance bound to one project.
+#[derive(Debug, Clone)]
+pub struct RedmineConfig {
+    /// Base URL of the Redmine instance, e.g. `https://redmine.example.com`.
+    pub base_url: String,
+    /// Project identifier to scope the sync to. May be the numeric id or the
+    /// project identifier slug — used as `project_id` and stored as
+    /// `source_project`. (Note: Redmine's `project_id` list filter wants the
+    /// numeric id; the slug works as the human-readable scope label.)
+    pub project: String,
+    /// API access key. Optional only if the instance allows anonymous REST.
+    pub api_key: Option<String>,
+    /// Max journals to fold into one issue memory (defense against huge threads).
+    pub max_journals: usize,
+}
+
+impl RedmineConfig {
+    pub fn new(base_url: impl Into<String>, project: impl Into<String>) -> Self {
+        Self {
+            base_url: base_url.into(),
+            project: project.into(),
+            api_key: None,
+            max_journals: 100,
+        }
+    }
+
+    pub fn with_api_key(mut self, key: Option<String>) -> Self {
+        self.api_key = key;
+        self
+    }
+
+    /// Base URL with any trailing slash removed.
+    fn root(&self) -> String {
+        self.base_url.trim_end_matches('/').to_string()
+    }
+}
+
+/// A Redmine connector bound to one project.
+pub struct RedmineConnector {
+    config: RedmineConfig,
+    scope: String,
+    client: reqwest::Client,
+}
+
+impl RedmineConnector {
+    pub fn new(config: RedmineConfig) -> ConnectorResult<Self> {
+        if config.base_url.trim().is_empty() {
+            return Err(ConnectorError::Config("base_url is required".to_string()));
+        }
+        if config.project.trim().is_empty() {
+            return Err(ConnectorError::Config("project is required".to_string()));
+        }
+        if reqwest::Url::parse(&config.root()).is_err() {
+            return Err(ConnectorError::Config(format!(
+                "base_url is not a valid URL: {}",
+                config.base_url
+            )));
+        }
+        let client = reqwest::Client::builder()
+            .user_agent(USER_AGENT)
+            .build()
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        let scope = config.project.clone();
+        Ok(Self {
+            config,
+            scope,
+            client,
+        })
+    }
+
+    fn auth(&self, req: reqwest::RequestBuilder) -> reqwest::RequestBuilder {
+        let req = req.header("Accept", "application/json");
+        match &self.config.api_key {
+            // The key goes in the header (not the URL) so it stays out of proxy
+            // and access logs.
+            Some(k) => req.header("X-Redmine-API-Key", k),
+            None => req,
+        }
+    }
+
+    fn classify_status(resp: &reqwest::Response) -> Option<ConnectorError> {
+        let status = resp.status();
+        if status.is_success() {
+            return None;
+        }
+        if status.as_u16() == 429 {
+            let retry = resp
+                .headers()
+                .get("retry-after")
+                .and_then(|v| v.to_str().ok())
+                .and_then(|s| s.parse::<u64>().ok())
+                .map(std::time::Duration::from_secs);
+            return Some(ConnectorError::RateLimited(retry));
+        }
+        let message = match status.as_u16() {
+            // A valid key against an instance with REST disabled 401/403s; make
+            // that distinguishable from "no results".
+            401 | 403 => {
+                "unauthorized — check REDMINE_API_KEY and that the instance has the REST API enabled (Administration → Settings → API)"
+                    .to_string()
+            }
+            _ => status
+                .canonical_reason()
+                .unwrap_or("request failed")
+                .to_string(),
+        };
+        Some(ConnectorError::Source {
+            status: status.as_u16(),
+            message,
+        })
+    }
+
+    /// Fetch the journals + relations for one issue (the detail endpoint —
+    /// journals are not returned on the list endpoint).
+    async fn fetch_detail(&self, issue_id: u64) -> ConnectorResult<RawIssue> {
+        let url = format!("{}/issues/{}.json", self.config.root(), issue_id);
+        let resp = self
+            .auth(self.client.get(&url))
+            .query(&[("include", "journals,relations")])
+            .send()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        if let Some(err) = Self::classify_status(&resp) {
+            return Err(err);
+        }
+        let wrapper: IssueWrapper = resp
+            .json()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        Ok(wrapper.issue)
+    }
+
+    /// Fold a raw issue (with journals) into one normalized memory record.
+    fn normalize(&self, issue: &RawIssue) -> NormalizedRecord {
+        let status_name = issue.status.as_ref().map(|s| s.name.clone());
+        let tracker_name = issue.tracker.as_ref().map(|t| t.name.clone());
+        let author = issue.author.as_ref().map(|a| a.name.clone());
+
+        // Journals sorted by id for a stable order + stable hash. Keep notes
+        // and field changes so status/assignment history remains searchable.
+        let mut journals: Vec<&RawJournal> = issue
+            .journals
+            .iter()
+            .filter(|j| {
+                j.notes
+                    .as_deref()
+                    .map(|n| !n.trim().is_empty())
+                    .unwrap_or(false)
+                    || !j.details.is_empty()
+            })
+            .collect();
+        journals.sort_by_key(|j| j.id);
+        journals.truncate(self.config.max_journals);
+
+        // Human-readable content.
+        let mut content = format!("[{}#{}] {}\n", self.scope, issue.id, issue.subject);
+        if let Some(s) = &status_name {
+            content.push_str(&format!("Status: {s}\n"));
+        }
+        if let Some(t) = &tracker_name {
+            content.push_str(&format!("Tracker: {t}\n"));
+        }
+        if let Some(desc) = &issue.description
+            && !desc.trim().is_empty()
+        {
+            content.push('\n');
+            content.push_str(desc.trim());
+            content.push('\n');
+        }
+        for j in &journals {
+            let who = j.user.as_ref().map(|u| u.name.as_str()).unwrap_or("?");
+            let note = j.notes.as_deref().unwrap_or("").trim();
+            if !note.is_empty() {
+                content.push_str(&format!("\n- {who}: {note}"));
+            }
+            for detail in &j.details {
+                content.push_str(&format!(
+                    "\n- {who} changed {}{}: {} -> {}",
+                    detail.property.as_deref().unwrap_or("field"),
+                    detail
+                        .name
+                        .as_deref()
+                        .map(|n| format!(".{n}"))
+                        .unwrap_or_default(),
+                    detail.old_value.as_deref().unwrap_or(""),
+                    detail.new_value.as_deref().unwrap_or("")
+                ));
+            }
+        }
+        if !issue.relations.is_empty() {
+            content.push_str("\n\nRelations:");
+            let mut relations: Vec<&RawRelation> = issue.relations.iter().collect();
+            relations.sort_by_key(|r| r.id);
+            for relation in relations {
+                let related = relation.related_issue_id(issue.id);
+                content.push_str(&format!(
+                    "\n- #{} ({})",
+                    related,
+                    relation.relation_type.as_deref().unwrap_or("relates")
+                ));
+                if let Some(delay) = relation.delay {
+                    content.push_str(&format!(", delay {delay}"));
+                }
+            }
+        }
+
+        // Stable content hash — meaning only, never the cursor (`updated_on`) or
+        // volatile counts. Journals and relations contribute stable fields in id
+        // order.
+        let journals_blob = journals
+            .iter()
+            .map(|j| {
+                let details = j
+                    .details
+                    .iter()
+                    .map(|d| {
+                        format!(
+                            "{}:{}:{}:{}",
+                            d.property.as_deref().unwrap_or(""),
+                            d.name.as_deref().unwrap_or(""),
+                            d.old_value.as_deref().unwrap_or(""),
+                            d.new_value.as_deref().unwrap_or("")
+                        )
+                    })
+                    .collect::<Vec<_>>()
+                    .join("\u{1e}");
+                format!(
+                    "{}:{}:{}",
+                    j.id,
+                    j.notes.as_deref().unwrap_or("").trim(),
+                    details
+                )
+            })
+            .collect::<Vec<_>>()
+            .join("\u{1f}");
+        let relations_blob = {
+            let mut relations: Vec<&RawRelation> = issue.relations.iter().collect();
+            relations.sort_by_key(|r| r.id);
+            relations
+                .iter()
+                .map(|r| {
+                    format!(
+                        "{}:{}:{}:{}",
+                        r.id,
+                        r.issue_id.unwrap_or_default(),
+                        r.issue_to_id.unwrap_or_default(),
+                        r.relation_type.as_deref().unwrap_or("")
+                    )
+                })
+                .collect::<Vec<_>>()
+                .join("\u{1f}")
+        };
+        let id_str = issue.id.to_string();
+        let status_id_str = issue
+            .status
+            .as_ref()
+            .map(|s| s.id.to_string())
+            .unwrap_or_default();
+        let tracker_id_str = issue
+            .tracker
+            .as_ref()
+            .map(|t| t.id.to_string())
+            .unwrap_or_default();
+        let done_ratio_str = issue.done_ratio.unwrap_or(0).to_string();
+        let desc_str = issue.description.clone().unwrap_or_default();
+        let hash = content_hash(&[
+            ("id", &id_str),
+            ("subject", &issue.subject),
+            ("description", &desc_str),
+            ("status_id", &status_id_str),
+            ("tracker_id", &tracker_id_str),
+            ("done_ratio", &done_ratio_str),
+            ("journals", &journals_blob),
+            ("relations", &relations_blob),
+        ]);
+
+        // Tags, lowercased — `tag_prefix` matching is case-sensitive, and
+        // Redmine status/tracker names are mixed-case.
+        let mut tags = vec!["redmine".to_string(), "issue".to_string()];
+        if let Some(s) = &status_name {
+            tags.push(format!("status:{}", s.to_lowercase()));
+        }
+        if let Some(t) = &tracker_name {
+            tags.push(format!("tracker:{}", t.to_lowercase()));
+        }
+        if let Some(p) = &issue.priority {
+            tags.push(format!("priority:{}", p.name.to_lowercase()));
+        }
+
+        let envelope = SourceEnvelope {
+            source_system: Some("redmine".to_string()),
+            source_id: Some(issue.id.to_string()),
+            source_url: Some(format!("{}/issues/{}", self.config.root(), issue.id)),
+            source_updated_at: issue
+                .updated_on
+                .as_deref()
+                .and_then(|s| DateTime::parse_from_rfc3339(s).ok())
+                .map(|d| d.with_timezone(&Utc)),
+            content_hash: Some(hash),
+            synced_at: Some(Utc::now()),
+            source_project: Some(self.scope.clone()),
+            source_type: Some("issue".to_string()),
+            source_author: author,
+        };
+
+        NormalizedRecord {
+            content,
+            tags,
+            envelope,
+        }
+    }
+}
+
+impl Connector for RedmineConnector {
+    fn source_system(&self) -> &str {
+        "redmine"
+    }
+
+    fn scope(&self) -> &str {
+        &self.scope
+    }
+
+    async fn fetch_updated(
+        &self,
+        since: Option<DateTime<Utc>>,
+        cursor: Option<String>,
+    ) -> ConnectorResult<FetchPage> {
+        // The cursor carries the next offset (Redmine pages by offset, not an
+        // opaque url). First page = offset 0.
+        let offset: u32 = cursor.as_deref().and_then(|c| c.parse().ok()).unwrap_or(0);
+
+        let url = format!("{}/issues.json", self.config.root());
+        let limit_str = PAGE_LIMIT.to_string();
+        let offset_str = offset.to_string();
+        // Build params; reqwest percent-encodes each value exactly once, so we
+        // pass the RAW `>=…` operator (it becomes %3E%3D on the wire). Do not
+        // pre-encode here or it would be double-encoded.
+        let mut params: Vec<(&str, String)> = vec![
+            ("status_id", "*".to_string()),
+            ("sort", "updated_on:asc".to_string()),
+            ("project_id", self.config.project.clone()),
+            ("limit", limit_str),
+            ("offset", offset_str),
+        ];
+        if let Some(s) = since {
+            let since_z = s.to_rfc3339_opts(chrono::SecondsFormat::Secs, true);
+            params.push(("updated_on", format!(">={since_z}")));
+        }
+
+        let resp = self
+            .auth(self.client.get(&url))
+            .query(&params)
+            .send()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+        if let Some(err) = Self::classify_status(&resp) {
+            return Err(err);
+        }
+        let page: IssueListResponse = resp
+            .json()
+            .await
+            .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+
+        // Per-issue detail fetch for journals (list endpoint omits them).
+        let mut records = Vec::new();
+        for summary in &page.issues {
+            let detailed = match self.fetch_detail(summary.id).await {
+                Ok(d) => d,
+                // A single issue failing detail-fetch should not abort the page;
+                // fall back to the list-level fields (no journals).
+                Err(_) => summary.clone(),
+            };
+            records.push(self.normalize(&detailed));
+        }
+
+        // Advance the offset cursor until we've walked total_count.
+        let next_offset = offset + page.issues.len() as u32;
+        let next_cursor = if (next_offset as u64) < page.total_count && !page.issues.is_empty() {
+            Some(next_offset.to_string())
+        } else {
+            None
+        };
+
+        Ok(FetchPage {
+            records,
+            next_cursor,
+        })
+    }
+
+    async fn list_live_ids(&self) -> ConnectorResult<Option<Vec<String>>> {
+        // Enumerate all issue ids (open AND closed) for the reconcile pass.
+        // status_id=* is mandatory here too, or closed issues read as deleted.
+        let mut ids = Vec::new();
+        let mut offset: u32 = 0;
+        loop {
+            let url = format!("{}/issues.json", self.config.root());
+            let resp = self
+                .auth(self.client.get(&url))
+                .query(&[
+                    ("status_id", "*".to_string()),
+                    ("project_id", self.config.project.clone()),
+                    ("limit", PAGE_LIMIT.to_string()),
+                    ("offset", offset.to_string()),
+                ])
+                .send()
+                .await
+                .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+            if let Some(err) = Self::classify_status(&resp) {
+                return Err(err);
+            }
+            let page: IssueListResponse = resp
+                .json()
+                .await
+                .map_err(|e| ConnectorError::Transport(e.to_string()))?;
+            if page.issues.is_empty() {
+                break;
+            }
+            for issue in &page.issues {
+                ids.push(issue.id.to_string());
+            }
+            offset += page.issues.len() as u32;
+            if (offset as u64) >= page.total_count {
+                break;
+            }
+        }
+        Ok(Some(ids))
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Raw Redmine API shapes (only the fields we use)
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Deserialize)]
+struct IssueListResponse {
+    #[serde(default)]
+    issues: Vec<RawIssue>,
+    #[serde(default)]
+    total_count: u64,
+}
+
+#[derive(Debug, Deserialize)]
+struct IssueWrapper {
+    issue: RawIssue,
+}
+
+#[derive(Debug, Clone, Deserialize)]
+struct RawIssue {
+    id: u64,
+    #[serde(default)]
+    subject: String,
+    #[serde(default)]
+    description: Option<String>,
+    #[serde(default)]
+    status: Option<NamedRef>,
+    #[serde(default)]
+    tracker: Option<NamedRef>,
+    #[serde(default)]
+    priority: Option<NamedRef>,
+    #[serde(default)]
+    author: Option<NamedRef>,
+    #[serde(default)]
+    done_ratio: Option<i64>,
+    #[serde(default)]
+    updated_on: Option<String>,
+    #[serde(default)]
+    journals: Vec<RawJournal>,
+    #[serde(default)]
+    relations: Vec<RawRelation>,
+}
+
+/// Redmine `{id, name}` reference (status, tracker, priority, user, …).
+#[derive(Debug, Clone, Deserialize)]
+struct NamedRef {
+    #[serde(default)]
+    id: i64,
+    #[serde(default)]
+    name: String,
+}
+
+#[derive(Debug, Clone, Deserialize)]
+struct RawJournal {
+    id: u64,
+    #[serde(default)]
+    notes: Option<String>,
+    #[serde(default)]
+    user: Option<NamedRef>,
+    #[serde(default)]
+    details: Vec<RawJournalDetail>,
+}
+
+#[derive(Debug, Clone, Deserialize)]
+struct RawJournalDetail {
+    #[serde(default)]
+    property: Option<String>,
+    #[serde(default)]
+    name: Option<String>,
+    #[serde(default)]
+    old_value: Option<String>,
+    #[serde(default)]
+    new_value: Option<String>,
+}
+
+#[derive(Debug, Clone, Deserialize)]
+struct RawRelation {
+    #[serde(default)]
+    id: u64,
+    #[serde(default)]
+    issue_id: Option<u64>,
+    #[serde(default)]
+    issue_to_id: Option<u64>,
+    #[serde(default)]
+    relation_type: Option<String>,
+    #[serde(default)]
+    delay: Option<i64>,
+}
+
+impl RawRelation {
+    fn related_issue_id(&self, current_issue_id: u64) -> u64 {
+        match (self.issue_id, self.issue_to_id) {
+            (Some(from), Some(to)) if from == current_issue_id => to,
+            (Some(from), Some(to)) if to == current_issue_id => from,
+            (_, Some(to)) => to,
+            (Some(from), _) => from,
+            _ => 0,
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn issue(id: u64, subject: &str, desc: &str, status: (i64, &str)) -> RawIssue {
+        RawIssue {
+            id,
+            subject: subject.to_string(),
+            description: Some(desc.to_string()),
+            status: Some(NamedRef {
+                id: status.0,
+                name: status.1.to_string(),
+            }),
+            tracker: Some(NamedRef {
+                id: 1,
+                name: "Bug".to_string(),
+            }),
+            priority: Some(NamedRef {
+                id: 2,
+                name: "Normal".to_string(),
+            }),
+            author: Some(NamedRef {
+                id: 7,
+                name: "Jane Dev".to_string(),
+            }),
+            done_ratio: Some(0),
+            updated_on: Some("2026-06-19T00:00:00Z".to_string()),
+            journals: vec![],
+            relations: vec![],
+        }
+    }
+
+    fn connector() -> RedmineConnector {
+        RedmineConnector::new(RedmineConfig::new("https://redmine.example.com", "infra")).unwrap()
+    }
+
+    #[test]
+    fn rejects_empty_and_bad_config() {
+        assert!(RedmineConnector::new(RedmineConfig::new("", "p")).is_err());
+        assert!(RedmineConnector::new(RedmineConfig::new("https://r.example", "")).is_err());
+        assert!(RedmineConnector::new(RedmineConfig::new("not a url", "p")).is_err());
+    }
+
+    #[test]
+    fn normalize_builds_keyed_envelope_with_citation() {
+        let c = connector();
+        let rec = c.normalize(&issue(123, "Disk full", "df -h shows 100%", (1, "New")));
+        let env = &rec.envelope;
+        assert!(env.has_key());
+        assert_eq!(env.source_system.as_deref(), Some("redmine"));
+        assert_eq!(env.source_id.as_deref(), Some("123"));
+        assert_eq!(
+            env.source_url.as_deref(),
+            Some("https://redmine.example.com/issues/123")
+        );
+        assert_eq!(env.source_project.as_deref(), Some("infra"));
+        assert_eq!(env.source_author.as_deref(), Some("Jane Dev"));
+        assert!(rec.content.contains("Disk full"));
+        // Tags lowercased so the case-sensitive tag_prefix filter matches.
+        assert!(rec.tags.contains(&"status:new".to_string()));
+        assert!(rec.tags.contains(&"tracker:bug".to_string()));
+        assert!(rec.tags.contains(&"priority:normal".to_string()));
+    }
+
+    #[test]
+    fn status_change_changes_hash() {
+        let c = connector();
+        let new = c
+            .normalize(&issue(1, "T", "body", (1, "New")))
+            .envelope
+            .content_hash;
+        let closed = c
+            .normalize(&issue(1, "T", "body", (5, "Closed")))
+            .envelope
+            .content_hash;
+        assert_ne!(
+            new, closed,
+            "a status change must change the hash → re-embed"
+        );
+    }
+
+    #[test]
+    fn journals_fold_in_id_order_and_affect_hash() {
+        let c = connector();
+        let mut iss = issue(1, "T", "body", (1, "New"));
+        iss.journals = vec![
+            RawJournal {
+                id: 20,
+                notes: Some("second".to_string()),
+                user: Some(NamedRef {
+                    id: 1,
+                    name: "B".to_string(),
+                }),
+                details: vec![],
+            },
+            RawJournal {
+                id: 10,
+                notes: Some("first".to_string()),
+                user: Some(NamedRef {
+                    id: 2,
+                    name: "A".to_string(),
+                }),
+                details: vec![],
+            },
+            // Pure empty journal must be dropped, not folded.
+            RawJournal {
+                id: 30,
+                notes: None,
+                user: Some(NamedRef {
+                    id: 3,
+                    name: "C".to_string(),
+                }),
+                details: vec![],
+            },
+        ];
+        let rec = c.normalize(&iss);
+        let first = rec.content.find("first").unwrap();
+        let second = rec.content.find("second").unwrap();
+        assert!(first < second, "journals fold in id order");
+
+        let no_journals = c
+            .normalize(&issue(1, "T", "body", (1, "New")))
+            .envelope
+            .content_hash;
+        assert_ne!(
+            rec.envelope.content_hash, no_journals,
+            "journals must contribute to the hash"
+        );
+    }
+
+    #[test]
+    fn journal_details_and_relations_are_searchable_and_hashed() {
+        let c = connector();
+        let mut iss = issue(1, "T", "body", (1, "New"));
+        iss.journals = vec![RawJournal {
+            id: 1,
+            notes: None,
+            user: Some(NamedRef {
+                id: 2,
+                name: "A".to_string(),
+            }),
+            details: vec![RawJournalDetail {
+                property: Some("attr".to_string()),
+                name: Some("status_id".to_string()),
+                old_value: Some("1".to_string()),
+                new_value: Some("5".to_string()),
+            }],
+        }];
+        iss.relations = vec![RawRelation {
+            id: 9,
+            issue_id: Some(1),
+            issue_to_id: Some(42),
+            relation_type: Some("relates".to_string()),
+            delay: None,
+        }];
+
+        let rec = c.normalize(&iss);
+        assert!(rec.content.contains("changed attr.status_id: 1 -> 5"));
+        assert!(rec.content.contains("#42 (relates)"));
+
+        let no_history = c.normalize(&issue(1, "T", "body", (1, "New")));
+        assert_ne!(
+            rec.envelope.content_hash, no_history.envelope.content_hash,
+            "field-change journals and relations must affect idempotent updates"
+        );
+    }
+}
--- a/crates/vestige-core/src/consolidation/phases.rs
+++ b/crates/vestige-core/src/consolidation/phases.rs
@ -889,6 +889,7 @@ mod tests {
            embedding_model: None,
            suppression_count: 0,
            suppressed_at: None,
+            source_envelope: None,
        }
    }

--- a/crates/vestige-core/src/lib.rs
+++ b/crates/vestige-core/src/lib.rs
@ -82,6 +82,7 @@

 /// Optional `vestige.toml` configuration (Phase 2: Configurable Output).
 pub mod config;
+pub mod connectors;
 pub mod consolidation;
 pub mod embedder;
 pub mod fsrs;
@ -134,6 +135,7 @@ pub use memory::{
    SearchMode,
    SearchResult,
    SimilarityResult,
+    SourceEnvelope,
    TableIntrospection,
    TemporalRange,
 };
@ -166,6 +168,7 @@ pub use storage::{
    CompositionNeighborRecord,
    CompositionOutcomeRecord,
    ConnectionRecord,
+    ConnectorCursor,
    ConsolidationHistoryRecord,
    Domain,
    DreamHistoryRecord,
@ -184,10 +187,13 @@ pub use storage::{
    PortableArchive,
    PortableImportMode,
    PortableImportReport,
+    ReconcileReport,
    Result,
    SchedulingState,
    SearchQuery,
    SmartIngestResult,
+    SourceUpsertOutcome,
+    SourceUpsertResult,
    SqliteMemoryStore,
    StateTransitionRecord,
    Storage,
--- a/crates/vestige-core/src/memory/mod.rs
+++ b/crates/vestige-core/src/memory/mod.rs
@ -10,7 +10,7 @@ mod node;
 mod strength;
 mod temporal;

-pub use node::{IngestInput, KnowledgeNode, NodeType, RecallInput, SearchMode};
+pub use node::{IngestInput, KnowledgeNode, NodeType, RecallInput, SearchMode, SourceEnvelope};
 pub use strength::{DualStrength, StrengthDecay};
 pub use temporal::{TemporalRange, TemporalValidity};

--- a/crates/vestige-core/src/memory/node.rs
+++ b/crates/vestige-core/src/memory/node.rs
@ -79,6 +79,91 @@ impl std::fmt::Display for NodeType {
    }
 }

+// ============================================================================
+// SOURCE ENVELOPE (#57)
+// ============================================================================
+
+/// Structured provenance for a memory that mirrors a record in an external
+/// system of record (a Redmine issue, a GitHub Issue, a Jira ticket, a support
+/// thread, …).
+///
+/// The product boundary (#57): the external system stays canonical. Vestige
+/// **indexes, connects, retrieves, and cites back**; it does not replace the
+/// ticket tracker. This envelope carries exactly the fields a connector needs
+/// to do that without leaking stale data:
+///
+/// - `(source_system, source_id)` is the **idempotency key**. Re-running a sync
+///   upserts the same logical record instead of duplicating it.
+/// - `content_hash` is the **change detector**. If a re-fetched record hashes
+///   to the stored value, the upsert is a no-op (only `synced_at` advances),
+///   so an incremental re-scan never churns the index or the embedding model.
+/// - `source_url` is the **citation**. Search results link back to the
+///   canonical record so the agent can follow it for authoritative detail.
+/// - `source_updated_at` is the **cursor field** the connector checkpoints on.
+///
+/// Every field is optional at the type level so partial connectors and manual
+/// imports can populate only what they have, but a real connector should always
+/// set `source_system`, `source_id`, and `content_hash`.
+#[non_exhaustive]
+#[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)]
+#[serde(rename_all = "camelCase")]
+pub struct SourceEnvelope {
+    /// External system this record came from, e.g. `redmine`, `github`, `jira`.
+    /// Namespaces `source_id` so two systems can share numeric ids.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_system: Option<String>,
+    /// Stable native id in the source system (Redmine issue id, GitHub issue
+    /// number/node id, …). Combined with `source_system` it is the upsert key.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_id: Option<String>,
+    /// Canonical URL back to the record so retrieval can cite the source.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_url: Option<String>,
+    /// When the source record was last updated upstream (the connector cursor
+    /// field — Redmine `updated_on`, GitHub `updated_at`). RFC 3339.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_updated_at: Option<DateTime<Utc>>,
+    /// Stable hash of the normalized record content. Idempotency / change key.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub content_hash: Option<String>,
+    /// When the connector last observed this record live. Drives tombstone
+    /// reconciliation (a record not seen in a full reconcile pass is gone).
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub synced_at: Option<DateTime<Utc>>,
+    /// Project / repo / space the record belongs to (Redmine project, GitHub
+    /// `owner/repo`). Used for scoped sync and search filters.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_project: Option<String>,
+    /// Record type within the source (`issue`, `comment`, `journal`, …).
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_type: Option<String>,
+    /// Author / reporter of the record in the source system.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_author: Option<String>,
+}
+
+impl SourceEnvelope {
+    /// True once the two fields a connector needs for idempotent upsert are
+    /// present. Manual imports that only set `source_url` are not "keyed".
+    pub fn has_key(&self) -> bool {
+        self.source_system.is_some() && self.source_id.is_some()
+    }
+
+    /// True if every field is unset — used to collapse an all-`None` envelope
+    /// back to `None` on the node so legacy rows stay clean.
+    pub fn is_empty(&self) -> bool {
+        self.source_system.is_none()
+            && self.source_id.is_none()
+            && self.source_url.is_none()
+            && self.source_updated_at.is_none()
+            && self.content_hash.is_none()
+            && self.synced_at.is_none()
+            && self.source_project.is_none()
+            && self.source_type.is_none()
+            && self.source_author.is_none()
+    }
+}
+
 // ============================================================================
 // KNOWLEDGE NODE
 // ============================================================================
@ -188,6 +273,15 @@ pub struct KnowledgeNode {
    /// Timestamp of the most recent suppression (for 24h labile window).
    #[serde(skip_serializing_if = "Option::is_none")]
    pub suppressed_at: Option<DateTime<Utc>>,
+
+    // ========== Source Envelope (#57, external-source connectors) ==========
+    /// Structured provenance for memories ingested from an external system
+    /// (Redmine, GitHub Issues, Jira, …). `None` for memories created directly
+    /// by an agent or the user — the legacy free-form `source` string above
+    /// remains the human-readable label; this envelope is the machine-readable,
+    /// idempotency- and citation-bearing record. See [`SourceEnvelope`].
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_envelope: Option<SourceEnvelope>,
 }

 impl Default for KnowledgeNode {
@ -224,6 +318,7 @@ impl Default for KnowledgeNode {
            embedding_model: None,
            suppression_count: 0,
            suppressed_at: None,
+            source_envelope: None,
        }
    }
 }
@ -291,6 +386,11 @@ pub struct IngestInput {
    /// When this knowledge stops being valid
    #[serde(skip_serializing_if = "Option::is_none")]
    pub valid_until: Option<DateTime<Utc>>,
+    /// Structured provenance for connector-ingested records (#57). When set
+    /// with a `(source_system, source_id)` key, callers should route through
+    /// `upsert_by_source` for idempotent sync rather than plain `ingest`.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_envelope: Option<SourceEnvelope>,
 }

 impl Default for IngestInput {
@ -304,6 +404,7 @@ impl Default for IngestInput {
            tags: vec![],
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        }
    }
 }
--- a/crates/vestige-core/src/storage/migrations.rs
+++ b/crates/vestige-core/src/storage/migrations.rs
@ -84,6 +84,11 @@ pub const MIGRATIONS: &[Migration] = &[
        description: "ADR 0001 Phase 1: embedding_model registry, domains/domain_scores columns, domains table",
        up: MIGRATION_V16_UP,
    },
+    Migration {
+        version: 17,
+        description: "#57 Source envelope: provenance columns + connector cursor checkpoints for idempotent external-source sync",
+        up: MIGRATION_V17_UP,
+    },
 ];

 /// A database migration
@ -957,6 +962,73 @@ pub const MIGRATION_V16_ALTER_COLUMNS: &[&str] = &[
    "ALTER TABLE knowledge_nodes ADD COLUMN domain_scores TEXT NOT NULL DEFAULT '{}'",
 ];

+/// V17: #57 Source envelope — structured provenance for connector-ingested
+/// records, plus a per-connector cursor checkpoint table.
+///
+/// The provenance columns live directly on `knowledge_nodes` (rather than a
+/// side table) so search can filter and cite them with no join. They are all
+/// nullable and default-NULL, so every existing memory is untouched and the
+/// migration is purely additive — legacy rows simply have no envelope.
+///
+/// The `(source_system, source_id)` pair is the idempotency key for
+/// `upsert_by_source`; the unique index enforces one memory per external
+/// record. `content_hash` is the change detector. `connector_cursors` holds the
+/// incremental-sync high-water mark and last full-reconcile time per
+/// (source_system, scope).
+///
+/// The `ALTER TABLE ... ADD COLUMN` statements are split into
+/// `MIGRATION_V17_ALTER_COLUMNS` and run individually by the migration runner,
+/// because SQLite has no `ADD COLUMN IF NOT EXISTS`; duplicate-column errors are
+/// swallowed so replay stays idempotent.
+const MIGRATION_V17_UP: &str = r#"
+-- Idempotency key: at most one memory per (source_system, source_id).
+-- Partial unique index so the millions of envelope-less legacy rows (all NULL)
+-- don't collide and don't pay index cost.
+CREATE UNIQUE INDEX IF NOT EXISTS idx_nodes_source_key
+    ON knowledge_nodes(source_system, source_id)
+    WHERE source_system IS NOT NULL AND source_id IS NOT NULL;
+
+-- Filter/scan support for source-aware search + reconciliation passes.
+CREATE INDEX IF NOT EXISTS idx_nodes_source_system
+    ON knowledge_nodes(source_system)
+    WHERE source_system IS NOT NULL;
+
+CREATE INDEX IF NOT EXISTS idx_nodes_source_project
+    ON knowledge_nodes(source_project)
+    WHERE source_project IS NOT NULL;
+
+-- Per-connector incremental-sync checkpoint. One row per (source_system, scope)
+-- e.g. ('github', 'samvallad33/vestige'). `cursor_updated_at` is the
+-- high-water mark on the source's update timestamp; `last_full_reconcile_at`
+-- gates the (expensive) deletion-reconcile pass.
+CREATE TABLE IF NOT EXISTS connector_cursors (
+    source_system          TEXT NOT NULL,
+    scope                  TEXT NOT NULL,
+    cursor_updated_at      TEXT,
+    last_synced_at         TEXT,
+    last_full_reconcile_at TEXT,
+    records_seen           INTEGER NOT NULL DEFAULT 0,
+    config                 TEXT NOT NULL DEFAULT '{}',
+    PRIMARY KEY (source_system, scope)
+);
+
+UPDATE schema_version SET version = 17, applied_at = datetime('now');
+"#;
+
+/// The `ALTER TABLE` statements for V17. Run individually + idempotently by the
+/// migration runner (SQLite has no `ADD COLUMN IF NOT EXISTS`).
+pub const MIGRATION_V17_ALTER_COLUMNS: &[&str] = &[
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_system TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_id TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_url TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_updated_at TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN content_hash TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN synced_at TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_project TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_type TEXT",
+    "ALTER TABLE knowledge_nodes ADD COLUMN source_author TEXT",
+];
+
 /// Apply pending migrations
 pub fn apply_migrations(conn: &rusqlite::Connection) -> rusqlite::Result<u32> {
    let current_version = get_current_version(conn)?;
@ -994,6 +1066,15 @@ pub fn apply_migrations(conn: &rusqlite::Connection) -> rusqlite::Result<u32> {
                }
            }

+            // V17 (#57) adds the source-envelope columns. Same idempotent
+            // ALTER handling as V16 — the unique index in the V17 batch
+            // references these columns, so they must exist before the batch.
+            if migration.version == 17 {
+                for stmt in MIGRATION_V17_ALTER_COLUMNS {
+                    add_column_if_missing(conn, stmt)?;
+                }
+            }
+
            // Use execute_batch to handle multi-statement SQL including triggers
            conn.execute_batch(migration.up)?;

@ -1026,11 +1107,11 @@ mod tests {
        // Pre-requisite: schema_version must be bootstrapped by V1.
        apply_migrations(&conn).expect("apply_migrations succeeds");

-        // 1. schema_version advanced to V16
+        // 1. schema_version advanced to the latest migration
        let version = get_current_version(&conn).expect("read schema_version");
        assert_eq!(
-            version, 16,
-            "schema_version must be 16 after all migrations"
+            version, 17,
+            "schema_version must be 17 after all migrations"
        );

        // 2. knowledge_edges is gone (V11 drops it)
@ -1151,11 +1232,11 @@ mod tests {
        // Replay V11 onward. V11 uses DROP TABLE IF EXISTS so it is idempotent.
        // V12/V13 tombstone tables use CREATE TABLE IF NOT EXISTS. V14/V16 ALTER
        // TABLE idempotency is handled by the migration runner.
-        apply_migrations(&conn).expect("V11..V16 replay must be idempotent");
+        apply_migrations(&conn).expect("V11..V17 replay must be idempotent");

        // After replaying from V10, the schema advances to the latest version.
        let version = get_current_version(&conn).expect("read schema_version");
-        assert_eq!(version, 16, "schema_version back at 16 after replay");
+        assert_eq!(version, 17, "schema_version back at latest after replay");
    }

    #[test]
@ -1229,7 +1310,97 @@ mod tests {
        // V16 uses CREATE TABLE IF NOT EXISTS and idempotent ALTER handling.
        apply_migrations(&conn).expect("V16 replay must be idempotent");
        let version = get_current_version(&conn).expect("read version");
-        assert_eq!(version, 16, "schema_version must be 16 after replay");
+        assert_eq!(version, 17, "schema_version must be latest after replay");
+    }
+
+    #[test]
+    fn v17_adds_source_envelope_columns_and_cursor_table() {
+        let conn = rusqlite::Connection::open_in_memory().expect("open in-memory");
+        apply_migrations(&conn).expect("apply_migrations");
+
+        // All nine envelope columns must exist on knowledge_nodes.
+        let cols: Vec<String> = {
+            let mut stmt = conn
+                .prepare("PRAGMA table_info(knowledge_nodes)")
+                .expect("prepare");
+            stmt.query_map([], |row| row.get::<_, String>(1))
+                .expect("query_map")
+                .filter_map(|r| r.ok())
+                .collect()
+        };
+        for c in [
+            "source_system",
+            "source_id",
+            "source_url",
+            "source_updated_at",
+            "content_hash",
+            "synced_at",
+            "source_project",
+            "source_type",
+            "source_author",
+        ] {
+            assert!(
+                cols.iter().any(|x| x == c),
+                "knowledge_nodes must have `{c}` column after V17"
+            );
+        }
+
+        // connector_cursors table must exist.
+        let cursor_rows: i64 = conn
+            .query_row(
+                "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='connector_cursors'",
+                [],
+                |row| row.get(0),
+            )
+            .expect("query sqlite_master");
+        assert_eq!(cursor_rows, 1, "connector_cursors must be created by V17");
+    }
+
+    #[test]
+    fn v17_unique_source_key_index_allows_many_null_legacy_rows() {
+        let conn = rusqlite::Connection::open_in_memory().expect("open in-memory");
+        apply_migrations(&conn).expect("apply_migrations");
+
+        // Two legacy rows with NULL source key must NOT collide on the partial
+        // unique index (the index only covers non-NULL keys).
+        for id in ["a", "b"] {
+            conn.execute(
+                "INSERT INTO knowledge_nodes (id, content, node_type, created_at, updated_at, last_accessed, \
+                 stability, difficulty, reps, lapses, learning_state, storage_strength, retrieval_strength, \
+                 retention_strength, next_review, scheduled_days, has_embedding) \
+                 VALUES (?1,'c','fact',datetime('now'),datetime('now'),datetime('now'),\
+                 1.0,0.3,0,0,'new',1.0,1.0,1.0,datetime('now'),1,0)",
+                [id],
+            )
+            .expect("insert legacy row");
+        }
+
+        // Two real connector rows that share (source_system, source_id) MUST
+        // collide — the unique index is the idempotency guarantee.
+        conn.execute(
+            "UPDATE knowledge_nodes SET source_system='github', source_id='1' WHERE id='a'",
+            [],
+        )
+        .expect("set source key on a");
+        let dup = conn.execute(
+            "UPDATE knowledge_nodes SET source_system='github', source_id='1' WHERE id='b'",
+            [],
+        );
+        assert!(
+            dup.is_err(),
+            "duplicate (source_system, source_id) must violate the unique index"
+        );
+    }
+
+    #[test]
+    fn v17_is_replayable() {
+        let conn = rusqlite::Connection::open_in_memory().expect("open in-memory");
+        apply_migrations(&conn).expect("first apply");
+        conn.execute("UPDATE schema_version SET version = 16", [])
+            .expect("rewind to 16");
+        apply_migrations(&conn).expect("V17 replay must be idempotent");
+        let version = get_current_version(&conn).expect("read version");
+        assert_eq!(version, 17, "schema_version must be 17 after replay");
    }

    #[test]
--- a/crates/vestige-core/src/storage/mod.rs
+++ b/crates/vestige-core/src/storage/mod.rs
@ -19,9 +19,10 @@ pub use portable::{
 };
 pub use sqlite::{
    CompositionEventRecord, CompositionMemberRecord, CompositionNeighborRecord,
-    CompositionOutcomeRecord, ConnectionRecord, ConsolidationHistoryRecord, DreamHistoryRecord,
-    FilePortableSyncBackend, InsightRecord, IntentionRecord, NeverComposedCandidate,
-    PortableSyncBackend, PortableSyncReport, Result, SmartIngestResult, SqliteMemoryStore,
+    CompositionOutcomeRecord, ConnectionRecord, ConnectorCursor, ConsolidationHistoryRecord,
+    DreamHistoryRecord, FilePortableSyncBackend, InsightRecord, IntentionRecord,
+    NeverComposedCandidate, PortableSyncBackend, PortableSyncReport, ReconcileReport, Result,
+    SmartIngestResult, SourceUpsertOutcome, SourceUpsertResult, SqliteMemoryStore,
    StateTransitionRecord, StorageError,
 };

--- a/crates/vestige-core/src/storage/sqlite.rs
+++ b/crates/vestige-core/src/storage/sqlite.rs
@ -682,6 +682,12 @@ impl SqliteMemoryStore {
        let valid_from_str = input.valid_from.map(|dt| dt.to_rfc3339());
        let valid_until_str = input.valid_until.map(|dt| dt.to_rfc3339());

+        // #57 Source envelope — flatten to nullable column values. A node with
+        // no external provenance leaves all nine columns NULL (legacy shape).
+        let env = input.source_envelope.clone().unwrap_or_default();
+        let env_source_updated_at = env.source_updated_at.map(|dt| dt.to_rfc3339());
+        let env_synced_at = env.synced_at.map(|dt| dt.to_rfc3339());
+
        {
            let writer = self
                .writer
@ -694,14 +700,18 @@ impl SqliteMemoryStore {
                    storage_strength, retrieval_strength, retention_strength,
                    sentiment_score, sentiment_magnitude, next_review, scheduled_days,
                    source, tags, valid_from, valid_until, has_embedding, embedding_model,
-                    domains, domain_scores
+                    domains, domain_scores,
+                    source_system, source_id, source_url, source_updated_at,
+                    content_hash, synced_at, source_project, source_type, source_author
                ) VALUES (
                    ?1, ?2, ?3, ?4, ?5, ?6,
                    ?7, ?8, ?9, ?10, ?11,
                    ?12, ?13, ?14,
                    ?15, ?16, ?17, ?18,
                    ?19, ?20, ?21, ?22, ?23, ?24,
-                    '[]', '{}'
+                    '[]', '{}',
+                    ?25, ?26, ?27, ?28,
+                    ?29, ?30, ?31, ?32, ?33
                )",
                params![
                    id,
@ -728,6 +738,15 @@ impl SqliteMemoryStore {
                    valid_until_str,
                    0,
                    Option::<String>::None,
+                    env.source_system,
+                    env.source_id,
+                    env.source_url,
+                    env_source_updated_at,
+                    env.content_hash,
+                    env_synced_at,
+                    env.source_project,
+                    env.source_type,
+                    env.source_author,
                ],
            )?;
        }
@ -1257,6 +1276,33 @@ impl SqliteMemoryStore {
                .ok()
        });

+        // #57 Source envelope columns (Migration V17). `.ok().flatten()` is
+        // tolerant of pre-V17 databases that lack these columns. Collapse an
+        // all-NULL envelope to `None` so legacy nodes serialize unchanged.
+        let parse_ts = |s: Option<String>| -> Option<DateTime<Utc>> {
+            s.and_then(|s| {
+                DateTime::parse_from_rfc3339(&s)
+                    .map(|dt| dt.with_timezone(&Utc))
+                    .ok()
+            })
+        };
+        let envelope = crate::memory::SourceEnvelope {
+            source_system: row.get("source_system").ok().flatten(),
+            source_id: row.get("source_id").ok().flatten(),
+            source_url: row.get("source_url").ok().flatten(),
+            source_updated_at: parse_ts(row.get("source_updated_at").ok().flatten()),
+            content_hash: row.get("content_hash").ok().flatten(),
+            synced_at: parse_ts(row.get("synced_at").ok().flatten()),
+            source_project: row.get("source_project").ok().flatten(),
+            source_type: row.get("source_type").ok().flatten(),
+            source_author: row.get("source_author").ok().flatten(),
+        };
+        let source_envelope = if envelope.is_empty() {
+            None
+        } else {
+            Some(envelope)
+        };
+
        Ok(KnowledgeNode {
            id: row.get("id")?,
            content: row.get("content")?,
@ -1293,6 +1339,8 @@ impl SqliteMemoryStore {
            // v2.0.5 Active Forgetting
            suppression_count,
            suppressed_at,
+            // #57 Source envelope
+            source_envelope,
        })
    }

@ -9415,6 +9463,340 @@ impl crate::storage::memory_store::MemoryStoreSend for SqliteMemoryStore {
    }
 }

+// ============================================================================
+// CONNECTOR SYNC (#57) — idempotent external-source ingestion
+// ============================================================================
+
+/// What `upsert_by_source` did with one external record. Drives the
+/// created/updated/unchanged/tombstoned counts a connector reports.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum SourceUpsertOutcome {
+    /// No memory existed for this `(source_system, source_id)` — inserted.
+    Created,
+    /// A memory existed and the `content_hash` changed — body + envelope updated
+    /// and the embedding regenerated.
+    Updated,
+    /// A memory existed with the same `content_hash` — nothing rewritten except
+    /// `synced_at` (so an incremental re-scan is free).
+    Unchanged,
+}
+
+/// Result of one `upsert_by_source` call.
+#[derive(Debug, Clone)]
+pub struct SourceUpsertResult {
+    pub outcome: SourceUpsertOutcome,
+    /// Memory id of the affected node (new or existing).
+    pub node_id: String,
+}
+
+/// Incremental-sync checkpoint for one `(source_system, scope)`.
+#[derive(Debug, Clone, Default)]
+pub struct ConnectorCursor {
+    pub source_system: String,
+    pub scope: String,
+    /// High-water mark on the source's update timestamp. `None` on first sync.
+    pub cursor_updated_at: Option<DateTime<Utc>>,
+    pub last_synced_at: Option<DateTime<Utc>>,
+    pub last_full_reconcile_at: Option<DateTime<Utc>>,
+    pub records_seen: i64,
+}
+
+/// Outcome of a tombstone reconciliation pass.
+#[derive(Debug, Clone, Default)]
+pub struct ReconcileReport {
+    /// Memory ids that were tombstoned (no longer visible upstream).
+    pub tombstoned: Vec<String>,
+    /// Number of local records considered for this scope.
+    pub considered: usize,
+}
+
+impl SqliteMemoryStore {
+    /// Idempotently upsert one external-source record, keyed on the envelope's
+    /// `(source_system, source_id)` (#57).
+    ///
+    /// This is the core primitive every connector calls per record. It makes
+    /// re-running a sync safe and cheap:
+    ///
+    /// - **No existing memory** for the key → insert (`Created`).
+    /// - **Existing memory, `content_hash` changed** → update content + envelope,
+    ///   stamp `updated_at`, regenerate the embedding (`Updated`).
+    /// - **Existing memory, `content_hash` unchanged** → touch only `synced_at`
+    ///   so the reconcile pass knows the record is still live (`Unchanged`).
+    ///
+    /// The caller MUST set `source_system`, `source_id`, and `content_hash` on
+    /// the input's `source_envelope`; otherwise this falls back to a plain
+    /// `ingest` (an un-keyed record can't be deduplicated).
+    pub fn upsert_by_source(&self, input: IngestInput) -> Result<SourceUpsertResult> {
+        let env = match input.source_envelope.clone() {
+            Some(e) if e.has_key() => e,
+            // No idempotency key — behave like a normal create.
+            _ => {
+                let node = self.ingest(input)?;
+                return Ok(SourceUpsertResult {
+                    outcome: SourceUpsertOutcome::Created,
+                    node_id: node.id,
+                });
+            }
+        };
+
+        let source_system = env.source_system.clone().unwrap_or_default();
+        let source_id = env.source_id.clone().unwrap_or_default();
+        let now = Utc::now();
+
+        // Look up the existing memory for this external record, if any.
+        let existing: Option<(String, Option<String>)> = {
+            let reader = self
+                .reader
+                .lock()
+                .map_err(|_| StorageError::Init("Reader lock poisoned".into()))?;
+            reader
+                .query_row(
+                    "SELECT id, content_hash FROM knowledge_nodes \
+                     WHERE source_system = ?1 AND source_id = ?2 LIMIT 1",
+                    params![source_system, source_id],
+                    |row| Ok((row.get::<_, String>(0)?, row.get::<_, Option<String>>(1)?)),
+                )
+                .optional()?
+        };
+
+        let Some((node_id, stored_hash)) = existing else {
+            // First time we've seen this record — plain insert carries the
+            // envelope through the existing ingest path.
+            let node = self.ingest(input)?;
+            return Ok(SourceUpsertResult {
+                outcome: SourceUpsertOutcome::Created,
+                node_id: node.id,
+            });
+        };
+
+        let new_hash = env.content_hash.clone();
+        let unchanged = match (&stored_hash, &new_hash) {
+            // Both present and equal → genuinely unchanged.
+            (Some(a), Some(b)) => a == b,
+            // Either side missing a hash → be conservative and treat as changed
+            // so we never silently skip a real update.
+            _ => false,
+        };
+
+        let env_source_updated_at = env.source_updated_at.map(|dt| dt.to_rfc3339());
+        let synced_at = now.to_rfc3339();
+
+        if unchanged {
+            // Cheapest path: only advance liveness + the source cursor field.
+            let writer = self
+                .writer
+                .lock()
+                .map_err(|_| StorageError::Init("Writer lock poisoned".into()))?;
+            writer.execute(
+                // Un-tombstone fully: a reappearing record clears BOTH bitemporal
+                // markers (valid_until AND superseded_by), otherwise it would be
+                // resurrected as currently-valid yet still flagged as superseded,
+                // which permanently excludes it from merge/consolidation.
+                "UPDATE knowledge_nodes \
+                 SET synced_at = ?1, source_updated_at = COALESCE(?2, source_updated_at), \
+                     source_url = COALESCE(?3, source_url), \
+                     valid_until = NULL, superseded_by = NULL \
+                 WHERE id = ?4",
+                params![synced_at, env_source_updated_at, env.source_url, node_id],
+            )?;
+            return Ok(SourceUpsertResult {
+                outcome: SourceUpsertOutcome::Unchanged,
+                node_id,
+            });
+        }
+
+        // Content changed upstream → update body + full envelope, clear any
+        // prior tombstone (`valid_until`), then regenerate the embedding.
+        {
+            let writer = self
+                .writer
+                .lock()
+                .map_err(|_| StorageError::Init("Writer lock poisoned".into()))?;
+            writer.execute(
+                // Clear BOTH bitemporal markers on update (see Unchanged branch).
+                "UPDATE knowledge_nodes SET \
+                    content = ?1, updated_at = ?2, synced_at = ?3, \
+                    content_hash = ?4, source_url = ?5, source_updated_at = ?6, \
+                    source_project = ?7, source_type = ?8, source_author = ?9, \
+                    valid_until = NULL, superseded_by = NULL \
+                 WHERE id = ?10",
+                params![
+                    input.content,
+                    now.to_rfc3339(),
+                    synced_at,
+                    env.content_hash,
+                    env.source_url,
+                    env_source_updated_at,
+                    env.source_project,
+                    env.source_type,
+                    env.source_author,
+                    node_id,
+                ],
+            )?;
+        }
+
+        #[cfg(all(feature = "embeddings", feature = "vector-search"))]
+        {
+            if let Some(index) = self.vector_index.as_ref()
+                && let Ok(mut index) = index.lock()
+            {
+                let _ = index.remove(&node_id);
+            }
+            if let Err(e) = self.generate_embedding_for_node(&node_id, &input.content) {
+                tracing::warn!("Failed to regenerate embedding for {}: {}", node_id, e);
+            }
+        }
+
+        Ok(SourceUpsertResult {
+            outcome: SourceUpsertOutcome::Updated,
+            node_id,
+        })
+    }
+
+    /// Read the incremental-sync checkpoint for a `(source_system, scope)`.
+    /// Returns a zeroed cursor (no high-water mark) if none has been saved yet.
+    pub fn get_connector_cursor(
+        &self,
+        source_system: &str,
+        scope: &str,
+    ) -> Result<ConnectorCursor> {
+        let reader = self
+            .reader
+            .lock()
+            .map_err(|_| StorageError::Init("Reader lock poisoned".into()))?;
+        let row = reader
+            .query_row(
+                "SELECT cursor_updated_at, last_synced_at, last_full_reconcile_at, records_seen \
+                 FROM connector_cursors WHERE source_system = ?1 AND scope = ?2",
+                params![source_system, scope],
+                |row| {
+                    Ok((
+                        row.get::<_, Option<String>>(0)?,
+                        row.get::<_, Option<String>>(1)?,
+                        row.get::<_, Option<String>>(2)?,
+                        row.get::<_, i64>(3)?,
+                    ))
+                },
+            )
+            .optional()?;
+
+        let parse = |s: Option<String>| -> Option<DateTime<Utc>> {
+            s.and_then(|s| {
+                DateTime::parse_from_rfc3339(&s)
+                    .map(|dt| dt.with_timezone(&Utc))
+                    .ok()
+            })
+        };
+
+        Ok(match row {
+            Some((cur, last, recon, seen)) => ConnectorCursor {
+                source_system: source_system.to_string(),
+                scope: scope.to_string(),
+                cursor_updated_at: parse(cur),
+                last_synced_at: parse(last),
+                last_full_reconcile_at: parse(recon),
+                records_seen: seen,
+            },
+            None => ConnectorCursor {
+                source_system: source_system.to_string(),
+                scope: scope.to_string(),
+                ..Default::default()
+            },
+        })
+    }
+
+    /// Persist the incremental-sync checkpoint for a `(source_system, scope)`.
+    pub fn save_connector_cursor(&self, cursor: &ConnectorCursor) -> Result<()> {
+        let writer = self
+            .writer
+            .lock()
+            .map_err(|_| StorageError::Init("Writer lock poisoned".into()))?;
+        writer.execute(
+            "INSERT INTO connector_cursors \
+                (source_system, scope, cursor_updated_at, last_synced_at, \
+                 last_full_reconcile_at, records_seen) \
+             VALUES (?1, ?2, ?3, ?4, ?5, ?6) \
+             ON CONFLICT(source_system, scope) DO UPDATE SET \
+                cursor_updated_at = excluded.cursor_updated_at, \
+                last_synced_at = excluded.last_synced_at, \
+                last_full_reconcile_at = excluded.last_full_reconcile_at, \
+                records_seen = excluded.records_seen",
+            params![
+                cursor.source_system,
+                cursor.scope,
+                cursor.cursor_updated_at.map(|d| d.to_rfc3339()),
+                cursor.last_synced_at.map(|d| d.to_rfc3339()),
+                cursor.last_full_reconcile_at.map(|d| d.to_rfc3339()),
+                cursor.records_seen,
+            ],
+        )?;
+        Ok(())
+    }
+
+    /// Reconcile deletions for a scope: tombstone every local memory in
+    /// `(source_system, source_project = scope)` whose `source_id` is NOT in the
+    /// caller-supplied set of currently-live ids (#57).
+    ///
+    /// Neither Redmine nor GitHub exposes a deletion feed, so an incremental
+    /// `updated_at` sync can never see a delete. The connector therefore
+    /// periodically enumerates the full set of live ids and calls this. We
+    /// **invalidate, don't purge** (Graphiti-style): the memory keeps its
+    /// content for audit but gets `valid_until = now`, so it falls out of
+    /// "currently valid" retrieval without losing history. A record that
+    /// reappears upstream is un-tombstoned by the next `upsert_by_source`
+    /// (which clears `valid_until`).
+    pub fn reconcile_source_tombstones(
+        &self,
+        source_system: &str,
+        scope: &str,
+        live_ids: &[String],
+    ) -> Result<ReconcileReport> {
+        let live: std::collections::HashSet<&str> = live_ids.iter().map(|s| s.as_str()).collect();
+
+        // All currently-valid local records for this scope.
+        let local: Vec<(String, String)> = {
+            let reader = self
+                .reader
+                .lock()
+                .map_err(|_| StorageError::Init("Reader lock poisoned".into()))?;
+            let mut stmt = reader.prepare(
+                "SELECT id, source_id FROM knowledge_nodes \
+                 WHERE source_system = ?1 AND source_project = ?2 \
+                   AND source_id IS NOT NULL AND valid_until IS NULL",
+            )?;
+            let rows = stmt.query_map(params![source_system, scope], |row| {
+                Ok((row.get::<_, String>(0)?, row.get::<_, String>(1)?))
+            })?;
+            rows.filter_map(|r| r.ok()).collect()
+        };
+
+        let considered = local.len();
+        let now = Utc::now().to_rfc3339();
+        let mut tombstoned = Vec::new();
+
+        {
+            let writer = self
+                .writer
+                .lock()
+                .map_err(|_| StorageError::Init("Writer lock poisoned".into()))?;
+            for (node_id, source_id) in &local {
+                if !live.contains(source_id.as_str()) {
+                    writer.execute(
+                        "UPDATE knowledge_nodes SET valid_until = ?1 WHERE id = ?2",
+                        params![now, node_id],
+                    )?;
+                    tombstoned.push(node_id.clone());
+                }
+            }
+        }
+
+        Ok(ReconcileReport {
+            tombstoned,
+            considered,
+        })
+    }
+}
+
 // ============================================================================
 // TESTS
 // ============================================================================
@ -9443,6 +9825,262 @@ mod tests {
        Storage::new(Some(dir.path().join(name))).unwrap()
    }

+    // ===================== Connector sync (#57) =========================
+
+    /// Build an `IngestInput` carrying a source envelope for a GitHub-ish issue.
+    fn source_input(id: &str, content: &str, hash: &str) -> IngestInput {
+        IngestInput {
+            content: content.to_string(),
+            node_type: "fact".to_string(),
+            source_envelope: Some(crate::memory::SourceEnvelope {
+                source_system: Some("github".to_string()),
+                source_id: Some(id.to_string()),
+                source_url: Some(format!("https://github.com/o/r/issues/{id}")),
+                content_hash: Some(hash.to_string()),
+                source_project: Some("o/r".to_string()),
+                source_type: Some("issue".to_string()),
+                source_author: Some("octocat".to_string()),
+                ..Default::default()
+            }),
+            ..Default::default()
+        }
+    }
+
+    fn node_count(store: &Storage) -> i64 {
+        // Count rows for our test source so embeddings/other tests don't bleed in.
+        let reader = store.reader.lock().unwrap();
+        reader
+            .query_row(
+                "SELECT COUNT(*) FROM knowledge_nodes WHERE source_system = 'github'",
+                [],
+                |r| r.get(0),
+            )
+            .unwrap()
+    }
+
+    #[test]
+    fn upsert_by_source_is_idempotent_across_reruns() {
+        let store = create_test_storage();
+
+        // First sync: a brand-new record → Created.
+        let r1 = store
+            .upsert_by_source(source_input("1", "Bug: crash on startup", "hash-a"))
+            .unwrap();
+        assert_eq!(r1.outcome, SourceUpsertOutcome::Created);
+        assert_eq!(node_count(&store), 1);
+
+        // Re-sync the SAME record with the SAME hash twice → Unchanged, no dupes.
+        for _ in 0..2 {
+            let r = store
+                .upsert_by_source(source_input("1", "Bug: crash on startup", "hash-a"))
+                .unwrap();
+            assert_eq!(r.outcome, SourceUpsertOutcome::Unchanged);
+            assert_eq!(r.node_id, r1.node_id, "must reuse the same memory id");
+        }
+        assert_eq!(
+            node_count(&store),
+            1,
+            "idempotent: still exactly one memory"
+        );
+    }
+
+    #[test]
+    fn upsert_by_source_updates_in_place_when_hash_changes() {
+        let store = create_test_storage();
+        let created = store
+            .upsert_by_source(source_input("7", "old body", "hash-old"))
+            .unwrap();
+
+        // Upstream edit: content + hash change → Updated, same id, new content.
+        let updated = store
+            .upsert_by_source(source_input("7", "new edited body", "hash-new"))
+            .unwrap();
+        assert_eq!(updated.outcome, SourceUpsertOutcome::Updated);
+        assert_eq!(updated.node_id, created.node_id);
+        assert_eq!(node_count(&store), 1, "update must not duplicate");
+
+        let node = store.get_node(&created.node_id).unwrap().unwrap();
+        assert_eq!(node.content, "new edited body");
+        let env = node.source_envelope.expect("envelope persisted");
+        assert_eq!(env.content_hash.as_deref(), Some("hash-new"));
+        assert_eq!(env.source_id.as_deref(), Some("7"));
+    }
+
+    #[test]
+    fn upsert_by_source_without_key_falls_back_to_create() {
+        let store = create_test_storage();
+        // Envelope present but missing source_id → not keyed → plain create.
+        let input = IngestInput {
+            content: "loose note".to_string(),
+            node_type: "fact".to_string(),
+            source_envelope: Some(crate::memory::SourceEnvelope {
+                source_url: Some("https://example.com/x".to_string()),
+                ..Default::default()
+            }),
+            ..Default::default()
+        };
+        let r = store.upsert_by_source(input).unwrap();
+        assert_eq!(r.outcome, SourceUpsertOutcome::Created);
+    }
+
+    #[test]
+    fn connector_cursor_round_trips() {
+        let store = create_test_storage();
+        // Unknown scope → zeroed cursor.
+        let empty = store.get_connector_cursor("github", "o/r").unwrap();
+        assert!(empty.cursor_updated_at.is_none());
+        assert_eq!(empty.records_seen, 0);
+
+        let ts = Utc::now();
+        let cursor = ConnectorCursor {
+            source_system: "github".to_string(),
+            scope: "o/r".to_string(),
+            cursor_updated_at: Some(ts),
+            last_synced_at: Some(ts),
+            last_full_reconcile_at: None,
+            records_seen: 42,
+        };
+        store.save_connector_cursor(&cursor).unwrap();
+
+        let back = store.get_connector_cursor("github", "o/r").unwrap();
+        assert_eq!(back.records_seen, 42);
+        assert_eq!(
+            back.cursor_updated_at.map(|d| d.to_rfc3339()),
+            Some(ts.to_rfc3339())
+        );
+
+        // Upsert semantics: saving again replaces, never duplicates.
+        let mut c2 = cursor.clone();
+        c2.records_seen = 99;
+        store.save_connector_cursor(&c2).unwrap();
+        assert_eq!(
+            store
+                .get_connector_cursor("github", "o/r")
+                .unwrap()
+                .records_seen,
+            99
+        );
+    }
+
+    #[test]
+    fn reconcile_tombstones_records_absent_from_live_set() {
+        let store = create_test_storage();
+        // Three synced issues in scope o/r.
+        for id in ["1", "2", "3"] {
+            store
+                .upsert_by_source(source_input(id, &format!("issue {id}"), &format!("h{id}")))
+                .unwrap();
+        }
+
+        // Reconcile: only 1 and 3 are still visible upstream → 2 is tombstoned.
+        let report = store
+            .reconcile_source_tombstones("github", "o/r", &["1".to_string(), "3".to_string()])
+            .unwrap();
+        assert_eq!(report.considered, 3);
+        assert_eq!(report.tombstoned.len(), 1, "exactly issue 2 tombstoned");
+
+        // Issue 2's memory is invalidated (valid_until set) but NOT purged —
+        // content retained for audit, just no longer currently-valid.
+        let two = {
+            let reader = store.reader.lock().unwrap();
+            reader
+                .query_row(
+                    "SELECT id, valid_until FROM knowledge_nodes WHERE source_id = '2'",
+                    [],
+                    |r| Ok((r.get::<_, String>(0)?, r.get::<_, Option<String>>(1)?)),
+                )
+                .unwrap()
+        };
+        assert!(
+            two.1.is_some(),
+            "tombstoned record must have valid_until set"
+        );
+        let node = store.get_node(&two.0).unwrap().unwrap();
+        assert!(
+            !node.is_currently_valid(),
+            "tombstoned node is not valid now"
+        );
+        assert_eq!(node.content, "issue 2", "content retained for audit");
+
+        // A reappearing record un-tombstones on next upsert (clears valid_until).
+        store
+            .upsert_by_source(source_input("2", "issue 2", "h2"))
+            .unwrap();
+        let revived = store.get_node(&two.0).unwrap().unwrap();
+        assert!(
+            revived.is_currently_valid(),
+            "re-synced record is valid again"
+        );
+    }
+
+    #[test]
+    fn upsert_clears_superseded_by_when_record_reappears() {
+        // Regression: un-tombstoning must clear BOTH bitemporal markers. A
+        // connector node that was superseded/merged (valid_until + superseded_by
+        // both set) and then re-observed upstream must come back fully clean,
+        // otherwise it is currently-valid yet still flagged superseded and is
+        // permanently excluded from merge candidacy.
+        let store = create_test_storage();
+        let created = store
+            .upsert_by_source(source_input("9", "body v1", "h9a"))
+            .unwrap();
+
+        // Simulate the node having been superseded (as merge/supersede would).
+        {
+            let writer = store.writer.lock().unwrap();
+            writer
+                .execute(
+                    "UPDATE knowledge_nodes SET valid_until = ?1, superseded_by = 'survivor-id' WHERE id = ?2",
+                    params![Utc::now().to_rfc3339(), created.node_id],
+                )
+                .unwrap();
+        }
+        assert!(
+            store
+                .superseded_node_ids()
+                .unwrap()
+                .contains(&created.node_id),
+            "precondition: node is superseded"
+        );
+
+        // Re-sync with a content change → Updated branch must clear both markers.
+        let res = store
+            .upsert_by_source(source_input("9", "body v2 edited", "h9b"))
+            .unwrap();
+        assert_eq!(res.outcome, SourceUpsertOutcome::Updated);
+        assert!(
+            !store
+                .superseded_node_ids()
+                .unwrap()
+                .contains(&created.node_id),
+            "superseded_by must be cleared on re-sync (no bitemporal zombie)"
+        );
+        let node = store.get_node(&created.node_id).unwrap().unwrap();
+        assert!(node.is_currently_valid());
+
+        // Also exercise the Unchanged branch: supersede again, re-sync same hash.
+        {
+            let writer = store.writer.lock().unwrap();
+            writer
+                .execute(
+                    "UPDATE knowledge_nodes SET valid_until = ?1, superseded_by = 'survivor-id' WHERE id = ?2",
+                    params![Utc::now().to_rfc3339(), created.node_id],
+                )
+                .unwrap();
+        }
+        let res2 = store
+            .upsert_by_source(source_input("9", "body v2 edited", "h9b"))
+            .unwrap();
+        assert_eq!(res2.outcome, SourceUpsertOutcome::Unchanged);
+        assert!(
+            !store
+                .superseded_node_ids()
+                .unwrap()
+                .contains(&created.node_id),
+            "Unchanged branch must also clear superseded_by"
+        );
+    }
+
    #[cfg(all(feature = "embeddings", feature = "vector-search"))]
    fn with_vector_search_disabled<T>(f: impl FnOnce() -> T) -> T {
        let _guard = ENV_LOCK.lock().unwrap();
--- a/crates/vestige-mcp/Cargo.toml
+++ b/crates/vestige-mcp/Cargo.toml
@ -10,9 +10,13 @@ categories = ["command-line-utilities", "database"]
 repository = "https://github.com/samvallad33/vestige"

 [features]
-default = ["embeddings", "ort-download", "vector-search"]
+default = ["embeddings", "ort-download", "vector-search", "connectors"]
 embeddings = ["vestige-core/embeddings"]
 vector-search = ["vestige-core/vector-search"]
+# External-source connectors (#57): GitHub Issues / Redmine indexing via the
+# `source_sync` MCP tool. On by default so the tool works out of the box; turn
+# off for a build with no HTTP client.
+connectors = ["vestige-core/connectors"]
 # Default ort backend: downloads prebuilt ONNX Runtime at build time.
 # Fails on targets without prebuilts (notably x86_64-apple-darwin).
 ort-download = ["embeddings", "vestige-core/ort-download"]
--- a/crates/vestige-mcp/src/bin/cli.rs
+++ b/crates/vestige-mcp/src/bin/cli.rs
@ -1817,6 +1817,7 @@ fn run_restore(backup_path: PathBuf) -> anyhow::Result<()> {
            tags: memory.tags.unwrap_or_default(),
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };

        match storage.ingest(input) {
@ -2415,6 +2416,7 @@ fn run_ingest(
        tags: tag_list,
        valid_from: None,
        valid_until: None,
+        source_envelope: None,
    };

    let storage = open_storage()?;
--- a/crates/vestige-mcp/src/bin/restore.rs
+++ b/crates/vestige-mcp/src/bin/restore.rs
@ -73,6 +73,7 @@ fn main() -> anyhow::Result<()> {
            tags: memory.tags.unwrap_or_default(),
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };

        match storage.ingest(input) {
--- a/crates/vestige-mcp/src/cognitive.rs
+++ b/crates/vestige-mcp/src/cognitive.rs
@ -195,6 +195,7 @@ mod tests {
                tags: vec!["test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
        result.id
--- a/crates/vestige-mcp/src/server.rs
+++ b/crates/vestige-mcp/src/server.rs
@ -281,6 +281,15 @@ impl McpServer {
                ..Default::default()
            },
            // ================================================================
+            // EXTERNAL-SOURCE CONNECTORS (#57)
+            // ================================================================
+            ToolDescription {
+                name: "source_sync".to_string(),
+                description: Some("Index an external system into Vestige as a durable, offline, semantically-searchable index that cites back to the canonical record. GitHub: source='github', repo='owner/name' (auth via GITHUB_TOKEN env). Redmine: source='redmine', project='<id>' (host via REDMINE_URL, auth via REDMINE_API_KEY env). Idempotent: re-running updates changed issues without duplicating; set reconcile=true to tombstone issues removed upstream.".to_string()),
+                input_schema: tools::source_sync::schema(),
+                ..Default::default()
+            },
+            // ================================================================
            // TEMPORAL TOOLS (v1.2+)
            // ================================================================
            ToolDescription {
@ -593,6 +602,11 @@ impl McpServer {
                    .await
            }

+            // ================================================================
+            // External-source connectors (#57)
+            // ================================================================
+            "source_sync" => tools::source_sync::execute(&self.storage, request.arguments).await,
+
            // ================================================================
            // DEPRECATED (v1.7): ingest → smart_ingest
            // ================================================================
@ -1806,10 +1820,10 @@ mod tests {
        let result = response.result.unwrap();
        let tools = result["tools"].as_array().unwrap();

-        // 33 tools: 25 from v2.1.21 + 7 Phase 3 merge/supersede tools:
-        // merge_candidates, plan_merge, plan_supersede, apply_plan, merge_undo,
-        // protect, merge_policy, composed_graph)
-        assert_eq!(tools.len(), 33, "Expected exactly 33 tools");
+        // 34 tools: 25 from v2.1.21 + 7 Phase 3 merge/supersede tools
+        // (merge_candidates, plan_merge, plan_supersede, apply_plan, merge_undo,
+        // protect, merge_policy, composed_graph) + 1 connector tool (source_sync, #57).
+        assert_eq!(tools.len(), 34, "Expected exactly 34 tools");

        let tool_names: Vec<&str> = tools.iter().map(|t| t["name"].as_str().unwrap()).collect();

@ -1821,6 +1835,9 @@ mod tests {

        // Core memory (smart_ingest absorbs ingest + checkpoint in v1.7)
        assert!(tool_names.contains(&"smart_ingest"));
+
+        // External-source connectors (#57)
+        assert!(tool_names.contains(&"source_sync"));
        assert!(
            !tool_names.contains(&"ingest"),
            "ingest should be removed in v1.7"
--- a/crates/vestige-mcp/src/tools/changelog.rs
+++ b/crates/vestige-mcp/src/tools/changelog.rs
@ -278,6 +278,7 @@ mod tests {
                tags: vec![],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
        node.id
--- a/crates/vestige-mcp/src/tools/codebase_unified.rs
+++ b/crates/vestige-mcp/src/tools/codebase_unified.rs
@ -154,6 +154,7 @@ async fn execute_remember_pattern(
        tags,
        valid_from: None,
        valid_until: None,
+        source_envelope: None,
    };

    let node = storage.ingest(input).map_err(|e| e.to_string())?;
@ -250,6 +251,7 @@ async fn execute_remember_decision(
        tags,
        valid_from: None,
        valid_until: None,
+        source_envelope: None,
    };

    let node = storage.ingest(input).map_err(|e| e.to_string())?;
--- a/crates/vestige-mcp/src/tools/cross_reference.rs
+++ b/crates/vestige-mcp/src/tools/cross_reference.rs
@ -1119,6 +1119,7 @@ mod tests {
                tags: tags.iter().map(|s| s.to_string()).collect(),
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap()
            .id
--- a/crates/vestige-mcp/src/tools/dream.rs
+++ b/crates/vestige-mcp/src/tools/dream.rs
@ -283,6 +283,7 @@ mod tests {
                    tags: vec!["dream-test".to_string()],
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
        }
@ -420,6 +421,7 @@ mod tests {
                    tags: vec!["dream-roundtrip".to_string()],
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
        }
@ -485,6 +487,7 @@ mod tests {
                    tags: vec!["save-conn-test".to_string()],
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
            ids.push(result.id);
@ -588,6 +591,7 @@ mod tests {
                    tags: tags.iter().map(|t| t.to_string()).collect(),
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
        }
@ -713,6 +717,7 @@ mod tests {
                    tags: tags.iter().map(|t| t.to_string()).collect(),
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
        }
--- a/crates/vestige-mcp/src/tools/explore.rs
+++ b/crates/vestige-mcp/src/tools/explore.rs
@ -414,6 +414,7 @@ mod tests {
                tags: vec!["test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap()
            .id;
@ -428,6 +429,7 @@ mod tests {
                tags: vec!["test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap()
            .id;
@ -478,6 +480,7 @@ mod tests {
            tags: vec!["test".to_string()],
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        let id_a = storage.ingest(make("Memory A about databases")).unwrap().id;
        let id_b = storage.ingest(make("Memory B about indexes")).unwrap().id;
@ -529,6 +532,7 @@ mod tests {
            tags: vec!["test".to_string()],
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        let id_a = storage.ingest(make("Bridge test memory A")).unwrap().id;
        let id_b = storage.ingest(make("Bridge test memory B")).unwrap().id;
--- a/crates/vestige-mcp/src/tools/feedback.rs
+++ b/crates/vestige-mcp/src/tools/feedback.rs
@ -313,6 +313,7 @@ mod tests {
                tags: vec![],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
        node.id
@ -556,6 +557,7 @@ mod tests {
                tags: vec![],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
        let node_id = node.id.clone();
--- a/crates/vestige-mcp/src/tools/graph.rs
+++ b/crates/vestige-mcp/src/tools/graph.rs
@ -328,6 +328,7 @@ mod tests {
                tags: vec!["test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

@ -355,6 +356,7 @@ mod tests {
                tags: vec!["science".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

@ -378,6 +380,7 @@ mod tests {
                tags: vec![],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

--- a/crates/vestige-mcp/src/tools/health.rs
+++ b/crates/vestige-mcp/src/tools/health.rs
@ -119,6 +119,7 @@ mod tests {
                    tags: vec!["test".to_string()],
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
        }
@ -144,6 +145,7 @@ mod tests {
                tags: vec![],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

--- a/crates/vestige-mcp/src/tools/maintenance.rs
+++ b/crates/vestige-mcp/src/tools/maintenance.rs
@ -778,6 +778,7 @@ mod tests {
                    tags: vec![],
                    valid_from: None,
                    valid_until: None,
+                    source_envelope: None,
                })
                .unwrap();
        }
@ -832,6 +833,7 @@ mod tests {
                        tags: vec![],
                        valid_from: None,
                        valid_until: None,
+                        source_envelope: None,
                    })
                    .unwrap();
            }
@ -904,6 +906,7 @@ mod tests {
                tags: vec!["schema-test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

@ -1015,6 +1018,7 @@ mod tests {
                tags: vec!["portable".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

--- a/crates/vestige-mcp/src/tools/memory_unified.rs
+++ b/crates/vestige-mcp/src/tools/memory_unified.rs
@ -552,6 +552,7 @@ mod tests {
                tags: vec!["test-tag".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
        node.id
--- a/crates/vestige-mcp/src/tools/mod.rs
+++ b/crates/vestige-mcp/src/tools/mod.rs
@ -12,6 +12,8 @@ pub mod intention_unified;
 pub mod memory_unified;
 pub mod search_unified;
 pub mod smart_ingest;
+// #57: external-source connectors (GitHub Issues / Redmine retrieval layer)
+pub mod source_sync;

 // v1.2: Temporal query tools
 pub mod changelog;
--- a/crates/vestige-mcp/src/tools/restore.rs
+++ b/crates/vestige-mcp/src/tools/restore.rs
@ -173,6 +173,7 @@ pub async fn execute(storage: &Arc<Storage>, args: Option<Value>) -> Result<Valu
            tags: memory.tags.clone().unwrap_or_default(),
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };

        match storage.ingest(input) {
@ -349,6 +350,7 @@ mod tests {
                tags: vec!["portable".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();

--- a/crates/vestige-mcp/src/tools/review.rs
+++ b/crates/vestige-mcp/src/tools/review.rs
@ -117,6 +117,7 @@ mod tests {
            tags: vec![],
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        let node = storage.ingest(input).unwrap();
        node.id
--- a/crates/vestige-mcp/src/tools/search_unified.rs
+++ b/crates/vestige-mcp/src/tools/search_unified.rs
@ -96,6 +96,40 @@ pub fn schema() -> Value {
            "tag_prefix": {
                "type": "string",
                "description": "Optional tag-prefix filter. When set, only results carrying at least one tag whose value starts with this prefix are returned (case-sensitive). Example: tag_prefix=\"meeting:\" matches memories tagged 'meeting:standup', 'meeting:1-on-1', etc. Applied as a post-filter; combine with a larger 'limit' if you expect heavy thinning."
+            },
+            "source_system": {
+                "type": "string",
+                "description": "Investigation filter (#57): only memories ingested from this external system, e.g. 'github' or 'redmine'. Post-filter — non-connector memories are excluded. Combine with a larger 'limit' if thinning is heavy."
+            },
+            "source_project": {
+                "type": "string",
+                "description": "Investigation filter: only memories from this source project/repo, exact match (GitHub 'owner/repo', Redmine project id)."
+            },
+            "source_id": {
+                "type": "string",
+                "description": "Investigation filter: a specific source record id (issue number / ticket id). Pair with source_system to disambiguate across systems."
+            },
+            "source_type": {
+                "type": "string",
+                "description": "Investigation filter: source record type, e.g. 'issue', 'comment'."
+            },
+            "source_author": {
+                "type": "string",
+                "description": "Investigation filter: the source author/reporter (not assignee)."
+            },
+            "source_updated_after": {
+                "type": "string",
+                "description": "Investigation filter: only records whose source was updated at/after this RFC3339 timestamp (inclusive)."
+            },
+            "source_updated_before": {
+                "type": "string",
+                "description": "Investigation filter: only records whose source was updated at/before this RFC3339 timestamp (inclusive)."
+            },
+            "source_status": {
+                "type": "string",
+                "enum": ["any", "valid", "tombstoned"],
+                "description": "Investigation filter: 'any' (default), 'valid' (currently-valid records only), or 'tombstoned' (records no longer visible upstream, kept for audit).",
+                "default": "any"
            }
        },
        "required": ["query"]
@ -126,6 +160,23 @@ struct SearchArgs {
    concrete: Option<bool>,
    #[serde(alias = "tag_prefix")]
    tag_prefix: Option<String>,
+    // #57 Phase 4 — source-aware investigation filters (all post-filters).
+    #[serde(alias = "source_system")]
+    source_system: Option<String>,
+    #[serde(alias = "source_project")]
+    source_project: Option<String>,
+    #[serde(alias = "source_id")]
+    source_id: Option<String>,
+    #[serde(alias = "source_type")]
+    source_type: Option<String>,
+    #[serde(alias = "source_author")]
+    source_author: Option<String>,
+    #[serde(alias = "source_updated_after")]
+    source_updated_after: Option<String>,
+    #[serde(alias = "source_updated_before")]
+    source_updated_before: Option<String>,
+    #[serde(alias = "source_status")]
+    source_status: Option<String>,
 }

 /// Execute unified search with 7-stage cognitive pipeline.
@ -190,15 +241,19 @@ pub async fn execute(
        }
    };

+    // #57 Phase 4 — parse the source-aware investigation filter once (shared by
+    // both the concrete and hybrid paths). Hard-errors on malformed input.
+    let source_filter = SourceFilter::from_args(&args)?;
+
    let concrete = args
        .concrete
        .unwrap_or_else(|| is_literal_query(&args.query));
    if concrete {
-        // When a tag_prefix is requested, fetch a larger pool so the
-        // post-filter has enough headroom to still return ~limit results
-        // after thinning. Cap at the same upper bound the underlying SQL
-        // path uses elsewhere (100).
-        let concrete_fetch_limit = if args.tag_prefix.is_some() {
+        // When a tag_prefix OR a source filter is requested, fetch a larger
+        // pool so the post-filter has enough headroom to still return ~limit
+        // results after thinning. Cap at the same upper bound the underlying
+        // SQL path uses elsewhere (100).
+        let concrete_fetch_limit = if args.tag_prefix.is_some() || source_filter.is_active() {
            (limit * 3).min(100)
        } else {
            limit
@ -215,14 +270,15 @@ pub async fn execute(
        // Apply tag_prefix post-filter BEFORE strengthen-on-access so
        // results the caller did not actually receive do not get a
        // testing-effect boost.
-        let filtered_results: Vec<&vestige_core::SearchResult> = match args.tag_prefix.as_deref() {
-            Some(prefix) => results
-                .iter()
-                .filter(|r| tags_match_prefix(&r.node.tags, prefix))
-                .take(limit as usize)
-                .collect(),
-            None => results.iter().collect(),
-        };
+        let filtered_results: Vec<&vestige_core::SearchResult> = results
+            .iter()
+            .filter(|r| match args.tag_prefix.as_deref() {
+                Some(prefix) => tags_match_prefix(&r.node.tags, prefix),
+                None => true,
+            })
+            .filter(|r| node_matches_source(&r.node, &source_filter))
+            .take(limit as usize)
+            .collect();

        let ids: Vec<&str> = filtered_results
            .iter()
@ -334,11 +390,15 @@ pub async fn execute(
        "exhaustive" => 5, // Deep overfetch for maximum recall
        _ => 3,            // Balanced default
    };
-    // When a tag_prefix filter is requested, double the overfetch (capped at
-    // the same 100 ceiling) so the post-filter has enough headroom to still
-    // return ~limit results after thinning.
-    let tag_prefix_multiplier = if args.tag_prefix.is_some() { 2 } else { 1 };
-    let overfetch_limit = (limit * overfetch_multiplier * tag_prefix_multiplier).min(100); // Cap at 100 to avoid excessive DB load
+    // When a tag_prefix OR source filter is requested, double the overfetch
+    // (capped at the same 100 ceiling) so the post-filter has enough headroom
+    // to still return ~limit results after thinning.
+    let post_filter_multiplier = if args.tag_prefix.is_some() || source_filter.is_active() {
+        2
+    } else {
+        1
+    };
+    let overfetch_limit = (limit * overfetch_multiplier * post_filter_multiplier).min(100); // Cap at 100 to avoid excessive DB load

    let results = storage
        .hybrid_search_filtered(
@ -375,6 +435,10 @@ pub async fn execute(
    if let Some(prefix) = args.tag_prefix.as_deref() {
        filtered_results.retain(|r| tags_match_prefix(&r.node.tags, prefix));
    }
+    // #57 Phase 4 — source-aware investigation post-filter (same precedent).
+    if source_filter.is_active() {
+        filtered_results.retain(|r| node_matches_source(&r.node, &source_filter));
+    }

    // ====================================================================
    // Dedup: merge Stage 0 keyword-priority results into Stage 1 results
@ -387,6 +451,10 @@ pub async fn execute(
        {
            continue;
        }
+        // Respect the source filter on re-inject for the same reason.
+        if source_filter.is_active() && !node_matches_source(&kp.node, &source_filter) {
+            continue;
+        }
        if let Some(existing) = filtered_results
            .iter_mut()
            .find(|r| r.node.id == kp.node.id)
@ -852,6 +920,156 @@ fn tags_match_prefix(tags: &[String], prefix: &str) -> bool {
    tags.iter().any(|t| t.starts_with(prefix))
 }

+/// Validity filter for source-aware search (#57 Phase 4).
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
+enum SourceStatus {
+    /// No validity constraint.
+    #[default]
+    Any,
+    /// Only currently-valid records.
+    Valid,
+    /// Only tombstoned records (no longer visible upstream, kept for audit).
+    Tombstoned,
+}
+
+/// Parsed source-aware investigation filter (#57 Phase 4).
+///
+/// All fields are optional; an all-empty filter matches every node (so search
+/// behavior is byte-for-byte unchanged when no source filter is supplied). Any
+/// source-scoped field being set excludes legacy/agent memories that have no
+/// `source_envelope`. Applied as a post-filter on the recalled nodes, mirroring
+/// the existing `tag_prefix` precedent (no SQL changes).
+#[derive(Debug, Clone, Default)]
+struct SourceFilter {
+    system: Option<String>,
+    project: Option<String>,
+    id: Option<String>,
+    source_type: Option<String>,
+    author: Option<String>,
+    updated_after: Option<chrono::DateTime<chrono::Utc>>,
+    updated_before: Option<chrono::DateTime<chrono::Utc>>,
+    status: SourceStatus,
+}
+
+impl SourceFilter {
+    /// Build from raw args, hard-erroring on malformed timestamps / status enum
+    /// (consistent with how `detail_level` / `retrieval_mode` reject bad input —
+    /// a silently-`None` bound would widen the filter and return wrong rows).
+    fn from_args(args: &SearchArgs) -> Result<Self, String> {
+        let parse_ts = |s: &Option<String>,
+                        field: &str|
+         -> Result<Option<chrono::DateTime<chrono::Utc>>, String> {
+            match s {
+                None => Ok(None),
+                Some(v) => chrono::DateTime::parse_from_rfc3339(v)
+                    .map(|dt| Some(dt.with_timezone(&chrono::Utc)))
+                    .map_err(|_| format!("Invalid {field}: '{v}' is not an RFC3339 timestamp")),
+            }
+        };
+        let status = match args.source_status.as_deref() {
+            None | Some("any") => SourceStatus::Any,
+            Some("valid") => SourceStatus::Valid,
+            Some("tombstoned") => SourceStatus::Tombstoned,
+            Some(other) => {
+                return Err(format!(
+                    "Invalid source_status '{other}'. Must be 'any', 'valid', or 'tombstoned'."
+                ));
+            }
+        };
+        Ok(Self {
+            system: args.source_system.clone(),
+            project: args.source_project.clone(),
+            id: args.source_id.clone(),
+            source_type: args.source_type.clone(),
+            author: args.source_author.clone(),
+            updated_after: parse_ts(&args.source_updated_after, "source_updated_after")?,
+            updated_before: parse_ts(&args.source_updated_before, "source_updated_before")?,
+            status,
+        })
+    }
+
+    /// True when at least one filter is set (used to size the over-fetch pool).
+    fn is_active(&self) -> bool {
+        self.system.is_some()
+            || self.project.is_some()
+            || self.id.is_some()
+            || self.source_type.is_some()
+            || self.author.is_some()
+            || self.updated_after.is_some()
+            || self.updated_before.is_some()
+            || self.status != SourceStatus::Any
+    }
+}
+
+/// Predicate: does this node satisfy the source-aware investigation filter?
+/// An all-empty filter returns `true` for every node.
+fn node_matches_source(node: &vestige_core::KnowledgeNode, filter: &SourceFilter) -> bool {
+    // Validity check operates on the NODE (valid_until lives on the node).
+    match filter.status {
+        SourceStatus::Any => {}
+        SourceStatus::Valid if !node.is_currently_valid() => return false,
+        SourceStatus::Tombstoned if node.is_currently_valid() => return false,
+        _ => {}
+    }
+
+    // Any source-scoped field requires an envelope; legacy memories are out.
+    // This includes `source_status=valid`: otherwise a source-scoped query for
+    // valid connector records would also return ordinary valid agent memories.
+    let envelope_scoped = filter.system.is_some()
+        || filter.project.is_some()
+        || filter.id.is_some()
+        || filter.source_type.is_some()
+        || filter.author.is_some()
+        || filter.updated_after.is_some()
+        || filter.updated_before.is_some()
+        || filter.status != SourceStatus::Any;
+    if !envelope_scoped {
+        return true;
+    }
+    let Some(env) = node.source_envelope.as_ref() else {
+        return false;
+    };
+
+    let exact = |want: &Option<String>, have: &Option<String>| -> bool {
+        match want {
+            None => true,
+            Some(w) => have.as_deref() == Some(w.as_str()),
+        }
+    };
+    if !exact(&filter.system, &env.source_system) {
+        return false;
+    }
+    if !exact(&filter.project, &env.source_project) {
+        return false;
+    }
+    if !exact(&filter.id, &env.source_id) {
+        return false;
+    }
+    if !exact(&filter.source_type, &env.source_type) {
+        return false;
+    }
+    if !exact(&filter.author, &env.source_author) {
+        return false;
+    }
+    // Date bounds (inclusive) on the source-updated time.
+    if filter.updated_after.is_some() || filter.updated_before.is_some() {
+        let Some(ts) = env.source_updated_at else {
+            return false;
+        };
+        if let Some(after) = filter.updated_after
+            && ts < after
+        {
+            return false;
+        }
+        if let Some(before) = filter.updated_before
+            && ts > before
+        {
+            return false;
+        }
+    }
+    true
+}
+
 /// Format a search result based on the requested detail level.
 /// Score field keys dropped when an output profile suppresses scores.
 const SCORE_FIELDS: &[&str] = &["combinedScore", "keywordScore", "semanticScore"];
@ -889,6 +1107,31 @@ pub fn apply_output_masks(results: &mut [Value], output_config: &OutputConfig) {
    }
 }

+/// Build a compact `source` object from a node's connector provenance (#57),
+/// or `Value::Null` when the memory has no external source envelope.
+///
+/// Surfacing `url` in search results is the whole point of the connector layer:
+/// the agent can follow the citation back to the canonical Redmine/GitHub record
+/// for authoritative detail. `tombstoned` flags records no longer visible
+/// upstream (kept for audit).
+fn source_provenance(node: &vestige_core::KnowledgeNode) -> Value {
+    let Some(env) = node.source_envelope.as_ref() else {
+        return Value::Null;
+    };
+    serde_json::json!({
+        "system": env.source_system,
+        "id": env.source_id,
+        "url": env.source_url,
+        "project": env.source_project,
+        "type": env.source_type,
+        "author": env.source_author,
+        "sourceUpdatedAt": env.source_updated_at.map(|dt| dt.to_rfc3339()),
+        "syncedAt": env.synced_at.map(|dt| dt.to_rfc3339()),
+        // A tombstoned (no-longer-visible) record has valid_until set in the past.
+        "tombstoned": !node.is_currently_valid(),
+    })
+}
+
 fn format_search_result(r: &vestige_core::SearchResult, detail_level: &str) -> Value {
    match detail_level {
        "brief" => serde_json::json!({
@ -898,45 +1141,65 @@ fn format_search_result(r: &vestige_core::SearchResult, detail_level: &str) -> V
            "retentionStrength": r.node.retention_strength,
            "combinedScore": r.combined_score,
        }),
-        "full" => serde_json::json!({
-            "id": r.node.id,
-            "content": r.node.content,
-            "combinedScore": r.combined_score,
-            "keywordScore": r.keyword_score,
-            "semanticScore": r.semantic_score,
-            "nodeType": r.node.node_type,
-            "tags": r.node.tags,
-            "retentionStrength": r.node.retention_strength,
-            "storageStrength": r.node.storage_strength,
-            "retrievalStrength": r.node.retrieval_strength,
-            "source": r.node.source,
-            "sentimentScore": r.node.sentiment_score,
-            "sentimentMagnitude": r.node.sentiment_magnitude,
-            "createdAt": r.node.created_at.to_rfc3339(),
-            "updatedAt": r.node.updated_at.to_rfc3339(),
-            "lastAccessed": r.node.last_accessed.to_rfc3339(),
-            "nextReview": r.node.next_review.map(|dt| dt.to_rfc3339()),
-            "stability": r.node.stability,
-            "difficulty": r.node.difficulty,
-            "reps": r.node.reps,
-            "lapses": r.node.lapses,
-            "validFrom": r.node.valid_from.map(|dt| dt.to_rfc3339()),
-            "validUntil": r.node.valid_until.map(|dt| dt.to_rfc3339()),
-            "matchType": format!("{:?}", r.match_type),
-        }),
+        "full" => {
+            let mut v = serde_json::json!({
+                "id": r.node.id,
+                "content": r.node.content,
+                "combinedScore": r.combined_score,
+                "keywordScore": r.keyword_score,
+                "semanticScore": r.semantic_score,
+                "nodeType": r.node.node_type,
+                "tags": r.node.tags,
+                "retentionStrength": r.node.retention_strength,
+                "storageStrength": r.node.storage_strength,
+                "retrievalStrength": r.node.retrieval_strength,
+                "source": r.node.source,
+                "sentimentScore": r.node.sentiment_score,
+                "sentimentMagnitude": r.node.sentiment_magnitude,
+                "createdAt": r.node.created_at.to_rfc3339(),
+                "updatedAt": r.node.updated_at.to_rfc3339(),
+                "lastAccessed": r.node.last_accessed.to_rfc3339(),
+                "nextReview": r.node.next_review.map(|dt| dt.to_rfc3339()),
+                "stability": r.node.stability,
+                "difficulty": r.node.difficulty,
+                "reps": r.node.reps,
+                "lapses": r.node.lapses,
+                "validFrom": r.node.valid_from.map(|dt| dt.to_rfc3339()),
+                "validUntil": r.node.valid_until.map(|dt| dt.to_rfc3339()),
+                "matchType": format!("{:?}", r.match_type),
+            });
+            attach_source_record(&mut v, &r.node);
+            v
+        }
        // "summary" (default) — includes dates so AI never has to guess when a memory is from
-        _ => serde_json::json!({
-            "id": r.node.id,
-            "content": r.node.content,
-            "combinedScore": r.combined_score,
-            "keywordScore": r.keyword_score,
-            "semanticScore": r.semantic_score,
-            "nodeType": r.node.node_type,
-            "tags": r.node.tags,
-            "retentionStrength": r.node.retention_strength,
-            "createdAt": r.node.created_at.to_rfc3339(),
-            "updatedAt": r.node.updated_at.to_rfc3339(),
-        }),
+        _ => {
+            let mut v = serde_json::json!({
+                "id": r.node.id,
+                "content": r.node.content,
+                "combinedScore": r.combined_score,
+                "keywordScore": r.keyword_score,
+                "semanticScore": r.semantic_score,
+                "nodeType": r.node.node_type,
+                "tags": r.node.tags,
+                "retentionStrength": r.node.retention_strength,
+                "createdAt": r.node.created_at.to_rfc3339(),
+                "updatedAt": r.node.updated_at.to_rfc3339(),
+            });
+            attach_source_record(&mut v, &r.node);
+            v
+        }
+    }
+}
+
+/// Inject a `sourceRecord` object into a result `Value` ONLY when the memory
+/// has external provenance, so legacy (agent/user-authored) memories keep their
+/// exact prior result shape.
+fn attach_source_record(value: &mut Value, node: &vestige_core::KnowledgeNode) {
+    let provenance = source_provenance(node);
+    if !provenance.is_null()
+        && let Value::Object(map) = value
+    {
+        map.insert("sourceRecord".to_string(), provenance);
    }
 }

@ -950,36 +1213,44 @@ pub fn format_node(node: &vestige_core::KnowledgeNode, detail_level: &str) -> Va
            "tags": node.tags,
            "retentionStrength": node.retention_strength,
        }),
-        "full" => serde_json::json!({
-            "id": node.id,
-            "content": node.content,
-            "nodeType": node.node_type,
-            "tags": node.tags,
-            "retentionStrength": node.retention_strength,
-            "storageStrength": node.storage_strength,
-            "retrievalStrength": node.retrieval_strength,
-            "source": node.source,
-            "sentimentScore": node.sentiment_score,
-            "sentimentMagnitude": node.sentiment_magnitude,
-            "createdAt": node.created_at.to_rfc3339(),
-            "updatedAt": node.updated_at.to_rfc3339(),
-            "lastAccessed": node.last_accessed.to_rfc3339(),
-            "nextReview": node.next_review.map(|dt| dt.to_rfc3339()),
-            "stability": node.stability,
-            "difficulty": node.difficulty,
-            "reps": node.reps,
-            "lapses": node.lapses,
-            "validFrom": node.valid_from.map(|dt| dt.to_rfc3339()),
-            "validUntil": node.valid_until.map(|dt| dt.to_rfc3339()),
-        }),
+        "full" => {
+            let mut v = serde_json::json!({
+                "id": node.id,
+                "content": node.content,
+                "nodeType": node.node_type,
+                "tags": node.tags,
+                "retentionStrength": node.retention_strength,
+                "storageStrength": node.storage_strength,
+                "retrievalStrength": node.retrieval_strength,
+                "source": node.source,
+                "sentimentScore": node.sentiment_score,
+                "sentimentMagnitude": node.sentiment_magnitude,
+                "createdAt": node.created_at.to_rfc3339(),
+                "updatedAt": node.updated_at.to_rfc3339(),
+                "lastAccessed": node.last_accessed.to_rfc3339(),
+                "nextReview": node.next_review.map(|dt| dt.to_rfc3339()),
+                "stability": node.stability,
+                "difficulty": node.difficulty,
+                "reps": node.reps,
+                "lapses": node.lapses,
+                "validFrom": node.valid_from.map(|dt| dt.to_rfc3339()),
+                "validUntil": node.valid_until.map(|dt| dt.to_rfc3339()),
+            });
+            attach_source_record(&mut v, node);
+            v
+        }
        // "summary" (default)
-        _ => serde_json::json!({
-            "id": node.id,
-            "content": node.content,
-            "nodeType": node.node_type,
-            "tags": node.tags,
-            "retentionStrength": node.retention_strength,
-        }),
+        _ => {
+            let mut v = serde_json::json!({
+                "id": node.id,
+                "content": node.content,
+                "nodeType": node.node_type,
+                "tags": node.tags,
+                "retentionStrength": node.retention_strength,
+            });
+            attach_source_record(&mut v, node);
+            v
+        }
    }
 }

@ -1016,6 +1287,7 @@ mod tests {
            tags: vec![],
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        let node = storage.ingest(input).unwrap();
        node.id
@ -1826,6 +2098,167 @@ mod tests {
        assert!(!required.contains(&serde_json::json!("tag_prefix")));
    }

+    // ===================== #57 Phase 4 source filters =====================
+
+    /// Build a KnowledgeNode carrying a source envelope for filter tests.
+    fn node_with_source(
+        system: &str,
+        project: &str,
+        id: &str,
+        author: &str,
+        updated: &str,
+    ) -> vestige_core::KnowledgeNode {
+        let mut n = vestige_core::KnowledgeNode::default();
+        n.id = format!("{system}-{id}");
+        // SourceEnvelope is #[non_exhaustive]; build via Default + field set.
+        let mut env = vestige_core::SourceEnvelope::default();
+        env.source_system = Some(system.to_string());
+        env.source_id = Some(id.to_string());
+        env.source_url = Some(format!("https://x/{id}"));
+        env.source_updated_at = chrono::DateTime::parse_from_rfc3339(updated)
+            .ok()
+            .map(|d| d.with_timezone(&chrono::Utc));
+        env.content_hash = Some("h".to_string());
+        env.source_project = Some(project.to_string());
+        env.source_type = Some("issue".to_string());
+        env.source_author = Some(author.to_string());
+        n.source_envelope = Some(env);
+        n
+    }
+
+    fn filter_from(json: serde_json::Value) -> SourceFilter {
+        let mut v = json;
+        v["query"] = serde_json::json!("q");
+        let args: SearchArgs = serde_json::from_value(v).unwrap();
+        SourceFilter::from_args(&args).unwrap()
+    }
+
+    #[test]
+    fn source_filter_empty_matches_everything() {
+        let f = SourceFilter::default();
+        assert!(!f.is_active());
+        let gh = node_with_source("github", "o/r", "1", "octo", "2026-06-19T00:00:00Z");
+        let legacy = vestige_core::KnowledgeNode::default(); // no envelope
+        assert!(node_matches_source(&gh, &f));
+        assert!(node_matches_source(&legacy, &f), "no filter = unchanged");
+    }
+
+    #[test]
+    fn source_filter_exact_fields() {
+        let gh = node_with_source("github", "o/r", "57", "octo", "2026-06-19T00:00:00Z");
+        let rm = node_with_source("redmine", "infra", "57", "jane", "2026-06-19T00:00:00Z");
+
+        let by_system = filter_from(serde_json::json!({"sourceSystem": "github"}));
+        assert!(node_matches_source(&gh, &by_system));
+        assert!(!node_matches_source(&rm, &by_system));
+
+        let by_project = filter_from(serde_json::json!({"sourceProject": "infra"}));
+        assert!(node_matches_source(&rm, &by_project));
+        assert!(!node_matches_source(&gh, &by_project));
+
+        let by_author = filter_from(serde_json::json!({"sourceAuthor": "octo"}));
+        assert!(node_matches_source(&gh, &by_author));
+        assert!(!node_matches_source(&rm, &by_author));
+
+        // id + system together disambiguate across systems sharing an id.
+        let by_id_sys =
+            filter_from(serde_json::json!({"sourceSystem": "redmine", "sourceId": "57"}));
+        assert!(node_matches_source(&rm, &by_id_sys));
+        assert!(!node_matches_source(&gh, &by_id_sys));
+    }
+
+    #[test]
+    fn source_filter_excludes_legacy_memories_when_envelope_scoped() {
+        let legacy = vestige_core::KnowledgeNode::default();
+        let f = filter_from(serde_json::json!({"sourceSystem": "github"}));
+        assert!(
+            !node_matches_source(&legacy, &f),
+            "an envelope-scoped filter must exclude memories with no source"
+        );
+    }
+
+    #[test]
+    fn source_filter_date_bounds_inclusive() {
+        let n = node_with_source("github", "o/r", "1", "octo", "2026-06-15T12:00:00Z");
+        // After bound: inclusive at the exact instant, excludes earlier.
+        assert!(node_matches_source(
+            &n,
+            &filter_from(serde_json::json!({"sourceUpdatedAfter": "2026-06-15T12:00:00Z"}))
+        ));
+        assert!(!node_matches_source(
+            &n,
+            &filter_from(serde_json::json!({"sourceUpdatedAfter": "2026-06-16T00:00:00Z"}))
+        ));
+        // Before bound: inclusive, excludes later.
+        assert!(node_matches_source(
+            &n,
+            &filter_from(serde_json::json!({"sourceUpdatedBefore": "2026-06-15T12:00:00Z"}))
+        ));
+        assert!(!node_matches_source(
+            &n,
+            &filter_from(serde_json::json!({"sourceUpdatedBefore": "2026-06-15T00:00:00Z"}))
+        ));
+    }
+
+    #[test]
+    fn source_filter_status_valid_vs_tombstoned() {
+        let mut live = node_with_source("github", "o/r", "1", "octo", "2026-06-19T00:00:00Z");
+        let mut dead = node_with_source("github", "o/r", "2", "octo", "2026-06-19T00:00:00Z");
+        let legacy = vestige_core::KnowledgeNode::default();
+        // Tombstone `dead` by setting valid_until in the past.
+        dead.valid_until = Some(chrono::Utc::now() - chrono::Duration::days(1));
+        live.valid_until = None;
+
+        let valid = filter_from(serde_json::json!({"sourceStatus": "valid"}));
+        assert!(node_matches_source(&live, &valid));
+        assert!(!node_matches_source(&dead, &valid));
+        assert!(
+            !node_matches_source(&legacy, &valid),
+            "source_status is source-scoped and must not include legacy memories"
+        );
+
+        let tomb = filter_from(serde_json::json!({"sourceStatus": "tombstoned"}));
+        assert!(!node_matches_source(&live, &tomb));
+        assert!(node_matches_source(&dead, &tomb));
+        assert!(!node_matches_source(&legacy, &tomb));
+    }
+
+    #[test]
+    fn source_filter_rejects_bad_timestamp_and_status() {
+        let mut v = serde_json::json!({"query": "q", "sourceUpdatedAfter": "not-a-date"});
+        let args: SearchArgs = serde_json::from_value(v.take()).unwrap();
+        assert!(SourceFilter::from_args(&args).is_err());
+
+        let mut v2 = serde_json::json!({"query": "q", "sourceStatus": "bogus"});
+        let args2: SearchArgs = serde_json::from_value(v2.take()).unwrap();
+        assert!(SourceFilter::from_args(&args2).is_err());
+    }
+
+    #[test]
+    fn test_schema_has_source_filters() {
+        let s = schema();
+        for prop in [
+            "source_system",
+            "source_project",
+            "source_id",
+            "source_type",
+            "source_author",
+            "source_updated_after",
+            "source_updated_before",
+            "source_status",
+        ] {
+            assert!(
+                s["properties"][prop].is_object(),
+                "schema must expose {prop}"
+            );
+        }
+        // None of the source filters are required.
+        let required = s["required"].as_array().unwrap();
+        for prop in ["source_system", "source_status"] {
+            assert!(!required.contains(&serde_json::json!(prop)));
+        }
+    }
+
    /// Helper that ingests a memory with specific tags. The base
    /// `ingest_test_content` helper passes `tags: vec![]`, which is fine
    /// for legacy tests but not for tag_prefix coverage.
@ -1839,6 +2272,7 @@ mod tests {
            tags: tags.into_iter().map(String::from).collect(),
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        let node = storage.ingest(input).unwrap();
        node.id
--- a/crates/vestige-mcp/src/tools/session_context.rs
+++ b/crates/vestige-mcp/src/tools/session_context.rs
@ -506,6 +506,7 @@ mod tests {
            tags: tags.into_iter().map(|s| s.to_string()).collect(),
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        let node = storage.ingest(input).unwrap();
        node.id
@ -712,6 +713,7 @@ mod tests {
            tags: vec!["pattern".to_string(), "codebase:vestige".to_string()],
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };
        storage.ingest(input).unwrap();

--- a/crates/vestige-mcp/src/tools/smart_ingest.rs
+++ b/crates/vestige-mcp/src/tools/smart_ingest.rs
@ -215,6 +215,7 @@ pub async fn execute(
        tags,
        valid_from: None,
        valid_until: None,
+        source_envelope: None,
    };

    // ====================================================================
@ -414,6 +415,7 @@ async fn execute_batch(
            tags,
            valid_from: None,
            valid_until: None,
+            source_envelope: None,
        };

        // ================================================================
--- a/crates/vestige-mcp/src/tools/source_sync.rs
+++ b/crates/vestige-mcp/src/tools/source_sync.rs
@ -0,0 +1,291 @@
+//! `source_sync` MCP tool (#57) — index an external system into Vestige.
+//!
+//! Turns Vestige into a durable, offline, provenance-linked retrieval layer
+//! over a long-lived external system. GitHub Issues and Redmine are the first
+//! reference connectors: Vestige indexes issues, comments/journals, and source
+//! metadata as source-aware memories you can search semantically and cite back
+//! to the canonical issue URL — re-runnable idempotently (no duplicates) and
+//! able to tombstone issues that vanish upstream.
+//!
+//! Unlike the official GitHub MCP server (a stateless live API proxy), this
+//! keeps a local index: searchable offline, embedded for semantic recall,
+//! joinable with the rest of your memory, and temporally versioned.
+//!
+//! ## Auth (security)
+//!
+//! Tokens are read from environment variables (`GITHUB_TOKEN` /
+//! `VESTIGE_GITHUB_TOKEN`, `REDMINE_API_KEY` / `VESTIGE_REDMINE_API_KEY`) and
+//! never from tool arguments, so credentials are not logged in the conversation.
+//! Public GitHub repositories and anonymous Redmine instances can work without a
+//! token/key at lower capability.
+
+use std::sync::Arc;
+
+use serde::Deserialize;
+use serde_json::{Value, json};
+
+use vestige_core::storage::Storage;
+
+/// JSON schema for the `source_sync` tool.
+pub fn schema() -> Value {
+    json!({
+        "type": "object",
+        "properties": {
+            "source": {
+                "type": "string",
+                "enum": ["github", "redmine"],
+                "description": "External system to sync: 'github' (GitHub Issues) or 'redmine' (a Redmine project).",
+                "default": "github"
+            },
+            "repo": {
+                "type": "string",
+                "description": "GitHub only: repository as 'owner/name', e.g. 'samvallad33/vestige'."
+            },
+            "project": {
+                "type": "string",
+                "description": "Redmine only: project identifier (slug or numeric id) to sync. The Redmine host comes from the REDMINE_URL env var."
+            },
+            "reconcile": {
+                "type": "boolean",
+                "description": "Also tombstone local memories for issues no longer visible upstream (an extra full enumeration pass). Default false on incremental syncs.",
+                "default": false
+            },
+            "max_pages": {
+                "type": "integer",
+                "description": "Max API pages to fetch this run (each page is up to 100 issues). Lets a first sync of a large project be resumed across calls. Default 10.",
+                "default": 10,
+                "minimum": 1,
+                "maximum": 1000
+            }
+        },
+        "required": []
+    })
+}
+
+#[derive(Debug, Deserialize)]
+#[serde(rename_all = "camelCase")]
+struct SourceSyncArgs {
+    #[serde(default = "default_source")]
+    source: String,
+    #[serde(default)]
+    repo: Option<String>,
+    #[serde(default)]
+    project: Option<String>,
+    #[serde(default)]
+    reconcile: bool,
+    #[serde(default, alias = "max_pages")]
+    max_pages: Option<usize>,
+}
+
+fn default_source() -> String {
+    "github".to_string()
+}
+
+/// Read the GitHub token from the environment (never from tool args).
+fn github_token() -> Option<String> {
+    std::env::var("GITHUB_TOKEN")
+        .or_else(|_| std::env::var("VESTIGE_GITHUB_TOKEN"))
+        .ok()
+        .filter(|s| !s.trim().is_empty())
+}
+
+/// Read the Redmine API key from the environment (never from tool args).
+fn redmine_api_key() -> Option<String> {
+    std::env::var("REDMINE_API_KEY")
+        .or_else(|_| std::env::var("VESTIGE_REDMINE_API_KEY"))
+        .ok()
+        .filter(|s| !s.trim().is_empty())
+}
+
+/// Read the Redmine base URL from the environment.
+fn redmine_url() -> Option<String> {
+    std::env::var("REDMINE_URL")
+        .or_else(|_| std::env::var("VESTIGE_REDMINE_URL"))
+        .ok()
+        .filter(|s| !s.trim().is_empty())
+}
+
+pub async fn execute(storage: &Arc<Storage>, args: Option<Value>) -> Result<Value, String> {
+    let args: SourceSyncArgs = match args {
+        Some(v) => serde_json::from_value(v).map_err(|e| format!("Invalid arguments: {e}"))?,
+        None => return Err("Missing arguments".to_string()),
+    };
+
+    let max_pages = args.max_pages.unwrap_or(10);
+
+    match args.source.as_str() {
+        "github" => {
+            let repo = args
+                .repo
+                .as_deref()
+                .ok_or_else(|| "github requires a 'repo' ('owner/name')".to_string())?;
+            let (owner, repo) = repo
+                .split_once('/')
+                .filter(|(o, r)| !o.is_empty() && !r.is_empty())
+                .ok_or_else(|| {
+                    "repo must be in 'owner/name' form, e.g. 'samvallad33/vestige'".to_string()
+                })?;
+            execute_github(storage, owner, repo, args.reconcile, max_pages).await
+        }
+        "redmine" => {
+            let project = args
+                .project
+                .as_deref()
+                .filter(|p| !p.trim().is_empty())
+                .ok_or_else(|| "redmine requires a 'project' identifier".to_string())?;
+            let base_url = redmine_url().ok_or_else(|| {
+                "set the REDMINE_URL env var to the Redmine host (e.g. https://redmine.example.com)"
+                    .to_string()
+            })?;
+            execute_redmine(storage, &base_url, project, args.reconcile, max_pages).await
+        }
+        other => Err(format!(
+            "Unsupported source '{other}'. Supported: 'github', 'redmine'."
+        )),
+    }
+}
+
+/// Connectors are feature-gated; surface a clear message when the build omits
+/// them rather than failing obscurely.
+#[cfg(not(feature = "connectors"))]
+async fn execute_github(
+    _storage: &Arc<Storage>,
+    _owner: &str,
+    _repo: &str,
+    _reconcile: bool,
+    _max_pages: usize,
+) -> Result<Value, String> {
+    Err(NO_CONNECTORS_MSG.to_string())
+}
+
+#[cfg(not(feature = "connectors"))]
+async fn execute_redmine(
+    _storage: &Arc<Storage>,
+    _base_url: &str,
+    _project: &str,
+    _reconcile: bool,
+    _max_pages: usize,
+) -> Result<Value, String> {
+    Err(NO_CONNECTORS_MSG.to_string())
+}
+
+#[cfg(not(feature = "connectors"))]
+const NO_CONNECTORS_MSG: &str = "This Vestige build was compiled without the 'connectors' feature. \
+     Rebuild with --features connectors to enable source_sync.";
+
+#[cfg(feature = "connectors")]
+async fn execute_github(
+    storage: &Arc<Storage>,
+    owner: &str,
+    repo: &str,
+    reconcile: bool,
+    max_pages: usize,
+) -> Result<Value, String> {
+    use vestige_core::connectors::github::{GithubConfig, GithubConnector};
+    use vestige_core::connectors::run_sync;
+
+    let config = GithubConfig::new(owner, repo).with_token(github_token());
+    let connector =
+        GithubConnector::new(config).map_err(|e| format!("connector init failed: {e}"))?;
+
+    let report = run_sync(storage.as_ref(), &connector, reconcile, max_pages)
+        .await
+        .map_err(|e| format!("sync failed: {e}"))?;
+
+    let scope = format!("{owner}/{repo}");
+    let total = report.created + report.updated + report.unchanged;
+    let authed = github_token().is_some();
+
+    let summary = format!(
+        "Synced {scope}: {} created, {} updated, {} unchanged{} ({total} records seen{}).",
+        report.created,
+        report.updated,
+        report.unchanged,
+        if report.reconciled {
+            format!(", {} tombstoned", report.tombstoned)
+        } else {
+            String::new()
+        },
+        if authed { "" } else { ", unauthenticated" },
+    );
+
+    Ok(json!({
+        "ok": true,
+        "summary": summary,
+        "source": "github",
+        "scope": scope,
+        "created": report.created,
+        "updated": report.updated,
+        "unchanged": report.unchanged,
+        "tombstoned": report.tombstoned,
+        "reconciled": report.reconciled,
+        "cursor": report.new_cursor.map(|d| d.to_rfc3339()),
+        "authenticated": authed,
+        "warnings": report.warnings,
+        "hint": if total == 0 && !authed {
+            "No records returned. For private repos or higher rate limits, set GITHUB_TOKEN in the server environment."
+        } else if report.new_cursor.is_some() && total >= 100 {
+            "More may remain — run source_sync again to continue from the saved cursor."
+        } else {
+            "Search these with the normal search tools; results cite the GitHub issue URL."
+        }
+    }))
+}
+
+#[cfg(feature = "connectors")]
+async fn execute_redmine(
+    storage: &Arc<Storage>,
+    base_url: &str,
+    project: &str,
+    reconcile: bool,
+    max_pages: usize,
+) -> Result<Value, String> {
+    use vestige_core::connectors::redmine::{RedmineConfig, RedmineConnector};
+    use vestige_core::connectors::run_sync;
+
+    let config = RedmineConfig::new(base_url, project).with_api_key(redmine_api_key());
+    let connector =
+        RedmineConnector::new(config).map_err(|e| format!("connector init failed: {e}"))?;
+
+    let report = run_sync(storage.as_ref(), &connector, reconcile, max_pages)
+        .await
+        .map_err(|e| format!("sync failed: {e}"))?;
+
+    let total = report.created + report.updated + report.unchanged;
+    let authed = redmine_api_key().is_some();
+
+    let summary = format!(
+        "Synced redmine project '{project}': {} created, {} updated, {} unchanged{} ({total} records seen{}).",
+        report.created,
+        report.updated,
+        report.unchanged,
+        if report.reconciled {
+            format!(", {} tombstoned", report.tombstoned)
+        } else {
+            String::new()
+        },
+        if authed { "" } else { ", anonymous" },
+    );
+
+    Ok(json!({
+        "ok": true,
+        "summary": summary,
+        "source": "redmine",
+        "scope": project,
+        "created": report.created,
+        "updated": report.updated,
+        "unchanged": report.unchanged,
+        "tombstoned": report.tombstoned,
+        "reconciled": report.reconciled,
+        "cursor": report.new_cursor.map(|d| d.to_rfc3339()),
+        "authenticated": authed,
+        "warnings": report.warnings,
+        "hint": if total == 0 && !authed {
+            "No records returned. Set REDMINE_API_KEY (and confirm the REST API is enabled on the instance) for private projects."
+        } else if report.new_cursor.is_some() && total >= 100 {
+            "More may remain — run source_sync again to continue from the saved cursor."
+        } else {
+            "Search these with the normal search tools; results cite the Redmine issue URL."
+        }
+    }))
+}
--- a/crates/vestige-mcp/src/tools/suppress.rs
+++ b/crates/vestige-mcp/src/tools/suppress.rs
@ -177,6 +177,7 @@ mod tests {
                tags: vec!["test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap()
            .id
--- a/crates/vestige-mcp/src/tools/timeline.rs
+++ b/crates/vestige-mcp/src/tools/timeline.rs
@ -209,6 +209,7 @@ mod tests {
                tags: vec!["timeline-test".to_string()],
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
    }
@ -226,6 +227,7 @@ mod tests {
                tags: tags.iter().map(|t| t.to_string()).collect(),
                valid_from: None,
                valid_until: None,
+                source_envelope: None,
            })
            .unwrap();
    }
--- a/docs/CONNECTORS.md
+++ b/docs/CONNECTORS.md
@ -0,0 +1,206 @@
+# External-Source Connectors
+
+> Status: **v2.1.27** — GitHub Issues + Redmine reference connectors, plus
+> source-aware investigation filters for search. Tracking issue:
+> [#57](https://github.com/samvallad33/vestige/issues/57).
+
+Connectors let Vestige act as a durable, local **retrieval and reasoning layer**
+over a long-lived external system — a ticket tracker, an issue board, a support
+queue — **without replacing it**. The external system stays the source of truth.
+Vestige indexes its records, embeds them for semantic recall, links them into the
+memory graph, and **cites back** to the canonical record.
+
+## Why this is different from a ticket-system MCP
+
+The official GitHub / Jira MCP servers are **live API proxies**: every query hits
+the upstream API, is rate-limited, keyword-only, online-only, and has no memory
+of past state. Vestige instead keeps a **durable local index** of the records, so
+you can:
+
+- search the history **offline** and **semantically** (embeddings, not just
+  keywords),
+- **join** ticket history with the rest of your memory in one search,
+- see a **point-in-time** view (records carry temporal validity),
+- and re-sync **idempotently** — re-running never duplicates a record.
+
+## Quick start (GitHub Issues)
+
+1. (Optional but recommended) export a token so you get the authenticated rate
+   limit (5,000 req/hr vs 60 for anonymous) and access to private repos:
+
+   ```sh
+   export GITHUB_TOKEN=ghp_xxx   # or VESTIGE_GITHUB_TOKEN
+   ```
+
+   The token is read **only** from the environment — never passed as a tool
+   argument, never logged.
+
+2. Ask your agent to run the `source_sync` MCP tool:
+
+   ```json
+   { "repo": "samvallad33/vestige" }
+   ```
+
+3. Search as normal. Connector-sourced results carry a `sourceRecord` object with
+   the canonical issue URL:
+
+   ```json
+   {
+     "content": "[samvallad33/vestige#57] Roadmap: external source connectors …",
+     "sourceRecord": {
+       "system": "github",
+       "id": "57",
+       "url": "https://github.com/samvallad33/vestige/issues/57",
+       "project": "samvallad33/vestige",
+       "type": "issue",
+       "author": "samvallad33",
+       "tombstoned": false
+     }
+   }
+   ```
+
+## Quick start (Redmine)
+
+Redmine stays the system of record; Vestige indexes a project's issues +
+journals (comments and status/assignment history).
+
+1. Point Vestige at the Redmine host and key (env only, never tool args):
+
+   ```sh
+   export REDMINE_URL=https://redmine.example.com
+   export REDMINE_API_KEY=xxxxxxxx   # or VESTIGE_REDMINE_API_KEY
+   ```
+
+   The instance must have the REST API enabled (Administration → Settings → API)
+   or every call returns 401/403 even with a valid key.
+
+2. Run `source_sync`:
+
+   ```json
+   { "source": "redmine", "project": "infra" }
+   ```
+
+   Results cite the canonical `https://redmine.example.com/issues/<id>` URL.
+
+## The `source_sync` tool
+
+| Field | Type | Default | Meaning |
+|---|---|---|---|
+| `source` | string | `github` | `github` or `redmine`. |
+| `repo` | string | — | **GitHub:** `owner/name`, e.g. `samvallad33/vestige`. |
+| `project` | string | — | **Redmine:** project identifier (host from `REDMINE_URL`). |
+| `reconcile` | bool | `false` | Also tombstone local memories for issues no longer visible upstream (an extra full-enumeration pass). |
+| `max_pages` | int | `10` | API pages to fetch this run (≤100 issues each). Lets a first sync of a large project resume across calls. |
+
+The tool returns counts (`created` / `updated` / `unchanged` / `tombstoned`),
+the saved `cursor`, whether it ran authenticated, and a `hint` for the next step.
+
+## Investigation filters (Phase 4)
+
+`search` accepts source-aware filters so an agent can scope a query to indexed
+records. All are optional post-filters; combine with a larger `limit` if you
+expect heavy thinning. A source-scoped query excludes non-connector memories.
+
+| Filter | Matches |
+|---|---|
+| `source_system` | `github`, `redmine`, … |
+| `source_project` | repo / project (exact) |
+| `source_id` | a specific issue/ticket id |
+| `source_type` | `issue`, `comment`, … |
+| `source_author` | reporter/author (not assignee) |
+| `source_updated_after` / `source_updated_before` | RFC3339 date range (inclusive) |
+| `source_status` | `valid` (default `any`) or `tombstoned` |
+
+Status, tracker, and priority are filterable through the existing `tag_prefix`
+(the connectors emit lowercase `status:`, `tracker:`, `priority:`, and GitHub
+`label:` / `state:` tags) — e.g. `tag_prefix: "status:open"`. Assignee and
+linked-issue graph traversal are not yet exposed (see below).
+
+### Idempotent, incremental sync
+
+Each run:
+
+1. resumes from the saved cursor (the high-water mark on the record's upstream
+   update time), minus a small overlap window so same-second / clock-skewed
+   updates are never missed;
+2. pages issues in ascending update order (`state=all`, so closing an issue is
+   **not** mistaken for a deletion), folding each issue + its comments into one
+   memory;
+3. routes each record through an **idempotent upsert** keyed on
+   `(source_system, source_id)`:
+   - unseen record → **insert**,
+   - changed content (by content hash) → **update in place** + re-embed,
+   - unchanged content → **no-op** (only the "last seen" time advances);
+4. advances and persists the cursor only after the run, so an interruption
+   re-scans rather than skips.
+
+Re-running `source_sync` on the same repo is therefore safe and cheap — it picks
+up only what changed.
+
+### Deletions (tombstoning)
+
+Neither GitHub nor Redmine exposes a deletion feed, so an incremental sync can
+never *see* a delete. Pass `reconcile: true` to run a reconciliation pass: Vestige
+enumerates the currently-visible issue ids and **invalidates** (does not purge)
+any local record no longer present. A tombstoned record keeps its content for
+audit but drops out of "currently valid" retrieval (`sourceRecord.tombstoned` is
+`true`). If the record reappears upstream, the next sync un-tombstones it.
+
+## The source envelope
+
+Every connector-ingested memory carries structured provenance, distinct from the
+legacy free-form `source` label:
+
+| Field | Purpose |
+|---|---|
+| `source_system` | `github`, `redmine`, … (namespaces ids). |
+| `source_id` | Native id (issue number, ticket id). |
+| `source_url` | Canonical link back — the citation. |
+| `source_updated_at` | Upstream update time (the sync cursor field). |
+| `content_hash` | Change detector → idempotency. |
+| `synced_at` | When the connector last saw the record live. |
+| `source_project` | Repo / project / space. |
+| `source_type` | `issue`, `comment`, … |
+| `source_author` | Reporter / author upstream. |
+
+`(source_system, source_id)` is enforced unique, so there is exactly one memory
+per external record. Legacy memories (agent- or user-authored) have no envelope
+and are completely unaffected.
+
+## Building
+
+The connector HTTP client is behind the `connectors` cargo feature, which is
+**on by default in the MCP server** (`vestige-mcp`). A build without it still
+exposes the `source_sync` tool but returns a clear "rebuild with `--features
+connectors`" message. The core library (`vestige-core`) leaves the feature
+**off** by default, so library consumers that don't need connectors link no HTTP
+client.
+
+```sh
+# default MCP build already includes connectors
+cargo build -p vestige-mcp --release
+
+# explicit, or for the core lib
+cargo build -p vestige-core --features connectors
+```
+
+## Writing a new connector
+
+Implement the `Connector` trait in `vestige_core::connectors` (fetch a window of
+records updated since a cursor, page forward, and optionally enumerate live ids
+for reconciliation), produce `NormalizedRecord`s with a filled
+`SourceEnvelope`, and hand them to `run_sync`. Two reference connectors show the
+shape — `crates/vestige-core/src/connectors/github.rs` (Link-header pagination,
+opaque-url cursor) and `crates/vestige-core/src/connectors/redmine.rs`
+(offset pagination, two-phase list-then-detail fetch). The sync driver,
+idempotent upsert, cursor checkpointing, and tombstone reconciliation are all
+reused for free.
+
+## Not yet supported
+
+- **Assignee filter** — the envelope stores `source_author` (reporter) only; no
+  assignee column yet.
+- **Tracker / version dedicated filter params** — reachable today via
+  `tag_prefix` (`tracker:`, and `version:`/`category:` when emitted).
+- **Linked-issue graph traversal** — connectors import relations into the memory
+  body, but issue-to-issue graph edges are not yet exposed in search.
--- a/tests/e2e/src/harness/db_manager.rs
+++ b/tests/e2e/src/harness/db_manager.rs
@ -31,6 +31,7 @@ fn make_ingest_input(
        source,
        valid_from,
        valid_until,
+        source_envelope: None,
    }
 }

--- a/tests/e2e/src/mocks/fixtures.rs
+++ b/tests/e2e/src/mocks/fixtures.rs
@ -29,6 +29,7 @@ fn make_ingest_input(
        source,
        valid_from,
        valid_until,
+        source_envelope: None,
    }
 }