mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-10 08:05:14 +02:00
feat(ingest): default local ingest to isolated diffs (#128)
* docs: add isolated-diff ingestion design * Refine isolated-diff ingestion design after adversarial review iteration 1 * Refine isolated-diff ingestion design after adversarial review iteration 2 * Refine isolated-diff ingestion design after adversarial review iteration 3 * feat: persist ingest trace events * feat: add isolated ingest patch helpers * feat: validate wiki body semantic references * feat: add final ingest artifact gates * feat: execute ingest work units in child worktrees * feat: integrate isolated work unit patches * feat: route selected ingest sources through isolated diffs * test: cover isolated diff ingestion regressions * feat: add isolated diff ingestion v1 core * docs: document ingest trace inspection * docs: add isolated diff ingestion v1 core plan * fix(ingest): tighten final artifact gates * fix(ingest): gate isolated final integration tree * fix(ingest): persist postmortem failure traces * fix(ingest): trace policy conflicts and cleanup child worktrees * test(ingest): verify isolated diff postmortem coverage * docs: add isolated diff ingestion gates and trace closure plan * fix(ingest): gate provenance before isolated diff squash * docs: add isolated diff ingestion provenance gate closure plan * fix(ingest): gate final wiki references * fix(ingest): enforce SL target connection scope * fix(ingest): trace isolated SL target policy gates * test(ingest): cover isolated diff reference and target gates * chore(ingest): verify isolated diff gate closure * docs: add isolated diff ingestion reference and target gate closure plan * fix(ingest): gate global wiki references * docs: add isolated diff ingestion global wiki reference gate closure plan * fix(ingest): validate scan sources and wiki refs * test(ingest): cover isolated diff textual conflict resolver * test(ingest): cover isolated diff resolver integration * feat(ingest): repair isolated diff textual conflicts * feat(ingest): report isolated diff resolver outcomes * test(ingest): verify isolated diff textual conflict repair * test(ingest): align textual conflict failure coverage * docs: add isolated diff textual conflict resolver plan * test(ingest): cover isolated diff gate repair * feat(ingest): add isolated diff gate repair agent * feat(ingest): repair isolated diff semantic gate failures * feat(ingest): wire isolated diff gate repair * test(ingest): verify isolated diff final gate repair * chore(ingest): verify isolated diff gate repair * docs: add isolated diff gate repair plan * Improve ingest progress updates * feat(ingest): route direct-write connectors through isolated diffs * test(ingest): cover non-metabase isolated diff routing * feat(ingest): project metricflow semantic models before work units * test(ingest): verify metricflow isolated projection path * chore(ingest): verify isolated diff connector migration * docs: add isolated diff connector migration plan * feat(ingest): make isolated diff routing the private default * feat(ingest): promote isolated diff to default runner path * feat(ingest): default local ingest to isolated diffs * chore(ingest): remove isolated diff allowlist references * fix(ingest): preserve transient evidence for isolated work units * docs: add isolated diff default promotion plan * refactor(ingest): remove shared worktree WorkUnit path * docs(ingest): align WorkUnit prompts with isolated diffs * test(ingest): drop unused runner import * docs: add isolated diff shared worktree removal plan * docs: add isolated diff gate repair classification plan * fix: restrict claude-code mcp servers * docs: align ingest trace guidance with public CLI --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
This commit is contained in:
parent
d1c84e5564
commit
e64da5a85d
66 changed files with 22346 additions and 514 deletions
2938
docs/superpowers/plans/2026-05-17-isolated-diff-ingestion-v1-core.md
Normal file
2938
docs/superpowers/plans/2026-05-17-isolated-diff-ingestion-v1-core.md
Normal file
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,493 @@
|
|||
# Isolated Diff Ingestion V1 Global Wiki Reference Gate Closure Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||
> superpowers:subagent-driven-development (recommended) or
|
||||
> superpowers:executing-plans to implement this plan task-by-task. Steps use
|
||||
> checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Reject final trees where an isolated-diff run changes semantic-layer
|
||||
sources or deletes wiki pages and leaves pre-existing wiki pages with stale
|
||||
body, `sl_refs`, frontmatter `refs`, or inline `[[page-key]]` references.
|
||||
|
||||
**Architecture:** Keep `artifact-gates.ts` validation-only. The runner expands
|
||||
the final wiki gate scope before the existing final artifact gate: changed pages
|
||||
are always validated, and all global wiki pages are validated when the run
|
||||
changes any semantic-layer source or removes any wiki page. The final-gate trace
|
||||
records the expanded scope and why it was expanded.
|
||||
|
||||
**Tech Stack:** TypeScript, Vitest, pnpm workspace commands, existing
|
||||
`IngestBundleRunner`, `KnowledgeWikiService`, and isolated-diff test fixtures.
|
||||
|
||||
---
|
||||
|
||||
## Audit Summary
|
||||
|
||||
The implemented isolated-diff plans cover the core v1 flow: child worktrees,
|
||||
binary no-rename patch proposals, `git apply --3way --index`, policy rejection,
|
||||
final gates after reconciliation and repair, pre-squash provenance raw-path
|
||||
validation, target-connection enforcement, failed reports, and persistent JSONL
|
||||
traces.
|
||||
|
||||
One v1-blocking correctness gap remains. Final wiki gates currently validate
|
||||
wiki pages changed by the run. They do not validate unchanged pages that become
|
||||
invalid because the run changes a semantic-layer source or deletes a referenced
|
||||
wiki page. Two concrete failures can therefore squash into main:
|
||||
|
||||
- A pre-existing wiki page body contains
|
||||
`` `mart_account_segments.total_contract_arr_cents` `` while the run updates
|
||||
`semantic-layer/warehouse/mart_account_segments.yaml` to define only
|
||||
`total_contract_arr`.
|
||||
- A pre-existing wiki page has `refs: [source-page]` or `[[source-page]]` while
|
||||
the run deletes `wiki/global/source-page.md`.
|
||||
|
||||
This plan does not expand connector rollout, promote isolated diffs to the
|
||||
default, add interactive resolution, add semantic auto-merge, remove the old
|
||||
path, expand transitive semantic-layer dependencies, or move provenance into
|
||||
files.
|
||||
|
||||
## File Structure
|
||||
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`.
|
||||
Adds two failing end-to-end regressions for unchanged wiki pages made stale by
|
||||
semantic-layer changes and wiki-page deletion.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.ts`.
|
||||
Adds a final wiki gate scope helper, expands validation to all global wiki
|
||||
pages when final state changes can invalidate unchanged references, and records
|
||||
scope details in the final-gate trace and failed report.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add failing unchanged wiki regressions
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
|
||||
- [ ] **Step 1: Add the stale existing wiki body regression**
|
||||
|
||||
Insert this test inside `describe('IngestBundleRunner isolated diff path', ...)`
|
||||
after the existing Metabase stale-measure regression:
|
||||
|
||||
```ts
|
||||
it('rejects unchanged wiki body refs made stale by isolated semantic-layer changes', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
await mkdir(join(runtime.configDir, 'semantic-layer/warehouse'), { recursive: true });
|
||||
await mkdir(join(runtime.configDir, 'wiki/global'), { recursive: true });
|
||||
await writeFile(
|
||||
join(runtime.configDir, 'semantic-layer/warehouse/mart_account_segments.yaml'),
|
||||
'name: mart_account_segments\ngrain: [account_id]\ncolumns: [{name: account_id, type: string}]\njoins: []\nmeasures:\n - name: total_contract_arr_cents\n expr: sum(contract_arr)\n',
|
||||
);
|
||||
await writeFile(
|
||||
join(runtime.configDir, 'wiki/global/account-segments.md'),
|
||||
'---\nsummary: Account segments\nusage_mode: auto\n---\n\nExisting ARR uses `mart_account_segments.total_contract_arr_cents`.\n',
|
||||
);
|
||||
await runtime.git.commitFiles(
|
||||
['semantic-layer/warehouse/mart_account_segments.yaml', 'wiki/global/account-segments.md'],
|
||||
'seed existing wiki body ref',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
const preRunHead = await runtime.git.revParseHead();
|
||||
|
||||
const { deps, adapter } = makeDeps(runtime);
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [{ unitKey: 'source-only', rawFiles: ['cards/source.json'], peerFileIndex: [], dependencyPaths: [] }],
|
||||
});
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async () => {
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await writeFile(
|
||||
join(root, 'semantic-layer/warehouse/mart_account_segments.yaml'),
|
||||
'name: mart_account_segments\ngrain: [account_id]\ncolumns: [{name: account_id, type: string}]\njoins: []\nmeasures:\n - name: total_contract_arr\n expr: sum(contract_arr)\n',
|
||||
);
|
||||
addTouchedSlSource(currentSession.touchedSlSources, 'warehouse', 'mart_account_segments');
|
||||
currentSession.actions.push({
|
||||
target: 'sl',
|
||||
type: 'updated',
|
||||
key: 'mart_account_segments',
|
||||
detail: 'Rename ARR measure',
|
||||
targetConnectionId: 'warehouse',
|
||||
rawPaths: ['cards/source.json'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['semantic-layer/warehouse/mart_account_segments.yaml'],
|
||||
'wu source rename',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(runner, runtime, [['cards/source.json', 'h1']]);
|
||||
|
||||
await expect(
|
||||
runner.run({
|
||||
jobId: 'job-existing-body-stale',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey: 'metabase',
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
}),
|
||||
).rejects.toThrow(/total_contract_arr_cents/);
|
||||
|
||||
expect(await runtime.git.revParseHead()).toBe(preRunHead);
|
||||
const trace = await readFile(join(runtime.configDir, '.ktx/ingest-traces/job-existing-body-stale/trace.jsonl'), 'utf-8');
|
||||
expect(trace).toContain('final_artifact_gates_failed');
|
||||
expect(trace).toContain('account-segments');
|
||||
expect(trace).toContain('semantic_layer_changed');
|
||||
expect(trace).toContain('ingest_failed');
|
||||
expect(trace).toContain('failure_report_created');
|
||||
expect(trace).not.toContain('squash_finished');
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add the stale existing wiki page-reference regression**
|
||||
|
||||
Insert this test near the existing final wiki reference regression:
|
||||
|
||||
```ts
|
||||
it('rejects unchanged inbound wiki refs broken by an isolated wiki deletion', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
await mkdir(join(runtime.configDir, 'wiki/global'), { recursive: true });
|
||||
await writeFile(
|
||||
join(runtime.configDir, 'wiki/global/source-page.md'),
|
||||
'---\nsummary: Source page\nusage_mode: auto\n---\n\nSource page\n',
|
||||
);
|
||||
await writeFile(
|
||||
join(runtime.configDir, 'wiki/global/account-segments.md'),
|
||||
'---\nsummary: Account segments\nusage_mode: auto\nrefs:\n - source-page\n---\n\nSee [[source-page]].\n',
|
||||
);
|
||||
await runtime.git.commitFiles(
|
||||
['wiki/global/source-page.md', 'wiki/global/account-segments.md'],
|
||||
'seed inbound wiki refs',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
const preRunHead = await runtime.git.revParseHead();
|
||||
|
||||
const { deps, adapter } = makeDeps(runtime);
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [{ unitKey: 'delete-target-page', rawFiles: ['pages/delete.json'], peerFileIndex: [], dependencyPaths: [] }],
|
||||
});
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async () => {
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await rm(join(root, 'wiki/global/source-page.md'), { force: true });
|
||||
currentSession.actions.push({
|
||||
target: 'wiki',
|
||||
type: 'removed',
|
||||
key: 'source-page',
|
||||
detail: 'Delete referenced page',
|
||||
rawPaths: ['pages/delete.json'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['wiki/global/source-page.md'],
|
||||
'wu delete target page',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(runner, runtime, [['pages/delete.json', 'h1']]);
|
||||
|
||||
await expect(
|
||||
runner.run({
|
||||
jobId: 'job-existing-wiki-ref-stale',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey: 'metabase',
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
}),
|
||||
).rejects.toThrow(/wiki references target missing page\(s\): account-segments -> source-page/);
|
||||
|
||||
expect(await runtime.git.revParseHead()).toBe(preRunHead);
|
||||
const trace = await readFile(join(runtime.configDir, '.ktx/ingest-traces/job-existing-wiki-ref-stale/trace.jsonl'), 'utf-8');
|
||||
expect(trace).toContain('final_artifact_gates_failed');
|
||||
expect(trace).toContain('account-segments -> source-page');
|
||||
expect(trace).toContain('wiki_page_removed');
|
||||
expect(trace).toContain('ingest_failed');
|
||||
expect(trace).toContain('failure_report_created');
|
||||
expect(trace).not.toContain('squash_finished');
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run the focused regressions and verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.isolated-diff.test.ts -t "unchanged wiki body refs|unchanged inbound wiki refs"
|
||||
```
|
||||
|
||||
Expected: FAIL. The stale body test currently squashes successfully because the
|
||||
unchanged `account-segments` page is not in `finalChangedWikiPageKeys`. The
|
||||
inbound wiki ref test currently squashes successfully because the deleted
|
||||
`source-page` is validated as a missing changed page and skipped, while the
|
||||
unchanged page that references it is never validated.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Expand the final wiki validation scope
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
|
||||
- [ ] **Step 1: Add final wiki gate scope helpers**
|
||||
|
||||
Add these private methods after `uniqueTouchedSlSources()`:
|
||||
|
||||
```ts
|
||||
private removedWikiPageKeysFromActions(actions: MemoryAction[]): string[] {
|
||||
return this.uniqueWikiPageKeys(
|
||||
actions.filter((action) => action.target === 'wiki' && action.type === 'removed').map((action) => action.key),
|
||||
);
|
||||
}
|
||||
|
||||
private async wikiPageKeysForFinalGates(input: {
|
||||
wikiService: ReturnType<KnowledgeWikiService['forWorktree']>;
|
||||
changedWikiPageKeys: string[];
|
||||
touchedSlSources: TouchedSlSource[];
|
||||
actions: MemoryAction[];
|
||||
}): Promise<{
|
||||
pageKeys: string[];
|
||||
trace: {
|
||||
global: boolean;
|
||||
reasons: string[];
|
||||
changedWikiPageKeys: string[];
|
||||
removedWikiPageKeys: string[];
|
||||
pageKeysValidated: string[];
|
||||
};
|
||||
}> {
|
||||
const changedWikiPageKeys = this.uniqueWikiPageKeys(input.changedWikiPageKeys);
|
||||
const removedWikiPageKeys = this.removedWikiPageKeysFromActions(input.actions);
|
||||
const reasons: string[] = [];
|
||||
if (input.touchedSlSources.length > 0) {
|
||||
reasons.push('semantic_layer_changed');
|
||||
}
|
||||
if (removedWikiPageKeys.length > 0) {
|
||||
reasons.push('wiki_page_removed');
|
||||
}
|
||||
|
||||
let pageKeys = changedWikiPageKeys;
|
||||
if (reasons.length > 0) {
|
||||
pageKeys = this.uniqueWikiPageKeys([
|
||||
...changedWikiPageKeys,
|
||||
...(await input.wikiService.listPageKeys('GLOBAL', null)),
|
||||
]);
|
||||
}
|
||||
|
||||
return {
|
||||
pageKeys,
|
||||
trace: {
|
||||
global: reasons.length > 0,
|
||||
reasons,
|
||||
changedWikiPageKeys,
|
||||
removedWikiPageKeys,
|
||||
pageKeysValidated: pageKeys,
|
||||
},
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Use the expanded scope before final gates**
|
||||
|
||||
In `runInner()`, replace the current `finalChangedWikiPageKeys` and
|
||||
`finalTouchedSlSources` block with this code:
|
||||
|
||||
```ts
|
||||
const baseFinalChangedWikiPageKeys = this.uniqueWikiPageKeys([
|
||||
...(isolatedDiffEnabled ? projectionChangedWikiPageKeys : []),
|
||||
...workUnitOutcomes
|
||||
.flatMap((outcome) => outcome.patchTouchedPaths ?? [])
|
||||
.flatMap((path) => this.wikiPageKeysFromPaths([path])),
|
||||
...this.wikiPageKeysFromActions(reconcileActions),
|
||||
...postReconciliationPaths.flatMap((path) => this.wikiPageKeysFromPaths([path])),
|
||||
...wikiSlRefRepairResult.repairs.filter((repair) => repair.scope === 'GLOBAL').map((repair) => repair.pageKey),
|
||||
]);
|
||||
const finalTouchedSlSources = this.uniqueTouchedSlSources([
|
||||
...(isolatedDiffEnabled ? projectionTouchedSources : []),
|
||||
...workUnitOutcomes.flatMap((outcome) => outcome.touchedSlSources),
|
||||
...this.touchedSlSourcesFromActions(reconcileActions, job.connectionId),
|
||||
...this.touchedSlSourcesFromPaths(postReconciliationPaths),
|
||||
...(postProcessorOutcome?.touchedSources ?? []),
|
||||
]);
|
||||
const finalWikiGateScope = await this.wikiPageKeysForFinalGates({
|
||||
wikiService: this.deps.wikiService.forWorktree(sessionWorktree.workdir),
|
||||
changedWikiPageKeys: baseFinalChangedWikiPageKeys,
|
||||
touchedSlSources: finalTouchedSlSources,
|
||||
actions: [...stageIndex.workUnits.flatMap((wu) => wu.actions), ...reconcileActions],
|
||||
});
|
||||
const finalChangedWikiPageKeys = finalWikiGateScope.pageKeys;
|
||||
```
|
||||
|
||||
This keeps the existing variable name used by `validateFinalIngestArtifacts()`,
|
||||
but the value now means "wiki page keys to validate in final gates."
|
||||
|
||||
- [ ] **Step 3: Add scope details to final-gate trace data**
|
||||
|
||||
In the `finalArtifactGateTraceData` object, add the
|
||||
`wikiReferenceGateScope` field:
|
||||
|
||||
```ts
|
||||
const finalArtifactGateTraceData = {
|
||||
changedWikiPageKeys: finalChangedWikiPageKeys,
|
||||
wikiReferenceGateScope: finalWikiGateScope.trace,
|
||||
touchedSlSources: finalTouchedSlSources,
|
||||
projectionTouchedPaths,
|
||||
workUnitPatchTouchedPaths: workUnitOutcomes.flatMap((outcome) => outcome.patchTouchedPaths ?? []),
|
||||
preReconciliationSha,
|
||||
postReconciliationSha,
|
||||
postReconciliationPaths,
|
||||
reconciliationActionCount: reconcileActions.length,
|
||||
wikiSlRefRepairCount: wikiSlRefRepairResult.repairs.length,
|
||||
};
|
||||
```
|
||||
|
||||
The failure report already stores `activeFailureDetails`, so this trace data
|
||||
also becomes persistent failed-report context when final gates fail.
|
||||
|
||||
- [ ] **Step 4: Run the focused regressions and verify they pass**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.isolated-diff.test.ts -t "unchanged wiki body refs|unchanged inbound wiki refs"
|
||||
```
|
||||
|
||||
Expected: PASS. Both traces include `final_artifact_gates_failed`,
|
||||
`failure_report_created`, no `squash_finished`, and
|
||||
`wikiReferenceGateScope` with either `semantic_layer_changed` or
|
||||
`wiki_page_removed`.
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Verification and commit
|
||||
|
||||
**Files:**
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
|
||||
- [ ] **Step 1: Run the isolated-diff focused suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts \
|
||||
src/ingest/artifact-gates.test.ts \
|
||||
src/ingest/wiki-body-refs.test.ts \
|
||||
src/ingest/semantic-layer-target-policy.test.ts \
|
||||
src/ingest/isolated-diff/git-patch.test.ts \
|
||||
src/ingest/isolated-diff/patch-integrator.test.ts \
|
||||
src/ingest/isolated-diff/work-unit-executor.test.ts \
|
||||
src/core/git.service.patch.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Type-check the context package**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run dead-code analysis**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm run dead-code
|
||||
```
|
||||
|
||||
Expected: PASS, or only pre-existing findings unrelated to
|
||||
`packages/context/src/ingest/ingest-bundle.runner.ts` and
|
||||
`packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`.
|
||||
Investigate any new finding before committing.
|
||||
|
||||
- [ ] **Step 4: Verify trace acceptance criteria**
|
||||
|
||||
Open the traces produced by the two new failing-run tests and confirm these
|
||||
events and fields exist:
|
||||
|
||||
```text
|
||||
job-existing-body-stale:
|
||||
- final_artifact_gates_started
|
||||
- final_artifact_gates_failed
|
||||
- ingest_failed
|
||||
- failure_report_created
|
||||
- no squash_finished
|
||||
- wikiReferenceGateScope.global is true
|
||||
- wikiReferenceGateScope.reasons includes semantic_layer_changed
|
||||
- wikiReferenceGateScope.pageKeysValidated includes account-segments
|
||||
- error.message includes total_contract_arr_cents
|
||||
|
||||
job-existing-wiki-ref-stale:
|
||||
- final_artifact_gates_started
|
||||
- final_artifact_gates_failed
|
||||
- ingest_failed
|
||||
- failure_report_created
|
||||
- no squash_finished
|
||||
- wikiReferenceGateScope.global is true
|
||||
- wikiReferenceGateScope.reasons includes wiki_page_removed
|
||||
- wikiReferenceGateScope.removedWikiPageKeys includes source-page
|
||||
- error.message includes account-segments -> source-page
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/ingest-bundle.runner.ts \
|
||||
packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
git commit -m "fix(ingest): gate global wiki references"
|
||||
```
|
||||
|
||||
Expected: one commit containing only the runner and isolated-diff runner test
|
||||
changes.
|
||||
|
||||
---
|
||||
|
||||
## Self-Review
|
||||
|
||||
Spec coverage:
|
||||
- Final global wiki body reference validation now covers unchanged wiki pages
|
||||
when a run changes semantic-layer sources.
|
||||
- Final global wiki page reference validation now covers unchanged inbound
|
||||
references when a run deletes wiki pages.
|
||||
- The plan keeps resolver behavior fail-fast and stops before squash.
|
||||
- Persistent trace and failed-report acceptance criteria are explicit and tied
|
||||
to the concrete failure modes.
|
||||
|
||||
Non-blocking gaps unchanged:
|
||||
- Broader connector rollout.
|
||||
- Isolated-diff default promotion.
|
||||
- Old shared-worktree path removal.
|
||||
- Interactive conflict resolution.
|
||||
- Semantic auto-merge.
|
||||
- Transitive semantic-layer dependency expansion.
|
||||
- Provenance-as-files.
|
||||
|
|
@ -0,0 +1,494 @@
|
|||
# Isolated Diff Ingestion V1 Provenance Gate Closure Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Ensure invalid provenance raw paths are rejected before isolated-diff
|
||||
ingestion squashes any integration worktree changes into the main project
|
||||
worktree.
|
||||
|
||||
**Architecture:** Keep provenance insertion after squash, but derive and
|
||||
validate the planned provenance rows immediately after final artifact gates and
|
||||
before the squash stage. This makes provenance validation part of the final
|
||||
pre-main safety boundary while preserving the existing report and database
|
||||
write shape.
|
||||
|
||||
**Tech Stack:** TypeScript ESM/NodeNext, Vitest, existing
|
||||
`IngestBundleRunner`, `validateProvenanceRawPaths`, ingest reports, and
|
||||
persistent ingest traces.
|
||||
|
||||
---
|
||||
|
||||
## Audit Summary
|
||||
|
||||
The implemented isolated-diff path now covers the core v1 safety surface:
|
||||
child worktrees, binary no-rename patches, `git apply --3way --index`, patch
|
||||
policy rejection, final wiki and semantic-layer gates after reconciliation and
|
||||
post-processing, failure reports, and persistent JSONL traces. The focused
|
||||
isolated-diff test suite passes:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-trace.test.ts \
|
||||
src/ingest/wiki-body-refs.test.ts \
|
||||
src/ingest/artifact-gates.test.ts \
|
||||
src/ingest/isolated-diff/git-patch.test.ts \
|
||||
src/ingest/isolated-diff/work-unit-executor.test.ts \
|
||||
src/ingest/isolated-diff/patch-integrator.test.ts \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
```
|
||||
|
||||
Current result: `7 passed`, `28 passed`.
|
||||
|
||||
One v1-blocking gap remains. `validateProvenanceRawPaths()` is called in
|
||||
`packages/context/src/ingest/ingest-bundle.runner.ts` after
|
||||
`squashMergeIntoMain()`. A work unit or reconciliation action can emit an
|
||||
otherwise valid wiki or semantic-layer artifact whose `rawPaths` contain a path
|
||||
outside the current raw snapshot and eviction set. Today the run fails during
|
||||
provenance recording, but only after the invalidly-attributed artifacts have
|
||||
already reached the main project worktree. That violates the spec requirement
|
||||
that final global gates run before any changes reach main.
|
||||
|
||||
Observability for the already-implemented phases is sufficient for postmortem
|
||||
reconstruction: traces include input snapshots, routing, child worktree
|
||||
creation and cleanup, patch collection and application, conflict
|
||||
classification, reconciliation, final gates, failure reports, and run outcome.
|
||||
This plan adds only the missing provenance validation failure trace because it
|
||||
corresponds to a concrete pre-main failure mode, not cosmetic trace expansion.
|
||||
|
||||
Non-blocking gaps that remain after this plan:
|
||||
|
||||
- Migrating Notion, LookML, Looker, dbt, MetricFlow, and historic-SQL direct
|
||||
durable writes to the isolated path.
|
||||
- Promoting isolated diffs as the default for all connectors.
|
||||
- Removing the old shared-worktree WorkUnit execution path.
|
||||
- Interactive, CLI, or agent-driven conflict resolution.
|
||||
- Auto-merging semantic conflicts that cannot be proven correct.
|
||||
- Transitive SQL-projection dependency expansion beyond direct declared joins.
|
||||
- Moving provenance rows to worktree files.
|
||||
- Adding failure reports for failures that happen before an ingest run row
|
||||
exists. The trace file is still written at the deterministic job path.
|
||||
|
||||
## File Structure
|
||||
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`.
|
||||
Add a regression proving invalid provenance raw paths fail before squash,
|
||||
leave main unchanged, skip SQLite provenance insertion, and emit a
|
||||
postmortem-grade trace event.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.ts`.
|
||||
Extract provenance row construction into private helpers, run provenance
|
||||
raw-path validation before squash, trace validation success and failure, and
|
||||
reuse the prevalidated rows for insertion and reports after squash.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add the pre-squash provenance regression
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
|
||||
- [ ] **Step 1: Write the failing runner test**
|
||||
|
||||
Append this test inside the existing
|
||||
`describe('IngestBundleRunner isolated diff path', ...)` block in
|
||||
`packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`:
|
||||
|
||||
```ts
|
||||
it('rejects invalid provenance raw paths before squash reaches main', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
const { deps, adapter } = makeDeps(runtime);
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [{ unitKey: 'card-valid-artifacts', rawFiles: ['cards/source.json'], peerFileIndex: [], dependencyPaths: [] }],
|
||||
});
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async () => {
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await mkdir(join(root, 'semantic-layer/warehouse'), { recursive: true });
|
||||
await mkdir(join(root, 'wiki/global'), { recursive: true });
|
||||
await writeFile(
|
||||
join(root, 'semantic-layer/warehouse/mart_account_segments.yaml'),
|
||||
'name: mart_account_segments\ngrain: [account_id]\ncolumns: [{name: account_id, type: string}]\njoins: []\nmeasures:\n - name: total_contract_arr\n expr: sum(contract_arr)\n',
|
||||
);
|
||||
await writeFile(
|
||||
join(root, 'wiki/global/account-segments.md'),
|
||||
'---\nsummary: Account segments\nusage_mode: auto\nsl_refs:\n - mart_account_segments\n---\n\nARR is `mart_account_segments.total_contract_arr`.\n',
|
||||
);
|
||||
addTouchedSlSource(currentSession.touchedSlSources, 'warehouse', 'mart_account_segments');
|
||||
currentSession.actions.push({
|
||||
target: 'sl',
|
||||
type: 'created',
|
||||
key: 'mart_account_segments',
|
||||
detail: 'Valid source',
|
||||
targetConnectionId: 'warehouse',
|
||||
rawPaths: ['cards/source.json'],
|
||||
});
|
||||
currentSession.actions.push({
|
||||
target: 'wiki',
|
||||
type: 'created',
|
||||
key: 'account-segments',
|
||||
detail: 'Valid wiki with invalid provenance raw path',
|
||||
rawPaths: ['cards/missing.json'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['semantic-layer/warehouse/mart_account_segments.yaml', 'wiki/global/account-segments.md'],
|
||||
'valid artifacts with invalid provenance',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(runner, runtime, [['cards/source.json', 'h1']]);
|
||||
const preRunHead = await runtime.git.revParseHead();
|
||||
|
||||
await expect(
|
||||
runner.run({
|
||||
jobId: 'job-invalid-provenance',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey: 'metabase',
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
}),
|
||||
).rejects.toThrow(/provenance row references raw path outside this snapshot: cards\/missing\.json/);
|
||||
|
||||
expect(await runtime.git.revParseHead()).toBe(preRunHead);
|
||||
expect(deps.provenance.insertMany).not.toHaveBeenCalled();
|
||||
const trace = await readFile(join(runtime.configDir, '.ktx/ingest-traces/job-invalid-provenance/trace.jsonl'), 'utf-8');
|
||||
expect(trace).toContain('final_artifact_gates_finished');
|
||||
expect(trace).toContain('provenance_rows_validation_failed');
|
||||
expect(trace).toContain('cards/missing.json');
|
||||
expect(trace).toContain('ingest_failed');
|
||||
expect(trace).not.toContain('squash_finished');
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the failing regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.isolated-diff.test.ts -t "invalid provenance raw paths"
|
||||
```
|
||||
|
||||
Expected: FAIL because the current runner validates provenance after
|
||||
`squashMergeIntoMain()`, so `runtime.git.revParseHead()` changes and the trace
|
||||
does not contain `provenance_rows_validation_failed`.
|
||||
|
||||
### Task 2: Move provenance validation into the pre-squash gate boundary
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
|
||||
- [ ] **Step 1: Import the provenance report and insert types**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, update the imports.
|
||||
|
||||
Replace this import block:
|
||||
|
||||
```ts
|
||||
import type {
|
||||
ContextEvidenceIndexSummary,
|
||||
IngestBundleRunnerDeps,
|
||||
IngestProvenanceRow,
|
||||
IngestRunsPort,
|
||||
IngestSessionWorktree,
|
||||
PageTriageRunResult,
|
||||
} from './ports.js';
|
||||
```
|
||||
|
||||
With:
|
||||
|
||||
```ts
|
||||
import type {
|
||||
ContextEvidenceIndexSummary,
|
||||
IngestBundleRunnerDeps,
|
||||
IngestProvenanceInsert,
|
||||
IngestProvenanceRow,
|
||||
IngestRunsPort,
|
||||
IngestSessionWorktree,
|
||||
PageTriageRunResult,
|
||||
} from './ports.js';
|
||||
```
|
||||
|
||||
Replace this import block:
|
||||
|
||||
```ts
|
||||
import {
|
||||
buildStageIndexFromReportBody,
|
||||
postProcessorSavedMemoryCounts,
|
||||
type IngestReportPostProcessorOutcome,
|
||||
type IngestReportSnapshot,
|
||||
} from './reports.js';
|
||||
```
|
||||
|
||||
With:
|
||||
|
||||
```ts
|
||||
import {
|
||||
buildStageIndexFromReportBody,
|
||||
postProcessorSavedMemoryCounts,
|
||||
type IngestReportPostProcessorOutcome,
|
||||
type IngestReportProvenanceDetail,
|
||||
type IngestReportSnapshot,
|
||||
} from './reports.js';
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add provenance row helpers**
|
||||
|
||||
Add these private methods after `private errorMessage(error: unknown): string`
|
||||
in `packages/context/src/ingest/ingest-bundle.runner.ts`:
|
||||
|
||||
```ts
|
||||
private buildProvenanceRows(input: {
|
||||
job: IngestBundleJob;
|
||||
syncId: string;
|
||||
currentHashes: Map<string, string>;
|
||||
stageIndex: StageIndex;
|
||||
reconcileActions: MemoryAction[];
|
||||
eviction?: EvictionUnit;
|
||||
}): IngestProvenanceInsert[] {
|
||||
const provenanceRows: IngestProvenanceInsert[] = [];
|
||||
const actionToType = (action: MemoryAction): IngestProvenanceInsert['actionType'] => {
|
||||
if (action.target === 'wiki') {
|
||||
return 'wiki_written';
|
||||
}
|
||||
return action.type === 'created' ? 'source_created' : 'measure_added';
|
||||
};
|
||||
const producedPaths = new Set<string>();
|
||||
const pushActionProvenance = (rawPath: string, action: MemoryAction): void => {
|
||||
const hash = input.currentHashes.get(rawPath) ?? '';
|
||||
provenanceRows.push({
|
||||
connectionId: input.job.connectionId,
|
||||
sourceKey: input.job.sourceKey,
|
||||
syncId: input.syncId,
|
||||
rawPath,
|
||||
rawContentHash: hash,
|
||||
artifactKind: action.target,
|
||||
artifactKey: action.key,
|
||||
targetConnectionId: action.target === 'sl' ? actionTargetConnectionId(action, input.job.connectionId) : null,
|
||||
artifactContentHash: null,
|
||||
actionType: actionToType(action),
|
||||
});
|
||||
producedPaths.add(rawPath);
|
||||
};
|
||||
|
||||
for (const wu of input.stageIndex.workUnits) {
|
||||
for (const action of wu.actions) {
|
||||
for (const rawPath of rawPathsForAction(action, wu.rawFiles)) {
|
||||
pushActionProvenance(rawPath, action);
|
||||
}
|
||||
}
|
||||
}
|
||||
for (const action of input.reconcileActions) {
|
||||
for (const rawPath of action.rawPaths ?? []) {
|
||||
pushActionProvenance(rawPath, action);
|
||||
}
|
||||
}
|
||||
for (const resolution of input.stageIndex.artifactResolutions ?? []) {
|
||||
const hash = input.currentHashes.get(resolution.rawPath) ?? '';
|
||||
provenanceRows.push({
|
||||
connectionId: input.job.connectionId,
|
||||
sourceKey: input.job.sourceKey,
|
||||
syncId: input.syncId,
|
||||
rawPath: resolution.rawPath,
|
||||
rawContentHash: hash,
|
||||
artifactKind: resolution.artifactKind,
|
||||
artifactKey: resolution.artifactKey,
|
||||
targetConnectionId: null,
|
||||
artifactContentHash: null,
|
||||
actionType: resolution.actionType,
|
||||
});
|
||||
producedPaths.add(resolution.rawPath);
|
||||
}
|
||||
for (const [rawPath, hash] of input.currentHashes) {
|
||||
if (producedPaths.has(rawPath)) {
|
||||
continue;
|
||||
}
|
||||
provenanceRows.push({
|
||||
connectionId: input.job.connectionId,
|
||||
sourceKey: input.job.sourceKey,
|
||||
syncId: input.syncId,
|
||||
rawPath,
|
||||
rawContentHash: hash,
|
||||
artifactKind: null,
|
||||
artifactKey: null,
|
||||
targetConnectionId: null,
|
||||
artifactContentHash: null,
|
||||
actionType: 'skipped',
|
||||
});
|
||||
}
|
||||
|
||||
return provenanceRows;
|
||||
}
|
||||
|
||||
private toReportProvenanceRows(rows: IngestProvenanceInsert[]): IngestReportProvenanceDetail[] {
|
||||
return rows.map(({ rawPath, artifactKind, artifactKey, actionType, targetConnectionId }) => ({
|
||||
rawPath,
|
||||
artifactKind,
|
||||
artifactKey,
|
||||
targetConnectionId: targetConnectionId ?? null,
|
||||
actionType,
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Validate planned provenance rows before squash**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, find the code that
|
||||
sets `activePhase = 'final_gates';` and runs `traceTimed(...,
|
||||
'final_artifact_gates', ...)`. Immediately after that `await traceTimed(...)`
|
||||
block and before the `// Stage 6 — squash commit` comment, insert:
|
||||
|
||||
```ts
|
||||
activePhase = 'provenance_validation';
|
||||
const provenanceRows = this.buildProvenanceRows({
|
||||
job,
|
||||
syncId,
|
||||
currentHashes,
|
||||
stageIndex,
|
||||
reconcileActions,
|
||||
eviction,
|
||||
});
|
||||
await traceTimed(
|
||||
runTrace,
|
||||
'provenance',
|
||||
'provenance_rows_validation',
|
||||
{
|
||||
rowCount: provenanceRows.length,
|
||||
currentRawPathCount: currentHashes.size,
|
||||
deletedRawPathCount: eviction?.deletedRawPaths.length ?? 0,
|
||||
},
|
||||
async () => {
|
||||
validateProvenanceRawPaths({
|
||||
rows: provenanceRows,
|
||||
currentRawPaths: new Set(currentHashes.keys()),
|
||||
deletedRawPaths: new Set(eviction?.deletedRawPaths ?? []),
|
||||
});
|
||||
},
|
||||
);
|
||||
const reportProvenanceRows = this.toReportProvenanceRows(provenanceRows);
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Replace the post-squash provenance construction block**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, in the
|
||||
`activePhase = 'provenance';` section after squash, delete the current block
|
||||
that starts with:
|
||||
|
||||
```ts
|
||||
// Provenance rows: per-artifact when the WU emitted actions, plus a `skipped`
|
||||
// fallback for raw files that produced nothing so the next DiffSet still sees
|
||||
// them.
|
||||
const provenanceRows: Parameters<typeof this.deps.provenance.insertMany>[0] = [];
|
||||
```
|
||||
|
||||
And ends with:
|
||||
|
||||
```ts
|
||||
await runTrace.event('debug', 'provenance', 'provenance_rows_validated', {
|
||||
rowCount: provenanceRows.length,
|
||||
});
|
||||
```
|
||||
|
||||
Do not delete the existing call to `await this.deps.provenance.insertMany(provenanceRows);`.
|
||||
Immediately after that insertion call, add:
|
||||
|
||||
```ts
|
||||
await runTrace.event('debug', 'provenance', 'provenance_rows_inserted', {
|
||||
rowCount: provenanceRows.length,
|
||||
});
|
||||
```
|
||||
|
||||
Then delete the later `const reportProvenanceRows = provenanceRows.map(...)`
|
||||
block because `reportProvenanceRows` is now created before squash from the
|
||||
prevalidated rows.
|
||||
|
||||
- [ ] **Step 5: Run the provenance regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.isolated-diff.test.ts -t "invalid provenance raw paths"
|
||||
```
|
||||
|
||||
Expected: PASS. The trace contains `provenance_rows_validation_failed`, main
|
||||
HEAD remains unchanged, and `provenance.insertMany` is not called.
|
||||
|
||||
- [ ] **Step 6: Run the focused isolated-diff suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-trace.test.ts \
|
||||
src/ingest/wiki-body-refs.test.ts \
|
||||
src/ingest/artifact-gates.test.ts \
|
||||
src/ingest/isolated-diff/git-patch.test.ts \
|
||||
src/ingest/isolated-diff/work-unit-executor.test.ts \
|
||||
src/ingest/isolated-diff/patch-integrator.test.ts \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Task 3: Type-check, dead-code check, and commit
|
||||
|
||||
**Files:**
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
|
||||
- [ ] **Step 1: Run the context package type-check**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run the workspace dead-code check**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm run dead-code
|
||||
```
|
||||
|
||||
Expected: PASS, or only existing unrelated Knip/Biome findings. Investigate
|
||||
any new findings in the two modified files before continuing.
|
||||
|
||||
- [ ] **Step 3: Commit the provenance gate closure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/ingest-bundle.runner.ts \
|
||||
packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
git commit -m "fix(ingest): gate provenance before isolated diff squash"
|
||||
```
|
||||
|
||||
Expected: one commit containing only the runner and isolated-diff runner test
|
||||
changes.
|
||||
|
||||
## Self-Review
|
||||
|
||||
Spec coverage: this plan closes the remaining violation of the design's final
|
||||
global gate invariant by proving invalid provenance raw paths fail before
|
||||
squash and by moving provenance validation into the pre-main gate boundary.
|
||||
|
||||
Placeholder scan: no placeholder steps remain. Every implementation step names
|
||||
the exact files, code, commands, and expected results.
|
||||
|
||||
Type consistency: the plan uses existing `IngestProvenanceInsert`,
|
||||
`IngestReportProvenanceDetail`, `MemoryAction`, `EvictionUnit`, `StageIndex`,
|
||||
`rawPathsForAction()`, and `validateProvenanceRawPaths()` names.
|
||||
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,754 @@
|
|||
# Isolated Diff Ingestion V1 Default Promotion Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||
> superpowers:subagent-driven-development (recommended) or
|
||||
> superpowers:executing-plans to implement this plan task-by-task. Steps use
|
||||
> checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Promote isolated-diff WorkUnit execution to the default ingest runner
|
||||
path while keeping the old shared-worktree branch reachable by an explicit
|
||||
private fallback setting for the final cleanup rollout.
|
||||
|
||||
**Architecture:** The runner stops asking whether a source is on an
|
||||
isolated-diff allowlist. Instead, non-override bundle ingests use isolated
|
||||
diffs unless the private settings object lists the source in
|
||||
`sharedWorktreeSourceKeys`. Local runtime defaults that fallback list to empty,
|
||||
and tests keep the old path covered with an explicit legacy source setting so
|
||||
rollout step 11 can delete it safely.
|
||||
|
||||
**Tech Stack:** TypeScript ESM/NodeNext, Vitest, pnpm workspace commands,
|
||||
existing `IngestBundleRunner`, `IngestSettingsPort`, local ingest runtime, and
|
||||
isolated-diff runner tests.
|
||||
|
||||
---
|
||||
|
||||
## Audit summary
|
||||
|
||||
This audit read the original spec at
|
||||
`docs/superpowers/specs/2026-05-17-isolated-diff-ingestion-design.md`, all
|
||||
plans matching
|
||||
`docs/superpowers/plans/2026-05-17-isolated-diff-ingestion-*.md` and
|
||||
`docs/superpowers/plans/2026-05-18-isolated-diff-ingestion-*.md`, and the
|
||||
current ingest runner code under `packages/context/src/ingest/`.
|
||||
|
||||
Implemented v1 rollout coverage:
|
||||
|
||||
- Rollout steps 1 and 2 are implemented by the core plan: child worktrees,
|
||||
binary no-rename patch proposals, and `git apply --3way --index`
|
||||
integration exist.
|
||||
- Rollout step 3 is implemented by the textual conflict resolver plan:
|
||||
`textual-conflict-resolver.ts` is wired through `patch-integrator.ts`.
|
||||
- Rollout steps 4, 5, and 6 are implemented by the gates, provenance,
|
||||
reference, global wiki, and gate-repair plans: final gates, persistent traces,
|
||||
failure reports, provenance validation, target policy, and repair counters
|
||||
exist.
|
||||
- Rollout step 7 is implemented by the core and follow-up plans: Metabase has
|
||||
isolated-diff stale-reference regression coverage.
|
||||
- Rollout step 8 is implemented by
|
||||
`2026-05-18-isolated-diff-ingestion-v1-connector-migration.md` and the
|
||||
follow-up commits: Notion, LookML, Looker, dbt, and MetricFlow route through
|
||||
isolated child worktrees, and MetricFlow projection runs before WorkUnits.
|
||||
|
||||
Current v1-blocking gaps:
|
||||
|
||||
- Rollout step 10 is not complete. `IngestBundleRunner.isIsolatedDiffEnabled()`
|
||||
still checks `settings.isolatedDiffSourceKeys`, and
|
||||
`local-bundle-runtime.ts` still installs the internal allowlist returned by
|
||||
`defaultIsolatedDiffSourceKeys()`.
|
||||
- Rollout step 11 remains blocked until step 10 lands. The old
|
||||
shared-worktree WorkUnit branch is still present and must stay reachable in
|
||||
this plan for final cleanup validation.
|
||||
|
||||
Non-blocking gaps:
|
||||
|
||||
- Rollout step 9 deterministic semantic merge helpers remain intentionally
|
||||
deferred until v1 resolver metrics show frequent mechanical repairs.
|
||||
- Transitive SQL-projection dependency expansion remains outside v1; current
|
||||
gates cover direct declared join neighbors.
|
||||
- Moving provenance into worktree files remains outside v1; the implemented
|
||||
source of truth is the ingest provenance store and report body.
|
||||
- Public connector knobs such as `executionMode`, `planningStrategy`, and
|
||||
`conflictPolicy` remain non-goals and must not be added.
|
||||
- Richer resolver context, such as full transcript excerpts for every
|
||||
overlapping patch, can be evaluated after the default path has production
|
||||
traces.
|
||||
|
||||
## File structure
|
||||
|
||||
- Modify `packages/context/src/ingest/isolated-diff/source-routing.ts`.
|
||||
Replace the isolated-diff direct-write allowlist with an empty default
|
||||
shared-worktree fallback list.
|
||||
- Modify `packages/context/src/ingest/isolated-diff/source-routing.test.ts`.
|
||||
Lock the fallback list semantics and remove direct-write allowlist
|
||||
assertions.
|
||||
- Modify `packages/context/src/ingest/ports.ts`.
|
||||
Replace `isolatedDiffSourceKeys?: string[]` with
|
||||
`sharedWorktreeSourceKeys?: string[]` on the private runner settings port.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.ts`.
|
||||
Make isolated diff the default for non-override runs and route to the old
|
||||
shared branch only when `sharedWorktreeSourceKeys` contains the source.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`.
|
||||
Prove an unlisted source uses isolated diffs by default and prove an
|
||||
explicit fallback source can still reach the shared-worktree branch.
|
||||
- Modify `packages/context/src/ingest/local-bundle-runtime.ts`.
|
||||
Install the new empty fallback list instead of the old isolated-diff
|
||||
allowlist.
|
||||
- Modify `packages/context/src/ingest/local-bundle-runtime.test.ts`.
|
||||
Assert local runtime settings do not expose `isolatedDiffSourceKeys` and do
|
||||
default `sharedWorktreeSourceKeys` to `[]`.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Replace source routing semantics
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/isolated-diff/source-routing.test.ts`
|
||||
- Modify: `packages/context/src/ingest/isolated-diff/source-routing.ts`
|
||||
- Modify: `packages/context/src/ingest/ports.ts`
|
||||
|
||||
- [ ] **Step 1: Write the failing source-routing tests**
|
||||
|
||||
Replace `packages/context/src/ingest/isolated-diff/source-routing.test.ts` with:
|
||||
|
||||
```ts
|
||||
import { describe, expect, it } from 'vitest';
|
||||
import { defaultSharedWorktreeSourceKeys, isSharedWorktreeFallbackSourceKey } from './source-routing.js';
|
||||
|
||||
describe('isolated-diff source routing', () => {
|
||||
it('defaults every non-override source to isolated diffs', () => {
|
||||
expect(defaultSharedWorktreeSourceKeys()).toEqual([]);
|
||||
});
|
||||
|
||||
it('returns a mutable copy for runtime settings', () => {
|
||||
const keys = defaultSharedWorktreeSourceKeys();
|
||||
keys.push('legacy-source');
|
||||
|
||||
expect(defaultSharedWorktreeSourceKeys()).toEqual([]);
|
||||
});
|
||||
|
||||
it('recognizes only explicitly configured shared-worktree fallback sources', () => {
|
||||
expect(isSharedWorktreeFallbackSourceKey('notion', [])).toBe(false);
|
||||
expect(isSharedWorktreeFallbackSourceKey('metricflow', [])).toBe(false);
|
||||
expect(isSharedWorktreeFallbackSourceKey('legacy-source', ['legacy-source'])).toBe(true);
|
||||
expect(isSharedWorktreeFallbackSourceKey('other-source', ['legacy-source'])).toBe(false);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the source-routing tests to verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/isolated-diff/source-routing.test.ts
|
||||
```
|
||||
|
||||
Expected: FAIL because `defaultSharedWorktreeSourceKeys()` and
|
||||
`isSharedWorktreeFallbackSourceKey()` are not exported yet.
|
||||
|
||||
- [ ] **Step 3: Rewrite the routing helper**
|
||||
|
||||
Replace `packages/context/src/ingest/isolated-diff/source-routing.ts` with:
|
||||
|
||||
```ts
|
||||
const DEFAULT_SHARED_WORKTREE_SOURCE_KEYS: readonly string[] = [];
|
||||
|
||||
export function defaultSharedWorktreeSourceKeys(): string[] {
|
||||
return [...DEFAULT_SHARED_WORKTREE_SOURCE_KEYS];
|
||||
}
|
||||
|
||||
export function isSharedWorktreeFallbackSourceKey(
|
||||
sourceKey: string,
|
||||
sharedWorktreeSourceKeys: readonly string[] = DEFAULT_SHARED_WORKTREE_SOURCE_KEYS,
|
||||
): boolean {
|
||||
return sharedWorktreeSourceKeys.includes(sourceKey);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Rename the private settings field**
|
||||
|
||||
In `packages/context/src/ingest/ports.ts`, replace the
|
||||
`IngestSettingsPort` interface with:
|
||||
|
||||
```ts
|
||||
export interface IngestSettingsPort {
|
||||
memoryIngestionModel: string;
|
||||
probeRowCount: number;
|
||||
workUnitMaxConcurrency?: number;
|
||||
workUnitStepBudget?: number;
|
||||
workUnitFailureMode?: 'abort' | 'continue';
|
||||
sharedWorktreeSourceKeys?: string[];
|
||||
ingestTraceLevel?: IngestTraceLevel;
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the source-routing tests again**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/isolated-diff/source-routing.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit routing semantics**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/isolated-diff/source-routing.ts \
|
||||
packages/context/src/ingest/isolated-diff/source-routing.test.ts \
|
||||
packages/context/src/ingest/ports.ts
|
||||
git commit -m "feat(ingest): make isolated diff routing the private default"
|
||||
```
|
||||
|
||||
### Task 2: Promote the runner default
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
|
||||
- [ ] **Step 1: Update the isolated runner test imports and harness**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`,
|
||||
replace the source-routing import with:
|
||||
|
||||
```ts
|
||||
import { defaultSharedWorktreeSourceKeys } from './isolated-diff/source-routing.js';
|
||||
```
|
||||
|
||||
Then change the `makeDeps()` signature and `settings` block to:
|
||||
|
||||
```ts
|
||||
function makeDeps(
|
||||
runtime: Awaited<ReturnType<typeof makeRealGitRuntime>>,
|
||||
sourceKey = 'metabase',
|
||||
settings: Partial<IngestBundleRunnerDeps['settings']> = {},
|
||||
) {
|
||||
```
|
||||
|
||||
```ts
|
||||
settings: {
|
||||
memoryIngestionModel: 'test',
|
||||
probeRowCount: 1,
|
||||
sharedWorktreeSourceKeys: defaultSharedWorktreeSourceKeys(),
|
||||
ingestTraceLevel: 'trace',
|
||||
...settings,
|
||||
},
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add the default-promotion regression tests**
|
||||
|
||||
Insert these tests inside
|
||||
`describe('IngestBundleRunner isolated diff path', ...)`, before the existing
|
||||
non-Metabase routing matrix:
|
||||
|
||||
```ts
|
||||
it('routes an unlisted direct-writing source through isolated diffs by default', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
const sourceKey = 'custom-direct-source';
|
||||
const { deps, adapter } = makeDeps(runtime, sourceKey);
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [
|
||||
{
|
||||
unitKey: 'custom-wiki',
|
||||
rawFiles: ['custom/page.json'],
|
||||
peerFileIndex: [],
|
||||
dependencyPaths: [],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async (params: any) => {
|
||||
if (params.telemetryTags.operationName !== 'ingest-bundle-wu') {
|
||||
return { stopReason: 'natural' };
|
||||
}
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await mkdir(join(root, 'wiki/global'), { recursive: true });
|
||||
await writeFile(
|
||||
join(root, 'wiki/global/custom-isolated.md'),
|
||||
'---\nsummary: Custom isolated write\nusage_mode: auto\n---\n\nCustom isolated write.\n',
|
||||
'utf-8',
|
||||
);
|
||||
currentSession.actions.push({
|
||||
target: 'wiki',
|
||||
type: 'created',
|
||||
key: 'custom-isolated',
|
||||
detail: 'Custom isolated write',
|
||||
rawPaths: ['custom/page.json'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['wiki/global/custom-isolated.md'],
|
||||
'custom wiki',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(runner, runtime, [['custom/page.json', 'h1']], sourceKey);
|
||||
|
||||
await expect(
|
||||
runner.run({
|
||||
jobId: 'job-custom-default',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey,
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
}),
|
||||
).resolves.toMatchObject({
|
||||
jobId: 'job-custom-default',
|
||||
failedWorkUnits: [],
|
||||
workUnitCount: 1,
|
||||
});
|
||||
|
||||
const trace = await readFile(
|
||||
join(runtime.configDir, '.ktx/ingest-traces/job-custom-default/trace.jsonl'),
|
||||
'utf-8',
|
||||
);
|
||||
expect(trace).toContain('isolated_diff_enabled');
|
||||
expect(trace).toContain('work_unit_child_created');
|
||||
expect(trace).not.toContain('shared_worktree_path_enabled');
|
||||
|
||||
const reportCreate = vi.mocked(deps.reports.create).mock.calls.at(-1)?.[0];
|
||||
const reportBody = reportCreate?.body as { isolatedDiff?: unknown } | undefined;
|
||||
expect(reportBody?.isolatedDiff).toMatchObject({
|
||||
enabled: true,
|
||||
acceptedPatches: 1,
|
||||
});
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
it('keeps the shared-worktree path reachable through explicit private fallback settings', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
const sourceKey = 'legacy-source';
|
||||
const { deps, adapter } = makeDeps(runtime, sourceKey, {
|
||||
sharedWorktreeSourceKeys: ['legacy-source'],
|
||||
});
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [
|
||||
{
|
||||
unitKey: 'legacy-wiki',
|
||||
rawFiles: ['legacy/page.json'],
|
||||
peerFileIndex: [],
|
||||
dependencyPaths: [],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async (params: any) => {
|
||||
if (params.telemetryTags.operationName !== 'ingest-bundle-wu') {
|
||||
return { stopReason: 'natural' };
|
||||
}
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await mkdir(join(root, 'wiki/global'), { recursive: true });
|
||||
await writeFile(
|
||||
join(root, 'wiki/global/legacy-shared.md'),
|
||||
'---\nsummary: Legacy shared write\nusage_mode: auto\n---\n\nLegacy shared write.\n',
|
||||
'utf-8',
|
||||
);
|
||||
currentSession.actions.push({
|
||||
target: 'wiki',
|
||||
type: 'created',
|
||||
key: 'legacy-shared',
|
||||
detail: 'Legacy shared write',
|
||||
rawPaths: ['legacy/page.json'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['wiki/global/legacy-shared.md'],
|
||||
'legacy wiki',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(runner, runtime, [['legacy/page.json', 'h1']], sourceKey);
|
||||
|
||||
await expect(
|
||||
runner.run({
|
||||
jobId: 'job-legacy-shared',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey,
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
}),
|
||||
).resolves.toMatchObject({
|
||||
jobId: 'job-legacy-shared',
|
||||
failedWorkUnits: [],
|
||||
workUnitCount: 1,
|
||||
});
|
||||
|
||||
const trace = await readFile(
|
||||
join(runtime.configDir, '.ktx/ingest-traces/job-legacy-shared/trace.jsonl'),
|
||||
'utf-8',
|
||||
);
|
||||
expect(trace).toContain('shared_worktree_path_enabled');
|
||||
expect(trace).not.toContain('work_unit_child_created');
|
||||
|
||||
const reportCreate = vi.mocked(deps.reports.create).mock.calls.at(-1)?.[0];
|
||||
const reportBody = reportCreate?.body as { isolatedDiff?: unknown } | undefined;
|
||||
expect(reportBody?.isolatedDiff).toMatchObject({
|
||||
enabled: false,
|
||||
});
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run the new runner tests to verify the default test fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.isolated-diff.test.ts -t "unlisted direct-writing source|shared-worktree path reachable"
|
||||
```
|
||||
|
||||
Expected: FAIL. The unlisted source still enters the old shared-worktree path
|
||||
because the runner checks `isolatedDiffSourceKeys`.
|
||||
|
||||
- [ ] **Step 4: Change the runner routing decision**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, replace
|
||||
`isIsolatedDiffEnabled()` with:
|
||||
|
||||
```ts
|
||||
private isSharedWorktreeFallbackEnabled(sourceKey: string): boolean {
|
||||
return (this.deps.settings.sharedWorktreeSourceKeys ?? []).includes(sourceKey);
|
||||
}
|
||||
```
|
||||
|
||||
Then replace the isolated-diff routing line with:
|
||||
|
||||
```ts
|
||||
const isolatedDiffEnabled = !overrideReport && !this.isSharedWorktreeFallbackEnabled(job.sourceKey);
|
||||
```
|
||||
|
||||
Finally, replace the shared-path trace event with:
|
||||
|
||||
```ts
|
||||
await runTrace.event('info', 'routing', 'shared_worktree_path_enabled', {
|
||||
sourceKey: job.sourceKey,
|
||||
reason: 'explicit_private_fallback',
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the new runner tests again**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.isolated-diff.test.ts -t "unlisted direct-writing source|shared-worktree path reachable"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit runner default promotion**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/ingest-bundle.runner.ts \
|
||||
packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
git commit -m "feat(ingest): promote isolated diff to default runner path"
|
||||
```
|
||||
|
||||
### Task 3: Update local runtime defaults
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/local-bundle-runtime.test.ts`
|
||||
- Modify: `packages/context/src/ingest/local-bundle-runtime.ts`
|
||||
|
||||
- [ ] **Step 1: Update the local runtime settings test type**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-runtime.test.ts`, replace
|
||||
`RuntimeWithSettingsDeps` with:
|
||||
|
||||
```ts
|
||||
type RuntimeWithSettingsDeps = {
|
||||
deps: {
|
||||
settings: {
|
||||
sharedWorktreeSourceKeys?: string[];
|
||||
isolatedDiffSourceKeys?: string[];
|
||||
};
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Replace the local runtime settings assertion**
|
||||
|
||||
Replace the test named
|
||||
`enables isolated-diff routing for direct durable-write connectors` with:
|
||||
|
||||
```ts
|
||||
it('defaults local bundle ingest to isolated diffs without an allowlist', () => {
|
||||
const runtime = createLocalBundleIngestRuntime({
|
||||
project,
|
||||
adapters: [new FakeSourceAdapter()],
|
||||
agentRunner: testAgentRunner(),
|
||||
});
|
||||
|
||||
const settings = (runtime.runner as unknown as RuntimeWithSettingsDeps).deps.settings;
|
||||
|
||||
expect(settings.sharedWorktreeSourceKeys).toEqual([]);
|
||||
expect('isolatedDiffSourceKeys' in settings).toBe(false);
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run the local runtime settings test to verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/local-bundle-runtime.test.ts -t "defaults local bundle ingest"
|
||||
```
|
||||
|
||||
Expected: FAIL because `local-bundle-runtime.ts` still sets
|
||||
`isolatedDiffSourceKeys`.
|
||||
|
||||
- [ ] **Step 4: Update local runtime imports and settings**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-runtime.ts`, replace the
|
||||
source-routing import with:
|
||||
|
||||
```ts
|
||||
import { defaultSharedWorktreeSourceKeys } from './isolated-diff/source-routing.js';
|
||||
```
|
||||
|
||||
Then replace the settings field:
|
||||
|
||||
```ts
|
||||
isolatedDiffSourceKeys: defaultIsolatedDiffSourceKeys(),
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```ts
|
||||
sharedWorktreeSourceKeys: defaultSharedWorktreeSourceKeys(),
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the local runtime settings test again**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/local-bundle-runtime.test.ts -t "defaults local bundle ingest"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit local runtime defaults**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/local-bundle-runtime.ts \
|
||||
packages/context/src/ingest/local-bundle-runtime.test.ts
|
||||
git commit -m "feat(ingest): default local ingest to isolated diffs"
|
||||
```
|
||||
|
||||
### Task 4: Remove stale allowlist references
|
||||
|
||||
**Files:**
|
||||
- Verify: `packages/context/src/ingest/isolated-diff/source-routing.ts`
|
||||
- Verify: `packages/context/src/ingest/local-bundle-runtime.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
- Verify: `packages/context/src/ingest/ports.ts`
|
||||
- Verify: `packages/context/src/ingest/**/*.test.ts`
|
||||
|
||||
- [ ] **Step 1: Search for old allowlist names**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "isolatedDiffSourceKeys|defaultIsolatedDiffSourceKeys|ISOLATED_DIFF_DIRECT_WRITE_SOURCE_KEYS|isIsolatedDiffDirectWriteSourceKey" packages/context/src
|
||||
```
|
||||
|
||||
Expected: no matches.
|
||||
|
||||
- [ ] **Step 2: Search for the new fallback setting**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "sharedWorktreeSourceKeys|defaultSharedWorktreeSourceKeys|isSharedWorktreeFallbackSourceKey" packages/context/src
|
||||
```
|
||||
|
||||
Expected: matches only in these files:
|
||||
|
||||
```text
|
||||
packages/context/src/ingest/ports.ts
|
||||
packages/context/src/ingest/ingest-bundle.runner.ts
|
||||
packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
packages/context/src/ingest/isolated-diff/source-routing.ts
|
||||
packages/context/src/ingest/isolated-diff/source-routing.test.ts
|
||||
packages/context/src/ingest/local-bundle-runtime.ts
|
||||
packages/context/src/ingest/local-bundle-runtime.test.ts
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run a focused no-allowlist regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/isolated-diff/source-routing.test.ts \
|
||||
src/ingest/local-bundle-runtime.test.ts \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts \
|
||||
-t "source routing|defaults local bundle ingest|unlisted direct-writing source|shared-worktree path reachable|routes notion|routes lookml|routes looker|routes dbt|routes metricflow"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Commit stale-reference cleanup if needed**
|
||||
|
||||
If Step 1 or Step 2 required any edits, run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest
|
||||
git commit -m "chore(ingest): remove isolated diff allowlist references"
|
||||
```
|
||||
|
||||
If no files changed, record that no cleanup commit was needed in the execution
|
||||
notes for this task.
|
||||
|
||||
### Task 5: Final verification
|
||||
|
||||
**Files:**
|
||||
- Verify: `packages/context/src/ingest/isolated-diff/source-routing.ts`
|
||||
- Verify: `packages/context/src/ingest/isolated-diff/source-routing.test.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
- Verify: `packages/context/src/ingest/local-bundle-runtime.ts`
|
||||
- Verify: `packages/context/src/ingest/local-bundle-runtime.test.ts`
|
||||
- Verify: `packages/context/src/ingest/ports.ts`
|
||||
- Verify: `docs/superpowers/plans/2026-05-18-isolated-diff-ingestion-v1-default-promotion.md`
|
||||
|
||||
- [ ] **Step 1: Run the full isolated-diff focused suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-trace.test.ts \
|
||||
src/ingest/wiki-body-refs.test.ts \
|
||||
src/ingest/artifact-gates.test.ts \
|
||||
src/ingest/semantic-layer-target-policy.test.ts \
|
||||
src/ingest/isolated-diff/source-routing.test.ts \
|
||||
src/ingest/isolated-diff/git-patch.test.ts \
|
||||
src/ingest/isolated-diff/work-unit-executor.test.ts \
|
||||
src/ingest/isolated-diff/patch-integrator.test.ts \
|
||||
src/ingest/isolated-diff/textual-conflict-resolver.test.ts \
|
||||
src/ingest/final-gate-repair.test.ts \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts \
|
||||
src/ingest/report-snapshot.test.ts \
|
||||
src/ingest/local-bundle-runtime.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run the MetricFlow local ingest regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/local-bundle-ingest.test.ts -t "runs full MetricFlow local ingest"
|
||||
```
|
||||
|
||||
Expected: PASS. The report body includes `isolatedDiff.enabled: true`,
|
||||
`acceptedPatches: 0`, and a string `projectionSha`.
|
||||
|
||||
- [ ] **Step 3: Run package type-check**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Run package tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run test
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Run TypeScript dead-code checks**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm run dead-code
|
||||
```
|
||||
|
||||
Expected: PASS, or only pre-existing findings unrelated to the files changed
|
||||
by this plan. Investigate any finding that names `source-routing.ts`,
|
||||
`ports.ts`, `local-bundle-runtime.ts`, or `ingest-bundle.runner.ts`.
|
||||
|
||||
- [ ] **Step 6: Decide whether docs-site needs an update**
|
||||
|
||||
No `docs-site/content/docs/` change is expected for this plan because the
|
||||
change is an internal runner rollout switch and does not add or remove public
|
||||
CLI commands, flags, config fields, connector setup steps, or user-facing
|
||||
documentation concepts.
|
||||
|
||||
- [ ] **Step 7: Commit final verification notes**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
git add docs/superpowers/plans/2026-05-18-isolated-diff-ingestion-v1-default-promotion.md
|
||||
git commit -m "docs: add isolated diff default promotion plan"
|
||||
```
|
||||
|
||||
Only include the plan file in this commit if all implementation commits have
|
||||
already captured their code changes.
|
||||
|
||||
## Completion criteria
|
||||
|
||||
This plan is complete when:
|
||||
|
||||
- `packages/context/src/ingest/ports.ts` has
|
||||
`sharedWorktreeSourceKeys?: string[]` and no `isolatedDiffSourceKeys` field.
|
||||
- `IngestBundleRunner` uses isolated diffs for every non-override source unless
|
||||
`sharedWorktreeSourceKeys` explicitly contains that source.
|
||||
- The trace for a default-routed source contains `isolated_diff_enabled` and
|
||||
not `shared_worktree_path_enabled`.
|
||||
- The trace for an explicitly fallback-routed source contains
|
||||
`shared_worktree_path_enabled` and not `work_unit_child_created`.
|
||||
- Local runtime settings default `sharedWorktreeSourceKeys` to `[]`.
|
||||
- No production or test code under `packages/context/src` references the old
|
||||
isolated-diff allowlist names.
|
||||
- The focused isolated-diff suite, MetricFlow local ingest regression,
|
||||
`@ktx/context` type-check, `@ktx/context` tests, and dead-code checks pass.
|
||||
|
||||
## Next rollout step
|
||||
|
||||
After this plan is implemented and verified, the only remaining v1-blocking
|
||||
rollout item from the spec is step 11: remove the old shared-worktree WorkUnit
|
||||
execution path and delete the private `sharedWorktreeSourceKeys` fallback
|
||||
setting.
|
||||
File diff suppressed because it is too large
Load diff
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,980 @@
|
|||
# Isolated Diff Ingestion V1 Shared Worktree Removal Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||
> superpowers:subagent-driven-development (recommended) or
|
||||
> superpowers:executing-plans to implement this plan task-by-task. Steps use
|
||||
> checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Remove the old shared-worktree WorkUnit execution path so every
|
||||
non-override bundle ingest uses isolated WorkUnit diffs.
|
||||
|
||||
**Architecture:** Keep `IngestBundleRunner` with one non-override execution
|
||||
path: raw snapshot, optional deterministic projection, child WorkUnit
|
||||
worktrees, patch integration, reconciliation, final gates, provenance
|
||||
validation, and squash. Delete the private fallback routing setting and all
|
||||
legacy tests, traces, and agent instructions that existed only for shared
|
||||
WorkUnit state.
|
||||
|
||||
**Tech Stack:** TypeScript, Vitest, pnpm, KTX ingest runner, Git worktrees.
|
||||
|
||||
---
|
||||
|
||||
## Audit summary
|
||||
|
||||
This audit read the original design in
|
||||
`docs/superpowers/specs/2026-05-17-isolated-diff-ingestion-design.md`, every
|
||||
implemented plan matching
|
||||
`docs/superpowers/plans/2026-05-17-isolated-diff-ingestion-*.md` and
|
||||
`docs/superpowers/plans/2026-05-18-isolated-diff-ingestion-*.md`, and the
|
||||
current implementation under `packages/context/src/ingest/`,
|
||||
`packages/context/prompts/`, and `packages/context/skills/`.
|
||||
|
||||
Implemented v1 rollout coverage:
|
||||
|
||||
- Rollout steps 1 and 2 exist in code: isolated child worktrees, binary
|
||||
no-rename patch collection, and `git apply --3way --index` patch integration.
|
||||
- Rollout step 3 exists in code:
|
||||
`packages/context/src/ingest/isolated-diff/textual-conflict-resolver.ts` is
|
||||
wired through the patch integrator and runner.
|
||||
- Rollout steps 4, 5, and 6 exist in code: final wiki and semantic-layer gates,
|
||||
provenance validation before squash, target policy checks, bounded gate
|
||||
repair, failed reports, and trace counters.
|
||||
- Rollout step 7 exists in code: the Metabase stale body-reference regression
|
||||
is covered in `ingest-bundle.runner.isolated-diff.test.ts`.
|
||||
- Rollout step 8 is committed: Notion, LookML, Looker, dbt, and MetricFlow
|
||||
route through isolated child worktrees, and MetricFlow projection runs before
|
||||
WorkUnits.
|
||||
- Rollout step 10 is committed: non-override ingests default to isolated diffs,
|
||||
and the old branch is reachable only through the private
|
||||
`sharedWorktreeSourceKeys` fallback setting.
|
||||
|
||||
## Remaining gaps
|
||||
|
||||
The remaining v1-blocking gaps are all part of rollout step 11:
|
||||
|
||||
- `packages/context/src/ingest/ports.ts` still exposes the private
|
||||
`sharedWorktreeSourceKeys?: string[]` setting.
|
||||
- `packages/context/src/ingest/isolated-diff/source-routing.ts` and its test
|
||||
exist only to support the fallback setting.
|
||||
- `packages/context/src/ingest/local-bundle-runtime.ts` still installs
|
||||
`sharedWorktreeSourceKeys: []`.
|
||||
- `packages/context/src/ingest/ingest-bundle.runner.ts` still checks
|
||||
`isSharedWorktreeFallbackEnabled()` and contains the
|
||||
`shared_worktree_path_enabled` branch that runs WorkUnits against the mutable
|
||||
integration worktree.
|
||||
- `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
still has a regression proving the shared-worktree fallback is reachable.
|
||||
- `packages/context/src/ingest/ingest-bundle.runner.test.ts` keeps broad runner
|
||||
tests on the legacy path through `sharedWorktreeSourceKeys`; those tests must
|
||||
either use the isolated mock harness or move coverage into the real-git
|
||||
isolated suite.
|
||||
- `packages/context/prompts/memory_agent_bundle_ingest_work_unit.md` and
|
||||
`packages/context/skills/ingest_triage/SKILL.md` still tell WorkUnit agents
|
||||
that prior WorkUnit writes in the same job are visible in the current working
|
||||
branch. That instruction is false after isolated diffs and must be removed
|
||||
with the shared path.
|
||||
|
||||
Non-blocking gaps after this plan:
|
||||
|
||||
- Rollout step 9 deterministic semantic merge helpers remain intentionally
|
||||
deferred until resolver metrics show frequent mechanical repairs.
|
||||
- Semantic-layer dependency expansion remains direct declared joins only; the
|
||||
spec explicitly defers transitive SQL-projection closure.
|
||||
- Provenance remains in the ingest provenance store and report body; moving it
|
||||
to worktree files is a separate schema migration.
|
||||
- Resolver context can later include richer transcript excerpts and explicit
|
||||
overlap summaries for every previously applied patch.
|
||||
- Failures before an ingest run row exists still have deterministic trace files
|
||||
but no stored ingest report.
|
||||
|
||||
## File structure
|
||||
|
||||
- Modify `packages/context/src/ingest/ports.ts`. Remove the private fallback
|
||||
setting from `IngestSettingsPort`.
|
||||
- Modify `packages/context/src/ingest/local-bundle-runtime.ts`. Stop importing
|
||||
and installing default shared-worktree fallback settings.
|
||||
- Delete `packages/context/src/ingest/isolated-diff/source-routing.ts`. This
|
||||
helper has no responsibility once fallback routing is removed.
|
||||
- Delete `packages/context/src/ingest/isolated-diff/source-routing.test.ts`.
|
||||
Its assertions exist only for the fallback helper.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.ts`. Delete
|
||||
`isSharedWorktreeFallbackEnabled()`, the old shared-worktree WorkUnit branch,
|
||||
and helper methods that only served that branch.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`.
|
||||
Remove fallback reachability coverage and add a stale-setting regression that
|
||||
proves a runtime object cannot opt out of isolated diffs.
|
||||
- Modify `packages/context/src/ingest/ingest-bundle.runner.test.ts`. Remove
|
||||
the fallback setting from the broad test harness and make its mocked Git
|
||||
session support no-op isolated patch collection.
|
||||
- Modify `packages/context/src/ingest/local-bundle-runtime.test.ts`. Assert
|
||||
local runtime settings do not contain the fallback key.
|
||||
- Modify `packages/context/prompts/memory_agent_bundle_ingest_work_unit.md`.
|
||||
Replace shared-branch WorkUnit visibility instructions with isolated-diff
|
||||
instructions.
|
||||
- Modify `packages/context/skills/ingest_triage/SKILL.md`. Remove Stage 3
|
||||
prior-WorkUnit visibility language and keep cross-WorkUnit sweep guidance in
|
||||
Stage 4 reconciliation.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add removal-contract regressions
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/local-bundle-runtime.test.ts`
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
|
||||
- [ ] **Step 1: Update the local runtime settings type**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-runtime.test.ts`, replace
|
||||
`RuntimeWithSettingsDeps` with:
|
||||
|
||||
```ts
|
||||
type RuntimeWithSettingsDeps = {
|
||||
deps: {
|
||||
settings: Record<string, unknown>;
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Replace the local runtime fallback-setting assertion**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-runtime.test.ts`, replace the test
|
||||
named `defaults local bundle ingest to isolated diffs without an allowlist` with:
|
||||
|
||||
```ts
|
||||
it('defaults local bundle ingest to isolated diffs without a shared-worktree fallback setting', () => {
|
||||
const runtime = createLocalBundleIngestRuntime({
|
||||
project,
|
||||
adapters: [new FakeSourceAdapter()],
|
||||
agentRunner: testAgentRunner(),
|
||||
});
|
||||
|
||||
const settings = (runtime.runner as unknown as RuntimeWithSettingsDeps).deps.settings;
|
||||
|
||||
expect(settings).not.toHaveProperty('sharedWorktreeSourceKeys');
|
||||
expect(Object.keys(settings).sort()).toEqual([
|
||||
'ingestTraceLevel',
|
||||
'memoryIngestionModel',
|
||||
'probeRowCount',
|
||||
'workUnitFailureMode',
|
||||
'workUnitMaxConcurrency',
|
||||
'workUnitStepBudget',
|
||||
]);
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Remove the source-routing import from the isolated runner test**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`,
|
||||
delete this import:
|
||||
|
||||
```ts
|
||||
import { defaultSharedWorktreeSourceKeys } from './isolated-diff/source-routing.js';
|
||||
```
|
||||
|
||||
Then remove the `sharedWorktreeSourceKeys` line from the `settings` object in
|
||||
`makeDeps()`:
|
||||
|
||||
```ts
|
||||
settings: {
|
||||
memoryIngestionModel: 'test',
|
||||
probeRowCount: 1,
|
||||
ingestTraceLevel: 'trace',
|
||||
...settings,
|
||||
},
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Replace the shared fallback reachability test**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`,
|
||||
replace the test named
|
||||
`keeps the shared-worktree path reachable through explicit private fallback settings`
|
||||
with this stale-setting regression:
|
||||
|
||||
```ts
|
||||
it('does not support shared-worktree fallback settings', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
const sourceKey = 'legacy-source';
|
||||
const staleSettings = {
|
||||
sharedWorktreeSourceKeys: ['legacy-source'],
|
||||
} as Partial<IngestBundleRunnerDeps['settings']> & Record<string, unknown>;
|
||||
const { deps, adapter } = makeDeps(runtime, sourceKey, staleSettings);
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [
|
||||
{
|
||||
unitKey: 'legacy-wiki',
|
||||
rawFiles: ['legacy/page.json'],
|
||||
peerFileIndex: [],
|
||||
dependencyPaths: [],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async (params: any) => {
|
||||
if (params.telemetryTags.operationName !== 'ingest-bundle-wu') {
|
||||
return { stopReason: 'natural' };
|
||||
}
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await mkdir(join(root, 'wiki/global'), { recursive: true });
|
||||
await writeFile(
|
||||
join(root, 'wiki/global/legacy-isolated.md'),
|
||||
'---\nsummary: Legacy isolated write\nusage_mode: auto\n---\n\nLegacy isolated write.\n',
|
||||
'utf-8',
|
||||
);
|
||||
currentSession.actions.push({
|
||||
target: 'wiki',
|
||||
type: 'created',
|
||||
key: 'legacy-isolated',
|
||||
detail: 'Legacy isolated write',
|
||||
rawPaths: ['legacy/page.json'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['wiki/global/legacy-isolated.md'],
|
||||
'legacy isolated wiki',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(runner, runtime, [['legacy/page.json', 'h1']], sourceKey);
|
||||
|
||||
await expect(
|
||||
runner.run({
|
||||
jobId: 'job-legacy-isolated',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey,
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
}),
|
||||
).resolves.toMatchObject({
|
||||
jobId: 'job-legacy-isolated',
|
||||
failedWorkUnits: [],
|
||||
workUnitCount: 1,
|
||||
});
|
||||
|
||||
const trace = await readFile(
|
||||
join(runtime.configDir, '.ktx/ingest-traces/job-legacy-isolated/trace.jsonl'),
|
||||
'utf-8',
|
||||
);
|
||||
expect(trace).toContain('isolated_diff_enabled');
|
||||
expect(trace).toContain('work_unit_child_created');
|
||||
expect(trace).not.toContain('shared_worktree_path_enabled');
|
||||
|
||||
const reportCreate = vi.mocked(deps.reports.create).mock.calls.at(-1)?.[0];
|
||||
const reportBody = reportCreate?.body as { isolatedDiff?: unknown } | undefined;
|
||||
expect(reportBody?.isolatedDiff).toMatchObject({
|
||||
enabled: true,
|
||||
acceptedPatches: 1,
|
||||
});
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the removal regressions and confirm they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/local-bundle-runtime.test.ts \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts \
|
||||
-t "shared-worktree fallback|stale|defaults local bundle ingest|unlisted direct-writing source"
|
||||
```
|
||||
|
||||
Expected: FAIL. The local runtime still exposes `sharedWorktreeSourceKeys`, and
|
||||
the stale-setting runner test still reaches `shared_worktree_path_enabled`.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Remove the fallback setting and routing module
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ports.ts`
|
||||
- Modify: `packages/context/src/ingest/local-bundle-runtime.ts`
|
||||
- Delete: `packages/context/src/ingest/isolated-diff/source-routing.ts`
|
||||
- Delete: `packages/context/src/ingest/isolated-diff/source-routing.test.ts`
|
||||
|
||||
- [ ] **Step 1: Remove the fallback setting from the runner settings port**
|
||||
|
||||
In `packages/context/src/ingest/ports.ts`, replace `IngestSettingsPort` with:
|
||||
|
||||
```ts
|
||||
export interface IngestSettingsPort {
|
||||
memoryIngestionModel: string;
|
||||
probeRowCount: number;
|
||||
workUnitMaxConcurrency?: number;
|
||||
workUnitStepBudget?: number;
|
||||
workUnitFailureMode?: 'abort' | 'continue';
|
||||
ingestTraceLevel?: IngestTraceLevel;
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Remove the local runtime source-routing import**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-runtime.ts`, delete this import:
|
||||
|
||||
```ts
|
||||
import { defaultSharedWorktreeSourceKeys } from './isolated-diff/source-routing.js';
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Remove the local runtime fallback setting**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-runtime.ts`, replace the settings
|
||||
object with:
|
||||
|
||||
```ts
|
||||
settings: {
|
||||
memoryIngestionModel: options.project.config.llm.models.default ?? 'local-ingest-model',
|
||||
probeRowCount: 0,
|
||||
workUnitMaxConcurrency: options.project.config.ingest.workUnits.maxConcurrency,
|
||||
workUnitStepBudget: options.project.config.ingest.workUnits.stepBudget,
|
||||
workUnitFailureMode: options.project.config.ingest.workUnits.failureMode,
|
||||
ingestTraceLevel: ingestTraceLevelFromEnv(),
|
||||
},
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Delete the fallback routing helper files**
|
||||
|
||||
Delete:
|
||||
|
||||
```bash
|
||||
git rm packages/context/src/ingest/isolated-diff/source-routing.ts
|
||||
git rm packages/context/src/ingest/isolated-diff/source-routing.test.ts
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Confirm no fallback helper imports remain**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "defaultSharedWorktreeSourceKeys|isSharedWorktreeFallbackSourceKey|source-routing" packages/context/src
|
||||
```
|
||||
|
||||
Expected: FAIL with no matches. `rg` exits with status 1 when the cleanup is
|
||||
complete.
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Delete the shared-worktree runner branch
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
|
||||
- [ ] **Step 1: Remove helper methods used only by the shared branch**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, delete these private
|
||||
methods:
|
||||
|
||||
```ts
|
||||
private buildFailedWorkUnitOutcome(wu: WorkUnit, error: unknown): WorkUnitOutcome {
|
||||
return {
|
||||
unitKey: wu.unitKey,
|
||||
status: 'failed',
|
||||
reason: error instanceof Error ? error.message : String(error),
|
||||
preSha: '',
|
||||
postSha: '',
|
||||
actions: [],
|
||||
touchedSlSources: [],
|
||||
slDisallowed: wu.slDisallowed,
|
||||
slDisallowedReason: wu.slDisallowedReason,
|
||||
};
|
||||
}
|
||||
|
||||
private formatWorkUnitFailure(outcome: WorkUnitOutcome): string {
|
||||
return `WorkUnit ${outcome.unitKey} failed: ${outcome.reason ?? 'unknown failure'}`;
|
||||
}
|
||||
|
||||
private isSharedWorktreeFallbackEnabled(sourceKey: string): boolean {
|
||||
return (this.deps.settings.sharedWorktreeSourceKeys ?? []).includes(sourceKey);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Make non-override isolated routing unconditional**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, replace:
|
||||
|
||||
```ts
|
||||
const isolatedDiffEnabled = !overrideReport && !this.isSharedWorktreeFallbackEnabled(job.sourceKey);
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```ts
|
||||
const isolatedDiffEnabled = !overrideReport;
|
||||
```
|
||||
|
||||
Then replace:
|
||||
|
||||
```ts
|
||||
if (!overrideReport && isolatedDiffEnabled) {
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```ts
|
||||
if (!overrideReport) {
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Delete the old shared-worktree branch**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.ts`, delete the whole
|
||||
branch that starts with:
|
||||
|
||||
```ts
|
||||
} else if (!overrideReport) {
|
||||
await runTrace.event('info', 'routing', 'shared_worktree_path_enabled', {
|
||||
sourceKey: job.sourceKey,
|
||||
reason: 'explicit_private_fallback',
|
||||
});
|
||||
```
|
||||
|
||||
and ends with:
|
||||
|
||||
```ts
|
||||
latestReportWorkUnits = this.toReportWorkUnits(stageIndex);
|
||||
}
|
||||
```
|
||||
|
||||
After the deletion, the surrounding code must read:
|
||||
|
||||
```ts
|
||||
}
|
||||
|
||||
}
|
||||
const carryForwardResult =
|
||||
contextReport && this.deps.contextCandidateCarryforward
|
||||
? await this.deps.contextCandidateCarryforward.carryForward({
|
||||
runId: runRow.id,
|
||||
connectionId: job.connectionId,
|
||||
sourceKey: job.sourceKey,
|
||||
})
|
||||
: null;
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Confirm the branch trace event is gone**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "shared_worktree_path_enabled|explicit_private_fallback|isSharedWorktreeFallbackEnabled|sharedWorktreeSourceKeys" packages/context/src/ingest/ingest-bundle.runner.ts
|
||||
```
|
||||
|
||||
Expected: FAIL with no matches.
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Update runner tests for isolated-only execution
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.test.ts`
|
||||
- Modify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
|
||||
- [ ] **Step 1: Remove the fallback setting from the broad runner test harness**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.test.ts`, replace the
|
||||
`settings` block in `buildRunner()` with:
|
||||
|
||||
```ts
|
||||
settings: {
|
||||
probeRowCount: 1,
|
||||
memoryIngestionModel: 'test-model',
|
||||
},
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add no-op isolated patch support to the broad mock Git**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.test.ts`, replace the
|
||||
`scopedGit` object in `makeDeps()` with:
|
||||
|
||||
```ts
|
||||
const scopedGit = {
|
||||
revParseHead: vi.fn().mockResolvedValue('h'),
|
||||
commitFiles: vi.fn().mockResolvedValue({ created: true, commitHash: 'h' }),
|
||||
commitStaged: vi.fn().mockResolvedValue({ created: false, commitHash: 'h' }),
|
||||
resetHardTo: vi.fn(),
|
||||
assertWorktreeClean: vi.fn().mockResolvedValue(undefined),
|
||||
writeBinaryNoRenamePatch: vi.fn(async (_base: string, _head: string, patchPath: string) => {
|
||||
await writeFile(patchPath, '', 'utf-8');
|
||||
}),
|
||||
applyPatchFile3WayIndex: vi.fn(),
|
||||
diffNameStatus: vi.fn().mockResolvedValue([]),
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Update the custom sequencer test Git mock**
|
||||
|
||||
In the test named
|
||||
`refuses to squash-merge when the session worktree has an in-progress sequencer op`,
|
||||
replace the `sessionGit` object with:
|
||||
|
||||
```ts
|
||||
const sessionGit = {
|
||||
revParseHead: vi.fn().mockResolvedValue('h'),
|
||||
commitFiles: vi.fn().mockResolvedValue({ created: true, commitHash: 'h' }),
|
||||
commitStaged: vi.fn().mockResolvedValue({ created: false, commitHash: 'h' }),
|
||||
resetHardTo: vi.fn(),
|
||||
assertWorktreeClean: vi.fn().mockRejectedValue(assertError),
|
||||
writeBinaryNoRenamePatch: vi.fn(async (_base: string, _head: string, patchPath: string) => {
|
||||
await writeFile(patchPath, '', 'utf-8');
|
||||
}),
|
||||
applyPatchFile3WayIndex: vi.fn(),
|
||||
diffNameStatus: vi.fn().mockResolvedValue([]),
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Move the failed-WorkUnit integration regression to the isolated suite**
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.test.ts`, delete the test
|
||||
named `squash-merges only successful WUs into main when one WU fails sl_validate`.
|
||||
|
||||
In `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`,
|
||||
add this test near the other real-git isolated runner regressions:
|
||||
|
||||
```ts
|
||||
it('does not integrate failed isolated WorkUnit patches', async () => {
|
||||
const runtime = await makeRealGitRuntime();
|
||||
try {
|
||||
const { deps, adapter } = makeDeps(runtime, 'fake');
|
||||
adapter.chunk.mockResolvedValue({
|
||||
workUnits: [
|
||||
{ unitKey: 'wu-good', rawFiles: ['good.raw'], peerFileIndex: [], dependencyPaths: [] },
|
||||
{ unitKey: 'wu-bad', rawFiles: ['bad.raw'], peerFileIndex: [], dependencyPaths: [] },
|
||||
],
|
||||
});
|
||||
deps.diffSetService.compute = vi.fn().mockResolvedValue({
|
||||
added: ['good.raw', 'bad.raw'],
|
||||
modified: [],
|
||||
deleted: [],
|
||||
unchanged: [],
|
||||
});
|
||||
deps.slValidator.validateSingleSource = vi.fn(
|
||||
async (_validationDeps: unknown, _connectionId: string, sourceName: string) => ({
|
||||
errors: sourceName === 'bad' ? [{ message: 'bad source rejected' }] : [],
|
||||
warnings: [],
|
||||
}),
|
||||
) as never;
|
||||
|
||||
let currentSession: any = null;
|
||||
deps.toolsetFactory.createIngestWuToolset = vi.fn((toolSession: any) => {
|
||||
currentSession = toolSession;
|
||||
return { toRuntimeTools: vi.fn(() => ({})) };
|
||||
});
|
||||
deps.agentRunner.runLoop = vi.fn(async (params: any) => {
|
||||
if (params.telemetryTags.operationName !== 'ingest-bundle-wu') {
|
||||
return { stopReason: 'natural' };
|
||||
}
|
||||
const unitKey = params.telemetryTags.unitKey;
|
||||
const root = rootOfConfig(currentSession.configService, runtime.configDir);
|
||||
await mkdir(join(root, 'semantic-layer/warehouse'), { recursive: true });
|
||||
if (unitKey === 'wu-good') {
|
||||
await writeFile(join(root, 'semantic-layer/warehouse/good.yaml'), 'name: good\n', 'utf-8');
|
||||
addTouchedSlSource(currentSession.touchedSlSources, 'warehouse', 'good');
|
||||
currentSession.actions.push({
|
||||
target: 'sl',
|
||||
type: 'created',
|
||||
key: 'good',
|
||||
detail: 'good source',
|
||||
targetConnectionId: 'warehouse',
|
||||
rawPaths: ['good.raw'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['semantic-layer/warehouse/good.yaml'],
|
||||
'test: add good source',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
}
|
||||
if (unitKey === 'wu-bad') {
|
||||
await writeFile(join(root, 'semantic-layer/warehouse/bad.yaml'), 'name: bad\n', 'utf-8');
|
||||
addTouchedSlSource(currentSession.touchedSlSources, 'warehouse', 'bad');
|
||||
currentSession.actions.push({
|
||||
target: 'sl',
|
||||
type: 'created',
|
||||
key: 'bad',
|
||||
detail: 'bad source',
|
||||
targetConnectionId: 'warehouse',
|
||||
rawPaths: ['bad.raw'],
|
||||
});
|
||||
await currentSession.gitService.commitFiles(
|
||||
['semantic-layer/warehouse/bad.yaml'],
|
||||
'test: add bad source',
|
||||
'KTX Test',
|
||||
'system@ktx.local',
|
||||
);
|
||||
}
|
||||
return { stopReason: 'natural' };
|
||||
}) as never;
|
||||
|
||||
const runner = new IngestBundleRunner(deps);
|
||||
await mockStageRawFiles(
|
||||
runner,
|
||||
runtime,
|
||||
[
|
||||
['good.raw', 'good-hash'],
|
||||
['bad.raw', 'bad-hash'],
|
||||
],
|
||||
'fake',
|
||||
);
|
||||
|
||||
const result = await runner.run({
|
||||
jobId: 'job-failed-wu-isolated',
|
||||
connectionId: 'warehouse',
|
||||
sourceKey: 'fake',
|
||||
trigger: 'upload',
|
||||
bundleRef: { kind: 'upload', uploadId: 'upload' },
|
||||
});
|
||||
|
||||
expect(result.failedWorkUnits).toEqual(['wu-bad']);
|
||||
await expect(readFile(join(runtime.configDir, 'semantic-layer/warehouse/good.yaml'), 'utf-8')).resolves.toContain(
|
||||
'good',
|
||||
);
|
||||
await expect(readFile(join(runtime.configDir, 'semantic-layer/warehouse/bad.yaml'), 'utf-8')).rejects.toThrow();
|
||||
|
||||
const reportCreate = vi.mocked(deps.reports.create).mock.calls.at(-1)?.[0];
|
||||
const reportBody = reportCreate?.body as { isolatedDiff?: { acceptedPatches?: number }; failedWorkUnits?: string[] };
|
||||
expect(reportBody.failedWorkUnits).toEqual(['wu-bad']);
|
||||
expect(reportBody.isolatedDiff).toMatchObject({ enabled: true, acceptedPatches: 1 });
|
||||
|
||||
const trace = await readFile(
|
||||
join(runtime.configDir, '.ktx/ingest-traces/job-failed-wu-isolated/trace.jsonl'),
|
||||
'utf-8',
|
||||
);
|
||||
expect(trace).toContain('work_unit_failed_before_patch');
|
||||
expect(trace).toContain('patch_accepted');
|
||||
expect(trace).not.toContain('shared_worktree_path_enabled');
|
||||
} finally {
|
||||
await rm(runtime.homeDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the updated focused runner tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts \
|
||||
src/ingest/local-bundle-runtime.test.ts \
|
||||
-t "does not support shared-worktree|does not integrate failed isolated|defaults local bundle ingest|unlisted direct-writing source"
|
||||
```
|
||||
|
||||
Expected: PASS. The traces contain `isolated_diff_enabled`, child worktree
|
||||
events, and no `shared_worktree_path_enabled`.
|
||||
|
||||
- [ ] **Step 6: Run the broad runner suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/ingest-bundle.runner.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS. Broad runner coverage no longer depends on
|
||||
`sharedWorktreeSourceKeys`.
|
||||
|
||||
- [ ] **Step 7: Commit the runner removal**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add \
|
||||
packages/context/src/ingest/ports.ts \
|
||||
packages/context/src/ingest/local-bundle-runtime.ts \
|
||||
packages/context/src/ingest/local-bundle-runtime.test.ts \
|
||||
packages/context/src/ingest/ingest-bundle.runner.ts \
|
||||
packages/context/src/ingest/ingest-bundle.runner.test.ts \
|
||||
packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts \
|
||||
packages/context/src/ingest/isolated-diff/source-routing.ts \
|
||||
packages/context/src/ingest/isolated-diff/source-routing.test.ts
|
||||
git commit -m "refactor(ingest): remove shared worktree WorkUnit path"
|
||||
```
|
||||
|
||||
Expected: commit succeeds. The deleted routing files are included as deletions.
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Remove shared-branch agent instructions
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/prompts/memory_agent_bundle_ingest_work_unit.md`
|
||||
- Modify: `packages/context/skills/ingest_triage/SKILL.md`
|
||||
- Test: `packages/context/src/ingest/ingest-prompts.test.ts`
|
||||
- Test: `packages/context/src/ingest/ingest-runtime-assets.test.ts`
|
||||
|
||||
- [ ] **Step 1: Update the WorkUnit role text**
|
||||
|
||||
In `packages/context/prompts/memory_agent_bundle_ingest_work_unit.md`, replace
|
||||
the `<role>` block with:
|
||||
|
||||
```md
|
||||
<role>
|
||||
You are processing ONE WorkUnit of a multi-file ingest bundle. The WorkUnit
|
||||
gives you a slice of raw source files (LookML views, dbt/MetricFlow YAMLs,
|
||||
Metabase card JSONs, Notion pages, or similar) and you must translate that
|
||||
slice into KTX semantic-layer sources and/or knowledge wiki pages, in one pass.
|
||||
You run in an isolated WorkUnit worktree. Deterministic projection output,
|
||||
existing project memory, and listed dependency paths are visible; sibling
|
||||
WorkUnit edits from this same job are not visible until the runner integrates
|
||||
accepted patches.
|
||||
</role>
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Update the WorkUnit workflow text**
|
||||
|
||||
In the same prompt, replace workflow steps 2 and 4 with:
|
||||
|
||||
```md
|
||||
2. Load the per-source review skill first (for example `lookml_ingest`,
|
||||
`metricflow_ingest`, or `dbt_ingest`), then `sl_capture` and
|
||||
`wiki_capture`, and `ingest_triage` last. The triage skill tells you how to
|
||||
react when existing project memory, deterministic projection output, or
|
||||
prior provenance overlaps with what this WorkUnit is about to write.
|
||||
4. For each raw file: call `read_raw_file` (or `read_raw_span` for slicing large
|
||||
files) to load content. Before writing a new SL source or wiki page, call
|
||||
`discover_data` for each candidate source, table, metric, or topic name to
|
||||
find existing wiki pages, SL sources, deterministic projection output, prior
|
||||
sync artifacts, and raw warehouse matches; apply `ingest_triage` when you hit
|
||||
one, and apply any matching canonical pin before deciding whether to edit,
|
||||
rename, or skip.
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Update the WorkUnit do-not rule**
|
||||
|
||||
In the same prompt, replace:
|
||||
|
||||
```md
|
||||
- Do not silently accept a name collision with a prior WU's write when the formula differs. Trigger `ingest_triage`.
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```md
|
||||
- Do not silently accept a name collision with visible existing memory,
|
||||
deterministic projection output, or prior provenance when the formula differs.
|
||||
Trigger `ingest_triage`.
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Update ingest triage caller guidance**
|
||||
|
||||
In `packages/context/skills/ingest_triage/SKILL.md`, replace:
|
||||
|
||||
```md
|
||||
This skill is loaded in two contexts:
|
||||
- By a Stage 3 WorkUnit agent when `sl_discover` reveals that a prior WU (or a prior sync) already wrote something that overlaps with what the current WU is about to write.
|
||||
- By the Stage 4 reconciliation agent for cross-WU sweeps and for eviction decisions.
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```md
|
||||
This skill is loaded in two contexts:
|
||||
- By a Stage 3 WorkUnit agent when `sl_discover`, deterministic projection
|
||||
output, existing project memory, or prior provenance overlaps with what the
|
||||
current WorkUnit is about to write.
|
||||
- By the Stage 4 reconciliation agent for cross-WorkUnit sweeps, accepted patch
|
||||
overlap, and eviction decisions.
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Update same-ingest wording in ingest triage**
|
||||
|
||||
In `packages/context/skills/ingest_triage/SKILL.md`, replace:
|
||||
|
||||
```md
|
||||
4. **If there's no prior-sync row (both are from THIS job), check for same-ingest contradictions:**
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```md
|
||||
4. **If reconciliation sees accepted patches from this same job with no
|
||||
prior-sync row, check for same-ingest contradictions:**
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Search for stale shared-state prompt language**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "prior WU|prior-WU|Prior WorkUnits|same job may have already written|visible on the working branch|shared_worktree_path_enabled|shared-worktree path reachable" packages/context/prompts packages/context/skills packages/context/src/ingest
|
||||
```
|
||||
|
||||
Expected: FAIL with no matches.
|
||||
|
||||
- [ ] **Step 7: Run prompt asset tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-prompts.test.ts \
|
||||
src/ingest/ingest-runtime-assets.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS. Prompt assets still load from packaged KTX assets.
|
||||
|
||||
- [ ] **Step 8: Commit the prompt cleanup**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add \
|
||||
packages/context/prompts/memory_agent_bundle_ingest_work_unit.md \
|
||||
packages/context/skills/ingest_triage/SKILL.md
|
||||
git commit -m "docs(ingest): align WorkUnit prompts with isolated diffs"
|
||||
```
|
||||
|
||||
Expected: commit succeeds.
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Final verification
|
||||
|
||||
**Files:**
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.ts`
|
||||
- Verify: `packages/context/src/ingest/ports.ts`
|
||||
- Verify: `packages/context/src/ingest/local-bundle-runtime.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.test.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
- Verify: `packages/context/prompts/memory_agent_bundle_ingest_work_unit.md`
|
||||
- Verify: `packages/context/skills/ingest_triage/SKILL.md`
|
||||
|
||||
- [ ] **Step 1: Run the isolated-diff focused suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run \
|
||||
src/ingest/ingest-trace.test.ts \
|
||||
src/ingest/wiki-body-refs.test.ts \
|
||||
src/ingest/artifact-gates.test.ts \
|
||||
src/ingest/semantic-layer-target-policy.test.ts \
|
||||
src/ingest/isolated-diff/git-patch.test.ts \
|
||||
src/ingest/isolated-diff/work-unit-executor.test.ts \
|
||||
src/ingest/isolated-diff/patch-integrator.test.ts \
|
||||
src/ingest/isolated-diff/textual-conflict-resolver.test.ts \
|
||||
src/ingest/final-gate-repair.test.ts \
|
||||
src/ingest/report-snapshot.test.ts \
|
||||
src/ingest/ingest-bundle.runner.isolated-diff.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS. The output includes the isolated-diff runner tests and no
|
||||
`source-routing.test.ts`.
|
||||
|
||||
- [ ] **Step 2: Run the full context test suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run test
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run context type-check**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
```
|
||||
|
||||
Expected: PASS. There are no `sharedWorktreeSourceKeys` type errors because the
|
||||
setting no longer exists.
|
||||
|
||||
- [ ] **Step 4: Run dead-code checks**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm run dead-code
|
||||
```
|
||||
|
||||
Expected: PASS. Knip does not report deleted source-routing exports, and Biome
|
||||
does not report stale imports.
|
||||
|
||||
- [ ] **Step 5: Search for removed legacy path names**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "sharedWorktreeSourceKeys|defaultSharedWorktreeSourceKeys|isSharedWorktreeFallbackSourceKey|shared_worktree_path_enabled|explicit_private_fallback|source-routing" packages docs/superpowers/plans/2026-05-18-isolated-diff-ingestion-v1-shared-worktree-removal.md
|
||||
```
|
||||
|
||||
Expected: matches only in this plan file. There must be no matches under
|
||||
`packages/`.
|
||||
|
||||
- [ ] **Step 6: Confirm docs-site does not need an update**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "sharedWorktree|isolatedDiffSourceKeys|sharedWorktreeSourceKeys|executionMode|planningStrategy|conflictPolicy" docs-site README.md packages/*/README.md
|
||||
```
|
||||
|
||||
Expected: either no matches or matches unrelated to a public user-facing knob.
|
||||
This change removes an internal runner fallback and does not add, remove, or
|
||||
rename public CLI behavior, configuration, or docs-site content.
|
||||
|
||||
- [ ] **Step 7: Commit final verification notes if files changed**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
Expected: clean after the two implementation commits. If this command reports
|
||||
new changes, stop and inspect them before finishing; final verification should
|
||||
not create extra source changes.
|
||||
|
||||
## Self-review
|
||||
|
||||
Spec coverage:
|
||||
|
||||
- Rollout step 11 is covered by Tasks 1 through 4: the private fallback setting,
|
||||
helper module, old runner branch, trace event, and fallback tests are deleted.
|
||||
- The isolated-diff WorkUnit flow remains covered by existing real-git tests and
|
||||
the new failed-WorkUnit regression in Task 4.
|
||||
- Agent-facing instructions are aligned with the spec's worktree invariant in
|
||||
Task 5: sibling WorkUnit edits are not visible inside a child worktree.
|
||||
- Override ingestion remains outside the WorkUnit execution branch and still
|
||||
uses prior report materialization plus serial reconciliation.
|
||||
|
||||
Placeholder scan:
|
||||
|
||||
- This plan contains exact file paths, test names, replacement snippets,
|
||||
commands, and expected results.
|
||||
- There are no deferred implementation markers or unspecified edge-case
|
||||
instructions.
|
||||
|
||||
Type consistency:
|
||||
|
||||
- `IngestSettingsPort` no longer includes `sharedWorktreeSourceKeys`.
|
||||
- `isolatedDiffEnabled` remains the runner's internal summary flag and is
|
||||
equivalent to `!overrideReport`.
|
||||
- The removed trace event is `shared_worktree_path_enabled`; retained isolated
|
||||
events include `isolated_diff_enabled`, `work_unit_child_created`, and
|
||||
`work_unit_patch_collected`.
|
||||
|
||||
Execution handoff:
|
||||
|
||||
Plan complete and saved to
|
||||
`docs/superpowers/plans/2026-05-18-isolated-diff-ingestion-v1-shared-worktree-removal.md`.
|
||||
|
||||
Two execution options:
|
||||
|
||||
1. **Subagent-Driven (recommended)** - Dispatch a fresh subagent per task,
|
||||
review between tasks, and keep iteration fast.
|
||||
2. **Inline Execution** - Execute tasks in this session using
|
||||
`superpowers:executing-plans`, with batch execution and checkpoints.
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,612 @@
|
|||
# Isolated-diff ingestion design
|
||||
|
||||
**Date:** 2026-05-17
|
||||
**Author:** Andrey Avtomonov
|
||||
**Status:** Design - pending implementation plan
|
||||
|
||||
## Background
|
||||
|
||||
KTX ingests third-party context sources into durable project memory: raw source
|
||||
snapshots, wiki pages, semantic-layer sources, evidence documents, candidates,
|
||||
and fallback records. The current bundle runner stages raw source data in one
|
||||
ingestion session worktree, then runs work units against that same mutable
|
||||
worktree.
|
||||
|
||||
A Metabase ingestion run exposed the failure mode this design addresses. One
|
||||
work unit inferred and wrote the semantic-layer measure
|
||||
`mart_account_segments.total_contract_arr_cents`, a later work unit overwrote
|
||||
the same source with `total_contract_arr`, and the generated wiki page kept
|
||||
referencing the stale non-existent measure. The local per-work-unit checks did
|
||||
not catch the final cross-artifact inconsistency because durable writes were
|
||||
accepted into shared state before final integration.
|
||||
|
||||
The fix is not a Metabase-only validation patch. The same class of risk exists
|
||||
any time LLM-authored work units mutate durable wiki or semantic-layer files:
|
||||
Metabase cards, Notion pages and clusters, dbt YAML, MetricFlow YAML, Looker
|
||||
dashboards and explores, and LookML models and views can all produce overlapping
|
||||
or contested memory artifacts. KTX needs one ingestion execution model that
|
||||
isolates agent-authored changes, integrates them deliberately, and validates
|
||||
the final project state globally.
|
||||
|
||||
## Goals
|
||||
|
||||
This design creates one opinionated ingestion algorithm for all context sources.
|
||||
Connector-specific code stays responsible for source-shaped work: fetching raw
|
||||
data, normalizing raw files, planning work units, and optionally projecting
|
||||
deterministic facts. The shared runner owns execution correctness.
|
||||
|
||||
The design has these goals:
|
||||
|
||||
- Run all agent-authored durable writes in isolated per-work-unit worktrees.
|
||||
- Treat each work unit's git diff as its proposal artifact.
|
||||
- Integrate accepted diffs through a shared artifact-aware merge path.
|
||||
- Resolve expected cross-work-unit overlap with bounded agent repair before
|
||||
failing the run.
|
||||
- Run final global semantic gates before any changes reach the main project
|
||||
worktree.
|
||||
- Keep connector variance minimal and source-shaped, not pipeline-shaped.
|
||||
- Avoid proposal manifests, typed candidates, and extra reporting entities for
|
||||
the first implementation.
|
||||
- Preserve deterministic projections for source systems with authoritative
|
||||
structured metadata.
|
||||
|
||||
## Non-goals
|
||||
|
||||
This design does not change the wiki frontmatter schema, wiki page file layout,
|
||||
the semantic-layer YAML format, or the raw source snapshot layouts. It does add
|
||||
a narrow author-facing inline-code grammar for explicit wiki body references to
|
||||
semantic-layer entities and raw tables, because body text is part of the
|
||||
stale-reference failure class. It also does not remove source adapters' current
|
||||
fetch and chunk logic in one large rewrite.
|
||||
|
||||
This design does not introduce public connector knobs such as
|
||||
`executionMode`, `planningStrategy`, or `conflictPolicy`. The core runner
|
||||
becomes more opinionated instead.
|
||||
|
||||
This design does not require all connectors to stop using candidates. Candidate
|
||||
storage remains valid for flows that intentionally defer wiki curation. The
|
||||
isolation model applies when a work unit writes durable project files.
|
||||
|
||||
## Locked design direction
|
||||
|
||||
The ingestion runner uses one flow for every source that can produce durable
|
||||
changes.
|
||||
|
||||
```text
|
||||
fetch raw
|
||||
-> optional deterministic project
|
||||
-> adapter plans WorkUnit[]
|
||||
-> isolated WU diffs
|
||||
-> artifact-aware integration
|
||||
-> global semantic gates
|
||||
-> squash
|
||||
```
|
||||
|
||||
The important invariant is that the core runner does not know why a work unit
|
||||
exists. A dbt adapter may plan by model, Notion may plan by page or cluster,
|
||||
MetricFlow may plan by graph component, and Looker may plan by dashboard or
|
||||
explore. Those differences describe the source system. They are not ingestion
|
||||
execution modes.
|
||||
|
||||
## Architecture
|
||||
|
||||
The design splits ingestion into two layers with explicit responsibility
|
||||
boundaries.
|
||||
|
||||
### Source adapter layer
|
||||
|
||||
The adapter owns source semantics. It fetches raw evidence, normalizes that
|
||||
evidence into staged files, and plans work units from the staged snapshot and
|
||||
diff scope.
|
||||
|
||||
The adapter may also provide deterministic projectors. A projector is code that
|
||||
converts authoritative source facts into KTX artifacts without an agent. Good
|
||||
examples are live database schema introspection and straightforward MetricFlow
|
||||
semantic-model import.
|
||||
|
||||
The isolation-relevant adapter surface remains small:
|
||||
|
||||
```ts
|
||||
interface SourceAdapter {
|
||||
source: string;
|
||||
skillNames: string[];
|
||||
|
||||
fetch?(pullConfig: unknown, stagedDir: string, ctx: FetchContext): Promise<void>;
|
||||
chunk(stagedDir: string, diffSet?: DiffSet): Promise<ChunkResult>;
|
||||
|
||||
project?(ctx: DeterministicProjectionContext): Promise<ProjectionResult>;
|
||||
resolveSlTargets?(ctx: SlTargetResolutionContext): Promise<string[]>;
|
||||
}
|
||||
```
|
||||
|
||||
This is the subset the isolated-diff runner needs to understand source-shaped
|
||||
planning and deterministic projection. It is not a proposal to delete existing
|
||||
`SourceAdapter` fields. Existing lifecycle and source-support fields such as
|
||||
`detect`, `readFetchReport`, `listTargetConnectionIds`, `clusterWorkUnits`,
|
||||
`describeScope`, `onPullSucceeded`, `evidenceIndexing`, `triageSupported`,
|
||||
`getTriageSignals`, and `reconcileSkillNames` stay part of the adapter contract
|
||||
until a separate cleanup intentionally removes them with migration impact
|
||||
called out.
|
||||
|
||||
`chunk()` returns ordinary `WorkUnit[]`. The runner does not need a
|
||||
`planningStrategy` enum because the source adapter can plan by any domain shape
|
||||
that makes sense.
|
||||
|
||||
### Ingestion execution layer
|
||||
|
||||
The runner owns correctness, isolation, and integration. After `WorkUnit[]`
|
||||
exists, all connectors follow the same execution path.
|
||||
|
||||
The runner is responsible for:
|
||||
|
||||
- creating the ingestion integration worktree from the project base commit;
|
||||
- committing deterministic projection in the integration worktree before child
|
||||
worktree creation;
|
||||
- creating one child worktree per work unit from the post-projection ingestion
|
||||
base commit;
|
||||
- scoping tools to the work unit's raw files and allowed target connections;
|
||||
- running the agent loop inside the work unit worktree;
|
||||
- validating touched artifacts before accepting the work unit diff;
|
||||
- collecting the work unit git diff;
|
||||
- applying accepted diffs into the integration worktree;
|
||||
- resolving textual and artifact-level conflicts;
|
||||
- running final global gates; and
|
||||
- squashing the integration worktree back to the project main worktree.
|
||||
|
||||
## Worktree model
|
||||
|
||||
The design uses three levels of git state.
|
||||
|
||||
```text
|
||||
project main worktree
|
||||
ingest integration worktree
|
||||
per-work-unit worktree(s)
|
||||
```
|
||||
|
||||
The project main worktree is the durable KTX project state. The ingestion
|
||||
integration worktree stages raw snapshots, deterministic projections, accepted
|
||||
work-unit diffs, reconciliation changes, and final gate repairs before one
|
||||
squash merge back to main.
|
||||
|
||||
Deterministic projection runs first in the integration worktree, after the raw
|
||||
snapshot is staged and before any per-work-unit worktree is created. The runner
|
||||
commits those projector changes as a single projection commit. The integration
|
||||
worktree's post-projection HEAD is the ingestion base commit referenced in this
|
||||
design. If the adapter has no projector, the raw-snapshot commit is the
|
||||
ingestion base commit.
|
||||
|
||||
Each per-work-unit worktree starts from the same ingestion base commit. A work
|
||||
unit never observes another concurrent work unit's transient edits. This makes
|
||||
the work unit diff a clean proposal against a stable base. Work units observe
|
||||
deterministic projection outputs, including through `dependencyPaths` context,
|
||||
and do not re-derive authoritative projected facts.
|
||||
|
||||
The integration worktree and each per-work-unit worktree must share one Git
|
||||
object database, created through `git worktree add` from the same repository.
|
||||
This is required so `git apply --3way` can resolve the base blobs recorded in
|
||||
each work-unit patch during integration.
|
||||
|
||||
The runner creates and runs child worktrees under the existing
|
||||
`workUnitMaxConcurrency` setting. A run may have many planned work units, but no
|
||||
more than that bound may be active or left on disk at once. The default remains
|
||||
serial execution. Child worktrees must be cleaned up after the diff, transcript,
|
||||
and outcome metadata are persisted, including failure paths. Adapters with
|
||||
large fan-out, such as Notion, may use `clusterWorkUnits` before execution to
|
||||
keep work-unit count tractable, but clustering remains source-shaped planning
|
||||
rather than a separate execution mode.
|
||||
|
||||
## Work-unit lifecycle
|
||||
|
||||
Each work unit follows a fixed lifecycle.
|
||||
|
||||
1. Create a child worktree at the ingestion base commit.
|
||||
2. Build a scoped tool session for the child worktree.
|
||||
3. Run the source skill and agent loop.
|
||||
4. Run work-unit-local gates against touched artifacts.
|
||||
5. If gates pass, record `git diff --binary` from base to child HEAD.
|
||||
6. If gates fail, mark the work unit failed and discard the child worktree.
|
||||
7. Clean up the child worktree after the diff and transcript are persisted.
|
||||
|
||||
The work unit outcome stores the existing operational metadata KTX already
|
||||
records: unit key, status, actions, touched semantic-layer sources, failure
|
||||
reason, raw files, and transcript path. It does not add a proposal manifest.
|
||||
The diff is the proposal.
|
||||
|
||||
For `slDisallowed` work units, isolation is defense in depth. The scoped
|
||||
work-unit tools must withhold semantic-layer write and edit tools, and the
|
||||
integration layer must reject any otherwise accepted diff from that work unit
|
||||
that touches `semantic-layer/**`. This catches buggy or bypassed tool behavior
|
||||
before an invalid LookML connection-mismatch write can reach the integration
|
||||
worktree.
|
||||
|
||||
### Diff proposal contract
|
||||
|
||||
The proposal artifact is a Git patch with binary-safe content, not the existing
|
||||
hash-based raw-source `DiffSet`.
|
||||
|
||||
The first implementation must use one pinned patch contract:
|
||||
|
||||
- collect `git diff --binary --no-renames <base>..HEAD`;
|
||||
- disable rename and copy detection so renames are represented as delete plus
|
||||
create in version one;
|
||||
- preserve mode changes from the patch metadata, but reject unexpected
|
||||
executable-mode or binary changes under known text artifact roots such as
|
||||
`wiki/**` and `semantic-layer/**`;
|
||||
- apply each accepted patch to the integration worktree with
|
||||
`git apply --3way --index`;
|
||||
- do not use `git apply --reject`, because partial hunk application is not an
|
||||
accepted integration state; and
|
||||
- if patch application fails, leaves conflicts, or touches a path disallowed for
|
||||
that work unit, roll back the integration worktree to its pre-apply HEAD and
|
||||
classify the outcome as a textual conflict.
|
||||
|
||||
Delete-versus-edit, recreate-versus-edit, and delete-versus-create races are
|
||||
therefore textual conflicts when Git cannot apply the patch cleanly. If Git
|
||||
applies the patch but known artifact validators reject the resulting tree, the
|
||||
outcome is a semantic conflict.
|
||||
|
||||
## Integration lifecycle
|
||||
|
||||
The integration worktree applies accepted work-unit diffs after local gates
|
||||
pass. The runner applies diffs in a deterministic order, using the original
|
||||
work-unit index unless a future implementation introduces explicit dependency
|
||||
ordering.
|
||||
|
||||
Integration has three conflict classes:
|
||||
|
||||
- Clean patch application: the diff applies without conflict.
|
||||
- Textual conflict: git cannot apply the patch cleanly.
|
||||
- Semantic conflict: the patch applies textually but creates an invalid or
|
||||
inconsistent artifact.
|
||||
|
||||
Textual conflicts are resolved before semantic gates run when a bounded
|
||||
resolver agent can produce a valid result. Overlapping work-unit writes are
|
||||
normal, especially for Metabase cards that target the same semantic-layer marts
|
||||
from different collections. The runner must treat overlap as an integration
|
||||
case, not as a reason to fail immediately.
|
||||
|
||||
Version one is agent-first. If `git apply --3way --index` leaves conflicts,
|
||||
the runner starts a resolver agent in the integration worktree. The resolver
|
||||
receives only the failed patch, already-applied patches, conflicted files,
|
||||
relevant work-unit transcripts, raw evidence paths, and the final-gate rules.
|
||||
The resolver must preserve all non-conflicting accepted content, resolve
|
||||
duplicate or competing artifact entries from evidence, and edit only files
|
||||
touched by the failed patch or already-applied overlapping patches.
|
||||
|
||||
The runner then reruns artifact gates for the changed files and continues with
|
||||
the remaining patches if validation passes. Resolver attempts are capped to
|
||||
avoid an unbounded repair loop. A run fails only after the bounded resolver
|
||||
attempts cannot produce a valid integration tree.
|
||||
|
||||
Deterministic semantic merge is a later optimization, not a version-one
|
||||
requirement. After measuring resolver latency, cost, and failure modes, KTX can
|
||||
add merge helpers for common semantic-layer YAML cases, such as additive
|
||||
`measures`, `segments`, `columns`, `joins`, and `descriptions` updates keyed by
|
||||
their stable logical identifiers. Those helpers can replace agent calls for
|
||||
mechanical merges once the measured v1 behavior justifies the added complexity.
|
||||
|
||||
The integration worktree is preserved on failure with conflict markers or
|
||||
resolver edits, work-unit patches, transcripts, trace events, and the failure
|
||||
report. The runner never squashes a failed or partially repaired integration
|
||||
tree back to the project main worktree.
|
||||
|
||||
### Gate repair stage
|
||||
|
||||
The gate repair stage handles cases where patches apply cleanly but the
|
||||
combined tree fails final semantic or wiki gates. This is distinct from textual
|
||||
conflict resolution: the tree is textually valid, but the artifacts violate KTX
|
||||
contracts.
|
||||
|
||||
After each patch integration and after reconciliation, the runner runs final
|
||||
artifact gates for the affected scope. If gates fail, the runner classifies the
|
||||
errors before deciding whether to repair or fail.
|
||||
|
||||
Repairable gate errors include:
|
||||
|
||||
- stale wiki body references to renamed semantic-layer entities;
|
||||
- invalid `sl_refs` entries that point to entities instead of sources;
|
||||
- inline prose that accidentally uses explicit SL reference syntax;
|
||||
- duplicate measures, segments, or joins with equivalent definitions;
|
||||
- missing or stale wiki references created by accepted patches; and
|
||||
- join or source references that can be corrected from the composed manifest
|
||||
and work-unit evidence.
|
||||
|
||||
High-risk gate errors fail without automatic repair unless a later
|
||||
implementation adds a stronger evidence contract:
|
||||
|
||||
- two work units define the same measure with different business meaning;
|
||||
- a required warehouse table or column does not exist;
|
||||
- a SQL source fails execution and no obvious localized rewrite exists; or
|
||||
- the repair would require choosing between conflicting facts without evidence.
|
||||
|
||||
For repairable errors, the runner starts a gate repair agent with the exact
|
||||
gate errors, changed files, relevant work-unit transcripts, raw evidence paths,
|
||||
and final-gate rules. The agent may edit only the files involved in the gate
|
||||
failure. The runner reruns gates after each repair attempt and caps attempts to
|
||||
one or two passes per integration stage. If the tree still fails, the run stops
|
||||
with the final gate report and preserved integration worktree.
|
||||
|
||||
### Reconciliation in the new flow
|
||||
|
||||
Reconciliation remains a shared runner stage, but it runs as a serial
|
||||
integration-stage pass instead of a parallel work unit.
|
||||
|
||||
The runner applies all accepted work-unit diffs to the integration worktree,
|
||||
resolves textual conflicts that can be resolved, and then runs reconciliation in
|
||||
that integration worktree before final global gates and before squash.
|
||||
Reconciliation must see the integrated state because its job is to resolve
|
||||
cross-work-unit duplicates, evictions, fallbacks, and source-specific
|
||||
reconcile guidance.
|
||||
|
||||
Reconciliation runs exactly once per integration pass, serially against the
|
||||
integration worktree, after all accepted work-unit diffs have been applied and
|
||||
after textual conflicts are resolved. It never runs inside a child worktree and
|
||||
never overlaps with work-unit execution. This is the safety carve-out from the
|
||||
isolation goal: concurrent agent writes are the failure mode being avoided, and
|
||||
reconciliation is non-concurrent by construction.
|
||||
|
||||
Reconciliation is not allowed to mutate project main directly. Its changes are
|
||||
captured as a reconciliation diff against the pre-reconciliation integration
|
||||
HEAD and recorded in the existing stage/report metadata. Reconciliation gates
|
||||
validate the artifacts touched by the reconciliation diff plus any wiki page or
|
||||
semantic-layer source referenced by changed frontmatter or body references,
|
||||
using the same artifact-class validators as work-unit gates. Reconciliation may
|
||||
write only to target connections authorized by the adapter for the ingest run,
|
||||
but it is not subject to any single work unit's `slDisallowed` scope. The final
|
||||
global gates validate the combined tree after reconciliation. If reconciliation
|
||||
introduces an invalid wiki or semantic-layer reference, touches an unauthorized
|
||||
target, or records an unresolvable artifact conflict, the runner sends
|
||||
repairable failures through the gate repair stage and stops before squash only
|
||||
when bounded repair cannot produce a valid tree.
|
||||
|
||||
## Artifact-aware integration
|
||||
|
||||
KTX durable artifacts are structured enough that git-only merge is not a strong
|
||||
correctness boundary. Artifact-aware integration must parse and validate known
|
||||
file classes after diffs are applied.
|
||||
|
||||
The first implementation must cover these worktree file classes:
|
||||
|
||||
- semantic-layer source YAML;
|
||||
- wiki markdown frontmatter;
|
||||
- wiki body references to semantic-layer sources, measures, dimensions, and raw
|
||||
warehouse tables.
|
||||
|
||||
Unmapped fallback records are not worktree files in version one. They remain
|
||||
typed stage-index and report records emitted by `emit_unmapped_fallback`; the
|
||||
integration layer validates their raw paths and structured reason codes as
|
||||
report metadata, not as mergeable artifacts.
|
||||
|
||||
Provenance also stays out of the worktree in version one. The source of truth is
|
||||
the ingest provenance store and report body. Before inserting provenance rows,
|
||||
the global gate derives the planned rows from accepted work-unit actions,
|
||||
reconciliation actions, artifact-resolution records, and skipped raw files, then
|
||||
checks those rows against the integrated worktree and staged raw hashes. Moving
|
||||
provenance to on-disk files would be a separate schema migration, not part of
|
||||
this design.
|
||||
|
||||
Artifact-resolution records are the existing merged or subsumed reconciliation
|
||||
outputs emitted through `emit_artifact_resolution` as
|
||||
`ArtifactResolutionRecord` stage-index records. They are in-memory stage
|
||||
records, not worktree files, and they feed the provenance gate.
|
||||
|
||||
Artifact-aware integration starts with validation plus bounded agent repair.
|
||||
It does not need semantic-layer YAML merge helpers in version one. If two diffs
|
||||
contest the same source YAML or wiki page and bounded agent repair cannot prove
|
||||
correctness, the runner must stop rather than silently accepting stale
|
||||
references. Deterministic semantic merge helpers can be added after v1 metrics
|
||||
show which conflicts are frequent, mechanical, and worth optimizing.
|
||||
|
||||
## Global semantic gates
|
||||
|
||||
Final gates run after every accepted diff, deterministic projection, and
|
||||
reconciliation change has landed in the integration worktree. These gates are
|
||||
global because the final failure can emerge only after independent valid diffs
|
||||
combine.
|
||||
|
||||
The final gates must include:
|
||||
|
||||
- semantic-layer validation for touched and dependency sources;
|
||||
- wiki `wiki_refs` validation;
|
||||
- wiki frontmatter `sl_refs` validation, including source-level and
|
||||
measure-level references;
|
||||
- wiki body validation for explicit semantic-layer source, measure, dimension,
|
||||
and table references; and
|
||||
- provenance validation for raw paths referenced by new or changed artifacts
|
||||
before those rows are inserted into SQLite.
|
||||
|
||||
For semantic-layer validation, touched sources are sources changed by accepted
|
||||
work-unit diffs, deterministic projection, or reconciliation. Dependency sources
|
||||
are their direct declared-join neighbors in the composed semantic-layer graph,
|
||||
including sources they join to and sources that join to them. Version one runs
|
||||
the existing whole-connection structural checks and source-scoped checks with
|
||||
the touched-and-dependency source set; it does not expand dependency scope to a
|
||||
transitive SQL-projection closure.
|
||||
|
||||
The wiki body gate needs a narrow grammar so ordinary prose does not become a
|
||||
semantic-layer reference. In version one, an explicit body reference is one of
|
||||
these Markdown forms outside fenced code blocks:
|
||||
|
||||
- an inline code token in the form `source.entity`, where both parts are plain
|
||||
identifier tokens, `source` matches a visible semantic-layer source, and
|
||||
`entity` must match one of that source's measures, dimensions, or segments;
|
||||
- an inline code token in the form `connectionId/source.entity`, where
|
||||
`source.entity` follows the same plain-identifier rule and validates against
|
||||
that specific target connection;
|
||||
- an inline code token in the form `source:source_name`, which validates a
|
||||
source-level semantic-layer reference; or
|
||||
- an inline code token in the form `table:qualified_table_name`, which validates
|
||||
a raw warehouse table reference against the visible raw table/catalog sources.
|
||||
|
||||
The parser ignores unformatted prose, fenced SQL examples, wildcard patterns
|
||||
such as `mart_nrr_quarterly.*_arr_cents`, inline SQL predicates such as
|
||||
`users.is_internal = false`, and unprefixed single-token inline code. Two-part
|
||||
inline code that does not name a visible semantic-layer source is not treated
|
||||
as an SL entity reference; use the `table:` prefix for raw warehouse table
|
||||
references.
|
||||
|
||||
The `total_contract_arr_cents` incident is the regression case for this gate:
|
||||
the integrated tree must fail if a wiki page references
|
||||
`mart_account_segments.total_contract_arr_cents` as an inline-code body token
|
||||
while the final semantic-layer source defines only `total_contract_arr`.
|
||||
|
||||
## Deterministic projection
|
||||
|
||||
Some connectors have authoritative structured inputs that do not need an LLM to
|
||||
write KTX artifacts. Those connectors can provide deterministic projectors that
|
||||
run in the integration worktree.
|
||||
|
||||
Projection is different from work-unit execution:
|
||||
|
||||
- projectors are code, not agents;
|
||||
- projectors run against the integration worktree;
|
||||
- projectors produce ordinary durable file changes; and
|
||||
- projector outputs still pass final global gates.
|
||||
|
||||
The runner infers hybrid behavior from the adapter. If an adapter has both
|
||||
projectors and work units, it is hybrid. If it has only projectors, it is
|
||||
deterministic. If it has only work units, it uses isolated diffs. No public
|
||||
`executionMode` knob is needed.
|
||||
|
||||
## Connector migration notes
|
||||
|
||||
Each connector keeps its source-shaped planning logic. The migration changes
|
||||
where durable writes happen and how they are integrated.
|
||||
|
||||
### Metabase
|
||||
|
||||
Metabase must move first because it produced the observed stale-measure wiki
|
||||
reference. Collection and card chunking can remain adapter-specific, but direct
|
||||
wiki and semantic-layer writes must happen in per-work-unit worktrees.
|
||||
|
||||
The regression test must reproduce two work units that touch
|
||||
`mart_account_segments`: one writes a wiki reference to an inferred measure and
|
||||
another leaves the final source with a different measure name. The final global
|
||||
gate must reject the integrated tree.
|
||||
|
||||
### dbt
|
||||
|
||||
dbt uses source-shaped planning by model or schema file. Deterministic
|
||||
projection is appropriate for straightforward model, source, column, and
|
||||
description facts when dbt artifacts are authoritative. Agent work units remain
|
||||
useful for business wiki synthesis, ambiguous relationship interpretation, and
|
||||
enrichment that is not directly represented in dbt YAML.
|
||||
|
||||
### MetricFlow
|
||||
|
||||
MetricFlow uses source-shaped planning by graph component. Existing
|
||||
deterministic semantic-model import code becomes a projector in the ingestion
|
||||
flow. Agent work units handle unsupported constructs, cross-model explanations,
|
||||
and wiki synthesis.
|
||||
|
||||
### Looker
|
||||
|
||||
Looker already defers some dashboard and look knowledge through candidates.
|
||||
That can continue. Any direct semantic-layer writes from explores or query
|
||||
translation must run through isolated work-unit diffs.
|
||||
|
||||
Looker-specific API and file-adapter collisions remain connector domain logic,
|
||||
but final correctness still belongs to the shared integration gates.
|
||||
|
||||
### LookML
|
||||
|
||||
LookML already has useful source-shaped ownership rules: models, views, orphan
|
||||
views, dashboards, and connection-mismatch guards. Those rules stay in the
|
||||
adapter. Direct semantic-layer writes move into isolated work-unit diffs.
|
||||
|
||||
Connection-mismatch work units can keep their existing write restrictions. The
|
||||
runner enforces those restrictions through scoped tools and target connection
|
||||
resolution.
|
||||
|
||||
### Notion
|
||||
|
||||
Notion pages and clusters can create overlapping durable wiki knowledge and can
|
||||
write semantic-layer overlays after warehouse verification. Notion therefore
|
||||
uses the same isolated-diff execution model for direct durable writes.
|
||||
|
||||
Large Notion workspaces still need source-shaped clustering to control context
|
||||
size and cost. Clustering remains adapter logic; correctness comes from isolated
|
||||
diffs and final global gates.
|
||||
|
||||
## Minimal connector variance
|
||||
|
||||
New connectors must not choose from a menu of ingestion architectures. They
|
||||
must provide the small amount of source-specific behavior the shared runner
|
||||
needs.
|
||||
|
||||
Every connector answers these questions:
|
||||
|
||||
- How does KTX fetch or receive raw evidence?
|
||||
- How does KTX normalize that evidence into staged files?
|
||||
- How does KTX split the staged evidence into `WorkUnit[]`?
|
||||
- Are any source facts authoritative enough for deterministic projection?
|
||||
- Which target semantic-layer connections can the connector write to?
|
||||
|
||||
Everything else is shared runner behavior.
|
||||
|
||||
## Regression tests
|
||||
|
||||
The implementation plan must start with narrow tests that prove the new
|
||||
execution model prevents the known failure class.
|
||||
|
||||
The first test creates a fake or Metabase-like adapter with two work units
|
||||
starting from the same base:
|
||||
|
||||
1. Work unit A writes a wiki page that references
|
||||
`mart_account_segments.total_contract_arr_cents` as an inline-code body
|
||||
token.
|
||||
2. Work unit B writes or overwrites the final semantic-layer source with only
|
||||
`total_contract_arr`.
|
||||
3. Both work units pass their local gates in isolation.
|
||||
4. Integration applies both diffs.
|
||||
5. The final global gate fails the run before squash.
|
||||
|
||||
Additional tests cover:
|
||||
|
||||
- two work units editing different wiki pages without conflict;
|
||||
- two work units editing the same semantic-layer overlay with additive changes,
|
||||
where the resolver agent preserves both changes and gates the repaired file;
|
||||
- two work units editing the same semantic-layer overlay with incompatible
|
||||
definitions, where the resolver agent receives the conflict context and the
|
||||
run fails only after bounded repair attempts cannot prove a result;
|
||||
- a textual conflict in a wiki page where the resolver agent preserves
|
||||
non-conflicting accepted content and gates the repaired page before squash;
|
||||
- a cleanly merged tree that fails final gates, where the gate repair agent
|
||||
fixes a stale wiki reference and the run continues;
|
||||
- an unrepairable final-gate failure, such as a missing warehouse column, where
|
||||
the runner stops with a preserved integration worktree and report;
|
||||
- a hybrid adapter case where deterministic projector outputs are visible in a
|
||||
child worktree before work-unit wiki synthesis, and the final global gate
|
||||
catches any stale reference to a non-existent projected semantic-layer entity;
|
||||
- Notion-style direct wiki writes with invalid `sl_refs`; and
|
||||
- LookML-style `slDisallowed` work units where write tools are unavailable and
|
||||
integration rejects any diff that still touches `semantic-layer/**`.
|
||||
|
||||
## Rollout
|
||||
|
||||
The rollout must be incremental because the current runner is shared by all
|
||||
adapters.
|
||||
|
||||
The rollout switch is runner-owned. During migration it may be a private
|
||||
per-source allowlist, or an internal `IngestSettingsPort` map keyed by
|
||||
`sourceKey`, but it must not become a `SourceAdapter` field or public connector
|
||||
configuration knob.
|
||||
|
||||
1. Add the per-work-unit worktree executor behind that internal runner setting.
|
||||
2. Add diff collection and deterministic integration in the existing runner.
|
||||
3. Add bounded resolver-agent handling for textual conflicts.
|
||||
4. Add final global wiki and semantic-layer reference gates, including the wiki
|
||||
body reference parser defined above.
|
||||
5. Add bounded gate-repair-agent handling for repairable final-gate failures.
|
||||
6. Instrument resolver latency, attempts, repaired files, and failure classes.
|
||||
7. Migrate Metabase to the new execution path first.
|
||||
8. Migrate Notion, LookML, Looker, dbt, and MetricFlow.
|
||||
9. Add deterministic semantic merge helpers only after v1 metrics show which
|
||||
agent repairs are frequent and mechanical enough to justify optimization.
|
||||
10. Promote the new path to the default after the Metabase regression test and
|
||||
at least one non-Metabase connector pass.
|
||||
11. Remove the old shared-worktree work-unit execution path.
|
||||
|
||||
The rollout is complete when every connector that permits agent-authored durable
|
||||
writes uses isolated diffs and all integrations pass the same final global
|
||||
gates.
|
||||
Loading…
Add table
Add a link
Reference in a new issue