mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-19 08:28:06 +02:00
Move docs changes to specs repo
This commit is contained in:
parent
2403f58eff
commit
573dfc20f0
8 changed files with 4 additions and 2793 deletions
|
|
@ -1,320 +0,0 @@
|
|||
# Adapter-Owned Finalization V1 Closure Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Close the remaining adapter-owned finalization v1 verification gaps so the finalization contract is publicly typed and the historic-SQL local acceptance path passes through `SourceAdapter.finalize()`.
|
||||
|
||||
**Architecture:** The production runner already owns finalization execution, commits, target policy, final gates, reports, traces, and provenance. This plan keeps production behavior intact, exports the finalization adapter types through the ingest barrel, and updates the local historic-SQL acceptance fixture to model the real adapter-owned finalization path instead of the removed post-processor path.
|
||||
|
||||
**Tech Stack:** TypeScript ESM/NodeNext, Vitest, pnpm workspace commands, existing `SourceAdapter`, `projectHistoricSqlEvidence()`, and package export coverage.
|
||||
|
||||
---
|
||||
|
||||
## Audit summary
|
||||
|
||||
The audit compared
|
||||
`docs/superpowers/specs/2026-05-18-adapter-owned-ingest-finalization-design.md`
|
||||
against the implemented source, plan, and targeted tests.
|
||||
|
||||
Implemented v1 coverage:
|
||||
|
||||
- `SourceAdapter.finalize()` exists with typed context and result objects in
|
||||
`packages/context/src/ingest/types.ts`.
|
||||
- `IngestBundleRunnerDeps.postProcessors`, `IngestBundlePostProcessorPort`,
|
||||
`HistoricSqlProjectionPostProcessor`, `post_processor` trace phases, and
|
||||
`postProcessor` report fields are absent from production source.
|
||||
- The runner invokes finalization after reconciliation and before
|
||||
`wiki_sl_ref_repair`, target-policy checks, final artifact gates,
|
||||
provenance validation, and squash.
|
||||
- The runner derives finalization touched paths from the integration-worktree
|
||||
diff, resolves semantic-layer scope including `_schema/*.yaml`, cross-checks
|
||||
adapter declarations, commits finalization, records reports/traces, rejects
|
||||
path overlap, and partitions finalization actions for provenance exclusions.
|
||||
- Override replay passes explicit `overrideReplay` metadata, omits
|
||||
`parseArtifacts`, and leaves current-run `workUnitOutcomes` empty.
|
||||
- Historic SQL implements adapter-owned `finalize()` and uses
|
||||
`projectHistoricSqlEvidence()` for aggregate projection maintenance.
|
||||
|
||||
V1-blocking gaps:
|
||||
|
||||
- `packages/context/src/ingest/index.ts` exports `SourceAdapter` and projection
|
||||
types, but not `DeterministicFinalizationContext`,
|
||||
`FinalizationOverrideReplay`, or `FinalizationResult`. The adapter contract is
|
||||
less usable from the public ingest barrel than the spec requires.
|
||||
- The targeted verification command currently fails because
|
||||
`HistoricSqlEvidenceTestAdapter` in
|
||||
`packages/context/src/ingest/local-bundle-ingest.test.ts` lacks
|
||||
`finalize()`, so `result.report.body.finalization` is `undefined` in the
|
||||
local historic-SQL projection acceptance test.
|
||||
|
||||
Non-blocking gaps:
|
||||
|
||||
- Older historical plan documents still mention post-processors. They are
|
||||
archived implementation history and do not affect runtime behavior.
|
||||
- The runner has helper-level declaration mismatch coverage, but no dedicated
|
||||
local-bundle integration test for a finalization declaration mismatch. The
|
||||
implementation path exists; adding a higher-level regression test can be a
|
||||
later hardening pass.
|
||||
- Finalization wiki page deletion could use a future global wiki-reference gate
|
||||
regression. Historic-SQL v1 finalization updates or archives pages in place,
|
||||
so this is not required for the current v1 acceptance path.
|
||||
|
||||
## File structure
|
||||
|
||||
- Modify `packages/context/src/ingest/index.ts`.
|
||||
Re-export the typed finalization adapter contract next to the existing
|
||||
projection contract.
|
||||
- Modify `packages/context/src/package-exports.test.ts`.
|
||||
Add compile-time coverage proving finalization adapter types are exported
|
||||
from the ingest barrel.
|
||||
- Modify `packages/context/src/ingest/local-bundle-ingest.test.ts`.
|
||||
Make the historic-SQL local acceptance test adapter implement
|
||||
`finalize()` by delegating to `projectHistoricSqlEvidence()`, and rename the
|
||||
stale test label from post-processor to finalization.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Export finalization adapter contract types
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/package-exports.test.ts`
|
||||
- Modify: `packages/context/src/ingest/index.ts`
|
||||
|
||||
- [ ] **Step 1: Write failing type export coverage**
|
||||
|
||||
In `packages/context/src/package-exports.test.ts`, add this import after the
|
||||
existing Vitest import:
|
||||
|
||||
```ts
|
||||
import type {
|
||||
DeterministicFinalizationContext,
|
||||
FinalizationOverrideReplay,
|
||||
FinalizationResult,
|
||||
} from './ingest/index.js';
|
||||
```
|
||||
|
||||
Then add this constant after `scanTypeExportCoverage`:
|
||||
|
||||
```ts
|
||||
const ingestFinalizationTypeExportCoverage: Partial<{
|
||||
context: DeterministicFinalizationContext;
|
||||
overrideReplay: FinalizationOverrideReplay;
|
||||
result: FinalizationResult;
|
||||
}> = {};
|
||||
```
|
||||
|
||||
Inside the existing package export test, place this assertion immediately after
|
||||
`expect(scanTypeExportCoverage).toEqual({});`:
|
||||
|
||||
```ts
|
||||
expect(ingestFinalizationTypeExportCoverage).toEqual({});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run type-check to verify the coverage fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
```
|
||||
|
||||
Expected: FAIL with TypeScript errors like:
|
||||
|
||||
```text
|
||||
Module '"./ingest/index.js"' has no exported member 'DeterministicFinalizationContext'.
|
||||
Module '"./ingest/index.js"' has no exported member 'FinalizationOverrideReplay'.
|
||||
Module '"./ingest/index.js"' has no exported member 'FinalizationResult'.
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Export the finalization types**
|
||||
|
||||
In `packages/context/src/ingest/index.ts`, update the existing export block
|
||||
from `./types.js` so the final lines read:
|
||||
|
||||
```ts
|
||||
WorkUnit,
|
||||
DeterministicProjectionContext,
|
||||
ProjectionResult,
|
||||
DeterministicFinalizationContext,
|
||||
FinalizationOverrideReplay,
|
||||
FinalizationResult,
|
||||
} from './types.js';
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run type-check and package export coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
pnpm --filter @ktx/context exec vitest run src/package-exports.test.ts
|
||||
```
|
||||
|
||||
Expected: both commands PASS.
|
||||
|
||||
- [ ] **Step 5: Commit the type export closure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/index.ts packages/context/src/package-exports.test.ts
|
||||
git commit -m "feat(ingest): export finalization adapter contract types"
|
||||
```
|
||||
|
||||
### Task 2: Repair the local historic-SQL finalization acceptance fixture
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/context/src/ingest/local-bundle-ingest.test.ts`
|
||||
|
||||
- [ ] **Step 1: Import the projection helper and finalization types**
|
||||
|
||||
In `packages/context/src/ingest/local-bundle-ingest.test.ts`, add this import
|
||||
after the fake adapter import:
|
||||
|
||||
```ts
|
||||
import { projectHistoricSqlEvidence } from './adapters/historic-sql/projection.js';
|
||||
```
|
||||
|
||||
Replace the existing type import from `./types.js` with:
|
||||
|
||||
```ts
|
||||
import type {
|
||||
ChunkResult,
|
||||
DeterministicFinalizationContext,
|
||||
DiffSet,
|
||||
FinalizationResult,
|
||||
SourceAdapter,
|
||||
} from './types.js';
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add adapter-owned finalization to the test adapter**
|
||||
|
||||
In `HistoricSqlEvidenceTestAdapter`, add this method after `chunk()`:
|
||||
|
||||
```ts
|
||||
async finalize(ctx: DeterministicFinalizationContext): Promise<FinalizationResult> {
|
||||
const projection = await projectHistoricSqlEvidence({
|
||||
workdir: ctx.workdir,
|
||||
connectionId: ctx.connectionId,
|
||||
syncId: ctx.syncId,
|
||||
runId: ctx.runId,
|
||||
overrideReplay: ctx.overrideReplay,
|
||||
});
|
||||
|
||||
return {
|
||||
result: projection,
|
||||
warnings: projection.warnings,
|
||||
errors: [],
|
||||
touchedSources: projection.touchedSources,
|
||||
changedWikiPageKeys: projection.changedWikiPageKeys,
|
||||
actions: projection.actions,
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Rename the stale test label**
|
||||
|
||||
Change the test name:
|
||||
|
||||
```ts
|
||||
it('runs historic-SQL evidence projection through the local bundle post-processor', async () => {
|
||||
```
|
||||
|
||||
to:
|
||||
|
||||
```ts
|
||||
it('runs historic-SQL evidence projection through local bundle finalization', async () => {
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run the focused failing test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/local-bundle-ingest.test.ts -t "historic-SQL evidence projection"
|
||||
```
|
||||
|
||||
Expected: PASS, and the assertion at
|
||||
`packages/context/src/ingest/local-bundle-ingest.test.ts:551` receives a
|
||||
`result.report.body.finalization` object with `status: "success"`.
|
||||
|
||||
- [ ] **Step 5: Commit the local acceptance fixture**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add packages/context/src/ingest/local-bundle-ingest.test.ts
|
||||
git commit -m "test(ingest): exercise historic sql finalization locally"
|
||||
```
|
||||
|
||||
### Task 3: Run final verification
|
||||
|
||||
**Files:**
|
||||
- Verify: `packages/context/src/ingest/finalization-scope.test.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.test.ts`
|
||||
- Verify: `packages/context/src/ingest/ingest-bundle.runner.isolated-diff.test.ts`
|
||||
- Verify: `packages/context/src/ingest/adapters/historic-sql/projection.test.ts`
|
||||
- Verify: `packages/context/src/ingest/local-bundle-ingest.test.ts`
|
||||
- Verify: `packages/context/src/ingest/adapters/historic-sql/local-ingest-acceptance.test.ts`
|
||||
- Verify: workspace TypeScript and dead-code checks
|
||||
|
||||
- [ ] **Step 1: Run the adapter-owned finalization targeted suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context exec vitest run src/ingest/finalization-scope.test.ts src/ingest/ingest-bundle.runner.test.ts src/ingest/ingest-bundle.runner.isolated-diff.test.ts src/ingest/adapters/historic-sql/projection.test.ts src/ingest/local-bundle-ingest.test.ts src/ingest/adapters/historic-sql/local-ingest-acceptance.test.ts
|
||||
```
|
||||
|
||||
Expected: PASS with all six test files passing.
|
||||
|
||||
- [ ] **Step 2: Run TypeScript validation**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm --filter @ktx/context run type-check
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run dead-code validation**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
pnpm run dead-code
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Inspect final status**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
Expected: only the intended committed changes are present, or the worktree is
|
||||
clean after the two commits.
|
||||
|
||||
## Docs impact
|
||||
|
||||
No `docs-site/content/docs/` update is required. The remaining v1 work is an
|
||||
adapter contract type export and test acceptance closure; it does not change
|
||||
CLI behavior, user configuration, setup flow, connector behavior, or public
|
||||
documentation examples.
|
||||
|
||||
## Self-review
|
||||
|
||||
- Spec coverage: The plan covers the remaining adapter API usability gap and
|
||||
the failing historic-SQL local finalization acceptance path. The main
|
||||
runner, reports, traces, provenance, override replay, and historic-SQL
|
||||
production finalization behavior already exist.
|
||||
- Placeholder scan: The plan contains no placeholder tasks or unspecified
|
||||
implementation steps.
|
||||
- Type consistency: `DeterministicFinalizationContext`,
|
||||
`FinalizationOverrideReplay`, and `FinalizationResult` match the existing
|
||||
names in `packages/context/src/ingest/types.ts`; the test adapter delegates
|
||||
to the existing `projectHistoricSqlEvidence()` result shape.
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -1,443 +0,0 @@
|
|||
# Adapter-owned ingest finalization design
|
||||
|
||||
**Date:** 2026-05-18
|
||||
**Author:** Andrey Avtomonov
|
||||
**Status:** Design - pending implementation plan
|
||||
|
||||
## Background
|
||||
|
||||
The isolated-diff ingestion migration made KTX's shared bundle runner
|
||||
responsible for one durable execution model: stage raw source data, run
|
||||
source-planned work units in isolated child worktrees, integrate their diffs,
|
||||
reconcile, run final gates, and squash the accepted integration tree back into
|
||||
the project worktree.
|
||||
|
||||
That direction is correct, but the current code still has a runner-level
|
||||
post-processing extension point. `IngestBundleRunnerDeps.postProcessors` maps a
|
||||
source key to an arbitrary `IngestBundlePostProcessorPort`, and local runtime
|
||||
wires `historic-sql` to `HistoricSqlProjectionPostProcessor`. That path can
|
||||
write durable semantic-layer and wiki artifacts after work-unit integration and
|
||||
reconciliation, outside the source adapter contract.
|
||||
|
||||
Historic SQL exposed why the extra path exists. Its table and pattern work units
|
||||
emit typed evidence, then a deterministic projection step merges the evidence
|
||||
into `_schema` usage and historic-SQL wiki pages. Some of that work is local to
|
||||
one work unit, but other behavior is whole-run maintenance: marking stale table
|
||||
usage, reusing existing pattern pages, and archiving old pattern pages. Those
|
||||
aggregate decisions do not fit cleanly inside independent per-work-unit writes.
|
||||
|
||||
The design goal is to preserve legitimate adapter-owned deterministic
|
||||
maintenance without keeping a generic runner-level escape hatch.
|
||||
|
||||
## Goals
|
||||
|
||||
This design tightens the isolated-diff architecture around a stable boundary:
|
||||
the generic runner owns execution mechanics, and adapters own source semantics.
|
||||
|
||||
The design has these goals:
|
||||
|
||||
- Remove runner-level `postProcessors` as an alternate durable-write pipeline.
|
||||
- Add a first-class `SourceAdapter.finalize?()` hook for deterministic
|
||||
post-work-unit source maintenance.
|
||||
- Keep `finalize?()` constrained, observable, and subject to the same final
|
||||
validation gates as work-unit and reconciliation changes.
|
||||
- Preserve historic-SQL aggregate projection behavior without treating it as a
|
||||
hidden fallback ingestion path.
|
||||
- Keep public execution knobs out of the adapter API.
|
||||
|
||||
## Non-goals
|
||||
|
||||
This design does not rework source-specific chunking, fetch formats, wiki page
|
||||
frontmatter, semantic-layer YAML, or raw source layouts. It does not replace
|
||||
agent-authored work units with deterministic projectors. It also does not add a
|
||||
public `executionMode`, `planningStrategy`, `conflictPolicy`, or source-key
|
||||
allowlist.
|
||||
|
||||
Override ingest remains a special correction operation that reuses a prior raw
|
||||
snapshot and forces reconciliation. It should be documented and tested as
|
||||
override replay, not as a fallback pipeline. This design does not require
|
||||
override ingest to run source work units.
|
||||
|
||||
## Locked design direction
|
||||
|
||||
The shared ingestion runner keeps one ordered pipeline for sources that can
|
||||
write durable project artifacts.
|
||||
|
||||
```text
|
||||
fetch raw
|
||||
-> adapter plans WorkUnit[]
|
||||
-> optional adapter project
|
||||
-> isolated WU diffs
|
||||
-> artifact-aware integration
|
||||
-> reconciliation
|
||||
-> optional adapter finalize
|
||||
-> runner wiki-SL-ref repair
|
||||
-> final target policy and artifact gates
|
||||
-> squash
|
||||
```
|
||||
|
||||
The exact implementation may continue to call `chunk()` before `project()` so a
|
||||
projector can consume `parseArtifacts`. The architectural invariant is that
|
||||
`project()` runs in the integration worktree before child worktrees start, while
|
||||
`finalize()` runs in the integration worktree after accepted work-unit and
|
||||
reconciliation changes are present.
|
||||
|
||||
Adapters decide what source-specific work belongs in `project()`, work units,
|
||||
or `finalize()`. The runner decides when those phases run, captures their git
|
||||
effects, enforces target scope, runs gates, writes traces and reports, and
|
||||
squashes the final tree.
|
||||
|
||||
## Adapter API
|
||||
|
||||
The source adapter contract should make deterministic source phases explicit.
|
||||
|
||||
```ts
|
||||
interface SourceAdapter {
|
||||
readonly source: string;
|
||||
readonly skillNames: string[];
|
||||
readonly reconcileSkillNames?: string[];
|
||||
readonly evidenceIndexing?: 'documents';
|
||||
readonly triageSupported?: boolean;
|
||||
|
||||
getTriageSignals?(stagedDir: string, externalId: string): Promise<TriageSignals>;
|
||||
detect(stagedDir: string): Promise<boolean>;
|
||||
fetch?(pullConfig: unknown, stagedDir: string, ctx: FetchContext): Promise<void>;
|
||||
readFetchReport?(stagedDir: string): Promise<SourceFetchReport | null>;
|
||||
listTargetConnectionIds?(stagedDir: string): Promise<string[]>;
|
||||
chunk(stagedDir: string, diffSet?: DiffSet): Promise<ChunkResult>;
|
||||
clusterWorkUnits?(ctx: ClusterWorkUnitsContext): Promise<WorkUnit[]>;
|
||||
project?(ctx: DeterministicProjectionContext): Promise<ProjectionResult>;
|
||||
finalize?(ctx: DeterministicFinalizationContext): Promise<FinalizationResult>;
|
||||
describeScope?(stagedDir: string): Promise<ScopeDescriptor>;
|
||||
onPullSucceeded?(ctx: PullSucceededContext): Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
`finalize?()` is not a compatibility wrapper for old post-processors. It is a
|
||||
source-adapter method with a fixed location in the runner lifecycle.
|
||||
|
||||
```ts
|
||||
interface DeterministicFinalizationContext {
|
||||
connectionId: string;
|
||||
sourceKey: string;
|
||||
syncId: string;
|
||||
jobId: string;
|
||||
runId: string;
|
||||
stagedDir: string;
|
||||
workdir: string;
|
||||
parseArtifacts?: unknown;
|
||||
stageIndex: StageIndex;
|
||||
workUnitOutcomes: WorkUnitOutcome[];
|
||||
reconciliationActions: MemoryAction[];
|
||||
overrideReplay?: FinalizationOverrideReplay;
|
||||
}
|
||||
|
||||
interface FinalizationResult {
|
||||
warnings: string[];
|
||||
errors: string[];
|
||||
touchedSources: TouchedSlSource[];
|
||||
changedWikiPageKeys: string[];
|
||||
actions?: MemoryAction[];
|
||||
result?: unknown;
|
||||
}
|
||||
|
||||
interface FinalizationOverrideReplay {
|
||||
priorJobId: string;
|
||||
priorRunId: string;
|
||||
priorSyncId: string;
|
||||
evictionRawPaths: string[];
|
||||
}
|
||||
```
|
||||
|
||||
The implementation plan can adjust exact type names to match the existing
|
||||
module layout, but the contract must preserve these semantics:
|
||||
|
||||
- `finalize?()` is deterministic TypeScript code, not an agent loop.
|
||||
- It runs only in the ingestion integration worktree.
|
||||
- It may write ordinary durable project files.
|
||||
- It must report the semantic-layer sources and wiki page keys it believes it
|
||||
touched so the runner can verify that declaration against the worktree diff.
|
||||
- Outside override replay, `stageIndex` is the canonical runner index for
|
||||
accepted work-unit actions, touched sources, evictions, reconciliation records,
|
||||
and artifact resolutions visible to the current run.
|
||||
- In override replay, `stageIndex` is a prior-run replay index for work-unit
|
||||
facts. It may contain prior-run work-unit actions, touched sources, and
|
||||
artifact records, and adapters must not treat those entries as current-run
|
||||
evidence. The runner must not replay prior-report `evictionsApplied` as
|
||||
current-run eviction evidence. If override reconciliation records eviction
|
||||
decisions, those records are fresh current-run `stageIndex.evictionsApplied`
|
||||
entries.
|
||||
- `workUnitOutcomes` contains only work units executed in the current run. It
|
||||
is empty when override replay skips source work units.
|
||||
- `reconciliationActions` contains only accepted reconciliation writes emitted
|
||||
through the reconciliation tool session in the current run. These actions have
|
||||
already mutated the integration worktree.
|
||||
- `overrideReplay` being present is the canonical signal that source work units
|
||||
did not produce current-run evidence unless another context field explicitly
|
||||
carries fresh current-run deterministic input.
|
||||
- `overrideReplay.evictionRawPaths` contains the deleted raw paths loaded from
|
||||
the prior report's `evictionInputs` for the reused raw snapshot. It is the
|
||||
only override-replay raw-path allowlist for removed-from-snapshot provenance.
|
||||
It is not, by itself, proof that a particular durable artifact is stale or was
|
||||
observed by current-run work units.
|
||||
- `actions` in `FinalizationResult` are descriptive records for finalization
|
||||
writes that the adapter already performed. The runner must not re-apply them.
|
||||
When finalization actions are intended to create provenance rows, they must
|
||||
carry defensible `rawPaths`: current-snapshot paths from the current raw
|
||||
snapshot, removed-from-snapshot paths from current-run
|
||||
`stageIndex.evictionsApplied`, or removed-from-snapshot paths from
|
||||
`overrideReplay.evictionRawPaths` when override replay is present.
|
||||
Finalization actions without defensible raw-path attribution are still
|
||||
reported, but the runner must exclude them from provenance and surface that
|
||||
exclusion explicitly.
|
||||
- It cannot mutate the main project worktree directly.
|
||||
- The finalization context must not pass a root-scoped service that can bypass
|
||||
the integration worktree. `workdir` is the durable write boundary. If a future
|
||||
helper is added to the context, the contract must name it as worktree-scoped
|
||||
and state whether it is read-only or allowed to write.
|
||||
|
||||
The existing adapter API fields unrelated to deterministic projection and
|
||||
finalization remain part of the contract. Adding `finalize?()` must not remove
|
||||
triage or evidence-indexing support.
|
||||
|
||||
## Override replay
|
||||
|
||||
Override ingest remains a replay of a prior raw snapshot with forced
|
||||
reconciliation. It does not execute source work units or call `adapter.chunk()`
|
||||
in this design, so finalization must not silently assume fresh work-unit
|
||||
evidence exists.
|
||||
|
||||
The runner should still enter the finalization phase for adapters that
|
||||
implement `finalize?()`, but it must pass explicit override metadata. In that
|
||||
mode, `workUnitOutcomes` is empty, `parseArtifacts` is absent,
|
||||
`overrideReplay.evictionRawPaths` is populated from the prior report's
|
||||
`evictionInputs`, `stageIndex` comes from the prior report with prior
|
||||
`evictionsApplied` excluded, and `reconciliationActions` contains only new
|
||||
override reconciliation actions.
|
||||
|
||||
If a future implementation intentionally re-parses the materialized override
|
||||
raw snapshot, it must expose that fact through an explicit override-safe context
|
||||
field instead of relying on `parseArtifacts` alone. `parseArtifacts` by itself
|
||||
is never current work-unit evidence in override replay and never authorizes
|
||||
historic-SQL whole-run cleanup.
|
||||
|
||||
Adapters must treat missing current-run deterministic inputs as a no-op, not as
|
||||
negative evidence. For historic SQL, override replay must not mark tables stale,
|
||||
mark pattern pages stale, or archive pattern pages from an empty current-run
|
||||
evidence directory. Whole-run cleanup can run only when `overrideReplay` is
|
||||
absent and current-run work-unit evidence exists, or when a future explicit
|
||||
override-safe context field names equivalent facts. Any override-safe
|
||||
finalization must be derived from the materialized raw snapshot or explicit
|
||||
prior-report data. In particular, prior-run
|
||||
`stageIndex.workUnits[*].actions`, prior-run touched sources, and prior-run
|
||||
artifact records are not proof that the current override run observed or failed
|
||||
to observe those artifacts.
|
||||
|
||||
## Runner responsibilities
|
||||
|
||||
The runner owns all reusable mechanics around `finalize?()`.
|
||||
|
||||
After reconciliation completes, the runner calls `adapter.finalize?()` if it
|
||||
exists. The runner captures the pre-finalization commit, derives the
|
||||
finalization changed paths from the integration-worktree git diff, commits those
|
||||
changes, records the commit SHA and touched paths in the run trace/report,
|
||||
includes finalization actions in saved-memory counts, and runs wiki-SL-ref
|
||||
repair before final target-policy and artifact gates.
|
||||
|
||||
The integration-worktree diff is the source of truth for finalization touched
|
||||
paths, changed wiki page keys, and semantic-layer paths. The adapter's
|
||||
`touchedSources` and `changedWikiPageKeys` declaration is a verification input,
|
||||
not the downstream authority. The runner must derive the final repair and gate
|
||||
scope from the diff, cross-check the adapter declaration against that diff, and
|
||||
fail the run on under-reporting or over-reporting that would make wiki-SL-ref
|
||||
repair, target-policy checks, final gates, reports, traces, or provenance use a
|
||||
different artifact set from the actual finalization commit.
|
||||
|
||||
The runner-derived semantic-layer scope must include logical
|
||||
`TouchedSlSource` tuples, not only file paths. Standalone semantic-layer files
|
||||
under `semantic-layer/<connectionId>/<sourceName>.yaml` can map structurally to
|
||||
`{ connectionId, sourceName }`. Aggregate semantic-layer files, including
|
||||
`semantic-layer/<connectionId>/_schema/*.yaml`, must be resolved by comparing
|
||||
the pre-finalization and post-finalization materialized semantic-layer sources
|
||||
with the worktree-scoped semantic-layer parser/loader. Wiki page keys continue
|
||||
to map structurally from `wiki/global/<pageKey>.md`. If the runner cannot
|
||||
resolve a changed semantic-layer path to logical touched sources with its own
|
||||
resolver, the run must fail; it must not fall back to the adapter declaration as
|
||||
the downstream scope.
|
||||
|
||||
`wiki_sl_ref_repair` remains a runner mechanic, not an adapter method. It runs
|
||||
after finalization and before final gates, and it uses the normal target
|
||||
connection set plus the runner-derived finalization touched sources to decide
|
||||
which semantic-layer references are visible. Its writes are part of the same
|
||||
integration worktree diff as finalization/reconciliation, so target-policy
|
||||
checks, final artifact gates, reports, traces, and squash behavior cover those
|
||||
writes before changes reach the main project worktree.
|
||||
|
||||
The runner must treat finalization like deterministic projection and
|
||||
reconciliation, not like a free-form source-key plug-in. It must enforce the
|
||||
same target-connection policy used for work-unit and reconciliation changes.
|
||||
If finalization writes an unauthorized semantic-layer target, modifies artifacts
|
||||
outside the authorized target set, references a missing semantic-layer entity, or
|
||||
returns errors, the run fails before changes reach the main project worktree.
|
||||
|
||||
The runner should expose one trace phase named `finalization`. It should not
|
||||
keep a `post_processor` stage, `IngestBundlePostProcessorPort`,
|
||||
`deps.postProcessors`, or report fields that imply a parallel post-processor
|
||||
pipeline.
|
||||
|
||||
## Adapter application
|
||||
|
||||
Each adapter continues to use the same generic runner mechanics, while keeping
|
||||
source-specific choices inside the adapter.
|
||||
|
||||
- `metabase` fetches cards and dashboards, computes scope, plans
|
||||
card/dashboard work units, and usually does not need `project()` or
|
||||
`finalize()`.
|
||||
- `notion` fetches pages, extracts triage signals, clusters page work units,
|
||||
and usually does not need deterministic finalization.
|
||||
- `dbt` fetches the repository, parses dbt project metadata, plans model work
|
||||
units, and may later add `project()` if dbt YAML import becomes deterministic.
|
||||
- `lookml` fetches LookML, produces validation artifacts, plans model and
|
||||
explore work units, and may later add `project()` for deterministic LookML to
|
||||
semantic-layer import.
|
||||
- `looker` fetches runtime bundles, fetch reports, target connections, and
|
||||
triage signals. It continues to rely on work-unit diffs and shared gates.
|
||||
- `metricflow` is the current strong `project()` example. It imports
|
||||
authoritative semantic models before child worktrees start, then lets any
|
||||
work units observe those projected files.
|
||||
- `live-database` can remain work-unit based, but database schema introspection
|
||||
is a good future `project()` candidate because the schema is authoritative
|
||||
structured metadata.
|
||||
- `historic-sql` should move current post-processor behavior into the adapter.
|
||||
Local table-usage and pattern-page writes may move into work-unit tools where
|
||||
they are genuinely per-unit. Whole-run maintenance such as stale table usage,
|
||||
pattern-page reuse, and stale/archive page decisions belongs in
|
||||
`HistoricSqlSourceAdapter.finalize()`.
|
||||
- `fake` remains a test adapter and does not need deterministic phases.
|
||||
|
||||
## Historic-SQL migration
|
||||
|
||||
Historic SQL should stop using evidence-only tool output plus runner-level
|
||||
post-processing as its durable projection path.
|
||||
|
||||
The preferred migration is:
|
||||
|
||||
1. Keep historic-SQL work units responsible for source-shaped analysis.
|
||||
2. Use source-specific tools for per-unit durable writes when the output is
|
||||
local to that unit, such as a table's usage metadata or one pattern page.
|
||||
3. Move whole-run deterministic cleanup into
|
||||
`HistoricSqlSourceAdapter.finalize()`.
|
||||
4. Delete `HistoricSqlProjectionPostProcessor`, `IngestBundlePostProcessorPort`,
|
||||
`deps.postProcessors`, and `post_processor` memory-flow/report stages.
|
||||
|
||||
If the implementation keeps typed evidence as an internal handoff between
|
||||
historic-SQL work units and `finalize()`, that evidence must be framed as
|
||||
source-specific input to the adapter's deterministic finalization, not as a
|
||||
generic runner post-processing mechanism. The evidence files must not become a
|
||||
public compatibility surface.
|
||||
|
||||
Historic-SQL finalization must distinguish "no current-run evidence exists"
|
||||
from "the current snapshot proves this artifact is stale." Whole-run cleanup
|
||||
such as stale table usage, pattern-page staleness, and archive decisions can
|
||||
run only when finalization has current-run historic-SQL evidence or an explicit
|
||||
override-safe source of equivalent facts.
|
||||
|
||||
## Reports and observability
|
||||
|
||||
Reports should describe first-class pipeline phases, not historical extension
|
||||
points. The isolated-diff summary should include finalization metadata when the
|
||||
adapter implements `finalize?()`: whether it ran, finalization commit SHA,
|
||||
touched paths, touched semantic-layer sources, changed wiki page keys,
|
||||
warnings, descriptive finalization actions, and source-specific result payload.
|
||||
|
||||
Saved-memory counts should come from work-unit, reconciliation, and
|
||||
finalization memory actions plus touched artifact reporting. Finalization
|
||||
actions are reporting/provenance records for writes that already happened in
|
||||
the integration worktree; they are not a second write channel. There should be
|
||||
no special `postProcessorSavedMemoryCounts` or `postProcessor` report body.
|
||||
Memory-flow phases should use `finalization` instead of `post_processor`.
|
||||
|
||||
The runner owns provenance for finalization. Adapters return touched artifacts
|
||||
and optional descriptive actions, but they do not call the provenance port.
|
||||
When finalization actions include valid `rawPaths`, the runner folds them into
|
||||
the normal provenance plan using the current `sourceKey`, `syncId`, raw content
|
||||
hashes, artifact kind, artifact key, target connection, and action type. The
|
||||
finalization phase and commit SHA belong in trace/report metadata; they should
|
||||
not be fabricated inside adapter-written files.
|
||||
|
||||
Finalization reports must show both the adapter-declared touched artifacts and
|
||||
the runner-derived touched artifacts from the finalization git diff. When those
|
||||
sets differ, the report and trace must include the mismatch and the run must
|
||||
fail before wiki-SL-ref repair or final gates rely on the wrong scope. When a
|
||||
finalization action is excluded from provenance because no defensible raw path
|
||||
exists, the report must name the action and reason instead of silently dropping
|
||||
it.
|
||||
|
||||
Traces must make finalization useful for postmortems. At minimum, record
|
||||
`finalization_started`, `finalization_committed`, `finalization_skipped`, and
|
||||
`finalization_failed` events with source key, touched paths, warnings, and
|
||||
error summaries.
|
||||
|
||||
## Failure handling
|
||||
|
||||
Finalization failures are ingestion failures. If `finalize?()` returns errors,
|
||||
throws, writes unauthorized targets, or causes final gates to fail, the runner
|
||||
marks the run failed and leaves the main project worktree unchanged.
|
||||
|
||||
Finalization should run after reconciliation because it may need to inspect the
|
||||
accepted work-unit and reconciliation result. Final gates should run after
|
||||
finalization because finalization writes durable project artifacts.
|
||||
|
||||
Finalization must not be used to repair arbitrary integration conflicts or
|
||||
rerun agent work. Conflict repair remains part of artifact-aware integration and
|
||||
reconciliation.
|
||||
|
||||
Finalization must also preserve reconciliation and accepted work-unit writes
|
||||
from the same run. The runner must remember the paths changed before
|
||||
finalization and fail if `finalize?()` modifies the same path after
|
||||
reconciliation. If a source needs deterministic maintenance for an artifact
|
||||
created or edited by a work unit in the same run, that behavior belongs in the
|
||||
source-specific work-unit tool or in a later run, not in post-reconciliation
|
||||
finalization.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
The implementation is complete when these conditions are true:
|
||||
|
||||
- No production runtime wiring references `deps.postProcessors`.
|
||||
- `IngestBundlePostProcessorPort` and `HistoricSqlProjectionPostProcessor` are
|
||||
removed from source exports and package export tests.
|
||||
- `SourceAdapter.finalize?()` exists with typed context and result objects.
|
||||
- The runner invokes `finalize?()` after reconciliation and before final gates.
|
||||
- Finalization changes are committed in the integration worktree and included
|
||||
in target-policy checks, final gates, reports, traces, and provenance inputs.
|
||||
- Override replay passes explicit override metadata to finalization, including
|
||||
`overrideReplay.evictionRawPaths`; leaves `workUnitOutcomes` empty when work
|
||||
units are skipped; omits `parseArtifacts` unless a future explicit
|
||||
override-safe input is added; and proves historic-SQL finalization does not
|
||||
use prior-run `stageIndex` records as current-run evidence or stale/archive
|
||||
artifacts from missing current-run evidence.
|
||||
- Finalization provenance uses current raw paths, current-run
|
||||
`stageIndex.evictionsApplied`, or `overrideReplay.evictionRawPaths`, and
|
||||
actions without defensible raw-path attribution are reported as excluded from
|
||||
provenance.
|
||||
- The runner derives finalization touched paths, wiki page keys, and
|
||||
semantic-layer scope from the integration-worktree git diff, resolves
|
||||
aggregate semantic-layer files such as `_schema/*.yaml` to logical touched
|
||||
sources with the runner's own semantic-layer parser/loader, cross-checks the
|
||||
adapter's touched-artifact declaration, and fails on mismatches or
|
||||
unresolvable changed semantic-layer paths.
|
||||
- The runner fails when finalization modifies a path already changed by accepted
|
||||
work-unit or reconciliation writes in the same run.
|
||||
- `wiki_sl_ref_repair` remains a runner-owned step after finalization and
|
||||
before final gates, consumes runner-derived finalization touched sources, and
|
||||
has its writes covered by target-policy checks and final gates.
|
||||
- Finalization `actions` are not re-applied by the runner; they are included
|
||||
only in reporting, saved-memory counts, and provenance planning when their
|
||||
raw-path attribution is valid.
|
||||
- Historic SQL uses adapter-owned finalization for whole-run projection
|
||||
maintenance.
|
||||
- Tests cover a successful finalization, a finalization failure, unauthorized
|
||||
finalization target rejection, override replay finalization behavior,
|
||||
wiki-SL-ref repair placement, and historic-SQL projection behavior without
|
||||
runner-level post-processors.
|
||||
Loading…
Add table
Add a link
Reference in a new issue