mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
Refine adapter-owned ingest finalization design after adversarial review iteration 3
This commit is contained in:
parent
a4a74c3822
commit
e8d461fcb8
1 changed files with 68 additions and 27 deletions
|
|
@ -145,6 +145,7 @@ interface FinalizationOverrideReplay {
|
|||
priorJobId: string;
|
||||
priorRunId: string;
|
||||
priorSyncId: string;
|
||||
evictionRawPaths: string[];
|
||||
}
|
||||
```
|
||||
|
||||
|
|
@ -159,9 +160,13 @@ module layout, but the contract must preserve these semantics:
|
|||
- Outside override replay, `stageIndex` is the canonical runner index for
|
||||
accepted work-unit actions, touched sources, evictions, reconciliation records,
|
||||
and artifact resolutions visible to the current run.
|
||||
- In override replay, `stageIndex` is a prior-run replay index. It may contain
|
||||
prior-run work-unit actions, touched sources, and artifact records, and
|
||||
adapters must not treat those entries as current-run evidence.
|
||||
- In override replay, `stageIndex` is a prior-run replay index for work-unit
|
||||
facts. It may contain prior-run work-unit actions, touched sources, and
|
||||
artifact records, and adapters must not treat those entries as current-run
|
||||
evidence. The runner must not replay prior-report `evictionsApplied` as
|
||||
current-run eviction evidence. If override reconciliation records eviction
|
||||
decisions, those records are fresh current-run `stageIndex.evictionsApplied`
|
||||
entries.
|
||||
- `workUnitOutcomes` contains only work units executed in the current run. It
|
||||
is empty when override replay skips source work units.
|
||||
- `reconciliationActions` contains only accepted reconciliation writes emitted
|
||||
|
|
@ -170,14 +175,21 @@ module layout, but the contract must preserve these semantics:
|
|||
- `overrideReplay` being present is the canonical signal that source work units
|
||||
did not produce current-run evidence unless another context field explicitly
|
||||
carries fresh current-run deterministic input.
|
||||
- `overrideReplay.evictionRawPaths` contains the deleted raw paths loaded from
|
||||
the prior report's `evictionInputs` for the reused raw snapshot. It is the
|
||||
only override-replay raw-path allowlist for removed-from-snapshot provenance.
|
||||
It is not, by itself, proof that a particular durable artifact is stale or was
|
||||
observed by current-run work units.
|
||||
- `actions` in `FinalizationResult` are descriptive records for finalization
|
||||
writes that the adapter already performed. The runner must not re-apply them.
|
||||
When finalization actions are intended to create provenance rows, they must
|
||||
carry defensible `rawPaths`: current-snapshot paths from the current raw
|
||||
snapshot, or removed-from-snapshot paths derived from `stageIndex.evictionsApplied`
|
||||
or the equivalent override-replay eviction metadata. Finalization actions
|
||||
without defensible raw-path attribution are still reported, but the runner must
|
||||
exclude them from provenance and surface that exclusion explicitly.
|
||||
snapshot, removed-from-snapshot paths from current-run
|
||||
`stageIndex.evictionsApplied`, or removed-from-snapshot paths from
|
||||
`overrideReplay.evictionRawPaths` when override replay is present.
|
||||
Finalization actions without defensible raw-path attribution are still
|
||||
reported, but the runner must exclude them from provenance and surface that
|
||||
exclusion explicitly.
|
||||
- It cannot mutate the main project worktree directly.
|
||||
- The finalization context must not pass a root-scoped service that can bypass
|
||||
the integration worktree. `workdir` is the durable write boundary. If a future
|
||||
|
|
@ -191,24 +203,35 @@ triage or evidence-indexing support.
|
|||
## Override replay
|
||||
|
||||
Override ingest remains a replay of a prior raw snapshot with forced
|
||||
reconciliation. It does not execute source work units, so finalization must not
|
||||
silently assume fresh work-unit evidence exists.
|
||||
reconciliation. It does not execute source work units or call `adapter.chunk()`
|
||||
in this design, so finalization must not silently assume fresh work-unit
|
||||
evidence exists.
|
||||
|
||||
The runner should still enter the finalization phase for adapters that
|
||||
implement `finalize?()`, but it must pass explicit override metadata. In that
|
||||
mode, `workUnitOutcomes` is empty, `parseArtifacts` is absent unless the runner
|
||||
created fresh parse artifacts in the current run, `stageIndex` comes from the
|
||||
prior report, and `reconciliationActions` contains only new override
|
||||
reconciliation actions.
|
||||
mode, `workUnitOutcomes` is empty, `parseArtifacts` is absent,
|
||||
`overrideReplay.evictionRawPaths` is populated from the prior report's
|
||||
`evictionInputs`, `stageIndex` comes from the prior report with prior
|
||||
`evictionsApplied` excluded, and `reconciliationActions` contains only new
|
||||
override reconciliation actions.
|
||||
|
||||
If a future implementation intentionally re-parses the materialized override
|
||||
raw snapshot, it must expose that fact through an explicit override-safe context
|
||||
field instead of relying on `parseArtifacts` alone. `parseArtifacts` by itself
|
||||
is never current work-unit evidence in override replay and never authorizes
|
||||
historic-SQL whole-run cleanup.
|
||||
|
||||
Adapters must treat missing current-run deterministic inputs as a no-op, not as
|
||||
negative evidence. For historic SQL, override replay must not mark tables stale,
|
||||
mark pattern pages stale, or archive pattern pages from an empty current-run
|
||||
evidence directory. Any override-safe finalization must be derived from the
|
||||
materialized raw snapshot or explicit prior-report data. In particular,
|
||||
prior-run `stageIndex.workUnits[*].actions`, prior-run touched sources, and
|
||||
prior-run artifact records are not proof that the current override run observed
|
||||
or failed to observe those artifacts.
|
||||
evidence directory. Whole-run cleanup can run only when `overrideReplay` is
|
||||
absent and current-run work-unit evidence exists, or when a future explicit
|
||||
override-safe context field names equivalent facts. Any override-safe
|
||||
finalization must be derived from the materialized raw snapshot or explicit
|
||||
prior-report data. In particular, prior-run
|
||||
`stageIndex.workUnits[*].actions`, prior-run touched sources, and prior-run
|
||||
artifact records are not proof that the current override run observed or failed
|
||||
to observe those artifacts.
|
||||
|
||||
## Runner responsibilities
|
||||
|
||||
|
|
@ -230,6 +253,18 @@ fail the run on under-reporting or over-reporting that would make wiki-SL-ref
|
|||
repair, target-policy checks, final gates, reports, traces, or provenance use a
|
||||
different artifact set from the actual finalization commit.
|
||||
|
||||
The runner-derived semantic-layer scope must include logical
|
||||
`TouchedSlSource` tuples, not only file paths. Standalone semantic-layer files
|
||||
under `semantic-layer/<connectionId>/<sourceName>.yaml` can map structurally to
|
||||
`{ connectionId, sourceName }`. Aggregate semantic-layer files, including
|
||||
`semantic-layer/<connectionId>/_schema/*.yaml`, must be resolved by comparing
|
||||
the pre-finalization and post-finalization materialized semantic-layer sources
|
||||
with the worktree-scoped semantic-layer parser/loader. Wiki page keys continue
|
||||
to map structurally from `wiki/global/<pageKey>.md`. If the runner cannot
|
||||
resolve a changed semantic-layer path to logical touched sources with its own
|
||||
resolver, the run must fail; it must not fall back to the adapter declaration as
|
||||
the downstream scope.
|
||||
|
||||
`wiki_sl_ref_repair` remains a runner mechanic, not an adapter method. It runs
|
||||
after finalization and before final gates, and it uses the normal target
|
||||
connection set plus the runner-derived finalization touched sources to decide
|
||||
|
|
@ -376,16 +411,22 @@ The implementation is complete when these conditions are true:
|
|||
- The runner invokes `finalize?()` after reconciliation and before final gates.
|
||||
- Finalization changes are committed in the integration worktree and included
|
||||
in target-policy checks, final gates, reports, traces, and provenance inputs.
|
||||
- Override replay passes explicit override metadata to finalization, leaves
|
||||
`workUnitOutcomes` empty when work units are skipped, and proves historic-SQL
|
||||
finalization does not use prior-run `stageIndex` records as current-run
|
||||
evidence or stale/archive artifacts from missing current-run evidence.
|
||||
- Finalization provenance uses current raw paths or eviction-derived raw paths,
|
||||
and actions without defensible raw-path attribution are reported as excluded
|
||||
from provenance.
|
||||
- Override replay passes explicit override metadata to finalization, including
|
||||
`overrideReplay.evictionRawPaths`; leaves `workUnitOutcomes` empty when work
|
||||
units are skipped; omits `parseArtifacts` unless a future explicit
|
||||
override-safe input is added; and proves historic-SQL finalization does not
|
||||
use prior-run `stageIndex` records as current-run evidence or stale/archive
|
||||
artifacts from missing current-run evidence.
|
||||
- Finalization provenance uses current raw paths, current-run
|
||||
`stageIndex.evictionsApplied`, or `overrideReplay.evictionRawPaths`, and
|
||||
actions without defensible raw-path attribution are reported as excluded from
|
||||
provenance.
|
||||
- The runner derives finalization touched paths, wiki page keys, and
|
||||
semantic-layer scope from the integration-worktree git diff, cross-checks the
|
||||
adapter's touched-artifact declaration, and fails on mismatches.
|
||||
semantic-layer scope from the integration-worktree git diff, resolves
|
||||
aggregate semantic-layer files such as `_schema/*.yaml` to logical touched
|
||||
sources with the runner's own semantic-layer parser/loader, cross-checks the
|
||||
adapter's touched-artifact declaration, and fails on mismatches or
|
||||
unresolvable changed semantic-layer paths.
|
||||
- The runner fails when finalization modifies a path already changed by accepted
|
||||
work-unit or reconciliation writes in the same run.
|
||||
- `wiki_sl_ref_repair` remains a runner-owned step after finalization and
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue