mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
fix: align KTX agent tools and repair handling (#73)
This commit is contained in:
parent
ed690ef60c
commit
28b5e2a83e
19 changed files with 113 additions and 45 deletions
|
|
@ -12,7 +12,7 @@ Parsimonious. Stage 3 WUs already loaded `ingest_triage` and handled conflicts t
|
|||
3. If the system prompt includes `<canonical_pins>`, apply those pins before flagging a same-name or near-duplicate conflict. A pinned `canonicalArtifactKey` keeps the contested name when it is present in the Stage Index; competing variants keep or receive disambiguated names.
|
||||
4. Sweep both exact-key conflicts and near-duplicate writes. Compare WUs that wrote overlapping SL source names, overlapping wiki keys, the same `tables:` or `sl_refs:` action details, or obviously equivalent topic titles under different wiki keys. Call `stage_diff` to see the actual difference, and use `wiki_read`/`sl_read_source` when two different keys appear to describe the same table, metric, or source-of-truth mapping. If they're the same content, leave one canonical artifact and record the duplicate as subsumed. If they differ per `ingest_triage` rules, apply the correct resolution (rename + capture; election of canonical; silent replace for expression-only re-ingest change; or pinned canonical), then call `emit_conflict_resolution` with the artifact key and decision.
|
||||
5. For any `wiki_write`, `wiki_remove`, `sl_write_source`, or `sl_edit_source` call you make during reconciliation, include `rawPaths` with only the raw paths that directly caused that reconciliation action.
|
||||
6. Call `eviction_list()` for deleted raw paths. For each listed artifact, remove it (`sl_delete`, `wiki_remove`) and include the evicted raw path in `rawPaths`. Then call `emit_eviction_decision` with `action: "removed"` for every removed artifact.
|
||||
6. Call `eviction_list()` for deleted raw paths. For each listed artifact, remove it (`sl_write_source`/`sl_edit_source` with `delete: true` for SL sources, `wiki_remove` for wiki pages) and include the evicted raw path in `rawPaths`. Then call `emit_eviction_decision` with `action: "removed"` for every removed artifact.
|
||||
7. If the Stage 4 sweep discovers a raw file whose only honest outcome is standalone SQL, wiki-only capture, or a human flag, call `emit_unmapped_fallback` with the raw path, reason, and fallback kind.
|
||||
8. Use `read_raw_span` to zoom into specific raw files when you need to resolve what two contested measures or wiki pages actually describe.
|
||||
9. Exit when you've processed every item.
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ callers: [memory_agent]
|
|||
# Ingest Triage — conflict classification and resolution
|
||||
|
||||
This skill is loaded in two contexts:
|
||||
- By a Stage 3 WorkUnit agent when `sl_discover` or an `sl_discover` reveals that a prior WU (or a prior sync) already wrote something that overlaps with what the current WU is about to write.
|
||||
- By a Stage 3 WorkUnit agent when `sl_discover` reveals that a prior WU (or a prior sync) already wrote something that overlaps with what the current WU is about to write.
|
||||
- By the Stage 4 reconciliation agent for cross-WU sweeps and for eviction decisions.
|
||||
|
||||
Apply the rules below before every write that could collide with an existing artifact.
|
||||
|
|
@ -32,7 +32,7 @@ Apply the rules below before every write that could collide with an existing art
|
|||
| Definitional contradiction | Same name, substantively different formulas (different aggregation, different filters, different columns) | **Rename + capture**: disambiguate ALL variants with suffix derived from the domain (`churn_risk_engagement_based`, `churn_risk_billing_based`) and write a unified wiki page listing every variant with provenance. The contested name does NOT land in the SL. **Always flag.** |
|
||||
|
||||
5. **Eviction (Stage 4 only)**: for each entry in `eviction_list()`:
|
||||
- Remove the artifact (`sl_delete` for SL sources, `wiki_remove` for wiki pages).
|
||||
- Remove the artifact (`sl_write_source` or `sl_edit_source` with `delete: true` for SL sources, `wiki_remove` for wiki pages).
|
||||
- Record the removal with `emit_eviction_decision` and `action: "removed"`.
|
||||
|
||||
## Why same-ingest vs re-ingest differs
|
||||
|
|
|
|||
|
|
@ -84,7 +84,7 @@ SL source, `tables:` frontmatter, `sl_refs`, or `emit_unmapped_fallback`:
|
|||
|
||||
**Required flow before writing any overlay or standalone**:
|
||||
|
||||
1. Call `sl_discover(<tableName>)` for each base table you're about to touch. That returns the real columns.
|
||||
1. Call `sl_discover({ query: "<tableName>" })` for each base table you're about to touch. That returns the real columns.
|
||||
2. If the table isn't in the manifest, use the warehouse `connectionName`
|
||||
returned by `discover_data` or the target connection chosen from
|
||||
`sl_discover`, then call a dialect-appropriate SQL probe with that
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ Each WorkUnit is either a single Notion page/span or a topical cluster of relate
|
|||
4. Use `context_evidence_search`, `context_evidence_read`, and `context_evidence_neighbors` to pull supporting chunks when indexed evidence is relevant. Pass `chunkId` and `documentId` values verbatim as returned by the evidence tools.
|
||||
5. Write durable business knowledge with `wiki_write`. Aim for a small number of high-quality pages per WorkUnit or cluster. Include `rawPaths` with the exact Notion raw files that support each page.
|
||||
6. When the Notion content defines a reusable dataset, metric, segment, join rule, source-of-truth mapping, or table with explicit columns, load `sl_capture`, discover existing sources first with `sl_discover` or `sl_read_source`, then use `sl_write_source` or `sl_edit_source` only for a confirmed mapped non-Notion target source. Include `rawPaths` with the exact Notion raw files that support the SL action. If no mapped target exists, call `emit_unmapped_fallback` and keep the content wiki-only.
|
||||
7. For every deleted raw path in the Eviction Set, call `eviction_list`, decide retention, then `context_eviction_decision_write`. Do this even when no wiki write is needed.
|
||||
7. For every deleted raw path in the Eviction Set, call `eviction_list`, decide retention, then `emit_eviction_decision`. Do this even when no wiki write is needed.
|
||||
|
||||
## What To Capture
|
||||
|
||||
|
|
@ -99,6 +99,6 @@ SL source, `tables:` frontmatter, `sl_refs`, or `emit_unmapped_fallback`:
|
|||
|
||||
## Tools
|
||||
|
||||
Allowed: `read_raw_file`, `read_raw_span`, `wiki_search`, `wiki_read`, `wiki_write`, `discover_data`, `entity_details`, `sql_execution`, `sl_discover`, `sl_read_source`, `sl_write_source`, `sl_edit_source`, `sl_validate`, `context_evidence_search`, `context_evidence_read`, `context_evidence_neighbors`, `emit_unmapped_fallback`, `eviction_list`, `context_eviction_decision_write`.
|
||||
Allowed: `read_raw_file`, `read_raw_span`, `wiki_search`, `wiki_read`, `wiki_write`, `discover_data`, `entity_details`, `sql_execution`, `sl_discover`, `sl_read_source`, `sl_write_source`, `sl_edit_source`, `sl_validate`, `context_evidence_search`, `context_evidence_read`, `context_evidence_neighbors`, `emit_unmapped_fallback`, `eviction_list`, `emit_eviction_decision`.
|
||||
|
||||
Not allowed: `context_candidate_write`, `context_candidate_mark`.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
name: sl
|
||||
description: KTX's semantic layer — a structured catalog of sources (tables/views), measures, joins, and segments expressed as YAML. Covers the schema and how to query it via `semantic_query`. Use when the task involves querying pre-defined metrics (ARR, churn, retention, LTV, MAU) or reading SL source YAML to understand the catalog. Capture is handled by the `sl_capture` skill (memory-agent only).
|
||||
description: KTX's semantic layer — a structured catalog of sources (tables/views), measures, joins, and segments expressed as YAML. Covers the schema and how to query it via `sl_query`. Use when the task involves querying pre-defined metrics (ARR, churn, retention, LTV, MAU) or reading SL source YAML to understand the catalog. Capture is handled by the `sl_capture` skill (memory-agent only).
|
||||
---
|
||||
|
||||
# Semantic Layer
|
||||
|
|
@ -9,7 +9,7 @@ KTX's semantic layer (SL) is a structured catalog. Each **source** represents a
|
|||
|
||||
This skill covers two parts:
|
||||
- **Part 1** — Schema reference (what an SL source looks like).
|
||||
- **Part 2** — Querying via `semantic_query`.
|
||||
- **Part 2** — Querying via `sl_query`.
|
||||
|
||||
Capture (when and how to add new patterns to the SL) is a separate concern handled by the memory-agent — see the `sl_capture` skill if you are running in capture mode. The research agent **reads** and **queries** the SL via the tools described here; it does not write to it.
|
||||
|
||||
|
|
@ -162,7 +162,7 @@ segments:
|
|||
description: Orders that were paid and not refunded
|
||||
```
|
||||
|
||||
Named, reusable boolean predicates scoped to one source. Reference by bare name in a measure's `segments: []`, or by dotted form `source.segment_name` in a `semantic_query`. Segments are predicates only — they are NOT selectable as dimensions. If you need to group by the predicate, add a `columns[]` entry instead.
|
||||
Named, reusable boolean predicates scoped to one source. Reference by bare name in a measure's `segments: []`, or by dotted form `source.segment_name` in an `sl_query`. Segments are predicates only — they are NOT selectable as dimensions. If you need to group by the predicate, add a `columns[]` entry instead.
|
||||
|
||||
### Cross-references with the wiki
|
||||
|
||||
|
|
@ -170,11 +170,11 @@ The reverse edge (wiki pages that cite this source) is derived automatically fro
|
|||
|
||||
---
|
||||
|
||||
## Part 2 — Querying via `semantic_query`
|
||||
## Part 2 — Querying via `sl_query`
|
||||
|
||||
The `semantic_query` tool generates correct SQL from a structured query. It handles joins, fan-out prevention, aggregation correctness, and filter classification automatically. Prefer it over writing raw SQL whenever the SL has the relevant sources.
|
||||
The `sl_query` tool generates correct SQL from a structured query. It handles joins, fan-out prevention, aggregation correctness, and filter classification automatically. Prefer it over writing raw SQL whenever the SL has the relevant sources.
|
||||
|
||||
### When to prefer semantic_query over raw SQL
|
||||
### When to prefer sl_query over raw SQL
|
||||
|
||||
- A pre-defined measure already exists (`source.measure_name` appears in the catalog).
|
||||
- The question combines fields from multiple sources — the engine resolves the join path automatically.
|
||||
|
|
@ -189,15 +189,12 @@ Use raw SQL (`sql_execution`) only when:
|
|||
```json
|
||||
{
|
||||
"connectionId": "uuid-of-the-connection",
|
||||
"reasoning": "Brief note on what this query analyzes",
|
||||
"query": {
|
||||
"measures": ["orders.total_revenue", "sum(orders.amount)"],
|
||||
"dimensions": ["customers.segment", { "field": "orders.created_at", "granularity": "month" }],
|
||||
"filters": ["orders.status != 'cancelled'", "orders.total_revenue > 10000"],
|
||||
"segments": ["orders.paid_non_refunded"],
|
||||
"order_by": [{ "field": "orders.created_at", "direction": "desc" }],
|
||||
"limit": 1000
|
||||
}
|
||||
"measures": ["orders.total_revenue", "sum(orders.amount)"],
|
||||
"dimensions": ["customers.segment", { "field": "orders.created_at", "granularity": "month" }],
|
||||
"filters": ["orders.status != 'cancelled'", "orders.total_revenue > 10000"],
|
||||
"segments": ["orders.paid_non_refunded"],
|
||||
"order_by": [{ "field": "orders.created_at", "direction": "desc" }],
|
||||
"limit": 1000
|
||||
}
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -63,7 +63,7 @@ Preferred:
|
|||
- name: total_revenue
|
||||
expr: sum(amount)
|
||||
```
|
||||
Callers filter `region = 'US'` at `semantic_query` time.
|
||||
Callers filter `region = 'US'` at query time.
|
||||
|
||||
**Bake constants in only when the filter has named business meaning that won't change** (`enterprise_arr` for a contractually defined tier), cannot be expressed via the source's dimensions, or comes from a regulated/fixed list.
|
||||
|
||||
|
|
@ -100,7 +100,7 @@ measures:
|
|||
|
||||
**Extract repeated filter bundles into named segments.** If the same predicate appears on multiple measures of the same source, lift it to a `segments[]` entry and have each measure reference it. One edit updates every measure that depends on it.
|
||||
|
||||
**Never write a standalone file on a manifest-backed name.** If `sl_discover({ tableName })` finds an existing schema for that name, you MUST write an overlay (`name:` + `measures:`/`segments:`/`descriptions:` only — no `sql:`, `table:`, `grain:`, `columns:`, `joins:`). A standalone with `sql:` or `table:` on a manifest-backed name clobbers the inherited columns and joins; `sl_write_source` and `sl_validate` both reject this shape with a clear fix hint. Always run `sl_discover` before your first write on any existing name.
|
||||
**Never write a standalone file on a manifest-backed name.** If `sl_discover({ query: "<table-or-source-name>" })` finds an existing schema for that name, you MUST write an overlay (`name:` + `measures:`/`segments:`/`descriptions:` only — no `sql:`, `table:`, `grain:`, `columns:`, `joins:`). A standalone with `sql:` or `table:` on a manifest-backed name clobbers the inherited columns and joins; `sl_write_source` and `sl_validate` both reject this shape with a clear fix hint. Always run `sl_discover` before your first write on any existing name.
|
||||
|
||||
**Prefer overlay decomposition over standalone SQL sources.** Before reaching for `source_type: sql`, check whether the metric decomposes into measures on existing overlays (including cross-source derived measures). Use `source_type: sql` only when:
|
||||
- The metric requires per-user/per-entity derivation that cannot be expressed as a single `expr` (e.g., `EXISTS` over a time-windowed subset), OR
|
||||
|
|
@ -209,10 +209,10 @@ SL source, `tables:` frontmatter, `sl_refs`, or `emit_unmapped_fallback`:
|
|||
## Tool sequence
|
||||
|
||||
1. `sl_discover` — see what source files exist.
|
||||
2. `sl_discover({ tableName })` — **REQUIRED before the first write on any name**. Shows columns/joins/grain from the manifest. If the call returns a schema, you MUST write an overlay, not a standalone. Skipping this is the #1 cause of accidentally shadowing the manifest.
|
||||
3. `sl_read_source({ sourceName })` — read the raw YAML before editing.
|
||||
4. For modifications: `sl_edit_source({ sourceName, old_string, new_string })` with exact-string replacements. `old_string` must match exactly and be unique in the file.
|
||||
5. For new sources or full rewrites: `sl_write_source({ sourceName, content })` with the full YAML content.
|
||||
2. `sl_discover({ query: "<table-or-source-name>" })` — **REQUIRED before the first write on any name**. Shows columns/joins/grain from the manifest. If the call returns a schema, you MUST write an overlay, not a standalone. Skipping this is the #1 cause of accidentally shadowing the manifest.
|
||||
3. `sl_read_source({ connectionId, sourceName })` — read the raw YAML before editing.
|
||||
4. For modifications: `sl_edit_source({ connectionId, sourceName, yaml_edits: [{ oldText, newText, reason }] })` with exact-string replacements. `oldText` must match exactly and be unique in the file.
|
||||
5. For new sources or full rewrites: `sl_write_source({ connectionId, sourceName, source })` with the full structured source definition.
|
||||
6. For join discovery: use `sql_execution({connectionName: "warehouse", sql: "SELECT count(*) FROM public.orders o JOIN public.customers c ON c.id = o.customer_id LIMIT 20"})` with the target warehouse connection name and dialect-correct table names to verify the join key exists in both tables and assess cardinality before declaring the join.
|
||||
7. Cross-reference knowledge: author the edge once on the **wiki** side via `sl_refs: [source_name]` in the page's front-matter. The reverse edge (wiki pages that cite an SL source) is derived automatically by the reconciler — do not add a `knowledge_refs:` field to SL YAMLs.
|
||||
8. `sl_validate` — run after writing or editing to surface schema issues, duplicate measure names, and cross-source validation errors. Read-only; the writes are already committed (the squash-at-end flow will collapse them into one commit).
|
||||
|
|
@ -235,13 +235,21 @@ Existing index: `orders [measures=0, joins=0] — candidate for enrichment`.
|
|||
```
|
||||
sl_discover()
|
||||
→ orders.yaml does not exist yet
|
||||
sl_discover({ tableName: "orders" })
|
||||
sl_discover({ query: "orders" })
|
||||
→ see grain, columns, no current overlay
|
||||
sl_write_source({
|
||||
connectionId: "warehouse",
|
||||
sourceName: "orders",
|
||||
content: "name: orders\nmeasures:\n - name: avg_order_value\n expr: avg(amount)\n description: Mean order transaction amount — filter by product_category at query time\n"
|
||||
source: {
|
||||
name: "orders",
|
||||
measures: [{
|
||||
name: "avg_order_value",
|
||||
expr: "avg(amount)",
|
||||
description: "Mean order transaction amount - filter by product_category at query time"
|
||||
}]
|
||||
}
|
||||
})
|
||||
sl_validate()
|
||||
sl_validate({ connectionId: "warehouse" })
|
||||
→ clean
|
||||
```
|
||||
|
||||
|
|
@ -258,16 +266,17 @@ Current user: "Wait, by 'active' I mean users who have placed an order in the la
|
|||
The existing `users.active_count` measure is wrong by the new definition.
|
||||
|
||||
```
|
||||
sl_read_source({ sourceName: "users" })
|
||||
sl_read_source({ connectionId: "warehouse", sourceName: "users" })
|
||||
→ see the wrong measure
|
||||
sl_edit_source({
|
||||
connectionId: "warehouse",
|
||||
sourceName: "users",
|
||||
yaml_edits: [{
|
||||
oldText: " - name: active_count\n expr: \"count(*)\"\n filter: \"last_login_at > now() - interval '30 days'\"\n description: Users who logged in within the last 30 days",
|
||||
newText: " - name: active_count\n expr: \"count(distinct case when last_order_at > now() - interval '30 days' then user_id end)\"\n description: Users with at least one order in the last 30 days"
|
||||
}]
|
||||
})
|
||||
sl_validate()
|
||||
sl_validate({ connectionId: "warehouse" })
|
||||
```
|
||||
|
||||
If you only added a new measure, the old incorrect `active_count` would stay and future queries would keep answering the wrong question.
|
||||
|
|
@ -277,7 +286,7 @@ If you only added a new measure, the old incorrect `active_count` would stay and
|
|||
Prior turn: user asked to correlate LTV with protocol count; assistant joined `fct_orders` with `fct_mau_multiprotocol` on `admin_user_id` in raw SQL.
|
||||
|
||||
```
|
||||
sl_read_source({ sourceName: "fct_orders" })
|
||||
sl_read_source({ connectionId: "warehouse", sourceName: "fct_orders" })
|
||||
→ no joins section yet
|
||||
sql_execution({
|
||||
connectionName: "warehouse",
|
||||
|
|
@ -285,13 +294,14 @@ sql_execution({
|
|||
})
|
||||
→ confirms cardinality (many orders per MAU row = many_to_one)
|
||||
sl_edit_source({
|
||||
connectionId: "warehouse",
|
||||
sourceName: "fct_orders",
|
||||
yaml_edits: [{
|
||||
oldText: "measures:",
|
||||
newText: "joins:\n - to: fct_mau_multiprotocol\n on: admin_user_id = fct_mau_multiprotocol.admin_user_id\n relationship: many_to_one\nmeasures:"
|
||||
}]
|
||||
})
|
||||
sl_validate()
|
||||
sl_validate({ connectionId: "warehouse" })
|
||||
```
|
||||
|
||||
Always verify joins with `sql_execution` before adding them.
|
||||
|
|
|
|||
|
|
@ -31,7 +31,7 @@ Do NOT capture:
|
|||
- Temporary instructions scoped to the current chat.
|
||||
- Ad-hoc formatting preferences.
|
||||
- Information already present in the semantic layer (column names, join paths, measure formulas — those belong in SL).
|
||||
- **Query results, snapshots, or time-bounded benchmark tables.** Numbers go stale; pasting "Oct 2025: 25%, Nov 2025: 19.9%, …" creates misinformation as soon as new data lands. Reference the SL source by name (`sl_refs`) and let future queries pull live data — the wiki captures the *rule* (definition, exclusion, segmentation), the SL source captures the *measure*, and `semantic_query` captures the *current values*.
|
||||
- **Query results, snapshots, or time-bounded benchmark tables.** Numbers go stale; pasting "Oct 2025: 25%, Nov 2025: 19.9%, …" creates misinformation as soon as new data lands. Reference the SL source by name (`sl_refs`) and let future query tools pull live data — the wiki captures the *rule* (definition, exclusion, segmentation), the SL source captures the *measure*, and query execution captures the *current values*.
|
||||
- **Interpretive narrative tied to a specific snapshot** ("M1 retention degraded sharply from Dec 2025"). The observation is anchored to data that will move; the actionable convention (e.g., "always exclude in-progress cohorts") may be worth capturing on its own, but the snapshot-specific commentary is not.
|
||||
|
||||
If nothing is worth capturing, respond without calling any tool.
|
||||
|
|
@ -136,7 +136,7 @@ wiki_search({ query: "refund revenue paid orders" })
|
|||
→ returns `revenue-definition` (related — defines paid-orders filter)
|
||||
sl_discover({ query: "refund rate" })
|
||||
→ returns fct_orders (score 0.08), fct_gaap_revenue (0.06)
|
||||
sl_read_source({ sourceName: "fct_orders" })
|
||||
sl_read_source({ connectionId: "warehouse", sourceName: "fct_orders" })
|
||||
→ confirms amount_refunded_dollars and transaction_amount_dollars exist
|
||||
wiki_write({
|
||||
key: "refund-rate-definition",
|
||||
|
|
|
|||
|
|
@ -40,6 +40,8 @@ describe('AgentRunnerService.runLoop', () => {
|
|||
|
||||
it('passes systemPrompt, userPrompt, tools, and step budget through to generateText', async () => {
|
||||
(generateText as any).mockResolvedValue({ text: 'ok', toolCalls: [], steps: [] });
|
||||
const repairHandler = vi.fn();
|
||||
llmProvider.repairToolCallHandler.mockReturnValueOnce(repairHandler);
|
||||
const tools = { noop: { description: 'noop', inputSchema: {}, execute: vi.fn() } };
|
||||
await runner.runLoop({
|
||||
modelRole: 'candidateExtraction',
|
||||
|
|
@ -59,7 +61,9 @@ describe('AgentRunnerService.runLoop', () => {
|
|||
expect(call.tools).toEqual(tools);
|
||||
expect(call.stopWhen).toBe(17);
|
||||
expect(call.temperature).toBe(0);
|
||||
expect(call.experimental_repairToolCall).toBe(repairHandler);
|
||||
expect(llmProvider.getModel).toHaveBeenCalledWith('candidateExtraction');
|
||||
expect(llmProvider.repairToolCallHandler).toHaveBeenCalledWith({ source: 'ktx-agent-runner' });
|
||||
});
|
||||
|
||||
it('returns stopReason=natural when the loop completes without error', async () => {
|
||||
|
|
|
|||
|
|
@ -73,6 +73,9 @@ export class AgentRunnerService {
|
|||
temperature: 0,
|
||||
stopWhen: stepCountIs(params.stepBudget),
|
||||
experimental_telemetry: this.deps.telemetry?.createTelemetry(params.telemetryTags),
|
||||
experimental_repairToolCall: this.deps.llmProvider.repairToolCallHandler({
|
||||
source: params.telemetryTags.operationName ?? 'ktx-agent-runner',
|
||||
}),
|
||||
messages: built.messages,
|
||||
tools: built.tools as Record<string, Tool>,
|
||||
onStepFinish: async () => {
|
||||
|
|
|
|||
|
|
@ -695,7 +695,8 @@ describe('IngestBundleRunner — Stages 1 → 7', () => {
|
|||
await params.toolSet.emit_unmapped_fallback.execute(
|
||||
{
|
||||
rawPath: 'a.yml',
|
||||
reason: 'semantic_not_representable',
|
||||
reason: 'parse_error',
|
||||
clarification: 'semantic_not_representable',
|
||||
fallback: 'flagged',
|
||||
},
|
||||
{ toolCallId: 'fallback-1', messages: [] },
|
||||
|
|
@ -954,6 +955,7 @@ describe('IngestBundleRunner — Stages 1 → 7', () => {
|
|||
{
|
||||
rawPath: 'a.yml',
|
||||
reason: 'conversion_metric_unsupported',
|
||||
detail: expect.stringContaining('conversion metric'),
|
||||
fallback: 'flagged',
|
||||
},
|
||||
],
|
||||
|
|
@ -1006,7 +1008,8 @@ describe('IngestBundleRunner — Stages 1 → 7', () => {
|
|||
await params.toolSet.emit_unmapped_fallback.execute(
|
||||
{
|
||||
rawPath: 'cards/untranslated.json',
|
||||
reason: 'metabase_sql_untranslated',
|
||||
reason: 'parse_error',
|
||||
clarification: 'metabase_sql_untranslated',
|
||||
fallback: 'flagged',
|
||||
},
|
||||
{ toolCallId: 'fallback-1', messages: [] },
|
||||
|
|
@ -1053,7 +1056,8 @@ describe('IngestBundleRunner — Stages 1 → 7', () => {
|
|||
unmappedFallbacks: [
|
||||
{
|
||||
rawPath: 'cards/untranslated.json',
|
||||
reason: 'metabase_sql_untranslated',
|
||||
reason: 'parse_error',
|
||||
detail: expect.stringContaining('metabase_sql_untranslated'),
|
||||
fallback: 'flagged',
|
||||
},
|
||||
],
|
||||
|
|
|
|||
|
|
@ -37,7 +37,9 @@ export type UnmappedFallbackReason =
|
|||
| 'multiple_table_references'
|
||||
| 'unsupported_dialect'
|
||||
| 'parse_error'
|
||||
| 'missing_target_table';
|
||||
| 'missing_target_table'
|
||||
| 'cumulative_metric_unsupported'
|
||||
| 'conversion_metric_unsupported';
|
||||
|
||||
export interface UnmappedFallbackRecord {
|
||||
rawPath: string;
|
||||
|
|
|
|||
|
|
@ -182,6 +182,30 @@ describe('reconciliation emit tools', () => {
|
|||
]);
|
||||
});
|
||||
|
||||
it('records MetricFlow-specific unsupported fallback reasons', async () => {
|
||||
const stageIndex = makeStageIndex();
|
||||
const tool = createEmitUnmappedFallbackTool({
|
||||
stageIndex,
|
||||
allowedPaths: new Set(['metrics/conversion.yml']),
|
||||
});
|
||||
|
||||
const output = await executeTool(tool, {
|
||||
rawPath: 'metrics/conversion.yml',
|
||||
reason: 'conversion_metric_unsupported',
|
||||
fallback: 'flagged',
|
||||
});
|
||||
|
||||
expect(output).toContain('conversion metric');
|
||||
expect(stageIndex.unmappedFallbacks).toEqual([
|
||||
{
|
||||
rawPath: 'metrics/conversion.yml',
|
||||
reason: 'conversion_metric_unsupported',
|
||||
detail: expect.stringContaining('conversion metric'),
|
||||
fallback: 'flagged',
|
||||
},
|
||||
]);
|
||||
});
|
||||
|
||||
it('rejects unmapped fallback decisions for raw paths outside the allowed set', async () => {
|
||||
const stageIndex = makeStageIndex();
|
||||
const tool = createEmitUnmappedFallbackTool({
|
||||
|
|
|
|||
|
|
@ -17,6 +17,8 @@ const unmappedFallbackReasonSchema = z.enum([
|
|||
'unsupported_dialect',
|
||||
'parse_error',
|
||||
'missing_target_table',
|
||||
'cumulative_metric_unsupported',
|
||||
'conversion_metric_unsupported',
|
||||
]);
|
||||
|
||||
function sameUnmappedFallback(left: UnmappedFallbackRecord, right: UnmappedFallbackRecord): boolean {
|
||||
|
|
@ -47,6 +49,10 @@ function canonicalDetail(reason: UnmappedFallbackReason, tableRef: string | unde
|
|||
return `${tableClause} uses a SQL dialect that is not yet supported.`;
|
||||
case 'parse_error':
|
||||
return `${tableClause} could not be parsed.`;
|
||||
case 'cumulative_metric_unsupported':
|
||||
return `${tableClause} is a cumulative metric, which is not yet supported as a first-class semantic-layer primitive.`;
|
||||
case 'conversion_metric_unsupported':
|
||||
return `${tableClause} is a conversion metric, which is not yet supported as a first-class semantic-layer primitive.`;
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -51,6 +51,6 @@ describe('eviction_list tool', () => {
|
|||
deletedRawPaths: [],
|
||||
});
|
||||
|
||||
expect(tool.description).toContain('context_eviction_decision_write');
|
||||
expect(tool.description).toContain('emit_eviction_decision');
|
||||
});
|
||||
});
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ export interface EvictionListDeps {
|
|||
export function createEvictionListTool(deps: EvictionListDeps) {
|
||||
return tool({
|
||||
description:
|
||||
'List every artifact that the most recent completed sync produced from a now-deleted raw file. Remove each listed artifact and record the decision with context_eviction_decision_write so the ingest report lists every deleted-source decision.',
|
||||
'List every artifact that the most recent completed sync produced from a now-deleted raw file. Remove each listed artifact and record the decision with emit_eviction_decision so the ingest report lists every deleted-source decision.',
|
||||
inputSchema: z.object({}),
|
||||
execute: async () => {
|
||||
if (deps.deletedRawPaths.length === 0) {
|
||||
|
|
|
|||
|
|
@ -28,7 +28,7 @@ const WRITE_TOOL_NAMES = new Set([
|
|||
]);
|
||||
|
||||
export const VERIFICATION_LEDGER_PROMPT = `<pre_write_verification>
|
||||
Before any write-capable tool call (wiki_write, wiki_remove, sl_write_source, sl_edit_source, emit_unmapped_fallback), call record_verification_ledger.
|
||||
Before any durable wiki, semantic-layer, or unmapped-fallback write (wiki_write, wiki_remove, sl_write_source, sl_edit_source, emit_unmapped_fallback), call record_verification_ledger.
|
||||
The ledger is a model-authored checkpoint, not a deterministic parser gate. Summarize the verification protocol from the loaded skill, list identifiers verified with discover_data/entity_details/sql_execution, and list anything intentionally left unverified. If the write contains no warehouse identifiers, say that explicitly.
|
||||
If a write tool returns verification_ledger_required, complete the ledger and retry the write.
|
||||
</pre_write_verification>`;
|
||||
|
|
|
|||
|
|
@ -4,6 +4,10 @@ import { generateText, Output, type FlexibleSchema, type ToolSet } from 'ai';
|
|||
type GenerateTextInput = Parameters<typeof generateText>[0];
|
||||
type GenerateTextFn = (input: GenerateTextInput) => Promise<{ text?: string; output?: unknown }>;
|
||||
|
||||
function hasTools(tools: ToolSet): boolean {
|
||||
return Object.keys(tools).length > 0;
|
||||
}
|
||||
|
||||
interface GenerateKtxTextInput {
|
||||
llmProvider: KtxLlmProvider;
|
||||
role: KtxModelRole;
|
||||
|
|
@ -30,6 +34,13 @@ export async function generateKtxText(input: GenerateKtxTextInput): Promise<stri
|
|||
temperature: input.temperature ?? 0,
|
||||
messages: built.messages,
|
||||
tools: built.tools as ToolSet,
|
||||
...(hasTools(built.tools as ToolSet)
|
||||
? {
|
||||
experimental_repairToolCall: input.llmProvider.repairToolCallHandler({
|
||||
source: `ktx-${input.role}`,
|
||||
}),
|
||||
}
|
||||
: {}),
|
||||
});
|
||||
if (typeof result.text !== 'string') {
|
||||
throw new Error('KTX LLM text generation returned no text');
|
||||
|
|
@ -52,6 +63,13 @@ export async function generateKtxObject<TOutput, TSchema>(
|
|||
temperature: input.temperature ?? 0,
|
||||
messages: built.messages,
|
||||
tools: built.tools as ToolSet,
|
||||
...(hasTools(built.tools as ToolSet)
|
||||
? {
|
||||
experimental_repairToolCall: input.llmProvider.repairToolCallHandler({
|
||||
source: `ktx-${input.role}`,
|
||||
}),
|
||||
}
|
||||
: {}),
|
||||
output: Output.object({
|
||||
schema: input.schema as FlexibleSchema<TOutput>,
|
||||
}),
|
||||
|
|
|
|||
|
|
@ -53,7 +53,7 @@ export class SlDiscoverTool extends BaseSemanticLayerTool<typeof slDiscoverInput
|
|||
return `<purpose>
|
||||
Discover available semantic layer sources, columns, measures, and joins.
|
||||
When called without a connectionId, discovers sources across ALL data sources — grouped by data source name and ID.
|
||||
Use this to understand what data is available before writing a semantic_query.
|
||||
Use this to understand what data is available before querying through the semantic layer.
|
||||
</purpose>
|
||||
|
||||
<when_to_use>
|
||||
|
|
|
|||
|
|
@ -36,7 +36,7 @@ Use this when you need to understand how a source is built — e.g., before edit
|
|||
|
||||
<when_not_to_use>
|
||||
- To discover what sources/measures/dimensions are available for querying — use sl_discover instead
|
||||
- To query data — use semantic_query or create_widget with slQuery
|
||||
- To query data — use the semantic-layer query surface (\`sl_query\` in MCP)
|
||||
</when_not_to_use>`;
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue