mirror of
https://github.com/Kaelio/ktx.git
synced 2026-07-04 10:52:13 +02:00
feat: query_policy semantic-layer-only restricts agents to predefined semantic-layer measures (#334)
* feat(sl): add predefined_measures_only guard to semantic query planning SemanticQuery gains a predefined_measures_only flag; the planner rejects any measure resolved with Provenance.COMPOSED (runtime aggregate expressions and query-time derivations) while predefined measures, predefined derived chains, dimensions, filters, and segments pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(config): add per-connection query_policy to warehouse connections query_policy: semantic-layer-only | read-only-sql (default) on the warehouse connection schema, plus a policy module with the raw-SQL guard, federated member restriction lookup, and the project-level predicate used to gate sql_execution registration. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(cli): enforce query_policy on raw SQL through one shared executor ktx sql and the MCP sql_execution tool now share executeProjectRawSql (resolve, policy check, read-only validation, execute), collapsing their duplicated validate-then-execute paths. Restricted connections are rejected before validation; federated raw SQL is rejected when any member is restricted. sql_execution is not registered when every SQL connection is restricted, and connection_list marks restricted connections so agents route to sl_query. executeProjectReadOnlySql stays generic for ktx-internal SQL (scan, ingest, SL-generated). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(sl): compile queries with predefined_measures_only from query_policy compileLocalSlQuery injects the flag from the connection's query_policy, never from caller input, covering both ktx sl query and the MCP sl_query tool through the daemon compile path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs: document query_policy semantic-layer-only Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(sl): close semantic-layer-only bypasses via filters and federated hint The predefined_measures_only guard only inspected query.measures, so a composed aggregate written into `filters` slipped through _classify_filters into a HAVING clause untouched — letting a restricted agent evaluate arbitrary aggregates (e.g. threshold-probing `sum(x) BETWEEN a AND b`). Reject filter clauses that compose an aggregate function; a HAVING that compares a predefined measure by name (`orders.revenue > 100`) still works. Also make the federated sl_query error policy-aware: when a member is restricted, raw federated SQL is disabled too, so stop directing the agent to `ktx sql -c _ktx_federated` / sql_execution (a guaranteed failure) and point to per-connection semantic-layer queries instead. --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
This commit is contained in:
parent
66768fe009
commit
a651b82e2f
21 changed files with 887 additions and 68 deletions
|
|
@ -139,6 +139,11 @@ connections are declared. It surfaces in `ktx connection` and in the agent's
|
|||
connection list so the id is discoverable. Querying a single member database
|
||||
directly with its own connection id (`ktx sql -c pg_books ...`) is unchanged.
|
||||
|
||||
If any member connection sets
|
||||
[`query_policy: semantic-layer-only`](/docs/configuration/ktx-yaml#query-policy),
|
||||
raw SQL against `_ktx_federated` is rejected as a whole: a federated query can
|
||||
touch any member's tables, so one restricted member restricts the federation.
|
||||
|
||||
## Federated queries are read-only
|
||||
|
||||
DuckDB attaches every member database with read-only access. Federated queries
|
||||
|
|
|
|||
|
|
@ -217,6 +217,41 @@ connections:
|
|||
observed in-scope query history. The block uses `mode: exclude` and remains
|
||||
hand-editable.
|
||||
|
||||
### Query policy
|
||||
|
||||
Set `query_policy: semantic-layer-only` on a warehouse connection to stop
|
||||
agents from authoring SQL against it. The default, `read-only-sql`, allows
|
||||
parser-validated read-only SQL through `ktx sql` and the `sql_execution` MCP
|
||||
tool alongside semantic-layer queries.
|
||||
|
||||
```yaml
|
||||
connections:
|
||||
warehouse:
|
||||
driver: snowflake
|
||||
query_policy: semantic-layer-only
|
||||
```
|
||||
|
||||
With `semantic-layer-only`:
|
||||
|
||||
- `ktx sql` and the `sql_execution` MCP tool reject the connection with a
|
||||
clear error. When every SQL connection in the project is restricted, the
|
||||
`sql_execution` tool is not registered at all.
|
||||
- Raw SQL against the federated connection (`_ktx_federated`) is rejected
|
||||
when any member connection is restricted.
|
||||
- Semantic-layer queries (`ktx sl query`, the `sl_query` tool) accept only
|
||||
measures predefined in the semantic-layer sources. Composed aggregate
|
||||
expressions such as `sum(orders.amount)` are rejected wherever they appear,
|
||||
including inside `filters` (a `HAVING`-style clause may only compare a
|
||||
predefined measure by name, e.g. `orders.revenue > 100`). Grouping by
|
||||
declared dimensions, filtering on columns, and segments remain available.
|
||||
- `connection_list` marks the connection as restricted so agents route to
|
||||
`sl_query` instead of burning a failed call.
|
||||
|
||||
The policy governs agent-facing query authorship, not data access: **ktx**'s
|
||||
own scan, ingest, and semantic-layer-generated SQL still run, and context
|
||||
tools such as `entity_details` and `dictionary_search` still expose schema
|
||||
metadata and sampled values.
|
||||
|
||||
### Metabase
|
||||
|
||||
```yaml
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue