refactor(sl): split overlay columns from column_overrides and enforce TS/Python wire contract

Overlay sources now have two distinct collections: `columns:` for computed
columns (requiring `expr` + `type`) and `column_overrides:` for metadata
patches to inherited manifest columns. Composing or loading an overlay that
mixes the two — or references an unknown column — fails with a typed error.

Introduce `ResolvedSemanticLayerSource` / `resolvedSourceSchema` /
`toResolvedWire` as the strict shape sent to the Python engine, and add a
schema contract test that diffs Zod against the Pydantic JSON schema dumped
by `python -m semantic_layer dump-schema`. `SourceDefinition` is now
`extra="forbid"` on the Python side.

`loadAllSources` surfaces per-file load errors instead of swallowing them,
so validation/query paths can report manifest shard parse failures.
This commit is contained in:
Andrey Avtomonov 2026-05-15 00:36:52 +02:00
parent 3e12a9fef4
commit f561bfa850
42 changed files with 847 additions and 193 deletions

View file

@ -3,7 +3,7 @@ from __future__ import annotations
from enum import Enum
from typing import Any, Literal
from pydantic import BaseModel, Field, model_validator
from pydantic import BaseModel, ConfigDict, Field, model_validator
# ── Source Definition Models ──────────────────────────────────────────
@ -105,6 +105,8 @@ class DefaultTimeDimensionDbt(BaseModel):
class SourceDefinition(BaseModel):
model_config = ConfigDict(extra="forbid")
name: str
description: str | None = None
descriptions: dict[str, str] | None = None
@ -123,6 +125,8 @@ class SourceDefinition(BaseModel):
def validate_source(self) -> SourceDefinition:
if self.description is None:
self.description = _resolve_description_map(self.descriptions)
if not self.table and not self.sql:
raise ValueError("resolved source must have 'table' or 'sql'")
if self.table and self.sql:
raise ValueError("'table' and 'sql' are mutually exclusive")
if not self.grain: