* Improve schema setup and Notion ingest UX * Handle Postgres network scan failures * WIP: save local changes before main merge * Refine setup prompt choices * Tighten ingest reconciliation guidance * Commit setup config updates * Canonicalize unmapped fallback details * Count reconciliation actions in reports * Harden semantic layer source validation * Return wiki content after edits * Validate SL sources against manifests * Validate wiki refs before writes * Simplify CLI next steps * Clarify agent setup summary * Surface dbt target SL sources * Recover SL write fallbacks * Preserve failed context build metadata * Track raw paths for ingest actions * test(cli): update seeded demo expectations * fix(ingest): scope fallback recovery checks * fix(sl): tighten source validation guards * fix(wiki): ignore empty embedding vectors * Improve Notion ingest UX * Enforce flat wiki keys * test(context): update wiki key assertion --------- Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
6.3 KiB
| name | description | callers | |
|---|---|---|---|
| notion_synthesize | Synthesize durable KTX wiki pages and semantic-layer sources from staged Notion pages, databases, data-source rows, and clustered Notion evidence. Load when a WorkUnit contains Notion raw files or Notion evidence chunks. |
|
Notion Cluster Synthesis
Use this skill when a WorkUnit contains staged Notion content from pages/**, databases/**, data-sources/**, or clustered Notion evidence.
Role
Each WorkUnit is either a single Notion page/span or a topical cluster of related Notion pages, pre-grouped by embedding similarity. Read the assigned raw files, then write a small set of durable wiki entries and, when applicable, semantic-layer sources that synthesize the WorkUnit's knowledge. Write final memory directly; do not write candidates.
Required Workflow
- Read the WorkUnit notes and rawFiles list. Page content lives in
page.md;metadata.jsonholds title, path, object type, data-source ids, last edited metadata, and properties. - For each assigned page, call
read_raw_file, orread_raw_spanfor oversized pages when the notes specify a span. - Search
wiki_searchfor existing pages that overlap the WorkUnit topics. Prefer updating an existing page over creating a duplicate. - Use
context_evidence_search,context_evidence_read, andcontext_evidence_neighborsto pull supporting chunks when indexed evidence is relevant. PasschunkIdanddocumentIdvalues verbatim as returned by the evidence tools. - Write durable business knowledge with
wiki_write. Aim for a small number of high-quality pages per WorkUnit or cluster. IncluderawPathswith the exact Notion raw files that support each page. - When the Notion content defines a reusable dataset, metric, segment, join rule, source-of-truth mapping, or table with explicit columns, load
sl_capture, discover existing sources first withsl_discoverorsl_read_source, then usesl_write_sourceorsl_edit_sourceonly for a confirmed mapped non-Notion target source. IncluderawPathswith the exact Notion raw files that support the SL action. If no mapped target exists, callemit_unmapped_fallbackand keep the content wiki-only. - For every deleted raw path in the Eviction Set, call
eviction_list, decide retention, thencontext_eviction_decision_write. Do this even when no wiki write is needed.
What To Capture
Capture durable, reusable company knowledge:
- metric definitions, KPI formulas, named business concepts, and reusable filters
- workflows, policies, ownership rules, approval conventions, and source-of-truth mappings
- data-source row pages that describe tables, columns, semantic models, dashboards, or business entities
- cross-system aliases connecting Notion terms to warehouse, dbt, Looker, Metabase, or MetricFlow names
- caveats, conflicts, supersession notes, and customer/product assumptions affecting future analysis
Skip noisy or transient content:
- meeting notes with no reusable rule
- task lists, project status updates, and time-bounded snapshots
- duplicate docs with no new fact
- database metadata pages when row pages contain the actual business content
- transient announcements and long page summaries
Quality
Prefer fewer, stronger entries. Every wiki entry must cite at least one Notion page or row using its path and last edited date when available. When evidence conflicts, write a conflict note inside the wiki page rather than choosing silently.
If a clustered WorkUnit includes several related pages, synthesize the shared rule or concept instead of writing one thin page per source. For oversized page spans, read only the assigned span unless the WorkUnit explicitly asks for neighboring context.
Search existing wiki pages for the same tables: or sl_refs: frontmatter and for source-of-truth aliases before creating a new page. If an existing page already documents the same warehouse object or business concept, update it instead of creating a differently named duplicate.
Citation Style
## Revenue Recognition
- Booked revenue excludes refunds and test accounts.
- Source: Notion - Company Handbook / Finance / Revenue Recognition, last edited 2026-04-12.
- Conflict note: An older Sales Ops page uses gross revenue before refunds; treat the Finance Handbook as current unless Finance says otherwise.
Semantic-Layer Rules
- Load
sl_capturebefore writing or editing SL sources. - Discover existing sources first with
sl_discover; read existing source YAML before editing. - Prefer overlays on manifest-backed sources over standalone SQL.
- If Notion describes a dashboard or metric but does not define executable logic, write a wiki page and attach
sl_refsonly after confirming the referenced source exists. - Notion
dataSourceCountcounts Notion databases/data sources only. It does not prove that a warehouse/dbt table has or lacks a mapped semantic-layer source. - Do not create SL sources under the Notion connection just because a page mentions a warehouse, dbt, Looker, or Metabase object. Use the mapped warehouse/source connection after discovery, or emit an unmapped fallback and write wiki-only.
- Distinguish fallback reasons precisely: if a non-Notion warehouse/dbt connection exists but
sl_discovercannot find the named table/source, useno_physical_table; reserveno_connection_mappingfor cases where there is no plausible non-Notion target connection at all. - If
sl_discoverresolves the table/source, do not callemit_unmapped_fallbackfor that table. Use the resolved source forsl_refs, overlay edits, or wiki-only documentation. - When calling
emit_unmapped_fallback, pass the table or source identifier astableRef(e.g.tableRef: "orbit_analytics.customer") — the tool generates the canonical detail string from the reason code andtableRef. Use the optionalclarificationfield only to add context that does not contradict the reason. Do not restate the reason inclarification.
Tools
Allowed: read_raw_file, read_raw_span, wiki_search, wiki_read, wiki_write, sl_discover, sl_read_source, sl_write_source, sl_edit_source, sl_validate, context_evidence_search, context_evidence_read, context_evidence_neighbors, emit_unmapped_fallback, eviction_list, context_eviction_decision_write.
Not allowed: context_candidate_write, context_candidate_mark.