From 730712b73fbcb7587840ff0fc2ad3c08216d6dd5 Mon Sep 17 00:00:00 2001 From: Andrew Altshuler Date: Wed, 13 May 2026 17:26:06 +0300 Subject: [PATCH] codeowners: yml source of truth + generator + drift CI (#88) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * codeowners: generator + drift CI + initial roles Source-of-truth approach to CODEOWNERS: yml is hand-edited, CODEOWNERS is generated and CI-enforced. Every role change is a reviewable PR with a permanent in-repo audit trail. No GitHub UI clicks, no shadow state. Initial roles: engineering @aaltshuler owns crates/** + default (.github/, scripts/, Cargo.*, openapi.json, everything else not docs) docs @aaltshuler @ragnorc owns docs/**, README.md, AGENTS.md, CLAUDE.md, SECURITY.md Per GitHub semantics, multiple owners on a CODEOWNERS line means "any one satisfies the review" — for docs, either named member can approve. Strict "N distinct approvers" would need a CI workaround (not wired today; tracked for future hardening). Components: - .github/codeowners-roles.yml — source of truth. Edit this. - .github/scripts/render-codeowners.py — generator (PyYAML; ~100 LoC). - .github/CODEOWNERS — generated. CI rejects hand-edits. - .github/workflows/codeowners.yml — two checks: * drift: re-render and assert CODEOWNERS matches. * noedit: reject PRs that edit CODEOWNERS without editing the yml. - docs/codeowners.md — explains the source-of-truth pattern, how to change roles, how to add new roles. - AGENTS.md topic-index row. What's NOT in this PR: - Branch protection on main (separate PR; needs `gh api` call against the org). - Required-reviewer enforcement (depends on branch protection landing). - Required CI status checks (depends on branch protection landing). - Scheduled rotation (the schedule: block in the yml + a weekly workflow). Today's roles are stable; rotation isn't needed yet. - Linear-as-source-of-truth integration (Approach 4 from the design discussion; deferred). Verified: - Generator output is deterministic (idempotent re-runs). - scripts/check-agents-md.sh OK (28 links, 28 docs). Co-Authored-By: Claude Opus 4.7 (1M context) * codeowners: fix catch-all ordering (Devin review #88) Devin caught a real bug: GitHub CODEOWNERS uses "last match wins" semantics, but the generator emitted the catch-all `*` AFTER specific patterns. Net effect: `*` won for every file, silently nullifying the docs role and never routing reviews to @ragnorc. Fix is one-line — emit the default `*` line before iterating the specific paths. Also: - Added a regression assertion in the generator: after rendering, the first non-comment line must start with `*` if a default is configured. Generator exits non-zero otherwise. Catches the same class of mistake in any future refactor. - Rewrote the yml header comment, which incorrectly stated "keep more-specific paths after broader patterns" (correct for GitHub semantics but the generator was doing the opposite — so the comment read as a description of behavior when it was actually a contradicted intention). Verified by re-rendering: `*` is now line 12, `crates/**` is line 14, `docs/**` is line 15, etc. README.md matches both `*` and `README.md`; `README.md` is later → wins → @aaltshuler + @ragnorc both assigned. Co-Authored-By: Claude Opus 4.7 (1M context) --------- Co-authored-by: Claude Opus 4.7 (1M context) --- .github/CODEOWNERS | 18 ++++ .github/codeowners-roles.yml | 57 ++++++++++++ .github/scripts/render-codeowners.py | 134 +++++++++++++++++++++++++++ .github/workflows/codeowners.yml | 66 +++++++++++++ AGENTS.md | 1 + docs/codeowners.md | 37 ++++++++ 6 files changed, 313 insertions(+) create mode 100644 .github/CODEOWNERS create mode 100644 .github/codeowners-roles.yml create mode 100755 .github/scripts/render-codeowners.py create mode 100644 .github/workflows/codeowners.yml create mode 100644 docs/codeowners.md diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS new file mode 100644 index 0000000..34dd9c6 --- /dev/null +++ b/.github/CODEOWNERS @@ -0,0 +1,18 @@ +# AUTOGENERATED from .github/codeowners-roles.yml. Do not edit by hand. +# +# To change role membership or path assignments: +# 1. Edit .github/codeowners-roles.yml +# 2. Run `python3 .github/scripts/render-codeowners.py` +# 3. Commit both files together +# +# CI fails if this file drifts from its source, and rejects PRs that +# edit this file directly without also editing the yml. + +* @aaltshuler + +crates/** @aaltshuler +docs/** @aaltshuler @ragnorc +README.md @aaltshuler @ragnorc +AGENTS.md @aaltshuler @ragnorc +CLAUDE.md @aaltshuler @ragnorc +SECURITY.md @aaltshuler @ragnorc diff --git a/.github/codeowners-roles.yml b/.github/codeowners-roles.yml new file mode 100644 index 0000000..9fdc8e5 --- /dev/null +++ b/.github/codeowners-roles.yml @@ -0,0 +1,57 @@ +# Source of truth for .github/CODEOWNERS. +# +# How to change role membership or path assignments: +# 1. Edit this file. +# 2. Run `python3 .github/scripts/render-codeowners.py` to regenerate +# .github/CODEOWNERS. +# 3. Commit both files in the same PR. +# +# CI fails on drift between this source and the generated CODEOWNERS +# (see .github/workflows/codeowners.yml). CI also rejects direct edits +# to .github/CODEOWNERS that don't accompany a change here. +# +# Why a generator instead of editing CODEOWNERS directly? +# The yml is the audit trail: `git log .github/codeowners-roles.yml` +# shows every role change with a reviewable diff and a merge commit. +# The rendered CODEOWNERS is what GitHub reads at PR time. + +roles: + engineering: + description: > + All production code under crates/**. Engine, CLI, server, + compiler. Single owner; review must come from this person. + members: + - aaltshuler + + docs: + description: > + Documentation under docs/**, plus repo-level docs (README.md, + AGENTS.md, CLAUDE.md symlink, SECURITY.md). Either named member + can approve; both are listed so reviews can route to whoever is + available. + members: + - aaltshuler + - ragnorc + +# Path → role mapping. GitHub CODEOWNERS uses "last match wins" +# semantics — when multiple patterns match a file, only the last +# matching pattern's owners apply. The generator handles this by +# emitting `default` as the first `*` line and the specific patterns +# below afterward, so specific paths override the catch-all. +# +# Within this list, order matters only between overlapping specific +# patterns (the later one wins). Today nothing overlaps; future +# additions should keep more-specific patterns later. +paths: + "crates/**": [engineering] + "docs/**": [docs] + "README.md": [docs] + "AGENTS.md": [docs] + "CLAUDE.md": [docs] + "SECURITY.md": [docs] + +# Catch-all for paths not explicitly mapped (.github/, scripts/, +# Cargo.toml, Cargo.lock, openapi.json, LICENSE, etc.). Defaults to +# engineering — every change to repo infrastructure needs the +# engineering owner's review. +default: [engineering] diff --git a/.github/scripts/render-codeowners.py b/.github/scripts/render-codeowners.py new file mode 100755 index 0000000..f243d0c --- /dev/null +++ b/.github/scripts/render-codeowners.py @@ -0,0 +1,134 @@ +#!/usr/bin/env python3 +"""Render .github/CODEOWNERS from .github/codeowners-roles.yml. + +The yml is the source of truth — editing CODEOWNERS directly is +rejected by CI (see .github/workflows/codeowners.yml). This script +expands the role-based yml into the flat path→owners format GitHub +expects. + +Usage: + python3 .github/scripts/render-codeowners.py + +Exits non-zero on: + - Missing PyYAML. + - Unknown role referenced in `paths` or `default`. + - Role with no members (a role must always resolve to at least + one owner; otherwise CODEOWNERS would assign nobody and GitHub + would silently fall back to "no required reviewer", which + defeats the purpose). +""" + +from __future__ import annotations + +import sys +from pathlib import Path + +try: + import yaml +except ImportError: + sys.exit( + "error: PyYAML is required. Install with `pip install pyyaml` " + "or `python3 -m pip install pyyaml`." + ) + +REPO_ROOT = Path(__file__).resolve().parents[2] +SOURCE = REPO_ROOT / ".github" / "codeowners-roles.yml" +OUTPUT = REPO_ROOT / ".github" / "CODEOWNERS" + +BANNER = """\ +# AUTOGENERATED from .github/codeowners-roles.yml. Do not edit by hand. +# +# To change role membership or path assignments: +# 1. Edit .github/codeowners-roles.yml +# 2. Run `python3 .github/scripts/render-codeowners.py` +# 3. Commit both files together +# +# CI fails if this file drifts from its source, and rejects PRs that +# edit this file directly without also editing the yml. +""" + + +def resolve(role_name: str, roles: dict) -> list[str]: + role = roles.get(role_name) + if role is None: + sys.exit( + f"error: unknown role '{role_name}'. " + f"Known roles: {sorted(roles.keys())}" + ) + members = role.get("members") or [] + if not members: + sys.exit( + f"error: role '{role_name}' has no members. " + f"A role must resolve to at least one owner." + ) + return members + + +def owners_for(role_names: list[str], roles: dict) -> list[str]: + """Return @-prefixed GitHub handles, deduped, preserving order.""" + seen: list[str] = [] + for role_name in role_names: + for member in resolve(role_name, roles): + handle = f"@{member}" + if handle not in seen: + seen.append(handle) + return seen + + +def main() -> int: + if not SOURCE.exists(): + sys.exit(f"error: source file not found: {SOURCE}") + spec = yaml.safe_load(SOURCE.read_text()) + + roles = spec.get("roles") or {} + if not roles: + sys.exit("error: codeowners-roles.yml declares no roles") + + paths = spec.get("paths") or {} + if not paths: + sys.exit("error: codeowners-roles.yml declares no paths") + + lines: list[str] = [BANNER] + + # Pad the path column for alignment. Width is the longest pattern + # plus a small margin. + width = max(len(p) for p in paths) + 2 + + # GitHub CODEOWNERS uses "last match wins" semantics. Emit the + # default catch-all `*` FIRST so specific patterns below override + # it for the paths they cover. If we emitted `*` last, every file + # would resolve to the default owners regardless of more-specific + # rules — which would silently nullify any role distinction. + if "default" in spec: + default_owners = owners_for(spec["default"], roles) + lines.append(f"{'*':<{width}} {' '.join(default_owners)}") + lines.append("") + + for pattern, role_names in paths.items(): + owners = owners_for(role_names, roles) + lines.append(f"{pattern:<{width}} {' '.join(owners)}") + + lines.append("") # trailing newline so the file ends cleanly + + rendered = "\n".join(lines) + + # Regression check: the catch-all `*` line (if any) must precede + # every specific-path line. Failure here means the generator is + # silently nullifying specific rules. + if "default" in spec: + non_comment = [ln for ln in rendered.splitlines() if ln and not ln.startswith("#")] + first_pattern = non_comment[0].split()[0] if non_comment else None + if first_pattern != "*": + sys.exit( + f"error: generator invariant violated — first emitted pattern is " + f"{first_pattern!r}, expected '*'. CODEOWNERS uses last-match-wins; " + f"the catch-all must come first." + ) + + OUTPUT.write_text(rendered) + print(f"wrote {OUTPUT.relative_to(REPO_ROOT)}") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/.github/workflows/codeowners.yml b/.github/workflows/codeowners.yml new file mode 100644 index 0000000..19d5835 --- /dev/null +++ b/.github/workflows/codeowners.yml @@ -0,0 +1,66 @@ +name: CODEOWNERS + +on: + pull_request: + paths: + - '.github/codeowners-roles.yml' + - '.github/CODEOWNERS' + - '.github/scripts/render-codeowners.py' + - '.github/workflows/codeowners.yml' + workflow_dispatch: + +# Read-only; we never push from this workflow. +permissions: + contents: read + +jobs: + drift: + name: CODEOWNERS matches source + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v5.0.1 + + - name: Set up Python + uses: actions/setup-python@v5.4.0 + with: + python-version: '3.13' + + - name: Install PyYAML + run: pip install pyyaml + + - name: Re-render CODEOWNERS + run: python3 .github/scripts/render-codeowners.py + + - name: Reject drift + run: | + if ! git diff --quiet .github/CODEOWNERS; then + echo "::error::.github/CODEOWNERS is out of sync with .github/codeowners-roles.yml." + echo "::error::Run \`python3 .github/scripts/render-codeowners.py\` locally and commit the result." + echo "--- diff ---" + git --no-pager diff .github/CODEOWNERS + exit 1 + fi + echo "CODEOWNERS is in sync with its source." + + noedit: + name: CODEOWNERS not hand-edited + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v5.0.1 + with: + # Need history so we can diff against the PR base. + fetch-depth: 0 + + - name: Reject hand-edits to generated file + run: | + base="origin/${{ github.base_ref }}" + git fetch origin "${{ github.base_ref }}" --quiet + changed=$(git diff --name-only "$base" HEAD) + edited_generated=$(echo "$changed" | grep -E '^\.github/CODEOWNERS$' || true) + edited_source=$(echo "$changed" | grep -E '^\.github/codeowners-roles\.yml$' || true) + if [ -n "$edited_generated" ] && [ -z "$edited_source" ]; then + echo "::error::This PR edits .github/CODEOWNERS but not its source .github/codeowners-roles.yml." + echo "::error::Edit the yml and regenerate via \`python3 .github/scripts/render-codeowners.py\`." + exit 1 + fi + echo "CODEOWNERS edits accompany source edits (or no CODEOWNERS edits in this PR)." diff --git a/AGENTS.md b/AGENTS.md index 043de89..c6bf097 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -93,6 +93,7 @@ Full diagram and concurrency model: [docs/architecture.md](docs/architecture.md) | Install (binary / Homebrew / source / channels) | [docs/install.md](docs/install.md) | | Deployment (binary / container / RustFS bootstrap / auth / build variants) | [docs/deployment.md](docs/deployment.md) | | CI / release workflows | [docs/ci.md](docs/ci.md) | +| Code ownership (CODEOWNERS source of truth, roles, regeneration) | [docs/codeowners.md](docs/codeowners.md) | | Constants & tunables cheat sheet | [docs/constants.md](docs/constants.md) | | Per-version release notes | [docs/releases/](docs/releases/) | diff --git a/docs/codeowners.md b/docs/codeowners.md new file mode 100644 index 0000000..ad388ea --- /dev/null +++ b/docs/codeowners.md @@ -0,0 +1,37 @@ +# Code ownership + +`.github/CODEOWNERS` is **generated** — not hand-edited. The source of truth is `.github/codeowners-roles.yml`, expanded by `.github/scripts/render-codeowners.py`. CI rejects drift between the two and rejects direct edits to `CODEOWNERS` that don't accompany a yml change. + +This setup gives every role change a reviewable PR and a permanent in-repo audit trail (`git log .github/codeowners-roles.yml`). + +## Current roles + +| Role | Members | Scope | +|---|---|---| +| `engineering` | `@aaltshuler` | All code under `crates/**`, repo infrastructure, default for unmapped paths | +| `docs` | `@aaltshuler`, `@ragnorc` | `docs/**`, README.md, AGENTS.md, CLAUDE.md, SECURITY.md | + +GitHub treats multiple owners in a CODEOWNERS line as **"any one of them satisfies the review requirement"**. For docs, either named member can approve. To require N distinct approvers on a specific path, layer a CI check on top (not currently configured). + +## How to change role membership or path mappings + +1. Edit `.github/codeowners-roles.yml`. +2. Run `python3 .github/scripts/render-codeowners.py` (requires PyYAML; `pip install pyyaml`). +3. Commit both files in the same PR. + +CI fails the PR if: +- `CODEOWNERS` was edited without a corresponding yml change, or +- The yml was changed but the rendered `CODEOWNERS` doesn't match. + +## How to add a new role + +1. Add a new entry to `roles:` in the yml with a `description` and `members` list. +2. Reference the role from `paths:` (or `default:`). +3. Regenerate + commit as above. + +## Why a generator, not direct CODEOWNERS edits? + +- **Audit trail**: `git log .github/codeowners-roles.yml` is the canonical record of every role change. The rendered `CODEOWNERS` is a derived artifact. +- **Roles are first-class**: paths reference roles, not raw handles. Renaming a person or rotating a role updates one place, not every path. +- **Future extension**: scheduled rotation (weekly on-call, quarterly leads) plugs into the same yml without changing the path mappings. Not enabled today. +- **Consistency with the product**: omnigraph itself enforces auditable Cedar policy. The repo's code-owner policy follows the same "policy as reviewed code" pattern.