codeowners: yml source of truth + generator + drift CI (#88)

* codeowners: generator + drift CI + initial roles

Source-of-truth approach to CODEOWNERS: yml is hand-edited, CODEOWNERS
is generated and CI-enforced. Every role change is a reviewable PR
with a permanent in-repo audit trail. No GitHub UI clicks, no shadow
state.

Initial roles:

  engineering  @aaltshuler            owns crates/** + default (.github/,
                                       scripts/, Cargo.*, openapi.json,
                                       everything else not docs)

  docs         @aaltshuler @ragnorc   owns docs/**, README.md, AGENTS.md,
                                       CLAUDE.md, SECURITY.md

Per GitHub semantics, multiple owners on a CODEOWNERS line means "any
one satisfies the review" — for docs, either named member can approve.
Strict "N distinct approvers" would need a CI workaround (not wired
today; tracked for future hardening).

Components:

- .github/codeowners-roles.yml — source of truth. Edit this.
- .github/scripts/render-codeowners.py — generator (PyYAML; ~100 LoC).
- .github/CODEOWNERS — generated. CI rejects hand-edits.
- .github/workflows/codeowners.yml — two checks:
  * drift: re-render and assert CODEOWNERS matches.
  * noedit: reject PRs that edit CODEOWNERS without editing the yml.
- docs/codeowners.md — explains the source-of-truth pattern, how to
  change roles, how to add new roles.
- AGENTS.md topic-index row.

What's NOT in this PR:

- Branch protection on main (separate PR; needs `gh api` call against
  the org).
- Required-reviewer enforcement (depends on branch protection landing).
- Required CI status checks (depends on branch protection landing).
- Scheduled rotation (the schedule: block in the yml + a weekly
  workflow). Today's roles are stable; rotation isn't needed yet.
- Linear-as-source-of-truth integration (Approach 4 from the design
  discussion; deferred).

Verified:
- Generator output is deterministic (idempotent re-runs).
- scripts/check-agents-md.sh OK (28 links, 28 docs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* codeowners: fix catch-all ordering (Devin review #88)

Devin caught a real bug: GitHub CODEOWNERS uses "last match wins"
semantics, but the generator emitted the catch-all `*` AFTER specific
patterns. Net effect: `*` won for every file, silently nullifying the
docs role and never routing reviews to @ragnorc.

Fix is one-line — emit the default `*` line before iterating the
specific paths. Also:

- Added a regression assertion in the generator: after rendering, the
  first non-comment line must start with `*` if a default is
  configured. Generator exits non-zero otherwise. Catches the same
  class of mistake in any future refactor.
- Rewrote the yml header comment, which incorrectly stated "keep
  more-specific paths after broader patterns" (correct for GitHub
  semantics but the generator was doing the opposite — so the comment
  read as a description of behavior when it was actually a contradicted
  intention).

Verified by re-rendering: `*` is now line 12, `crates/**` is line 14,
`docs/**` is line 15, etc. README.md matches both `*` and `README.md`;
`README.md` is later → wins → @aaltshuler + @ragnorc both assigned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Andrew Altshuler 2026-05-13 17:26:06 +03:00 committed by GitHub
parent c142dafdf3
commit 730712b73f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 313 additions and 0 deletions

18
.github/CODEOWNERS vendored Normal file
View file

@ -0,0 +1,18 @@
# AUTOGENERATED from .github/codeowners-roles.yml. Do not edit by hand.
#
# To change role membership or path assignments:
# 1. Edit .github/codeowners-roles.yml
# 2. Run `python3 .github/scripts/render-codeowners.py`
# 3. Commit both files together
#
# CI fails if this file drifts from its source, and rejects PRs that
# edit this file directly without also editing the yml.
* @aaltshuler
crates/** @aaltshuler
docs/** @aaltshuler @ragnorc
README.md @aaltshuler @ragnorc
AGENTS.md @aaltshuler @ragnorc
CLAUDE.md @aaltshuler @ragnorc
SECURITY.md @aaltshuler @ragnorc

57
.github/codeowners-roles.yml vendored Normal file
View file

@ -0,0 +1,57 @@
# Source of truth for .github/CODEOWNERS.
#
# How to change role membership or path assignments:
# 1. Edit this file.
# 2. Run `python3 .github/scripts/render-codeowners.py` to regenerate
# .github/CODEOWNERS.
# 3. Commit both files in the same PR.
#
# CI fails on drift between this source and the generated CODEOWNERS
# (see .github/workflows/codeowners.yml). CI also rejects direct edits
# to .github/CODEOWNERS that don't accompany a change here.
#
# Why a generator instead of editing CODEOWNERS directly?
# The yml is the audit trail: `git log .github/codeowners-roles.yml`
# shows every role change with a reviewable diff and a merge commit.
# The rendered CODEOWNERS is what GitHub reads at PR time.
roles:
engineering:
description: >
All production code under crates/**. Engine, CLI, server,
compiler. Single owner; review must come from this person.
members:
- aaltshuler
docs:
description: >
Documentation under docs/**, plus repo-level docs (README.md,
AGENTS.md, CLAUDE.md symlink, SECURITY.md). Either named member
can approve; both are listed so reviews can route to whoever is
available.
members:
- aaltshuler
- ragnorc
# Path → role mapping. GitHub CODEOWNERS uses "last match wins"
# semantics — when multiple patterns match a file, only the last
# matching pattern's owners apply. The generator handles this by
# emitting `default` as the first `*` line and the specific patterns
# below afterward, so specific paths override the catch-all.
#
# Within this list, order matters only between overlapping specific
# patterns (the later one wins). Today nothing overlaps; future
# additions should keep more-specific patterns later.
paths:
"crates/**": [engineering]
"docs/**": [docs]
"README.md": [docs]
"AGENTS.md": [docs]
"CLAUDE.md": [docs]
"SECURITY.md": [docs]
# Catch-all for paths not explicitly mapped (.github/, scripts/,
# Cargo.toml, Cargo.lock, openapi.json, LICENSE, etc.). Defaults to
# engineering — every change to repo infrastructure needs the
# engineering owner's review.
default: [engineering]

134
.github/scripts/render-codeowners.py vendored Executable file
View file

@ -0,0 +1,134 @@
#!/usr/bin/env python3
"""Render .github/CODEOWNERS from .github/codeowners-roles.yml.
The yml is the source of truth editing CODEOWNERS directly is
rejected by CI (see .github/workflows/codeowners.yml). This script
expands the role-based yml into the flat pathowners format GitHub
expects.
Usage:
python3 .github/scripts/render-codeowners.py
Exits non-zero on:
- Missing PyYAML.
- Unknown role referenced in `paths` or `default`.
- Role with no members (a role must always resolve to at least
one owner; otherwise CODEOWNERS would assign nobody and GitHub
would silently fall back to "no required reviewer", which
defeats the purpose).
"""
from __future__ import annotations
import sys
from pathlib import Path
try:
import yaml
except ImportError:
sys.exit(
"error: PyYAML is required. Install with `pip install pyyaml` "
"or `python3 -m pip install pyyaml`."
)
REPO_ROOT = Path(__file__).resolve().parents[2]
SOURCE = REPO_ROOT / ".github" / "codeowners-roles.yml"
OUTPUT = REPO_ROOT / ".github" / "CODEOWNERS"
BANNER = """\
# AUTOGENERATED from .github/codeowners-roles.yml. Do not edit by hand.
#
# To change role membership or path assignments:
# 1. Edit .github/codeowners-roles.yml
# 2. Run `python3 .github/scripts/render-codeowners.py`
# 3. Commit both files together
#
# CI fails if this file drifts from its source, and rejects PRs that
# edit this file directly without also editing the yml.
"""
def resolve(role_name: str, roles: dict) -> list[str]:
role = roles.get(role_name)
if role is None:
sys.exit(
f"error: unknown role '{role_name}'. "
f"Known roles: {sorted(roles.keys())}"
)
members = role.get("members") or []
if not members:
sys.exit(
f"error: role '{role_name}' has no members. "
f"A role must resolve to at least one owner."
)
return members
def owners_for(role_names: list[str], roles: dict) -> list[str]:
"""Return @-prefixed GitHub handles, deduped, preserving order."""
seen: list[str] = []
for role_name in role_names:
for member in resolve(role_name, roles):
handle = f"@{member}"
if handle not in seen:
seen.append(handle)
return seen
def main() -> int:
if not SOURCE.exists():
sys.exit(f"error: source file not found: {SOURCE}")
spec = yaml.safe_load(SOURCE.read_text())
roles = spec.get("roles") or {}
if not roles:
sys.exit("error: codeowners-roles.yml declares no roles")
paths = spec.get("paths") or {}
if not paths:
sys.exit("error: codeowners-roles.yml declares no paths")
lines: list[str] = [BANNER]
# Pad the path column for alignment. Width is the longest pattern
# plus a small margin.
width = max(len(p) for p in paths) + 2
# GitHub CODEOWNERS uses "last match wins" semantics. Emit the
# default catch-all `*` FIRST so specific patterns below override
# it for the paths they cover. If we emitted `*` last, every file
# would resolve to the default owners regardless of more-specific
# rules — which would silently nullify any role distinction.
if "default" in spec:
default_owners = owners_for(spec["default"], roles)
lines.append(f"{'*':<{width}} {' '.join(default_owners)}")
lines.append("")
for pattern, role_names in paths.items():
owners = owners_for(role_names, roles)
lines.append(f"{pattern:<{width}} {' '.join(owners)}")
lines.append("") # trailing newline so the file ends cleanly
rendered = "\n".join(lines)
# Regression check: the catch-all `*` line (if any) must precede
# every specific-path line. Failure here means the generator is
# silently nullifying specific rules.
if "default" in spec:
non_comment = [ln for ln in rendered.splitlines() if ln and not ln.startswith("#")]
first_pattern = non_comment[0].split()[0] if non_comment else None
if first_pattern != "*":
sys.exit(
f"error: generator invariant violated — first emitted pattern is "
f"{first_pattern!r}, expected '*'. CODEOWNERS uses last-match-wins; "
f"the catch-all must come first."
)
OUTPUT.write_text(rendered)
print(f"wrote {OUTPUT.relative_to(REPO_ROOT)}")
return 0
if __name__ == "__main__":
sys.exit(main())

66
.github/workflows/codeowners.yml vendored Normal file
View file

@ -0,0 +1,66 @@
name: CODEOWNERS
on:
pull_request:
paths:
- '.github/codeowners-roles.yml'
- '.github/CODEOWNERS'
- '.github/scripts/render-codeowners.py'
- '.github/workflows/codeowners.yml'
workflow_dispatch:
# Read-only; we never push from this workflow.
permissions:
contents: read
jobs:
drift:
name: CODEOWNERS matches source
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5.0.1
- name: Set up Python
uses: actions/setup-python@v5.4.0
with:
python-version: '3.13'
- name: Install PyYAML
run: pip install pyyaml
- name: Re-render CODEOWNERS
run: python3 .github/scripts/render-codeowners.py
- name: Reject drift
run: |
if ! git diff --quiet .github/CODEOWNERS; then
echo "::error::.github/CODEOWNERS is out of sync with .github/codeowners-roles.yml."
echo "::error::Run \`python3 .github/scripts/render-codeowners.py\` locally and commit the result."
echo "--- diff ---"
git --no-pager diff .github/CODEOWNERS
exit 1
fi
echo "CODEOWNERS is in sync with its source."
noedit:
name: CODEOWNERS not hand-edited
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5.0.1
with:
# Need history so we can diff against the PR base.
fetch-depth: 0
- name: Reject hand-edits to generated file
run: |
base="origin/${{ github.base_ref }}"
git fetch origin "${{ github.base_ref }}" --quiet
changed=$(git diff --name-only "$base" HEAD)
edited_generated=$(echo "$changed" | grep -E '^\.github/CODEOWNERS$' || true)
edited_source=$(echo "$changed" | grep -E '^\.github/codeowners-roles\.yml$' || true)
if [ -n "$edited_generated" ] && [ -z "$edited_source" ]; then
echo "::error::This PR edits .github/CODEOWNERS but not its source .github/codeowners-roles.yml."
echo "::error::Edit the yml and regenerate via \`python3 .github/scripts/render-codeowners.py\`."
exit 1
fi
echo "CODEOWNERS edits accompany source edits (or no CODEOWNERS edits in this PR)."

View file

@ -93,6 +93,7 @@ Full diagram and concurrency model: [docs/architecture.md](docs/architecture.md)
| Install (binary / Homebrew / source / channels) | [docs/install.md](docs/install.md) |
| Deployment (binary / container / RustFS bootstrap / auth / build variants) | [docs/deployment.md](docs/deployment.md) |
| CI / release workflows | [docs/ci.md](docs/ci.md) |
| Code ownership (CODEOWNERS source of truth, roles, regeneration) | [docs/codeowners.md](docs/codeowners.md) |
| Constants & tunables cheat sheet | [docs/constants.md](docs/constants.md) |
| Per-version release notes | [docs/releases/](docs/releases/) |

37
docs/codeowners.md Normal file
View file

@ -0,0 +1,37 @@
# Code ownership
`.github/CODEOWNERS` is **generated** — not hand-edited. The source of truth is `.github/codeowners-roles.yml`, expanded by `.github/scripts/render-codeowners.py`. CI rejects drift between the two and rejects direct edits to `CODEOWNERS` that don't accompany a yml change.
This setup gives every role change a reviewable PR and a permanent in-repo audit trail (`git log .github/codeowners-roles.yml`).
## Current roles
| Role | Members | Scope |
|---|---|---|
| `engineering` | `@aaltshuler` | All code under `crates/**`, repo infrastructure, default for unmapped paths |
| `docs` | `@aaltshuler`, `@ragnorc` | `docs/**`, README.md, AGENTS.md, CLAUDE.md, SECURITY.md |
GitHub treats multiple owners in a CODEOWNERS line as **"any one of them satisfies the review requirement"**. For docs, either named member can approve. To require N distinct approvers on a specific path, layer a CI check on top (not currently configured).
## How to change role membership or path mappings
1. Edit `.github/codeowners-roles.yml`.
2. Run `python3 .github/scripts/render-codeowners.py` (requires PyYAML; `pip install pyyaml`).
3. Commit both files in the same PR.
CI fails the PR if:
- `CODEOWNERS` was edited without a corresponding yml change, or
- The yml was changed but the rendered `CODEOWNERS` doesn't match.
## How to add a new role
1. Add a new entry to `roles:` in the yml with a `description` and `members` list.
2. Reference the role from `paths:` (or `default:`).
3. Regenerate + commit as above.
## Why a generator, not direct CODEOWNERS edits?
- **Audit trail**: `git log .github/codeowners-roles.yml` is the canonical record of every role change. The rendered `CODEOWNERS` is a derived artifact.
- **Roles are first-class**: paths reference roles, not raw handles. Renaming a person or rotating a role updates one place, not every path.
- **Future extension**: scheduled rotation (weekly on-call, quarterly leads) plugs into the same yml without changing the path mappings. Not enabled today.
- **Consistency with the product**: omnigraph itself enforces auditable Cedar policy. The repo's code-owner policy follows the same "policy as reviewed code" pattern.