omnigraph/.github/scripts/render-codeowners.py
Andrew Altshuler 730712b73f
codeowners: yml source of truth + generator + drift CI (#88)
* codeowners: generator + drift CI + initial roles

Source-of-truth approach to CODEOWNERS: yml is hand-edited, CODEOWNERS
is generated and CI-enforced. Every role change is a reviewable PR
with a permanent in-repo audit trail. No GitHub UI clicks, no shadow
state.

Initial roles:

  engineering  @aaltshuler            owns crates/** + default (.github/,
                                       scripts/, Cargo.*, openapi.json,
                                       everything else not docs)

  docs         @aaltshuler @ragnorc   owns docs/**, README.md, AGENTS.md,
                                       CLAUDE.md, SECURITY.md

Per GitHub semantics, multiple owners on a CODEOWNERS line means "any
one satisfies the review" — for docs, either named member can approve.
Strict "N distinct approvers" would need a CI workaround (not wired
today; tracked for future hardening).

Components:

- .github/codeowners-roles.yml — source of truth. Edit this.
- .github/scripts/render-codeowners.py — generator (PyYAML; ~100 LoC).
- .github/CODEOWNERS — generated. CI rejects hand-edits.
- .github/workflows/codeowners.yml — two checks:
  * drift: re-render and assert CODEOWNERS matches.
  * noedit: reject PRs that edit CODEOWNERS without editing the yml.
- docs/codeowners.md — explains the source-of-truth pattern, how to
  change roles, how to add new roles.
- AGENTS.md topic-index row.

What's NOT in this PR:

- Branch protection on main (separate PR; needs `gh api` call against
  the org).
- Required-reviewer enforcement (depends on branch protection landing).
- Required CI status checks (depends on branch protection landing).
- Scheduled rotation (the schedule: block in the yml + a weekly
  workflow). Today's roles are stable; rotation isn't needed yet.
- Linear-as-source-of-truth integration (Approach 4 from the design
  discussion; deferred).

Verified:
- Generator output is deterministic (idempotent re-runs).
- scripts/check-agents-md.sh OK (28 links, 28 docs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* codeowners: fix catch-all ordering (Devin review #88)

Devin caught a real bug: GitHub CODEOWNERS uses "last match wins"
semantics, but the generator emitted the catch-all `*` AFTER specific
patterns. Net effect: `*` won for every file, silently nullifying the
docs role and never routing reviews to @ragnorc.

Fix is one-line — emit the default `*` line before iterating the
specific paths. Also:

- Added a regression assertion in the generator: after rendering, the
  first non-comment line must start with `*` if a default is
  configured. Generator exits non-zero otherwise. Catches the same
  class of mistake in any future refactor.
- Rewrote the yml header comment, which incorrectly stated "keep
  more-specific paths after broader patterns" (correct for GitHub
  semantics but the generator was doing the opposite — so the comment
  read as a description of behavior when it was actually a contradicted
  intention).

Verified by re-rendering: `*` is now line 12, `crates/**` is line 14,
`docs/**` is line 15, etc. README.md matches both `*` and `README.md`;
`README.md` is later → wins → @aaltshuler + @ragnorc both assigned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:26:06 +03:00

134 lines
4.4 KiB
Python
Executable file

#!/usr/bin/env python3
"""Render .github/CODEOWNERS from .github/codeowners-roles.yml.
The yml is the source of truth — editing CODEOWNERS directly is
rejected by CI (see .github/workflows/codeowners.yml). This script
expands the role-based yml into the flat path→owners format GitHub
expects.
Usage:
python3 .github/scripts/render-codeowners.py
Exits non-zero on:
- Missing PyYAML.
- Unknown role referenced in `paths` or `default`.
- Role with no members (a role must always resolve to at least
one owner; otherwise CODEOWNERS would assign nobody and GitHub
would silently fall back to "no required reviewer", which
defeats the purpose).
"""
from __future__ import annotations
import sys
from pathlib import Path
try:
import yaml
except ImportError:
sys.exit(
"error: PyYAML is required. Install with `pip install pyyaml` "
"or `python3 -m pip install pyyaml`."
)
REPO_ROOT = Path(__file__).resolve().parents[2]
SOURCE = REPO_ROOT / ".github" / "codeowners-roles.yml"
OUTPUT = REPO_ROOT / ".github" / "CODEOWNERS"
BANNER = """\
# AUTOGENERATED from .github/codeowners-roles.yml. Do not edit by hand.
#
# To change role membership or path assignments:
# 1. Edit .github/codeowners-roles.yml
# 2. Run `python3 .github/scripts/render-codeowners.py`
# 3. Commit both files together
#
# CI fails if this file drifts from its source, and rejects PRs that
# edit this file directly without also editing the yml.
"""
def resolve(role_name: str, roles: dict) -> list[str]:
role = roles.get(role_name)
if role is None:
sys.exit(
f"error: unknown role '{role_name}'. "
f"Known roles: {sorted(roles.keys())}"
)
members = role.get("members") or []
if not members:
sys.exit(
f"error: role '{role_name}' has no members. "
f"A role must resolve to at least one owner."
)
return members
def owners_for(role_names: list[str], roles: dict) -> list[str]:
"""Return @-prefixed GitHub handles, deduped, preserving order."""
seen: list[str] = []
for role_name in role_names:
for member in resolve(role_name, roles):
handle = f"@{member}"
if handle not in seen:
seen.append(handle)
return seen
def main() -> int:
if not SOURCE.exists():
sys.exit(f"error: source file not found: {SOURCE}")
spec = yaml.safe_load(SOURCE.read_text())
roles = spec.get("roles") or {}
if not roles:
sys.exit("error: codeowners-roles.yml declares no roles")
paths = spec.get("paths") or {}
if not paths:
sys.exit("error: codeowners-roles.yml declares no paths")
lines: list[str] = [BANNER]
# Pad the path column for alignment. Width is the longest pattern
# plus a small margin.
width = max(len(p) for p in paths) + 2
# GitHub CODEOWNERS uses "last match wins" semantics. Emit the
# default catch-all `*` FIRST so specific patterns below override
# it for the paths they cover. If we emitted `*` last, every file
# would resolve to the default owners regardless of more-specific
# rules — which would silently nullify any role distinction.
if "default" in spec:
default_owners = owners_for(spec["default"], roles)
lines.append(f"{'*':<{width}} {' '.join(default_owners)}")
lines.append("")
for pattern, role_names in paths.items():
owners = owners_for(role_names, roles)
lines.append(f"{pattern:<{width}} {' '.join(owners)}")
lines.append("") # trailing newline so the file ends cleanly
rendered = "\n".join(lines)
# Regression check: the catch-all `*` line (if any) must precede
# every specific-path line. Failure here means the generator is
# silently nullifying specific rules.
if "default" in spec:
non_comment = [ln for ln in rendered.splitlines() if ln and not ln.startswith("#")]
first_pattern = non_comment[0].split()[0] if non_comment else None
if first_pattern != "*":
sys.exit(
f"error: generator invariant violated — first emitted pattern is "
f"{first_pattern!r}, expected '*'. CODEOWNERS uses last-match-wins; "
f"the catch-all must come first."
)
OUTPUT.write_text(rendered)
print(f"wrote {OUTPUT.relative_to(REPO_ROOT)}")
return 0
if __name__ == "__main__":
sys.exit(main())