feat: IAM service, gateway auth middleware, capability model, and CLIs (#849)

Replaces the legacy GATEWAY_SECRET shared-token gate with an IAM-backed identity and authorisation model. The gateway no longer has an "allow-all" or "no auth" mode; every request is authenticated via the IAM service, authorised against a capability model that encodes both the operation and the workspace it targets, and rejected with a deliberately-uninformative 401 / 403 on any failure. IAM service (trustgraph-flow/trustgraph/iam, trustgraph-base/schema/iam) ----------------------------------------------------------------------- * New backend service (iam-svc) owning users, workspaces, API keys, passwords and JWT signing keys in Cassandra. Reached over the standard pub/sub request/response pattern; gateway is the only caller. * Operations: bootstrap, resolve-api-key, login, get-signing-key-public, rotate-signing-key, create/list/get/update/disable/delete/enable-user, change-password, reset-password, create/list/get/update/disable- workspace, create/list/revoke-api-key. * Ed25519 JWT signing (alg=EdDSA). Key rotation writes a new kid and retires the previous one; validation is grace-period friendly. * Passwords: PBKDF2-HMAC-SHA-256, 600k iterations, per-user salt. * API keys: 128-bit random, SHA-256 hashed. Plaintext returned once. * Bootstrap is explicit: --bootstrap-mode {token,bootstrap} is a required startup argument with no permissive default. Masked "auth failure" errors hide whether a refused bootstrap request was due to mode, state, or authorisation. Gateway authentication (trustgraph-flow/trustgraph/gateway/auth.py) ------------------------------------------------------------------- * IamAuth replaces the legacy Authenticator. Distinguishes JWTs (three-segment dotted) from API keys by shape; verifies JWTs locally using the cached IAM public key; resolves API keys via IAM with a short-TTL hash-keyed cache. Every failure path surfaces the same 401 body ("auth failure") so callers cannot enumerate credential state. * Public key is fetched at gateway startup with a bounded retry loop; traffic does not begin flowing until auth has started. Capability model (trustgraph-flow/trustgraph/gateway/capabilities.py) --------------------------------------------------------------------- * Roles have two dimensions: a capability set and a workspace scope. OSS ships reader / writer / admin; the first two are workspace- assigned, admin is cross-workspace ("*"). No "cross-workspace" pseudo-capability — workspace permission is a property of the role. * check(identity, capability, target_workspace=None) is the single authorisation test: some role must grant the capability *and* be active in the target workspace. * enforce_workspace validates a request-body workspace against the caller's role scopes and injects the resolved value. Cross- workspace admin is permitted by role scope, not by a bypass. * Gateway endpoints declare a required capability explicitly — no permissive default. Construction fails fast if omitted. Enterprise editions can replace the role table without changing the wire protocol. WebSocket first-frame auth (dispatch/mux.py, endpoint/socket.py) ---------------------------------------------------------------- * /api/v1/socket handshake unconditionally accepts; authentication runs on the first WebSocket frame ({"type":"auth","token":"..."}) with {"type":"auth-ok","workspace":"..."} / {"type":"auth-failed"}. The socket stays open on failure so the client can re-authenticate — browsers treat a handshake-time 401 as terminal, breaking reconnection. * Mux.receive rejects every non-auth frame before auth succeeds, enforces the caller's workspace (envelope + inner payload) using the role-scope resolver, and supports mid-session re-auth. * Flow import/export streaming endpoints keep the legacy ?token= handshake (URL-scoped short-lived transfers; no re-auth need). Auth surface ------------ * POST /api/v1/auth/login — public, returns a JWT. * POST /api/v1/auth/bootstrap — public; forwards to IAM's bootstrap op which itself enforces mode + tables-empty. * POST /api/v1/auth/change-password — any authenticated user. * POST /api/v1/iam — admin-only generic forwarder for the rest of the IAM API (per-op REST endpoints to follow in a later change). Removed / breaking ------------------ * GATEWAY_SECRET / --api-token / default_api_token and the legacy Authenticator.permitted contract. The gateway cannot run without IAM. * ?token= on /api/v1/socket. * DispatcherManager and Mux both raise on auth=None — no silent downgrade path. CLI tools (trustgraph-cli) -------------------------- tg-bootstrap-iam, tg-login, tg-create-user, tg-list-users, tg-disable-user, tg-enable-user, tg-delete-user, tg-change-password, tg-reset-password, tg-create-api-key, tg-list-api-keys, tg-revoke-api-key, tg-create-workspace, tg-list-workspaces. Passwords read via getpass; tokens / one-time secrets written to stdout with operator context on stderr so shell composition works cleanly. AsyncSocketClient / SocketClient updated to the first-frame auth protocol. Specifications -------------- * docs/tech-specs/iam.md updated with the error policy, workspace resolver extension point, and OSS role-scope model. * docs/tech-specs/iam-protocol.md (new) — transport, dataclasses, operation table, error taxonomy, bootstrap modes. * docs/tech-specs/capabilities.md (new) — capability vocabulary, OSS role bundles, agent-as-composition note, enforcement-boundary policy, enterprise extensibility. Tests ----- * test_auth.py (rewritten) — IamAuth + JWT round-trip with real Ed25519 keypairs + API-key cache behaviour. * test_capabilities.py (new) — role table sanity, check across role x workspace combinations, enforce_workspace paths, unknown-cap / unknown-role fail-closed. * Every endpoint test construction now names its capability explicitly (no permissive defaults relied upon). New tests pin the fail-closed invariants: DispatcherManager / Mux refuse auth=None; i18n path-traversal defense is exercised. * test_socket_graceful_shutdown rewritten against IamAuth.
2026-04-26 17:06:22 +02:00 · 2026-04-24 17:29:10 +01:00 · 2026-04-24 17:29:10 +01:00 · 67b2fc448f
commit 67b2fc448f
parent ae9936c9cc
61 changed files with 6474 additions and 792 deletions
--- a/tests/unit/test_gateway/test_auth.py
+++ b/tests/unit/test_gateway/test_auth.py
@ -1,69 +1,312 @@
 """
-Tests for Gateway Authentication
+Tests for gateway/auth.py — IamAuth, JWT verification, API key
+resolution cache.
+
+JWTs are signed with real Ed25519 keypairs generated per-test, so
+the crypto path is exercised end-to-end without mocks.  API-key
+resolution is tested against a stubbed IamClient since the real
+one requires pub/sub.
 """

+import base64
+import json
+import time
+from unittest.mock import AsyncMock, Mock, patch
+
 import pytest
+from aiohttp import web
+from cryptography.hazmat.primitives import serialization
+from cryptography.hazmat.primitives.asymmetric import ed25519

-from trustgraph.gateway.auth import Authenticator
+from trustgraph.gateway.auth import (
+    IamAuth, Identity,
+    _b64url_decode, _verify_jwt_eddsa,
+    API_KEY_CACHE_TTL,
+)


-class TestAuthenticator:
-    """Test cases for Authenticator class"""
+# -- helpers ---------------------------------------------------------------

-    def test_authenticator_initialization_with_token(self):
-        """Test Authenticator initialization with valid token"""
-        auth = Authenticator(token="test-token-123")
-        
-        assert auth.token == "test-token-123"
-        assert auth.allow_all is False

-    def test_authenticator_initialization_with_allow_all(self):
-        """Test Authenticator initialization with allow_all=True"""
-        auth = Authenticator(allow_all=True)
-        
-        assert auth.token is None
-        assert auth.allow_all is True
+def _b64url(data: bytes) -> str:
+    return base64.urlsafe_b64encode(data).rstrip(b"=").decode("ascii")

-    def test_authenticator_initialization_without_token_raises_error(self):
-        """Test Authenticator initialization without token raises RuntimeError"""
-        with pytest.raises(RuntimeError, match="Need a token"):
-            Authenticator()

-    def test_authenticator_initialization_with_empty_token_raises_error(self):
-        """Test Authenticator initialization with empty token raises RuntimeError"""
-        with pytest.raises(RuntimeError, match="Need a token"):
-            Authenticator(token="")
+def make_keypair():
+    priv = ed25519.Ed25519PrivateKey.generate()
+    public_pem = priv.public_key().public_bytes(
+        encoding=serialization.Encoding.PEM,
+        format=serialization.PublicFormat.SubjectPublicKeyInfo,
+    ).decode("ascii")
+    return priv, public_pem

-    def test_permitted_with_allow_all_returns_true(self):
-        """Test permitted method returns True when allow_all is enabled"""
-        auth = Authenticator(allow_all=True)
-        
-        # Should return True regardless of token or roles
-        assert auth.permitted("any-token", []) is True
-        assert auth.permitted("different-token", ["admin"]) is True
-        assert auth.permitted(None, ["user"]) is True

-    def test_permitted_with_matching_token_returns_true(self):
-        """Test permitted method returns True with matching token"""
-        auth = Authenticator(token="secret-token")
-        
-        # Should return True when tokens match
-        assert auth.permitted("secret-token", []) is True
-        assert auth.permitted("secret-token", ["admin", "user"]) is True
+def sign_jwt(priv, claims, alg="EdDSA"):
+    header = {"alg": alg, "typ": "JWT", "kid": "kid-test"}
+    h = _b64url(json.dumps(header, separators=(",", ":"), sort_keys=True).encode())
+    p = _b64url(json.dumps(claims, separators=(",", ":"), sort_keys=True).encode())
+    signing_input = f"{h}.{p}".encode("ascii")
+    if alg == "EdDSA":
+        sig = priv.sign(signing_input)
+    else:
+        raise ValueError(f"test helper doesn't sign {alg}")
+    return f"{h}.{p}.{_b64url(sig)}"

-    def test_permitted_with_non_matching_token_returns_false(self):
-        """Test permitted method returns False with non-matching token"""
-        auth = Authenticator(token="secret-token")
-        
-        # Should return False when tokens don't match
-        assert auth.permitted("wrong-token", []) is False
-        assert auth.permitted("different-token", ["admin"]) is False
-        assert auth.permitted(None, ["user"]) is False

-    def test_permitted_with_token_and_allow_all_returns_true(self):
-        """Test permitted method with both token and allow_all set"""
-        auth = Authenticator(token="test-token", allow_all=True)
-        
-        # allow_all should take precedence
-        assert auth.permitted("any-token", []) is True
-        assert auth.permitted("wrong-token", ["admin"]) is True
+def make_request(auth_header):
+    """Minimal stand-in for an aiohttp request — IamAuth only reads
+    ``request.headers["Authorization"]``."""
+    req = Mock()
+    req.headers = {}
+    if auth_header is not None:
+        req.headers["Authorization"] = auth_header
+    return req
+
+
+# -- pure helpers ----------------------------------------------------------
+
+
+class TestB64UrlDecode:
+
+    def test_round_trip_without_padding(self):
+        data = b"hello"
+        encoded = _b64url(data)
+        assert _b64url_decode(encoded) == data
+
+    def test_handles_various_lengths(self):
+        for s in (b"a", b"ab", b"abc", b"abcd", b"abcde"):
+            assert _b64url_decode(_b64url(s)) == s
+
+
+# -- JWT verification -----------------------------------------------------
+
+
+class TestVerifyJwtEddsa:
+
+    def test_valid_jwt_passes(self):
+        priv, pub = make_keypair()
+        claims = {
+            "sub": "user-1", "workspace": "default",
+            "roles": ["reader"],
+            "iat": int(time.time()),
+            "exp": int(time.time()) + 60,
+        }
+        token = sign_jwt(priv, claims)
+        got = _verify_jwt_eddsa(token, pub)
+        assert got["sub"] == "user-1"
+        assert got["workspace"] == "default"
+
+    def test_expired_jwt_rejected(self):
+        priv, pub = make_keypair()
+        claims = {
+            "sub": "user-1", "workspace": "default", "roles": [],
+            "iat": int(time.time()) - 3600,
+            "exp": int(time.time()) - 1,
+        }
+        token = sign_jwt(priv, claims)
+        with pytest.raises(ValueError, match="expired"):
+            _verify_jwt_eddsa(token, pub)
+
+    def test_bad_signature_rejected(self):
+        priv_a, _ = make_keypair()
+        _, pub_b = make_keypair()
+        claims = {
+            "sub": "user-1", "workspace": "default", "roles": [],
+            "iat": int(time.time()),
+            "exp": int(time.time()) + 60,
+        }
+        token = sign_jwt(priv_a, claims)
+        # pub_b never signed this token.
+        with pytest.raises(Exception):
+            _verify_jwt_eddsa(token, pub_b)
+
+    def test_malformed_jwt_rejected(self):
+        _, pub = make_keypair()
+        with pytest.raises(ValueError, match="malformed"):
+            _verify_jwt_eddsa("not-a-jwt", pub)
+
+    def test_unsupported_algorithm_rejected(self):
+        priv, pub = make_keypair()
+        # Manually build an "alg":"HS256" header — no signer needed
+        # since we expect it to bail before verifying.
+        header = {"alg": "HS256", "typ": "JWT", "kid": "x"}
+        payload = {
+            "sub": "user-1", "workspace": "default", "roles": [],
+            "iat": int(time.time()), "exp": int(time.time()) + 60,
+        }
+        h = _b64url(json.dumps(header, separators=(",", ":")).encode())
+        p = _b64url(json.dumps(payload, separators=(",", ":")).encode())
+        sig = _b64url(b"not-a-real-sig")
+        token = f"{h}.{p}.{sig}"
+        with pytest.raises(ValueError, match="unsupported alg"):
+            _verify_jwt_eddsa(token, pub)
+
+
+# -- Identity --------------------------------------------------------------
+
+
+class TestIdentity:
+
+    def test_fields(self):
+        i = Identity(
+            user_id="u", workspace="w", roles=["reader"], source="api-key",
+        )
+        assert i.user_id == "u"
+        assert i.workspace == "w"
+        assert i.roles == ["reader"]
+        assert i.source == "api-key"
+
+
+# -- IamAuth.authenticate --------------------------------------------------
+
+
+class TestIamAuthDispatch:
+    """``authenticate()`` chooses between the JWT and API-key paths
+    by shape of the bearer."""
+
+    @pytest.mark.asyncio
+    async def test_no_authorization_header_raises_401(self):
+        auth = IamAuth(backend=Mock())
+        with pytest.raises(web.HTTPUnauthorized):
+            await auth.authenticate(make_request(None))
+
+    @pytest.mark.asyncio
+    async def test_non_bearer_header_raises_401(self):
+        auth = IamAuth(backend=Mock())
+        with pytest.raises(web.HTTPUnauthorized):
+            await auth.authenticate(make_request("Basic whatever"))
+
+    @pytest.mark.asyncio
+    async def test_empty_bearer_raises_401(self):
+        auth = IamAuth(backend=Mock())
+        with pytest.raises(web.HTTPUnauthorized):
+            await auth.authenticate(make_request("Bearer "))
+
+    @pytest.mark.asyncio
+    async def test_unknown_format_raises_401(self):
+        # Not tg_... and not dotted-JWT shape.
+        auth = IamAuth(backend=Mock())
+        with pytest.raises(web.HTTPUnauthorized):
+            await auth.authenticate(make_request("Bearer garbage"))
+
+    @pytest.mark.asyncio
+    async def test_valid_jwt_resolves_to_identity(self):
+        priv, pub = make_keypair()
+        claims = {
+            "sub": "user-1", "workspace": "default",
+            "roles": ["writer"],
+            "iat": int(time.time()),
+            "exp": int(time.time()) + 60,
+        }
+        token = sign_jwt(priv, claims)
+
+        auth = IamAuth(backend=Mock())
+        auth._signing_public_pem = pub
+
+        ident = await auth.authenticate(
+            make_request(f"Bearer {token}")
+        )
+        assert ident.user_id == "user-1"
+        assert ident.workspace == "default"
+        assert ident.roles == ["writer"]
+        assert ident.source == "jwt"
+
+    @pytest.mark.asyncio
+    async def test_jwt_without_public_key_fails(self):
+        # If the gateway hasn't fetched IAM's public key yet, JWTs
+        # must not validate — even ones that would otherwise pass.
+        priv, _ = make_keypair()
+        claims = {
+            "sub": "user-1", "workspace": "default", "roles": [],
+            "iat": int(time.time()), "exp": int(time.time()) + 60,
+        }
+        token = sign_jwt(priv, claims)
+        auth = IamAuth(backend=Mock())
+        # _signing_public_pem defaults to None
+        with pytest.raises(web.HTTPUnauthorized):
+            await auth.authenticate(make_request(f"Bearer {token}"))
+
+    @pytest.mark.asyncio
+    async def test_api_key_path(self):
+        auth = IamAuth(backend=Mock())
+
+        async def fake_resolve(api_key):
+            assert api_key == "tg_testkey"
+            return ("user-xyz", "default", ["admin"])
+
+        async def fake_with_client(op):
+            return await op(Mock(resolve_api_key=fake_resolve))
+
+        with patch.object(auth, "_with_client", side_effect=fake_with_client):
+            ident = await auth.authenticate(
+                make_request("Bearer tg_testkey")
+            )
+        assert ident.user_id == "user-xyz"
+        assert ident.workspace == "default"
+        assert ident.roles == ["admin"]
+        assert ident.source == "api-key"
+
+    @pytest.mark.asyncio
+    async def test_api_key_rejection_masked_as_401(self):
+        auth = IamAuth(backend=Mock())
+
+        async def fake_with_client(op):
+            raise RuntimeError("auth-failed: unknown api key")
+
+        with patch.object(auth, "_with_client", side_effect=fake_with_client):
+            with pytest.raises(web.HTTPUnauthorized):
+                await auth.authenticate(
+                    make_request("Bearer tg_bogus")
+                )
+
+
+# -- API key cache ---------------------------------------------------------
+
+
+class TestApiKeyCache:
+
+    @pytest.mark.asyncio
+    async def test_cache_hit_skips_iam(self):
+        auth = IamAuth(backend=Mock())
+        calls = {"n": 0}
+
+        async def fake_with_client(op):
+            calls["n"] += 1
+            return await op(Mock(
+                resolve_api_key=AsyncMock(
+                    return_value=("u", "default", ["reader"]),
+                )
+            ))
+
+        with patch.object(auth, "_with_client", side_effect=fake_with_client):
+            await auth.authenticate(make_request("Bearer tg_k1"))
+            await auth.authenticate(make_request("Bearer tg_k1"))
+            await auth.authenticate(make_request("Bearer tg_k1"))
+
+        # Only the first lookup reaches IAM; the rest are cache hits.
+        assert calls["n"] == 1
+
+    @pytest.mark.asyncio
+    async def test_different_keys_are_separately_cached(self):
+        auth = IamAuth(backend=Mock())
+        seen = []
+
+        async def fake_with_client(op):
+            async def resolve(plaintext):
+                seen.append(plaintext)
+                return ("u-" + plaintext, "default", ["reader"])
+            return await op(Mock(resolve_api_key=resolve))
+
+        with patch.object(auth, "_with_client", side_effect=fake_with_client):
+            a = await auth.authenticate(make_request("Bearer tg_a"))
+            b = await auth.authenticate(make_request("Bearer tg_b"))
+
+        assert a.user_id == "u-tg_a"
+        assert b.user_id == "u-tg_b"
+        assert seen == ["tg_a", "tg_b"]
+
+    @pytest.mark.asyncio
+    async def test_cache_has_ttl_constant_set(self):
+        # Not a behaviour test — just ensures we don't accidentally
+        # set TTL to 0 (which would defeat the cache) or to a week.
+        assert 10 <= API_KEY_CACHE_TTL <= 3600