The gateway no longer holds any policy state — capability sets, role
definitions, workspace scope rules. Per the IAM contract it asks the
regime "may this identity perform this capability on this resource?"
per request. That moves the OSS role-based regime entirely into
iam-svc, which can be replaced (SSO, ABAC, ReBAC) without changing
the gateway, the wire protocol, or backend services.
Contract:
- authenticate(credential) -> Identity (handle, workspace,
principal_id, source). No roles, claims, or policy state surface
to the gateway.
- authorise(identity, capability, resource, parameters) -> (allow,
ttl). Cached per-decision (regime TTL clamped above; fail-closed
on regime errors).
- authorise_many available as a fan-out variant.
Operation registry drives every authorisation decision:
- /api/v1/iam -> IamEndpoint, looks up bare op name (create-user,
list-workspaces, ...).
- /api/v1/{kind} -> RegistryRoutedVariableEndpoint, <kind>:<op>
(config:get, flow:list-blueprints, librarian:add-document, ...).
- /api/v1/flow/{flow}/service/{kind} -> flow-service:<kind>.
- /api/v1/flow/{flow}/{import,export}/{kind} ->
flow-{import,export}:<kind>.
- WS Mux per-frame -> flow-service:<kind>; closes a gap where
authenticated users could hit any service kind.
85 operations registered across the surface.
JWT carries identity only — sub + workspace. The roles claim is gone;
the gateway never reads policy state from a credential.
The three coarse *_KIND_CAPABILITY maps are removed. The registry is
the only source of truth for the capability + resource shape of an
operation. Tests migrated to the new Identity shape and to
authorise()-mocked auth doubles.
Specs updated: docs/tech-specs/iam-contract.md (Identity surface,
caching, registry-naming conventions), iam.md (JWT shape, gateway
flow, role section reframed as OSS-regime detail), iam-protocol.md
(positioned as one implementation of the contract).
17 KiB
| layout | title | parent |
|---|---|---|
| default | IAM Service Protocol Technical Specification | Tech Specs |
IAM Service Protocol Technical Specification
Overview
This document specifies the wire protocol of the open-source IAM
regime — one implementation of the abstract IAM contract defined
in iam-contract.md. Other regimes (OIDC / SSO,
ABAC, ReBAC, external policy engines) implement the same contract
with different transports, data models, and policy semantics; the
gateway is unaware of which regime it's wired against.
The OSS regime is a backend processor (iam-svc) reached over the
standard request/response pub/sub pattern. It owns users,
workspaces, API keys, login credentials, and JWT signing keys, all
backed by Cassandra. The API gateway is its only caller.
This document defines:
- the
IamRequestandIamResponsedataclasses on the bus, - the operation set the OSS regime implements,
- per-operation input and output fields,
- the error taxonomy,
- the bootstrap modes,
- the initial HTTP forwarding endpoint used while the protocol is being exercised.
The mapping from this regime onto the abstract contract is direct:
| Contract operation | OSS regime operation |
|---|---|
authenticate(credential) |
resolve-api-key (for API keys); local JWT validation against get-signing-key-public (for JWTs) |
authorise(identity, capability, resource, parameters) |
Role-table lookup against the OSS role bundles defined in capabilities.md, gated by workspace scope. Workspace can come from the resource address (workspace- and flow-level resources) or from a parameter (system-level resources whose parameters reference a workspace, e.g. create-user with workspace association W). |
authorise_many |
Loop over authorise |
| Identity / credential / workspace management | create-user, create-api-key, etc. as listed below. These are operations on system-level resources (the user / workspace / credential registries); workspace, where it appears in the body, is a parameter. |
Architectural context — roles, capabilities, workspace as resource
scope, enforcement boundary — lives in iam.md and
capabilities.md. The contract abstraction
lives in iam-contract.md.
Transport
- Request topic:
request:tg/request/iam-request - Response topic:
response:tg/response/iam-response - Pattern: request/response, correlated by the
idmessage property, the same pattern used byconfig-svcandflow-svc. - Caller: the API gateway only. Under the enforcement-boundary policy (see capabilities spec), the IAM service trusts the bus and performs no per-request authentication or capability check against the caller. The gateway has already evaluated capability membership and workspace scoping before sending the request.
Dataclasses
IamRequest
@dataclass
class IamRequest:
# One of the operation strings below.
operation: str = ""
# Scope of this request. Required on every workspace-scoped
# operation. Omitted (or empty) for system-level ops
# (workspace CRUD, signing-key ops, bootstrap, resolve-api-key,
# login).
workspace: str = ""
# Acting user id, for audit. Set by the gateway to the
# authenticated caller's id on user-initiated operations.
# Empty for internal-origin (bootstrap, reconcilers) and for
# resolve-api-key / login (no actor yet).
actor: str = ""
# --- identity selectors ---
user_id: str = ""
username: str = "" # login; unique within a workspace
key_id: str = "" # revoke-api-key, list-api-keys (own)
api_key: str = "" # resolve-api-key (plaintext)
# --- credentials ---
password: str = "" # login, change-password (current)
new_password: str = "" # change-password
# --- user fields ---
user: UserInput | None = None # create-user, update-user
# --- workspace fields ---
workspace_record: WorkspaceInput | None = None # create-workspace, update-workspace
# --- api key fields ---
key: ApiKeyInput | None = None # create-api-key
IamResponse
@dataclass
class IamResponse:
# Populated on success of operations that return them.
user: UserRecord | None = None # create-user, get-user, update-user
users: list[UserRecord] = field(default_factory=list) # list-users
workspace: WorkspaceRecord | None = None # create-workspace, get-workspace, update-workspace
workspaces: list[WorkspaceRecord] = field(default_factory=list) # list-workspaces
# create-api-key returns the plaintext once. Never populated
# on any other operation.
api_key_plaintext: str = ""
api_key: ApiKeyRecord | None = None # create-api-key
api_keys: list[ApiKeyRecord] = field(default_factory=list) # list-api-keys
# login, rotate-signing-key
jwt: str = ""
jwt_expires: str = "" # ISO-8601 UTC
# get-signing-key-public
signing_key_public: str = "" # PEM
# resolve-api-key returns who this key authenticates as.
resolved_user_id: str = ""
resolved_workspace: str = ""
resolved_roles: list[str] = field(default_factory=list)
# reset-password
temporary_password: str = "" # returned once to the operator
# bootstrap: on first run, the initial admin's one-time API key
# is returned for the operator to capture.
bootstrap_admin_user_id: str = ""
bootstrap_admin_api_key: str = ""
# Present on any failed operation.
error: Error | None = None
Value types
@dataclass
class UserInput:
username: str = ""
name: str = ""
email: str = ""
password: str = "" # only on create-user; never on update-user
roles: list[str] = field(default_factory=list)
enabled: bool = True
must_change_password: bool = False
@dataclass
class UserRecord:
id: str = ""
workspace: str = ""
username: str = ""
name: str = ""
email: str = ""
roles: list[str] = field(default_factory=list)
enabled: bool = True
must_change_password: bool = False
created: str = "" # ISO-8601 UTC
# Password hash is never included in any response.
@dataclass
class WorkspaceInput:
id: str = ""
name: str = ""
enabled: bool = True
@dataclass
class WorkspaceRecord:
id: str = ""
name: str = ""
enabled: bool = True
created: str = "" # ISO-8601 UTC
@dataclass
class ApiKeyInput:
user_id: str = ""
name: str = "" # operator-facing label, e.g. "laptop"
expires: str = "" # optional ISO-8601 UTC; empty = no expiry
@dataclass
class ApiKeyRecord:
id: str = ""
user_id: str = ""
name: str = ""
prefix: str = "" # first 4 chars of plaintext, for identification in lists
expires: str = "" # empty = no expiry
created: str = ""
last_used: str = "" # empty if never used
# key_hash is never included in any response.
Operations
| Operation | Request fields | Response fields | Notes |
|---|---|---|---|
login |
username, password, workspace (optional) |
jwt, jwt_expires |
If workspace omitted, IAM resolves to the user's assigned workspace. |
resolve-api-key |
api_key (plaintext) |
resolved_user_id, resolved_workspace, resolved_roles |
Gateway-internal. Service returns auth-failed for unknown / expired / revoked keys. |
change-password |
user_id, password (current), new_password |
— | Self-service. IAM validates password against stored hash. |
reset-password |
user_id |
temporary_password |
Admin-initiated. IAM generates a random password, sets must_change_password=true on the user, returns the plaintext once. |
create-user |
workspace, user |
user |
Admin-only. user.password is hashed and stored; user.roles must be subset of known roles. |
list-users |
workspace |
users |
|
get-user |
workspace, user_id |
user |
|
update-user |
workspace, user_id, user |
user |
password field on user is rejected; use change-password / reset-password. |
disable-user |
workspace, user_id |
— | Soft-delete; sets enabled=false. Revokes all the user's API keys. |
create-workspace |
workspace_record |
workspace |
System-level. |
list-workspaces |
— | workspaces |
System-level. |
get-workspace |
workspace_record (id only) |
workspace |
System-level. |
update-workspace |
workspace_record |
workspace |
System-level. |
disable-workspace |
workspace_record (id only) |
— | System-level. Sets enabled=false; revokes all workspace API keys; disables all users in the workspace. |
create-api-key |
workspace, key |
api_key_plaintext, api_key |
Plaintext returned once; only hash stored. key.name required. |
list-api-keys |
workspace, user_id |
api_keys |
|
revoke-api-key |
workspace, key_id |
— | Deletes the key record. |
get-signing-key-public |
— | signing_key_public |
Gateway fetches this at startup. |
rotate-signing-key |
— | — | System-level. Introduces a new signing key; old key continues to validate JWTs for a grace period (implementation-defined, minimum 1h). |
bootstrap |
— | bootstrap_admin_user_id, bootstrap_admin_api_key |
If IAM tables are empty, creates the initial default workspace, an admin user, an initial API key, and an initial signing key; returns them once. No-op on subsequent calls (returns empty fields). |
Error taxonomy
All errors are carried in the IamResponse.error field. error.type
is one of the values below; error.message is a human-readable
string that is not surfaced verbatim to external callers (the
gateway maps to auth failure / access denied per the IAM error
policy).
type |
When |
|---|---|
invalid-argument |
Malformed request (missing required field, unknown operation, invalid format). |
not-found |
Named resource does not exist (user_id, key_id, workspace). |
duplicate |
Create operation collides with an existing resource (username, workspace id, key name). |
auth-failed |
login with wrong credentials; resolve-api-key with unknown / expired / revoked key; change-password with wrong current password. Single bucket to deny oracle attacks. |
weak-password |
Password does not meet policy (length, complexity — policy defined at service level). |
disabled |
Target user or workspace has enabled=false. |
operation-not-permitted |
Non-admin attempting system-level operation, or workspace-scoped operation attempting to affect another workspace. |
internal-error |
Unexpected IAM-side failure. Log and surface as 500 at the gateway. |
The gateway is responsible for translating auth-failed and
operation-not-permitted into the obfuscated external error
response ("auth failure" / "access denied"); invalid-argument
becomes a descriptive 400; not-found / duplicate /
weak-password / disabled become descriptive 4xx but never leak
IAM-internal detail.
Credential storage
- Passwords are stored using a slow KDF (bcrypt / argon2id — the
service picks; documented as an implementation detail). The
password_hashcolumn stores the full KDF-encoded string (algorithm, cost, salt, hash). Not a plain SHA-256. - API keys are stored as SHA-256 of the plaintext. API keys
are 128-bit random values (
tg_+ base64url); the entropy makes a slow hash unnecessary. The hash serves as the primary key on theiam_api_keystable, enabling O(1) lookup onresolve-api-key. - JWT signing key is stored as an RSA or Ed25519 private key
(implementation choice) in a dedicated
iam_signing_keystable with akid,created, and optionalretiredtimestamp. At most one active key; up to N retired keys are kept for a grace period to validate previously-issued JWTs.
Passwords, API-key plaintext, and signing-key private material are
never returned in any response other than the explicit one-time
responses above (reset-password, create-api-key, bootstrap).
Bootstrap modes
iam-svc requires a bootstrap mode to be chosen at startup. There is
no default — an unset or invalid mode causes the service to refuse
to start. The purpose is to force the operator to make an explicit
security decision rather than rely on an implicit "safe" fallback.
| Mode | Startup behaviour | bootstrap operation |
Suitability |
|---|---|---|---|
token |
On first start with empty tables, auto-seeds the default workspace, admin user, admin API key (using the operator-provided --bootstrap-token), and an initial signing key. No-op on subsequent starts. |
Refused — returns auth-failed / "auth failure" regardless of caller. |
Production, any public-exposure deployment. |
bootstrap |
No startup seeding. Tables remain empty until the bootstrap operation is invoked over the pub/sub bus (typically via tg-bootstrap-iam). |
Live while tables are empty. Generates and returns the admin API key once. Refused (auth-failed) once tables are populated. |
Dev / compose up / CI. Not safe under public exposure — any caller reaching the gateway's /api/v1/iam forwarder before the operator can cause a token to be issued to them. Operators choosing this mode accept that risk. |
Error masking
In both modes, any refused invocation of the bootstrap operation
returns the same error (auth-failed / "auth failure"). A caller
cannot distinguish:
- "service is in token mode"
- "service is in bootstrap mode but already bootstrapped"
- "operation forbidden"
This matches the general IAM error-policy stance (see iam.md) and
prevents externally enumerating IAM's state.
Configuration sources
The mode and token can be supplied two ways. Resolution order is fixed; there is no permissive fallback.
| Source | Field |
|---|---|
| Processor-group YAML / CLI argument | bootstrap_mode, bootstrap_token |
| Environment variable | IAM_BOOTSTRAP_MODE, IAM_BOOTSTRAP_TOKEN |
For each setting the service uses the explicit param value if
present; otherwise the environment variable; otherwise the service
refuses to start. The env-var path is intended for the K8s
deployment pattern where the token is injected from a Secret via
secretKeyRef, so the plaintext never has to live in YAML or git.
A typical production manifest holds bootstrap_mode: "token" in
the YAML and pulls IAM_BOOTSTRAP_TOKEN from the Secret; the YAML
is then safe to version-control.
Bootstrap-token lifecycle
The bootstrap token — whether operator-supplied (token mode) or
service-generated (bootstrap mode) — is a one-time credential. It
is stored as admin's single API key, tagged name="bootstrap". The
operator's first admin action after bootstrap should be:
- Create a durable admin user and API key (or issue a durable API key to the bootstrap admin).
- Revoke the bootstrap key via
revoke-api-key. - Remove the bootstrap token from any deployment configuration (Secret, env var, or YAML field — wherever it was sourced).
The name="bootstrap" marker makes bootstrap keys easy to detect in
tooling (e.g. a tg-list-api-keys filter).
HTTP forwarding (initial integration)
For the initial gateway integration — before the IAM service is wired into the authentication middleware — the gateway exposes a single forwarding endpoint:
POST /api/v1/iam
- Request body is a JSON encoding of
IamRequest. - Response body is a JSON encoding of
IamResponse. - The gateway's existing authentication (
GATEWAY_SECRETbearer) gates access to this endpoint so the IAM protocol can be exercised end-to-end in tests without touching the live auth path. - This endpoint is not the final shape. Once the middleware is
in place, per-operation REST endpoints replace it (for example
POST /api/v1/auth/login,POST /api/v1/users,DELETE /api/v1/api-keys/{id}), and this generic forwarder is removed.
The endpoint performs only message marshalling: it does not read or rewrite fields in the request, and it applies no capability check. All authorisation for user / workspace / key management lands in the subsequent middleware work.
Non-goals for this spec
- REST endpoint shape for the final gateway surface — covered in Phase 2 of the IAM implementation plan, not here.
- OIDC / SAML external IdP protocol — out of scope for open source.
- Key-signing algorithm choice, password KDF choice, JWT claim layout — implementation details captured in code + ADRs, not locked in the protocol spec.
References
- IAM Contract Specification — the abstract gateway↔IAM regime contract this protocol implements.
- Identity and Access Management Specification
- Capability Vocabulary Specification