mirror of
https://github.com/trustgraph-ai/trustgraph.git
synced 2026-04-30 19:06:21 +02:00
Three threads, all reinforcing the contract's system-level vs.
workspace-association distinction.
WS Mux service routing
- tg-show-flows (and any workspace-level service over the WS) was
failing with "unknown service" because the post-refactor Mux
unconditionally looked up flow-service:<kind>. Now branches on
the envelope's flow field: with flow → flow-service:<kind>;
without flow → <kind>:<op> from the inner body; with bare op
lookup for service=iam. Resource and parameters come from the
matched op's own extractors — same path the HTTP endpoints take.
Optional workspace on system-level user/key ops
- list-users returns the deployment-wide list when no workspace is
supplied, filters when one is. get-user, update-user,
disable-user, enable-user, delete-user, reset-password,
create-api-key, list-api-keys, revoke-api-key all treat workspace
as an optional integrity check rather than a required argument.
- create-user keeps workspace required — there it's the new user's
home-workspace binding, a parameter rather than an address.
- API keys reclassified as SYSTEM-level resources. By the same
reasoning that makes users system-level, an API key is a
credential record on a deployment-wide registry; the workspace it
authenticates to is a property, not a containment.
Self-service surface
- whoami: returns the caller's own user record. AUTHENTICATED-only;
no users:read capability required. Foundation for UI affordances
that depend on the caller's permissions.
- bootstrap-status: POST /api/v1/auth/bootstrap-status, PUBLIC,
side-effect-free. Returns {bootstrap_available: bool} so a
first-run UI can decide whether to render setup without consuming
the bootstrap op.
- Gateway now injects actor=identity.handle on every authenticated
forward to iam-svc (IamEndpoint and WS Mux iam path), overwriting
any caller-supplied value. Underpins whoami, audit logging, and
future regime-side decisions that need actor identity.
- tg-whoami and tg-update-user CLIs.
Spec polish
- iam-contract.md: actor-injection rule documented; whoami /
bootstrap-status added to operations list; permission-scope
framing tightened (workspace scope is a property of the grant,
not the user or role).
- iam.md: self-service section; gateway flow gains the actor-
injection step; role section reframed so iam-svc constraints
don't leak into contract-level prose.
- iam-protocol.md: ops table updated for whoami, bootstrap-status,
optional-workspace pattern; bootstrap_available added to the
IamResponse listing.
386 lines
18 KiB
Markdown
386 lines
18 KiB
Markdown
---
|
|
layout: default
|
|
title: "IAM Service Protocol Technical Specification"
|
|
parent: "Tech Specs"
|
|
---
|
|
|
|
# IAM Service Protocol Technical Specification
|
|
|
|
## Overview
|
|
|
|
This document specifies the wire protocol of the **open-source IAM
|
|
regime** — one implementation of the abstract IAM contract defined
|
|
in [`iam-contract.md`](iam-contract.md). Other regimes (OIDC / SSO,
|
|
ABAC, ReBAC, external policy engines) implement the same contract
|
|
with different transports, data models, and policy semantics; the
|
|
gateway is unaware of which regime it's wired against.
|
|
|
|
The OSS regime is a backend processor (`iam-svc`) reached over the
|
|
standard request/response pub/sub pattern. It owns users,
|
|
workspaces, API keys, login credentials, and JWT signing keys, all
|
|
backed by Cassandra. The API gateway is its only caller.
|
|
|
|
This document defines:
|
|
|
|
- the `IamRequest` and `IamResponse` dataclasses on the bus,
|
|
- the operation set the OSS regime implements,
|
|
- per-operation input and output fields,
|
|
- the error taxonomy,
|
|
- the bootstrap modes,
|
|
- the initial HTTP forwarding endpoint used while the protocol is
|
|
being exercised.
|
|
|
|
The mapping from this regime onto the abstract contract is direct:
|
|
|
|
| Contract operation | OSS regime operation |
|
|
|---|---|
|
|
| `authenticate(credential)` | `resolve-api-key` (for API keys); local JWT validation against `get-signing-key-public` (for JWTs) |
|
|
| `authorise(identity, capability, resource, parameters)` | Role-table lookup against the OSS role bundles defined in [`capabilities.md`](capabilities.md), gated by workspace scope. Workspace can come from the resource address (workspace- and flow-level resources) or from a parameter (system-level resources whose parameters reference a workspace, e.g. `create-user with workspace association W`). |
|
|
| `authorise_many` | Loop over `authorise` |
|
|
| Identity / credential / workspace management | `create-user`, `create-api-key`, etc. as listed below. These are operations on system-level resources (the user / workspace / credential registries); workspace, where it appears in the body, is a parameter. |
|
|
|
|
Architectural context — roles, capabilities, workspace as resource
|
|
scope, enforcement boundary — lives in [`iam.md`](iam.md) and
|
|
[`capabilities.md`](capabilities.md). The contract abstraction
|
|
lives in [`iam-contract.md`](iam-contract.md).
|
|
|
|
## Transport
|
|
|
|
- **Request topic:** `request:tg/request/iam-request`
|
|
- **Response topic:** `response:tg/response/iam-response`
|
|
- **Pattern:** request/response, correlated by the `id` message
|
|
property, the same pattern used by `config-svc` and `flow-svc`.
|
|
- **Caller:** the API gateway only. Under the enforcement-boundary
|
|
policy (see capabilities spec), the IAM service trusts the bus
|
|
and performs no per-request authentication or capability check
|
|
against the caller. The gateway has already evaluated capability
|
|
membership and workspace scoping before sending the request.
|
|
|
|
## Dataclasses
|
|
|
|
### `IamRequest`
|
|
|
|
```python
|
|
@dataclass
|
|
class IamRequest:
|
|
# One of the operation strings below.
|
|
operation: str = ""
|
|
|
|
# Scope of this request. Required on every workspace-scoped
|
|
# operation. Omitted (or empty) for system-level ops
|
|
# (workspace CRUD, signing-key ops, bootstrap, resolve-api-key,
|
|
# login).
|
|
workspace: str = ""
|
|
|
|
# Acting user id. Set by the gateway to the authenticated
|
|
# caller's identity handle for every authenticated request
|
|
# (overwrites any caller-supplied value — the gateway is the
|
|
# only authority for actor identity, so handlers can rely on it
|
|
# being authentic). Used for audit logging, self-service ops
|
|
# like ``whoami`` that resolve "the caller", and future actor-
|
|
# scoped policy checks. Empty for unauthenticated ops
|
|
# (``login``, ``bootstrap``, ``bootstrap-status``,
|
|
# ``get-signing-key-public``, ``resolve-api-key``). See the
|
|
# actor-injection rule in the IAM contract spec.
|
|
actor: str = ""
|
|
|
|
# --- identity selectors ---
|
|
user_id: str = ""
|
|
username: str = "" # login; unique within a workspace
|
|
key_id: str = "" # revoke-api-key, list-api-keys (own)
|
|
api_key: str = "" # resolve-api-key (plaintext)
|
|
|
|
# --- credentials ---
|
|
password: str = "" # login, change-password (current)
|
|
new_password: str = "" # change-password
|
|
|
|
# --- user fields ---
|
|
user: UserInput | None = None # create-user, update-user
|
|
|
|
# --- workspace fields ---
|
|
workspace_record: WorkspaceInput | None = None # create-workspace, update-workspace
|
|
|
|
# --- api key fields ---
|
|
key: ApiKeyInput | None = None # create-api-key
|
|
```
|
|
|
|
### `IamResponse`
|
|
|
|
```python
|
|
@dataclass
|
|
class IamResponse:
|
|
# Populated on success of operations that return them.
|
|
user: UserRecord | None = None # create-user, get-user, update-user
|
|
users: list[UserRecord] = field(default_factory=list) # list-users
|
|
workspace: WorkspaceRecord | None = None # create-workspace, get-workspace, update-workspace
|
|
workspaces: list[WorkspaceRecord] = field(default_factory=list) # list-workspaces
|
|
|
|
# create-api-key returns the plaintext once. Never populated
|
|
# on any other operation.
|
|
api_key_plaintext: str = ""
|
|
api_key: ApiKeyRecord | None = None # create-api-key
|
|
api_keys: list[ApiKeyRecord] = field(default_factory=list) # list-api-keys
|
|
|
|
# login, rotate-signing-key
|
|
jwt: str = ""
|
|
jwt_expires: str = "" # ISO-8601 UTC
|
|
|
|
# get-signing-key-public
|
|
signing_key_public: str = "" # PEM
|
|
|
|
# resolve-api-key returns who this key authenticates as.
|
|
resolved_user_id: str = ""
|
|
resolved_workspace: str = ""
|
|
resolved_roles: list[str] = field(default_factory=list)
|
|
|
|
# reset-password
|
|
temporary_password: str = "" # returned once to the operator
|
|
|
|
# bootstrap: on first run, the initial admin's one-time API key
|
|
# is returned for the operator to capture.
|
|
bootstrap_admin_user_id: str = ""
|
|
bootstrap_admin_api_key: str = ""
|
|
|
|
# bootstrap-status: true iff an unconsumed ``bootstrap`` call
|
|
# would currently succeed. Always emitted by the response
|
|
# translator (the false case is meaningful for first-run UIs).
|
|
bootstrap_available: bool = False
|
|
|
|
# Present on any failed operation.
|
|
error: Error | None = None
|
|
```
|
|
|
|
### Value types
|
|
|
|
```python
|
|
@dataclass
|
|
class UserInput:
|
|
username: str = ""
|
|
name: str = ""
|
|
email: str = ""
|
|
password: str = "" # only on create-user; never on update-user
|
|
roles: list[str] = field(default_factory=list)
|
|
enabled: bool = True
|
|
must_change_password: bool = False
|
|
|
|
@dataclass
|
|
class UserRecord:
|
|
id: str = ""
|
|
workspace: str = ""
|
|
username: str = ""
|
|
name: str = ""
|
|
email: str = ""
|
|
roles: list[str] = field(default_factory=list)
|
|
enabled: bool = True
|
|
must_change_password: bool = False
|
|
created: str = "" # ISO-8601 UTC
|
|
# Password hash is never included in any response.
|
|
|
|
@dataclass
|
|
class WorkspaceInput:
|
|
id: str = ""
|
|
name: str = ""
|
|
enabled: bool = True
|
|
|
|
@dataclass
|
|
class WorkspaceRecord:
|
|
id: str = ""
|
|
name: str = ""
|
|
enabled: bool = True
|
|
created: str = "" # ISO-8601 UTC
|
|
|
|
@dataclass
|
|
class ApiKeyInput:
|
|
user_id: str = ""
|
|
name: str = "" # operator-facing label, e.g. "laptop"
|
|
expires: str = "" # optional ISO-8601 UTC; empty = no expiry
|
|
|
|
@dataclass
|
|
class ApiKeyRecord:
|
|
id: str = ""
|
|
user_id: str = ""
|
|
name: str = ""
|
|
prefix: str = "" # first 4 chars of plaintext, for identification in lists
|
|
expires: str = "" # empty = no expiry
|
|
created: str = ""
|
|
last_used: str = "" # empty if never used
|
|
# key_hash is never included in any response.
|
|
```
|
|
|
|
## Operations
|
|
|
|
| Operation | Request fields | Response fields | Notes |
|
|
|---|---|---|---|
|
|
| `login` | `username`, `password`, `workspace` (optional) | `jwt`, `jwt_expires` | If `workspace` omitted, IAM resolves to the user's assigned workspace. |
|
|
| `whoami` | `actor` (gateway-injected) | `user` | Returns the calling user's own record. AUTHENTICATED-only; no `users:read` capability required. |
|
|
| `resolve-api-key` | `api_key` (plaintext) | `resolved_user_id`, `resolved_workspace`, `resolved_roles` | Gateway-internal. Service returns `auth-failed` for unknown / expired / revoked keys. |
|
|
| `change-password` | `user_id`, `password` (current), `new_password` | — | Self-service. IAM validates `password` against stored hash. |
|
|
| `reset-password` | `user_id`, `workspace` (optional integrity check) | `temporary_password` | Admin-initiated. IAM generates a random password, sets `must_change_password=true` on the user, returns the plaintext once. |
|
|
| `create-user` | `workspace`, `user` | `user` | `user.password` is hashed and stored; `user.roles` must be subset of known roles. `workspace` is the new user's home-workspace binding (a required *parameter*, not an address). |
|
|
| `list-users` | `workspace` (optional filter) | `users` | If `workspace` omitted, returns the deployment-wide list. |
|
|
| `get-user` | `user_id`, `workspace` (optional integrity check) | `user` | |
|
|
| `update-user` | `user_id`, `user`, `workspace` (optional integrity check) | `user` | `password` field on `user` is rejected; use `change-password` / `reset-password`. Username is immutable. |
|
|
| `disable-user` | `user_id`, `workspace` (optional integrity check) | — | Soft-delete; sets `enabled=false`. Revokes all the user's API keys. |
|
|
| `enable-user` | `user_id`, `workspace` (optional integrity check) | — | Re-enables a previously disabled user; does not restore API keys. |
|
|
| `delete-user` | `user_id`, `workspace` (optional integrity check) | — | Hard-delete; removes user record, username lookup, and all the user's API keys. |
|
|
| `create-workspace` | `workspace_record` | `workspace` | System-level. |
|
|
| `list-workspaces` | — | `workspaces` | System-level. |
|
|
| `get-workspace` | `workspace_record` (id only) | `workspace` | System-level. |
|
|
| `update-workspace` | `workspace_record` | `workspace` | System-level. |
|
|
| `disable-workspace` | `workspace_record` (id only) | — | System-level. Sets `enabled=false`; revokes all workspace API keys; disables all users in the workspace. |
|
|
| `create-api-key` | `key`, `workspace` (optional integrity check) | `api_key_plaintext`, `api_key` | Plaintext returned **once**; only hash stored. `key.name` required. |
|
|
| `list-api-keys` | `user_id`, `workspace` (optional integrity check) | `api_keys` | |
|
|
| `revoke-api-key` | `key_id`, `workspace` (optional integrity check) | — | Deletes the key record. |
|
|
| `get-signing-key-public` | — | `signing_key_public` | Gateway fetches this at startup. |
|
|
| `rotate-signing-key` | — | — | System-level. Introduces a new signing key; old key continues to validate JWTs for a grace period (implementation-defined, minimum 1h). |
|
|
| `bootstrap` | — | `bootstrap_admin_user_id`, `bootstrap_admin_api_key` | If IAM tables are empty and the service is in `bootstrap` mode, creates the initial `default` workspace, an `admin` user, an initial API key, and an initial signing key; returns them once. Otherwise returns a masked auth failure. |
|
|
| `bootstrap-status` | — | `bootstrap_available` | Side-effect-free probe; `true` iff iam-svc is in `bootstrap` mode and tables are empty. Intended for first-run UX. |
|
|
|
|
## Error taxonomy
|
|
|
|
All errors are carried in the `IamResponse.error` field. `error.type`
|
|
is one of the values below; `error.message` is a human-readable
|
|
string that is **not** surfaced verbatim to external callers (the
|
|
gateway maps to `auth failure` / `access denied` per the IAM error
|
|
policy).
|
|
|
|
| `type` | When |
|
|
|---|---|
|
|
| `invalid-argument` | Malformed request (missing required field, unknown operation, invalid format). |
|
|
| `not-found` | Named resource does not exist (`user_id`, `key_id`, workspace). |
|
|
| `duplicate` | Create operation collides with an existing resource (username, workspace id, key name). |
|
|
| `auth-failed` | `login` with wrong credentials; `resolve-api-key` with unknown / expired / revoked key; `change-password` with wrong current password. Single bucket to deny oracle attacks. |
|
|
| `weak-password` | Password does not meet policy (length, complexity — policy defined at service level). |
|
|
| `disabled` | Target user or workspace has `enabled=false`. |
|
|
| `operation-not-permitted` | Non-admin attempting system-level operation, or workspace-scoped operation attempting to affect another workspace. |
|
|
| `internal-error` | Unexpected IAM-side failure. Log and surface as 500 at the gateway. |
|
|
|
|
The gateway is responsible for translating `auth-failed` and
|
|
`operation-not-permitted` into the obfuscated external error
|
|
response (`"auth failure"` / `"access denied"`); `invalid-argument`
|
|
becomes a descriptive 400; `not-found` / `duplicate` /
|
|
`weak-password` / `disabled` become descriptive 4xx but never leak
|
|
IAM-internal detail.
|
|
|
|
## Credential storage
|
|
|
|
- **Passwords** are stored using a slow KDF (bcrypt / argon2id — the
|
|
service picks; documented as an implementation detail). The
|
|
`password_hash` column stores the full KDF-encoded string
|
|
(algorithm, cost, salt, hash). Not a plain SHA-256.
|
|
- **API keys** are stored as SHA-256 of the plaintext. API keys
|
|
are 128-bit random values (`tg_` + base64url); the entropy
|
|
makes a slow hash unnecessary. The hash serves as the primary
|
|
key on the `iam_api_keys` table, enabling O(1) lookup on
|
|
`resolve-api-key`.
|
|
- **JWT signing key** is stored as an RSA or Ed25519 private key
|
|
(implementation choice) in a dedicated `iam_signing_keys` table
|
|
with a `kid`, `created`, and optional `retired` timestamp. At
|
|
most one active key; up to N retired keys are kept for a grace
|
|
period to validate previously-issued JWTs.
|
|
|
|
Passwords, API-key plaintext, and signing-key private material are
|
|
never returned in any response other than the explicit one-time
|
|
responses above (`reset-password`, `create-api-key`, `bootstrap`).
|
|
|
|
## Bootstrap modes
|
|
|
|
`iam-svc` requires a bootstrap mode to be chosen at startup. There is
|
|
no default — an unset or invalid mode causes the service to refuse
|
|
to start. The purpose is to force the operator to make an explicit
|
|
security decision rather than rely on an implicit "safe" fallback.
|
|
|
|
| Mode | Startup behaviour | `bootstrap` operation | Suitability |
|
|
|---|---|---|---|
|
|
| `token` | On first start with empty tables, auto-seeds the `default` workspace, admin user, admin API key (using the operator-provided `--bootstrap-token`), and an initial signing key. No-op on subsequent starts. | Refused — returns `auth-failed` / `"auth failure"` regardless of caller. | Production, any public-exposure deployment. |
|
|
| `bootstrap` | No startup seeding. Tables remain empty until the `bootstrap` operation is invoked over the pub/sub bus (typically via `tg-bootstrap-iam`). | Live while tables are empty. Generates and returns the admin API key once. Refused (`auth-failed`) once tables are populated. | Dev / compose up / CI. **Not safe under public exposure** — any caller reaching the gateway's `/api/v1/iam` forwarder before the operator can cause a token to be issued to them. Operators choosing this mode accept that risk. |
|
|
|
|
### Error masking
|
|
|
|
In both modes, any refused invocation of the `bootstrap` operation
|
|
returns the same error (`auth-failed` / `"auth failure"`). A caller
|
|
cannot distinguish:
|
|
|
|
- "service is in token mode"
|
|
- "service is in bootstrap mode but already bootstrapped"
|
|
- "operation forbidden"
|
|
|
|
This matches the general IAM error-policy stance (see `iam.md`) and
|
|
prevents externally enumerating IAM's state.
|
|
|
|
### Configuration sources
|
|
|
|
The mode and token can be supplied two ways. Resolution order is
|
|
fixed; there is no permissive fallback.
|
|
|
|
| Source | Field |
|
|
|---|---|
|
|
| Processor-group YAML / CLI argument | `bootstrap_mode`, `bootstrap_token` |
|
|
| Environment variable | `IAM_BOOTSTRAP_MODE`, `IAM_BOOTSTRAP_TOKEN` |
|
|
|
|
For each setting the service uses the explicit param value if
|
|
present; otherwise the environment variable; otherwise the service
|
|
refuses to start. The env-var path is intended for the K8s
|
|
deployment pattern where the token is injected from a `Secret` via
|
|
`secretKeyRef`, so the plaintext never has to live in YAML or git.
|
|
A typical production manifest holds `bootstrap_mode: "token"` in
|
|
the YAML and pulls `IAM_BOOTSTRAP_TOKEN` from the Secret; the YAML
|
|
is then safe to version-control.
|
|
|
|
### Bootstrap-token lifecycle
|
|
|
|
The bootstrap token — whether operator-supplied (`token` mode) or
|
|
service-generated (`bootstrap` mode) — is a one-time credential. It
|
|
is stored as admin's single API key, tagged `name="bootstrap"`. The
|
|
operator's first admin action after bootstrap should be:
|
|
|
|
1. Create a durable admin user and API key (or issue a durable API
|
|
key to the bootstrap admin).
|
|
2. Revoke the bootstrap key via `revoke-api-key`.
|
|
3. Remove the bootstrap token from any deployment configuration
|
|
(Secret, env var, or YAML field — wherever it was sourced).
|
|
|
|
The `name="bootstrap"` marker makes bootstrap keys easy to detect in
|
|
tooling (e.g. a `tg-list-api-keys` filter).
|
|
|
|
## HTTP forwarding (initial integration)
|
|
|
|
For the initial gateway integration — before the IAM service is
|
|
wired into the authentication middleware — the gateway exposes a
|
|
single forwarding endpoint:
|
|
|
|
```
|
|
POST /api/v1/iam
|
|
```
|
|
|
|
- Request body is a JSON encoding of `IamRequest`.
|
|
- Response body is a JSON encoding of `IamResponse`.
|
|
- The gateway's existing authentication (`GATEWAY_SECRET` bearer)
|
|
gates access to this endpoint so the IAM protocol can be
|
|
exercised end-to-end in tests without touching the live auth
|
|
path.
|
|
- This endpoint is **not** the final shape. Once the middleware is
|
|
in place, per-operation REST endpoints replace it (for example
|
|
`POST /api/v1/auth/login`, `POST /api/v1/users`, `DELETE
|
|
/api/v1/api-keys/{id}`), and this generic forwarder is removed.
|
|
|
|
The endpoint performs only message marshalling: it does not read
|
|
or rewrite fields in the request, and it applies no capability
|
|
check. All authorisation for user / workspace / key management
|
|
lands in the subsequent middleware work.
|
|
|
|
## Non-goals for this spec
|
|
|
|
- REST endpoint shape for the final gateway surface — covered in
|
|
Phase 2 of the IAM implementation plan, not here.
|
|
- OIDC / SAML external IdP protocol — out of scope for open source.
|
|
- Key-signing algorithm choice, password KDF choice, JWT claim
|
|
layout — implementation details captured in code + ADRs, not
|
|
locked in the protocol spec.
|
|
|
|
## References
|
|
|
|
- [IAM Contract Specification](iam-contract.md) — the abstract
|
|
gateway↔IAM regime contract this protocol implements.
|
|
- [Identity and Access Management Specification](iam.md)
|
|
- [Capability Vocabulary Specification](capabilities.md)
|