Updated IAM spec

2026-06-25 22:58:06 +02:00 · 2026-04-23 11:58:07 +01:00 · 2026-04-23 11:58:07 +01:00 · f1e336138d
commit f1e336138d
parent b7c6c3a3a0
1 changed files with 76 additions and 18 deletions
--- a/docs/tech-specs/capabilities.md
+++ b/docs/tech-specs/capabilities.md
@ -18,12 +18,12 @@ This document defines the capability vocabulary — the closed list
 of capability strings that the gateway recognises — and the
 open-source edition's role bundles.

-The capability mechanism is shared between open-source and
-enterprise editions. The open-source edition ships a fixed
-three-role bundle (`reader`, `writer`, `admin`). Enterprise editions
-define additional roles by composing their own capability bundles
-from the same vocabulary; no protocol, gateway, or backend-service
-change is required.
+The capability mechanism is shared between open-source and potential
+3rd party enterprise capability. The open-source edition ships a
+fixed three-role bundle (`reader`, `writer`, `admin`). Enterprise
+capability may define additional roles by composing their own
+capability bundles from the same vocabulary; no protocol, gateway,
+or backend-service change is required.

 ## Motivation

@ -53,16 +53,20 @@ multi-word subsystems.

 | Capability | Covers |
 |---|---|
-| `query` | Read queries: agent, text-completion, prompt, graph-rag, document-rag, embeddings, triples, rows, NLP query, SPARQL, structured-query, mcp-tool |
-| `library:read` | List / fetch documents |
-| `library:write` | Add / replace / delete documents |
+| `agent` | agent (query-only; no write counterpart) |
+| `graph:read` | graph-rag, graph-embeddings-query, triples-query, sparql, graph-embeddings-export, triples-export |
+| `graph:write` | triples-import, graph-embeddings-import |
+| `documents:read` | document-rag, document-embeddings-query, document-embeddings-export, entity-contexts-export, document-stream-export, library list / fetch |
+| `documents:write` | document-embeddings-import, entity-contexts-import, text-load, document-load, library add / replace / delete |
+| `rows:read` | rows-query, row-embeddings-query, nlp-query, structured-query, structured-diag |
+| `rows:write` | rows-import |
+| `llm` | text-completion, prompt (stateless invocation) |
+| `embeddings` | Raw text-embedding service (stateless compute; typed-data embedding stores live under their data-subject capability) |
+| `mcp` | mcp-tool |
 | `collections:read` | List / describe collections |
 | `collections:write` | Create / delete collections |
 | `knowledge:read` | List / get knowledge cores |
 | `knowledge:write` | Create / delete knowledge cores |
-| `ingest` | text-load, document-load |
-| `export` | Streaming exports (triples, graph-embeddings, document-embeddings, entity-contexts, core-export) |
-| `import` | Streaming imports (triples, graph-embeddings, document-embeddings, entity-contexts, rows, core-import) |

 **Control plane**

@ -87,14 +91,28 @@ The open-source edition ships three roles:

 | Role | Capabilities |
 |---|---|
-| `reader` | `query`, `library:read`, `collections:read`, `knowledge:read`, `flows:read`, `config:read`, `keys:self` |
-| `writer` | everything in `reader` **+** `library:write`, `collections:write`, `knowledge:write`, `ingest`, `export`, `import` |
+| `reader` | `agent`, `graph:read`, `documents:read`, `rows:read`, `llm`, `embeddings`, `mcp`, `collections:read`, `knowledge:read`, `flows:read`, `config:read`, `keys:self` |
+| `writer` | everything in `reader` **+** `graph:write`, `documents:write`, `rows:write`, `collections:write`, `knowledge:write` |
 | `admin` | everything in `writer` **+** `config:write`, `flows:write`, `users:read`, `users:write`, `users:admin`, `keys:admin`, `workspaces:admin`, `iam:admin`, `metrics:read` |

 Open-source bundles are deliberately coarse. `workspaces:admin` and
 `iam:admin` live inside `admin` without a separate role; a single
 `admin` user holds the keys to the whole deployment.

+### The `agent` capability and composition
+
+The `agent` capability is granted independently of the capabilities
+it composes under the hood (`llm`, `graph`, `documents`, `rows`,
+`mcp`, etc.). A user holding `agent` but not `llm` can still cause
+LLM invocations because the agent implementation chooses which
+services to invoke on the caller's behalf.
+
+This is deliberate. A common policy is "allow controlled access
+via the agent, deny raw model calls" — granting `agent` without
+granting `llm` expresses exactly that. An administrator granting
+`agent` should treat it as a grant of everything the agent
+composes at deployment time.
+
 ### Authorisation evaluation

 For a request bearing a resolved set of roles
@ -109,6 +127,46 @@ No hierarchy, no precedence, no role-order sensitivity. A user
 with a single role is the common case; a user with multiple roles
 gets the union of their bundles.

+### Enforcement boundary
+
+Capability checks — and authentication — are applied **only at the
+API gateway**, on requests arriving from external callers.
+Operations originating inside the platform (backend service to
+backend service, agent to LLM, flow-svc to config-svc, bootstrap
+initialisers, scheduled reconcilers, autonomous flow steps) are
+**not capability-checked**. Backend services trust the workspace
+set by the gateway on inbound pub/sub messages and trust
+internally-originated messages without further authorisation.
+
+This policy has four consequences that are part of the spec, not
+accidents of implementation:
+
+1. **The gateway is the single trust boundary for user
+   authorisation.** Every backend service is a downstream consumer
+   of an already-authorised workspace scope.
+2. **Pub/sub carries workspace, not user identity.** Messages on
+   the bus do not carry credentials or the identity that originated
+   a request; they carry the resolved workspace only. This keeps
+   the bus protocol free of secrets and aligns with the workspace
+   resolver's role as the gateway-side narrowing step.
+3. **Composition is transitive.** Granting a capability that the
+   platform composes internally (for example, `agent`) transitively
+   grants everything that capability composes under the hood,
+   because the downstream calls are internal-origin and are not
+   re-checked. The composite nature of `agent` described above is
+   a consequence of this policy, not a special case.
+4. **Internal-origin operations have no user.** Bootstrap,
+   reconcilers, and other platform-initiated work act with
+   system-level authority. The workspace field on such messages
+   identifies which workspace's data is being touched, not who
+   asked.
+
+**Trust model.** Whoever has pub/sub access is implicitly trusted
+to act as any workspace. Defense-in-depth within the backend is
+not part of this design; the security perimeter is the gateway
+and the bus itself (TLS / network isolation between the bus and
+any untrusted network).
+
 ### Unknown capabilities and unknown roles

 - An endpoint declaring an unknown capability is a server-side bug
@ -143,13 +201,13 @@ data-engineer:  writer + {flows:read, config:read}
 workspace-owner: admin − {workspaces:admin, iam:admin}
 ```

-None of this requires a protocol change — the wire-protocol
-`roles` field on user records is already a set, the gateway's
+None of this requires a protocol change — the wire-protocol `roles`
+field on user records is already a set, the gateway's
 capability-check is already capability-based, and the capability
-vocabulary is closed. Enterprise introduces roles whose bundles
+vocabulary is closed. Enterprises may introduce roles whose bundles
 compose the same capabilities differently.

-When enterprise introduces a new capability (e.g. for a feature
+When an enterprise introduces a new capability (e.g. for a feature
 that does not exist in open source), the capability string is
 added to the vocabulary and recognised by the gateway build that
 ships that feature.