diff --git a/doc/api-reference.md b/doc/api-reference.md index 363c865..1069082 100644 --- a/doc/api-reference.md +++ b/doc/api-reference.md @@ -75,10 +75,30 @@ A dictionary containing the chat completion response with the following structur "prompt_tokens": int, "completion_tokens": int, "total_tokens": int + }, + "_metadata": { + "payload_id": str, + "processed_at": int, # Unix timestamp + "is_encrypted": bool, + "response_status": str, + "security_tier": str, # "standard", "high", or "maximum" + "memory_protection": dict, # server-side memory protection info + "cuda_device": dict, # privacy-safe GPU info (hashed identifiers) + "tpm_attestation": { # TPM 2.0 hardware attestation (see Security Guide) + "is_available": bool, + # Present only when is_available is True: + "pcr_banks": str, # e.g. "sha256:0,7,10" + "pcr_values": dict, # {bank: {pcr_index: hex_digest}} + "quote_b64": str, # base64-encoded TPMS_ATTEST (signed by AIK) + "signature_b64": str, # base64-encoded TPMT_SIGNATURE + "aik_pubkey_b64": str, # base64-encoded TPM2B_PUBLIC (ephemeral AIK) + } } } ``` +The `_metadata` field is added by the client library and is not part of the OpenAI API response format. See the [Security Guide](security-guide.md) for how to interpret and verify `tpm_attestation`. + #### acreate(model, messages, **kwargs) Async alias for create() method. diff --git a/doc/security-guide.md b/doc/security-guide.md index 6c34f71..6e4abdc 100644 --- a/doc/security-guide.md +++ b/doc/security-guide.md @@ -162,6 +162,81 @@ Secure memory features: - Guarantees zeroing of sensitive memory - Prevents memory dumps from containing sensitive data +## Hardware Attestation (TPM 2.0) + +### What it is + +When the server has a TPM 2.0 chip, every response includes a `tpm_attestation` block in `_metadata`. This is a cryptographically signed hardware quote proving: + +- Which firmware and Secure Boot state the server is running (PCR 0, 7) +- Which application binary is running, when IMA is active (PCR 10) + +The quote is signed by an ephemeral AIK (Attestation Identity Key) generated fresh for each request and tied to the `payload_id` nonce, so it cannot be replayed for a different request. + +### Reading the attestation + +```python +response = await client.create( + model="Qwen/Qwen3-0.6B", + messages=[{"role": "user", "content": "..."}], + security_tier="maximum" +) + +tpm = response["_metadata"].get("tpm_attestation", {}) + +if tpm.get("is_available"): + print("PCR banks:", tpm["pcr_banks"]) # e.g. "sha256:0,7,10" + print("PCR values:", tpm["pcr_values"]) # {bank: {index: hex}} + print("AIK key:", tpm["aik_pubkey_b64"][:32], "...") +else: + print("TPM not available on this server") +``` + +### Verifying the quote + +The response is self-contained: `aik_pubkey_b64` is the full public key of the AIK that signed the quote, so no separate key-fetch round-trip is needed. + +Verification steps using `tpm2-pytss`: + +```python +import base64 +from tpm2_pytss.types import TPM2B_PUBLIC, TPMT_SIGNATURE, TPM2B_ATTEST + +# 1. Decode the quote components +aik_pub = TPM2B_PUBLIC.unmarshal(base64.b64decode(tpm["aik_pubkey_b64"]))[0] +quote = TPM2B_ATTEST.unmarshal(base64.b64decode(tpm["quote_b64"]))[0] +sig = TPMT_SIGNATURE.unmarshal(base64.b64decode(tpm["signature_b64"]))[0] + +# 2. Verify the signature over the quote using the AIK public key +# (use a TPM ESAPI verify_signature call or an offline RSA verify) + +# 3. Inspect the qualifying_data inside the quote — it must match +# SHA-256(payload_id.encode())[:16] to confirm this quote is for this request + +# 4. Check pcr_values against your known-good baseline +``` + +> Full verification requires `tpm2-pytss` on the client side (`pip install tpm2-pytss` + `sudo apt install libtss2-dev`). It is optional — the attestation is informational unless your deployment policy requires verification. + +### Behaviour per security tier + +| Tier | TPM unavailable | +|------|----------------| +| `standard` | `tpm_attestation: {"is_available": false}` — request proceeds | +| `high` | same as standard | +| `maximum` | `ServiceUnavailableError` (HTTP 503) — request rejected | + +For `maximum` tier, the server enforces TPM availability as a hard requirement. If your server has no TPM and you request `maximum`, catch the error explicitly: + +```python +from nomyo import ServiceUnavailableError + +try: + response = await client.create(..., security_tier="maximum") +except ServiceUnavailableError as e: + print("Server does not meet TPM requirements for maximum tier:", e) +``` + ## Compliance Considerations ### HIPAA Compliance @@ -207,9 +282,11 @@ response = await client.create( messages=[{"role": "user", "content": "Hello"}] ) -print(response["_metadata"]) # Contains security-related information +print(response["_metadata"]) # Contains security_tier, memory_protection, tpm_attestation, etc. ``` +See [Hardware Attestation](#hardware-attestation-tpm-20) for details on the `tpm_attestation` field. + ### Logging Enable logging to see security operations: diff --git a/nomyo/SecureCompletionClient.py b/nomyo/SecureCompletionClient.py index 4c1d96d..6aa5379 100644 --- a/nomyo/SecureCompletionClient.py +++ b/nomyo/SecureCompletionClient.py @@ -1,5 +1,5 @@ import asyncio, ctypes, json, base64, urllib.parse, httpx, os, secrets, sys, warnings, logging -from typing import Dict, Any, Optional +from typing import Dict, Any, Optional, Union from cryptography.hazmat.primitives import serialization, hashes from cryptography.hazmat.primitives.asymmetric import rsa, padding from cryptography.hazmat.backends import default_backend