diff --git a/README.md b/README.md index 32c6c81..c10b898 100644 --- a/README.md +++ b/README.md @@ -10,17 +10,6 @@ ## 🚀 Quick Start -### 0. Try It Now (Demo Credentials) - -No account needed — use these public demo credentials to test immediately: - -| | | -|---|---| -| **API key** | `NOMYO_AI_E2EE_INFERENCE` | -| **Model** | `Qwen/Qwen3-0.6B` | - -> **Note:** The demo endpoint uses a fixed 256-token context window and is intended for evaluation only. - ### 1. Install methods via pip (recommended): diff --git a/doc/api-reference.md b/doc/api-reference.md index 1069082..363c865 100644 --- a/doc/api-reference.md +++ b/doc/api-reference.md @@ -75,30 +75,10 @@ A dictionary containing the chat completion response with the following structur "prompt_tokens": int, "completion_tokens": int, "total_tokens": int - }, - "_metadata": { - "payload_id": str, - "processed_at": int, # Unix timestamp - "is_encrypted": bool, - "response_status": str, - "security_tier": str, # "standard", "high", or "maximum" - "memory_protection": dict, # server-side memory protection info - "cuda_device": dict, # privacy-safe GPU info (hashed identifiers) - "tpm_attestation": { # TPM 2.0 hardware attestation (see Security Guide) - "is_available": bool, - # Present only when is_available is True: - "pcr_banks": str, # e.g. "sha256:0,7,10" - "pcr_values": dict, # {bank: {pcr_index: hex_digest}} - "quote_b64": str, # base64-encoded TPMS_ATTEST (signed by AIK) - "signature_b64": str, # base64-encoded TPMT_SIGNATURE - "aik_pubkey_b64": str, # base64-encoded TPM2B_PUBLIC (ephemeral AIK) - } } } ``` -The `_metadata` field is added by the client library and is not part of the OpenAI API response format. See the [Security Guide](security-guide.md) for how to interpret and verify `tpm_attestation`. - #### acreate(model, messages, **kwargs) Async alias for create() method. diff --git a/doc/getting-started.md b/doc/getting-started.md index 1cf78a4..4ccdf82 100644 --- a/doc/getting-started.md +++ b/doc/getting-started.md @@ -1,33 +1,5 @@ # Getting Started -## Try It Now (Demo Credentials) - -You can test the client immediately using these public demo credentials — no sign-up required: - -| | | -|---|---| -| **API key** | `NOMYO_AI_E2EE_INFERENCE` | -| **Model** | `Qwen/Qwen3-0.6B` | - -> **Note:** The demo endpoint uses a fixed 256-token context window and is intended for evaluation only. - -```python -import asyncio -from nomyo import SecureChatCompletion - -async def main(): - client = SecureChatCompletion(api_key="NOMYO_AI_E2EE_INFERENCE") - - response = await client.create( - model="Qwen/Qwen3-0.6B", - messages=[{"role": "user", "content": "Hello!"}] - ) - - print(response['choices'][0]['message']['content']) - -asyncio.run(main()) -``` - ## Basic Usage The NOMYO client provides end-to-end encryption (E2E) for all communications between your application and the NOMYO inference endpoints. This ensures that your prompts and responses are protected from unauthorized access or interception. diff --git a/doc/security-guide.md b/doc/security-guide.md index 6e4abdc..6c34f71 100644 --- a/doc/security-guide.md +++ b/doc/security-guide.md @@ -162,81 +162,6 @@ Secure memory features: - Guarantees zeroing of sensitive memory - Prevents memory dumps from containing sensitive data -## Hardware Attestation (TPM 2.0) - -### What it is - -When the server has a TPM 2.0 chip, every response includes a `tpm_attestation` block in `_metadata`. This is a cryptographically signed hardware quote proving: - -- Which firmware and Secure Boot state the server is running (PCR 0, 7) -- Which application binary is running, when IMA is active (PCR 10) - -The quote is signed by an ephemeral AIK (Attestation Identity Key) generated fresh for each request and tied to the `payload_id` nonce, so it cannot be replayed for a different request. - -### Reading the attestation - -```python -response = await client.create( - model="Qwen/Qwen3-0.6B", - messages=[{"role": "user", "content": "..."}], - security_tier="maximum" -) - -tpm = response["_metadata"].get("tpm_attestation", {}) - -if tpm.get("is_available"): - print("PCR banks:", tpm["pcr_banks"]) # e.g. "sha256:0,7,10" - print("PCR values:", tpm["pcr_values"]) # {bank: {index: hex}} - print("AIK key:", tpm["aik_pubkey_b64"][:32], "...") -else: - print("TPM not available on this server") -``` - -### Verifying the quote - -The response is self-contained: `aik_pubkey_b64` is the full public key of the AIK that signed the quote, so no separate key-fetch round-trip is needed. - -Verification steps using `tpm2-pytss`: - -```python -import base64 -from tpm2_pytss.types import TPM2B_PUBLIC, TPMT_SIGNATURE, TPM2B_ATTEST - -# 1. Decode the quote components -aik_pub = TPM2B_PUBLIC.unmarshal(base64.b64decode(tpm["aik_pubkey_b64"]))[0] -quote = TPM2B_ATTEST.unmarshal(base64.b64decode(tpm["quote_b64"]))[0] -sig = TPMT_SIGNATURE.unmarshal(base64.b64decode(tpm["signature_b64"]))[0] - -# 2. Verify the signature over the quote using the AIK public key -# (use a TPM ESAPI verify_signature call or an offline RSA verify) - -# 3. Inspect the qualifying_data inside the quote — it must match -# SHA-256(payload_id.encode())[:16] to confirm this quote is for this request - -# 4. Check pcr_values against your known-good baseline -``` - -> Full verification requires `tpm2-pytss` on the client side (`pip install tpm2-pytss` + `sudo apt install libtss2-dev`). It is optional — the attestation is informational unless your deployment policy requires verification. - -### Behaviour per security tier - -| Tier | TPM unavailable | -|------|----------------| -| `standard` | `tpm_attestation: {"is_available": false}` — request proceeds | -| `high` | same as standard | -| `maximum` | `ServiceUnavailableError` (HTTP 503) — request rejected | - -For `maximum` tier, the server enforces TPM availability as a hard requirement. If your server has no TPM and you request `maximum`, catch the error explicitly: - -```python -from nomyo import ServiceUnavailableError - -try: - response = await client.create(..., security_tier="maximum") -except ServiceUnavailableError as e: - print("Server does not meet TPM requirements for maximum tier:", e) -``` - ## Compliance Considerations ### HIPAA Compliance @@ -282,11 +207,9 @@ response = await client.create( messages=[{"role": "user", "content": "Hello"}] ) -print(response["_metadata"]) # Contains security_tier, memory_protection, tpm_attestation, etc. +print(response["_metadata"]) # Contains security-related information ``` -See [Hardware Attestation](#hardware-attestation-tpm-20) for details on the `tpm_attestation` field. - ### Logging Enable logging to see security operations: diff --git a/nomyo/SecureCompletionClient.py b/nomyo/SecureCompletionClient.py index 6aa5379..4c1d96d 100644 --- a/nomyo/SecureCompletionClient.py +++ b/nomyo/SecureCompletionClient.py @@ -1,5 +1,5 @@ import asyncio, ctypes, json, base64, urllib.parse, httpx, os, secrets, sys, warnings, logging -from typing import Dict, Any, Optional, Union +from typing import Dict, Any, Optional from cryptography.hazmat.primitives import serialization, hashes from cryptography.hazmat.primitives.asymmetric import rsa, padding from cryptography.hazmat.backends import default_backend diff --git a/nomyo/__init__.py b/nomyo/__init__.py index 6fb55fe..5773045 100644 --- a/nomyo/__init__.py +++ b/nomyo/__init__.py @@ -51,6 +51,6 @@ try: except ImportError: pass -__version__ = "0.2.7" +__version__ = "0.2.6" __author__ = "NOMYO AI" __license__ = "Apache-2.0" diff --git a/pyproject.toml b/pyproject.toml index d0f08cb..5576a31 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "hatchling.build" [project] name = "nomyo" -version = "0.2.7" +version = "0.2.6" description = "OpenAI-compatible secure chat client with end-to-end encryption for NOMYO Inference Endpoints" authors = [ {name = "NOMYO.AI", email = "ichi@nomyo.ai"},