Compare commits
3 commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 1330a1068f | |||
| e440a7c43b | |||
| 1fad4f15f0 |
7 changed files with 140 additions and 4 deletions
11
README.md
11
README.md
|
|
@ -10,6 +10,17 @@
|
||||||
|
|
||||||
## 🚀 Quick Start
|
## 🚀 Quick Start
|
||||||
|
|
||||||
|
### 0. Try It Now (Demo Credentials)
|
||||||
|
|
||||||
|
No account needed — use these public demo credentials to test immediately:
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **API key** | `NOMYO_AI_E2EE_INFERENCE` |
|
||||||
|
| **Model** | `Qwen/Qwen3-0.6B` |
|
||||||
|
|
||||||
|
> **Note:** The demo endpoint uses a fixed 256-token context window and is intended for evaluation only.
|
||||||
|
|
||||||
### 1. Install methods
|
### 1. Install methods
|
||||||
|
|
||||||
via pip (recommended):
|
via pip (recommended):
|
||||||
|
|
|
||||||
|
|
@ -75,10 +75,30 @@ A dictionary containing the chat completion response with the following structur
|
||||||
"prompt_tokens": int,
|
"prompt_tokens": int,
|
||||||
"completion_tokens": int,
|
"completion_tokens": int,
|
||||||
"total_tokens": int
|
"total_tokens": int
|
||||||
|
},
|
||||||
|
"_metadata": {
|
||||||
|
"payload_id": str,
|
||||||
|
"processed_at": int, # Unix timestamp
|
||||||
|
"is_encrypted": bool,
|
||||||
|
"response_status": str,
|
||||||
|
"security_tier": str, # "standard", "high", or "maximum"
|
||||||
|
"memory_protection": dict, # server-side memory protection info
|
||||||
|
"cuda_device": dict, # privacy-safe GPU info (hashed identifiers)
|
||||||
|
"tpm_attestation": { # TPM 2.0 hardware attestation (see Security Guide)
|
||||||
|
"is_available": bool,
|
||||||
|
# Present only when is_available is True:
|
||||||
|
"pcr_banks": str, # e.g. "sha256:0,7,10"
|
||||||
|
"pcr_values": dict, # {bank: {pcr_index: hex_digest}}
|
||||||
|
"quote_b64": str, # base64-encoded TPMS_ATTEST (signed by AIK)
|
||||||
|
"signature_b64": str, # base64-encoded TPMT_SIGNATURE
|
||||||
|
"aik_pubkey_b64": str, # base64-encoded TPM2B_PUBLIC (ephemeral AIK)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
The `_metadata` field is added by the client library and is not part of the OpenAI API response format. See the [Security Guide](security-guide.md) for how to interpret and verify `tpm_attestation`.
|
||||||
|
|
||||||
#### acreate(model, messages, **kwargs)
|
#### acreate(model, messages, **kwargs)
|
||||||
|
|
||||||
Async alias for create() method.
|
Async alias for create() method.
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,33 @@
|
||||||
# Getting Started
|
# Getting Started
|
||||||
|
|
||||||
|
## Try It Now (Demo Credentials)
|
||||||
|
|
||||||
|
You can test the client immediately using these public demo credentials — no sign-up required:
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **API key** | `NOMYO_AI_E2EE_INFERENCE` |
|
||||||
|
| **Model** | `Qwen/Qwen3-0.6B` |
|
||||||
|
|
||||||
|
> **Note:** The demo endpoint uses a fixed 256-token context window and is intended for evaluation only.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import asyncio
|
||||||
|
from nomyo import SecureChatCompletion
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
client = SecureChatCompletion(api_key="NOMYO_AI_E2EE_INFERENCE")
|
||||||
|
|
||||||
|
response = await client.create(
|
||||||
|
model="Qwen/Qwen3-0.6B",
|
||||||
|
messages=[{"role": "user", "content": "Hello!"}]
|
||||||
|
)
|
||||||
|
|
||||||
|
print(response['choices'][0]['message']['content'])
|
||||||
|
|
||||||
|
asyncio.run(main())
|
||||||
|
```
|
||||||
|
|
||||||
## Basic Usage
|
## Basic Usage
|
||||||
|
|
||||||
The NOMYO client provides end-to-end encryption (E2E) for all communications between your application and the NOMYO inference endpoints. This ensures that your prompts and responses are protected from unauthorized access or interception.
|
The NOMYO client provides end-to-end encryption (E2E) for all communications between your application and the NOMYO inference endpoints. This ensures that your prompts and responses are protected from unauthorized access or interception.
|
||||||
|
|
|
||||||
|
|
@ -162,6 +162,81 @@ Secure memory features:
|
||||||
- Guarantees zeroing of sensitive memory
|
- Guarantees zeroing of sensitive memory
|
||||||
- Prevents memory dumps from containing sensitive data
|
- Prevents memory dumps from containing sensitive data
|
||||||
|
|
||||||
|
## Hardware Attestation (TPM 2.0)
|
||||||
|
|
||||||
|
### What it is
|
||||||
|
|
||||||
|
When the server has a TPM 2.0 chip, every response includes a `tpm_attestation` block in `_metadata`. This is a cryptographically signed hardware quote proving:
|
||||||
|
|
||||||
|
- Which firmware and Secure Boot state the server is running (PCR 0, 7)
|
||||||
|
- Which application binary is running, when IMA is active (PCR 10)
|
||||||
|
|
||||||
|
The quote is signed by an ephemeral AIK (Attestation Identity Key) generated fresh for each request and tied to the `payload_id` nonce, so it cannot be replayed for a different request.
|
||||||
|
|
||||||
|
### Reading the attestation
|
||||||
|
|
||||||
|
```python
|
||||||
|
response = await client.create(
|
||||||
|
model="Qwen/Qwen3-0.6B",
|
||||||
|
messages=[{"role": "user", "content": "..."}],
|
||||||
|
security_tier="maximum"
|
||||||
|
)
|
||||||
|
|
||||||
|
tpm = response["_metadata"].get("tpm_attestation", {})
|
||||||
|
|
||||||
|
if tpm.get("is_available"):
|
||||||
|
print("PCR banks:", tpm["pcr_banks"]) # e.g. "sha256:0,7,10"
|
||||||
|
print("PCR values:", tpm["pcr_values"]) # {bank: {index: hex}}
|
||||||
|
print("AIK key:", tpm["aik_pubkey_b64"][:32], "...")
|
||||||
|
else:
|
||||||
|
print("TPM not available on this server")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verifying the quote
|
||||||
|
|
||||||
|
The response is self-contained: `aik_pubkey_b64` is the full public key of the AIK that signed the quote, so no separate key-fetch round-trip is needed.
|
||||||
|
|
||||||
|
Verification steps using `tpm2-pytss`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import base64
|
||||||
|
from tpm2_pytss.types import TPM2B_PUBLIC, TPMT_SIGNATURE, TPM2B_ATTEST
|
||||||
|
|
||||||
|
# 1. Decode the quote components
|
||||||
|
aik_pub = TPM2B_PUBLIC.unmarshal(base64.b64decode(tpm["aik_pubkey_b64"]))[0]
|
||||||
|
quote = TPM2B_ATTEST.unmarshal(base64.b64decode(tpm["quote_b64"]))[0]
|
||||||
|
sig = TPMT_SIGNATURE.unmarshal(base64.b64decode(tpm["signature_b64"]))[0]
|
||||||
|
|
||||||
|
# 2. Verify the signature over the quote using the AIK public key
|
||||||
|
# (use a TPM ESAPI verify_signature call or an offline RSA verify)
|
||||||
|
|
||||||
|
# 3. Inspect the qualifying_data inside the quote — it must match
|
||||||
|
# SHA-256(payload_id.encode())[:16] to confirm this quote is for this request
|
||||||
|
|
||||||
|
# 4. Check pcr_values against your known-good baseline
|
||||||
|
```
|
||||||
|
|
||||||
|
> Full verification requires `tpm2-pytss` on the client side (`pip install tpm2-pytss` + `sudo apt install libtss2-dev`). It is optional — the attestation is informational unless your deployment policy requires verification.
|
||||||
|
|
||||||
|
### Behaviour per security tier
|
||||||
|
|
||||||
|
| Tier | TPM unavailable |
|
||||||
|
|------|----------------|
|
||||||
|
| `standard` | `tpm_attestation: {"is_available": false}` — request proceeds |
|
||||||
|
| `high` | same as standard |
|
||||||
|
| `maximum` | `ServiceUnavailableError` (HTTP 503) — request rejected |
|
||||||
|
|
||||||
|
For `maximum` tier, the server enforces TPM availability as a hard requirement. If your server has no TPM and you request `maximum`, catch the error explicitly:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from nomyo import ServiceUnavailableError
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = await client.create(..., security_tier="maximum")
|
||||||
|
except ServiceUnavailableError as e:
|
||||||
|
print("Server does not meet TPM requirements for maximum tier:", e)
|
||||||
|
```
|
||||||
|
|
||||||
## Compliance Considerations
|
## Compliance Considerations
|
||||||
|
|
||||||
### HIPAA Compliance
|
### HIPAA Compliance
|
||||||
|
|
@ -207,9 +282,11 @@ response = await client.create(
|
||||||
messages=[{"role": "user", "content": "Hello"}]
|
messages=[{"role": "user", "content": "Hello"}]
|
||||||
)
|
)
|
||||||
|
|
||||||
print(response["_metadata"]) # Contains security-related information
|
print(response["_metadata"]) # Contains security_tier, memory_protection, tpm_attestation, etc.
|
||||||
```
|
```
|
||||||
|
|
||||||
|
See [Hardware Attestation](#hardware-attestation-tpm-20) for details on the `tpm_attestation` field.
|
||||||
|
|
||||||
### Logging
|
### Logging
|
||||||
|
|
||||||
Enable logging to see security operations:
|
Enable logging to see security operations:
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
import asyncio, ctypes, json, base64, urllib.parse, httpx, os, secrets, sys, warnings, logging
|
import asyncio, ctypes, json, base64, urllib.parse, httpx, os, secrets, sys, warnings, logging
|
||||||
from typing import Dict, Any, Optional
|
from typing import Dict, Any, Optional, Union
|
||||||
from cryptography.hazmat.primitives import serialization, hashes
|
from cryptography.hazmat.primitives import serialization, hashes
|
||||||
from cryptography.hazmat.primitives.asymmetric import rsa, padding
|
from cryptography.hazmat.primitives.asymmetric import rsa, padding
|
||||||
from cryptography.hazmat.backends import default_backend
|
from cryptography.hazmat.backends import default_backend
|
||||||
|
|
|
||||||
|
|
@ -51,6 +51,6 @@ try:
|
||||||
except ImportError:
|
except ImportError:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
__version__ = "0.2.6"
|
__version__ = "0.2.7"
|
||||||
__author__ = "NOMYO AI"
|
__author__ = "NOMYO AI"
|
||||||
__license__ = "Apache-2.0"
|
__license__ = "Apache-2.0"
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,7 @@ build-backend = "hatchling.build"
|
||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "nomyo"
|
name = "nomyo"
|
||||||
version = "0.2.6"
|
version = "0.2.7"
|
||||||
description = "OpenAI-compatible secure chat client with end-to-end encryption for NOMYO Inference Endpoints"
|
description = "OpenAI-compatible secure chat client with end-to-end encryption for NOMYO Inference Endpoints"
|
||||||
authors = [
|
authors = [
|
||||||
{name = "NOMYO.AI", email = "ichi@nomyo.ai"},
|
{name = "NOMYO.AI", email = "ichi@nomyo.ai"},
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue