nomyo4J/TRANSLATION_REFERENCE.md

479 lines
17 KiB
Markdown
Raw Permalink Normal View History

2026-04-21 17:24:11 +02:00
# NOMYO Python Client — Translation Reference
> Target: Port this library to another language. Every class, method, signature, constant, wire format, and error mapping is documented below.
---
## 1. Package Layout
| File (relative to package root) | Purpose |
|---|---|
| `nomyo/__init__.py` | Public exports, version string |
| `nomyo/nomyo.py` | `SecureChatCompletion` — OpenAI-compatible entrypoint |
| `nomyo/SecureCompletionClient.py` | Key mgmt, hybrid encryption, HTTP roundtrip, retries |
| `nomyo/SecureMemory.py` | Cross-platform memory locking + secure zeroing (optional, platform-specific) |
**Python version:** `>= 3.10`
**Build:** `hatchling` (pyproject.toml)
**Dependencies:** `anyio`, `certifi`, `cffi`, `cryptography`, `exceptiongroup`, `h11`, `httpcore`, `httpx`, `idna`, `pycparser`, `typing_extensions`
---
## 2. Public API Surface (`__all__`)
| Export | Type | Source file |
|---|---|---|
| `SecureChatCompletion` | class | `nomyo.py` |
| `SecurityError` | exception | `SecureCompletionClient.py` |
| `APIError` | exception (base) | `SecureCompletionClient.py` |
| `AuthenticationError` | exception (401) | `SecureCompletionClient.py` |
| `InvalidRequestError` | exception (400) | `SecureCompletionClient.py` |
| `APIConnectionError` | exception (network) | `SecureCompletionClient.py` |
| `ForbiddenError` | exception (403) | `SecureCompletionClient.py` |
| `RateLimitError` | exception (429) | `SecureCompletionClient.py` |
| `ServerError` | exception (500) | `SecureCompletionClient.py` |
| `ServiceUnavailableError` | exception (503) | `SecureCompletionClient.py` |
| `get_memory_protection_info` | function | `SecureMemory.py` |
| `disable_secure_memory` | function | `SecureMemory.py` |
| `enable_secure_memory` | function | `SecureMemory.py` |
| `secure_bytearray` | context manager | `SecureMemory.py` |
| `secure_bytes` | context manager (deprecated) | `SecureMemory.py` |
| `SecureBuffer` | class | `SecureMemory.py` |
---
## 3. `SecureChatCompletion` (entrypoint)
### Constructor
```python
SecureChatCompletion(
base_url: str = "https://api.nomyo.ai",
allow_http: bool = False,
api_key: Optional[str] = None,
secure_memory: bool = True,
key_dir: Optional[str] = None,
max_retries: int = 2
)
```
| Param | Default | Description |
|---|---|---|
| `base_url` | `"https://api.nomyo.ai"` | NOMYO Router base URL. HTTPS enforced unless `allow_http=True`. |
| `allow_http` | `False` | Permit `http://` URLs (dev only). |
| `api_key` | `None` | Bearer token for auth. Can also be passed per-call via `create()`. |
| `secure_memory` | `True` | Enable memory locking/zeroing. Warns if unavailable. |
| `key_dir` | `None` | Directory to persist RSA keys. `None` = ephemeral (in-memory only). |
| `max_retries` | `2` | Retries on 429/500/502/503/504 + network errors. Exponential backoff: 1s, 2s, 4s… |
### `create(model, messages, **kwargs) -> Dict[str, Any]`
Async method. Returns a **dict** (not an object). Same signature as `openai.ChatCompletion.create()`.
| Param | Type | Required | Description |
|---|---|---|---|
| `model` | `str` | yes | Model identifier, e.g. `"Qwen/Qwen3-0.6B"` |
| `messages` | `List[Dict]` | yes | OpenAI-format messages: `[{"role": "user", "content": "..."}]` |
| `temperature` | `float` | no | 02 |
| `max_tokens` | `int` | no | |
| `top_p` | `float` | no | |
| `stop` | `str \| List[str]` | no | |
| `presence_penalty` | `float` | no | -2.0 to 2.0 |
| `frequency_penalty` | `float` | no | -2.0 to 2.0 |
| `n` | `int` | no | Number of completions |
| `best_of` | `int` | no | |
| `seed` | `int` | no | |
| `logit_bias` | `Dict[str, float]` | no | |
| `user` | `str` | no | |
| `tools` | `List[Dict]` | no | Tool definitions passed through to llama.cpp |
| `tool_choice` | `str` | no | `"auto"`, `"none"`, or specific tool name |
| `response_format` | `Dict` | no | `{"type": "json_object"}` or `{"type": "json_schema", ...}` |
| `stream` | `bool` | no | **NOT supported.** Server rejects with HTTP 400. Always use `False`. |
| `base_url` | `str` | no | Per-call override (creates temp client internally). |
| `security_tier` | `str` | no | `"standard"`, `"high"`, or `"maximum"`. Invalid values raise `ValueError`. |
| `api_key` | `str` | no | Per-call override of instance `api_key`. |
**Return value:** `Dict[str, Any]` — OpenAI-compatible response dict (see §6.2).
### `acreate(model, messages, **kwargs) -> Dict[str, Any]`
Async alias for `create()`. Identical behavior.
---
## 4. `SecureCompletionClient` (low-level)
### Constructor
```python
SecureCompletionClient(
router_url: str = "https://api.nomyo.ai",
allow_http: bool = False,
secure_memory: bool = True,
max_retries: int = 2
)
```
Same semantics as `SecureChatCompletion` constructor (maps directly to inner client).
### Instance attributes
| Attribute | Type | Description |
|---|---|---|
| `router_url` | `str` | Base URL (trailing slash stripped). |
| `private_key` | `rsa.RSAPrivateKey \| None` | Loaded/generated RSA private key. |
| `public_key_pem` | `str \| None` | PEM-encoded public key string. |
| `key_size` | `int` | Always `4096`. |
| `allow_http` | `bool` | HTTP allowance flag. |
| `max_retries` | `int` | Retry count. |
| `_use_secure_memory` | `bool` | Whether secure memory ops are active. |
### `generate_keys(save_to_file: bool = False, key_dir: str = "client_keys", password: Optional[str] = None) -> None`
Generates a 4096-bit RSA key pair (public exponent `65537`). If `save_to_file=True`:
- Creates `key_dir/` (mode 755).
- Writes `private_key.pem` with mode `0o600`.
- Writes `public_key.pem` with mode `0o644`.
- If `password` is given, private key is encrypted with `BestAvailableEncryption`.
### `load_keys(private_key_path: str, public_key_path: Optional[str] = None, password: Optional[str] = None) -> None`
Loads an RSA private key from disk. If `public_key_path` is omitted, derives the public key from the loaded private key. Validates key size >= 2048 bits.
### `fetch_server_public_key() -> str` (async)
`GET {router_url}/pki/public_key`
- Returns server PEM public key as string.
- Validates it parses as a valid PEM public key.
- Raises `SecurityError` if URL is not HTTPS and `allow_http=False`.
### `encrypt_payload(payload: Dict[str, Any]) -> bytes` (async)
Encrypts a dict payload using hybrid encryption. Returns raw encrypted bytes (JSON package, serialized to bytes).
**Encryption process:**
1. Serialize payload to JSON → `bytearray`.
2. Validate size <= 10 MB.
3. Generate 256-bit AES key via `secrets.token_bytes(32)``bytearray`.
4. If secure memory enabled: lock both payload and AES key in memory.
5. Call `_do_encrypt()` (see below).
6. Zero/destroy payload and AES key from memory on exit.
### `_do_encrypt(payload_bytes: bytes \| bytearray, aes_key: bytes \| bytearray) -> bytes` (async)
Core hybrid encryption routine. **This is the wire format constructor.**
```
1. nonce = secrets.token_bytes(12) # 96-bit GCM nonce
2. ciphertext = AES-256-GCM_encrypt(aes_key, nonce, payload_bytes)
3. tag = GCM_tag
4. server_pubkey = await fetch_server_public_key()
5. encrypted_aes_key = RSA-OAEP-SHA256_encrypt(server_pubkey, aes_key_bytes)
6. Build JSON package (see §6.1)
7. Return json.dumps(package).encode('utf-8')
```
### `decrypt_response(encrypted_response: bytes, payload_id: str) -> Dict[str, Any]` (async)
Decrypts a server response.
**Validation chain:**
1. Parse JSON.
2. Check `version == "1.0"` — raises `ValueError` if mismatch.
3. Check `algorithm == "hybrid-aes256-rsa4096"` — raises `ValueError` if mismatch.
4. Validate `encrypted_payload` has `ciphertext`, `nonce`, `tag`.
5. Require `self.private_key` is not `None`.
6. Decrypt AES key: `RSA-OAEP-SHA256_decrypt(private_key, encrypted_aes_key)`.
7. Decrypt payload: `AES-256-GCM_decrypt(aes_key, nonce, tag, ciphertext)`.
8. Parse decrypted bytes as JSON → response dict.
9. Attach `_metadata` if not present (see §6.2).
Any decryption failure (except JSON parse errors) raises `SecurityError("Decryption failed: integrity check or authentication failed")`.
### `send_secure_request(payload, payload_id, api_key=None, security_tier=None) -> Dict[str, Any]` (async)
Full request lifecycle: encrypt → HTTP POST → retry → decrypt → return.
**Request headers:**
```
Content-Type: application/octet-stream
X-Payload-ID: {payload_id}
X-Public-Key: {url_encoded_pem_public_key}
Authorization: Bearer {api_key} (if api_key is provided)
X-Security-Tier: {tier} (if security_tier is provided)
```
**POST** to `{router_url}/v1/chat/secure_completion` with encrypted payload as body.
**Retry logic:**
- Retryable status codes: `{429, 500, 502, 503, 504}`.
- Backoff: `2^(attempt-1)` seconds (1s, 2s, 4s…).
- Total attempts: `max_retries + 1`.
- Network errors also retry.
- Non-retryable exceptions propagate immediately.
**Status → exception mapping:**
| Status | Exception |
|---|---|
| 200 | Return decrypted response dict |
| 400 | `InvalidRequestError` |
| 401 | `AuthenticationError` |
| 403 | `ForbiddenError` |
| 404 | `APIError` |
| 429 | `RateLimitError` |
| 500 | `ServerError` |
| 503 | `ServiceUnavailableError` |
| 502/504 | `APIError` (retryable) |
| other | `APIError` (non-retryable) |
| network error | `APIConnectionError` |
---
## 5. Encryption Wire Format
The encrypted package is a JSON object sent as `application/octet-stream`:
```json
{
"version": "1.0",
"algorithm": "hybrid-aes256-rsa4096",
"encrypted_payload": {
"ciphertext": "<base64>",
"nonce": "<base64>",
"tag": "<base64>"
},
"encrypted_aes_key": "<base64>",
"key_algorithm": "RSA-OAEP-SHA256",
"payload_algorithm": "AES-256-GCM"
}
```
| Field | Encoding | Description |
|---|---|---|
| `version` | string | Protocol version. **Never change** — used for downgrade detection. |
| `algorithm` | string | `"hybrid-aes256-rsa4096"`. **Never change** — used for downgrade detection. |
| `encrypted_payload.ciphertext` | base64 | AES-256-GCM encrypted payload. |
| `encrypted_payload.nonce` | base64 | 12-byte GCM nonce. |
| `encrypted_payload.tag` | base64 | 16-byte GCM authentication tag. |
| `encrypted_aes_key` | base64 | RSA-OAEP-SHA256 encrypted 32-byte AES key. |
| `key_algorithm` | string | `"RSA-OAEP-SHA256"` |
| `payload_algorithm` | string | `"AES-256-GCM"` |
---
## 6. Data Structures
### 6.1 Encrypted Request Payload (before encryption)
The dict passed to `encrypt_payload()` has this structure:
```json
{
"model": "Qwen/Qwen3-0.6B",
"messages": [
{"role": "user", "content": "Hello"}
],
"temperature": 0.7,
"...": "any other OpenAI-compatible param"
}
```
**Important:** `api_key` is **never** included in the encrypted payload. It is sent only as the `Authorization: Bearer` HTTP header.
### 6.2 Response Dict (after decryption)
```json
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1234567890,
"model": "Qwen/Qwen3-0.6B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris.",
"tool_calls": [...],
"reasoning_content": "..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
},
"_metadata": {
"payload_id": "openai-compat-abc123",
"processed_at": 1765250382,
"is_encrypted": true,
"encryption_algorithm": "hybrid-aes256-rsa4096",
"security_tier": "standard",
"memory_protection": {
"platform": "linux",
"memory_locking": true,
"secure_zeroing": true,
"core_dump_prevention": true
},
"cuda_device": {
"available": true,
"device_hash": "sha256_hex"
}
}
}
```
### 6.3 Security Tier Values
| Value | Hardware | Use case |
|---|---|---|
| `"standard"` | GPU | General secure inference |
| `"high"` | CPU/GPU | Sensitive business data |
| `"maximum"` | CPU only | PHI, classified data |
Sent as `X-Security-Tier` HTTP header. Invalid values raise `ValueError`.
---
## 7. Error Class Hierarchy
All errors are exceptions. `APIError` subclasses carry `status_code` and `error_details`.
```
Exception
└── APIError (base, has message/status_code/error_details)
├── AuthenticationError (status_code=401)
├── InvalidRequestError (status_code=400)
├── RateLimitError (status_code=429)
├── ForbiddenError (status_code=403)
├── ServerError (status_code=500)
└── ServiceUnavailableError (status_code=503)
Exception
└── SecurityError (crypto/key failure, no status_code)
Exception
└── APIConnectionError (network failure, no status_code)
```
**APIError constructor:**
```python
APIError(message: str, status_code: Optional[int] = None, error_details: Optional[Dict] = None)
```
---
## 8. SecureMemory Module
Optional, platform-specific. Fails gracefully if unavailable (e.g. Windows on some Python builds).
### `SecureBuffer` class
Wraps a `bytearray` with memory locking and guaranteed zeroing on exit.
| Attribute/Method | Type | Description |
|---|---|---|
| `data` | `bytearray` | Underlying mutable buffer |
| `address` | `int` | Memory address (via ctypes) |
| `size` | `int` | Buffer size in bytes |
| `lock() -> bool` | method | Attempt memory lock |
| `unlock() -> bool` | method | Unlock memory |
| `zero()` | method | Securely zero contents |
| `__enter__` / `__exit__` | context mgr | Auto-lock on enter, auto-zero+unlock on exit |
### `secure_bytearray(data: bytes \| bytearray, lock: bool = True) -> SecureBuffer` (context manager)
Recommended secure handling. Converts input to `bytearray`, locks (best-effort), yields `SecureBuffer`. Always zeros on exit, even on exception.
### `secure_bytes(data: bytes, lock: bool = True) -> SecureBuffer` (context manager, **deprecated**)
Same as `secure_bytearray` but accepts immutable `bytes`. Emits deprecation warning. Original bytes cannot be zeroed.
### `get_memory_protection_info() -> Dict[str, Any]`
Returns protection capabilities:
```json
{
"enabled": true,
"platform": "linux",
"protection_level": "full",
"has_memory_locking": true,
"has_secure_zeroing": true,
"supports_full_protection": true,
"page_size": 4096
}
```
`protection_level` values: `"full"`, `"zeroing_only"`, `"none"`.
### `disable_secure_memory()` / `enable_secure_memory()`
Globally disable/re-enable secure memory operations.
---
## 9. Constants
| Constant | Value | Location | Notes |
|---|---|---|---|
| Protocol version | `"1.0"` | `SecureCompletionClient.py` | **Never change** — downgrade detection |
| Algorithm string | `"hybrid-aes256-rsa4096"` | `SecureCompletionClient.py` | **Never change** — downgrade detection |
| RSA key size | `4096` | `SecureCompletionClient.py` | Fixed |
| RSA public exponent | `65537` | `SecureCompletionClient.py` | Fixed |
| AES key size | `32` bytes (256-bit) | `SecureCompletionClient.py` | Per-request ephemeral |
| GCM nonce size | `12` bytes (96-bit) | `SecureCompletionClient.py` | Per-request via `secrets.token_bytes` |
| Max payload size | `10 * 1024 * 1024` (10 MB) | `SecureCompletionClient.py` | DoS protection |
| Default max retries | `2` | Both client classes | Exponential backoff: 1s, 2s, 4s… |
| Private key file mode | `0o600` | `SecureCompletionClient.py` | Owner read/write only |
| Public key file mode | `0o644` | `SecureCompletionClient.py` | Owner rw, group/others r |
| Min RSA key size (validation) | `2048` | `SecureCompletionClient.py` | `_validate_rsa_key` |
| Valid security tiers | `["standard", "high", "maximum"]` | `SecureCompletionClient.py` | Case-sensitive |
| Retryable status codes | `{429, 500, 502, 503, 504}` | `SecureCompletionClient.py` | |
| Package version | `"0.2.7"` | `pyproject.toml` + `__init__.py` | Bump both |
---
## 10. Endpoint URLs
| Endpoint | Method | Purpose |
|---|---|---|
| `{router_url}/pki/public_key` | GET | Fetch server RSA public key |
| `{router_url}/v1/chat/secure_completion` | POST | Encrypted chat completion |
---
## 11. Key Lifecycle
1. **First `create()` call**`_ensure_keys()` runs (async, double-checked locking via `asyncio.Lock`).
2. If `key_dir` is set:
- Try `load_keys()` from `{key_dir}/private_key.pem` + `{key_dir}/public_key.pem`.
- If that fails → `generate_keys(save_to_file=True, key_dir=key_dir)`.
3. If `key_dir` is `None``generate_keys()` (ephemeral, in-memory only).
4. Keys are reused across all subsequent calls until the client is discarded.
---
## 12. HTTP Client Details
- Uses `httpx.AsyncClient` with `timeout=60.0`.
- SSL verification enabled for HTTPS URLs; disabled for `http://`.
- Request body is raw bytes (not JSON) — `Content-Type: application/octet-stream`.
- Public key is URL-encoded in the `X-Public-Key` header.
---
## 13. Memory Protection Platform Matrix
| Platform | Locking | Zeroing |
|---|---|---|
| Linux | `mlock()` via `libc.so.6` | `memset()` via `libc.so.6` |
| Windows | `VirtualLock()` via `kernel32` | `RtlZeroMemory()` via `ntdll` + Python-level fallback |
| macOS | `mlock()` via `libc.dylib` | `memset()` via `libc.dylib` |
| Other | No lock | Python-level byte-by-byte zeroing |
mlock may fail with `EPERM` (need `CAP_IPC_LOCK` or `ulimit -l` increase) — degrades to zeroing-only gracefully.