nomyo4J/TRANSLATION_REFERENCE.md
2026-04-21 17:24:11 +02:00

478 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# NOMYO Python Client — Translation Reference
> Target: Port this library to another language. Every class, method, signature, constant, wire format, and error mapping is documented below.
---
## 1. Package Layout
| File (relative to package root) | Purpose |
|---|---|
| `nomyo/__init__.py` | Public exports, version string |
| `nomyo/nomyo.py` | `SecureChatCompletion` — OpenAI-compatible entrypoint |
| `nomyo/SecureCompletionClient.py` | Key mgmt, hybrid encryption, HTTP roundtrip, retries |
| `nomyo/SecureMemory.py` | Cross-platform memory locking + secure zeroing (optional, platform-specific) |
**Python version:** `>= 3.10`
**Build:** `hatchling` (pyproject.toml)
**Dependencies:** `anyio`, `certifi`, `cffi`, `cryptography`, `exceptiongroup`, `h11`, `httpcore`, `httpx`, `idna`, `pycparser`, `typing_extensions`
---
## 2. Public API Surface (`__all__`)
| Export | Type | Source file |
|---|---|---|
| `SecureChatCompletion` | class | `nomyo.py` |
| `SecurityError` | exception | `SecureCompletionClient.py` |
| `APIError` | exception (base) | `SecureCompletionClient.py` |
| `AuthenticationError` | exception (401) | `SecureCompletionClient.py` |
| `InvalidRequestError` | exception (400) | `SecureCompletionClient.py` |
| `APIConnectionError` | exception (network) | `SecureCompletionClient.py` |
| `ForbiddenError` | exception (403) | `SecureCompletionClient.py` |
| `RateLimitError` | exception (429) | `SecureCompletionClient.py` |
| `ServerError` | exception (500) | `SecureCompletionClient.py` |
| `ServiceUnavailableError` | exception (503) | `SecureCompletionClient.py` |
| `get_memory_protection_info` | function | `SecureMemory.py` |
| `disable_secure_memory` | function | `SecureMemory.py` |
| `enable_secure_memory` | function | `SecureMemory.py` |
| `secure_bytearray` | context manager | `SecureMemory.py` |
| `secure_bytes` | context manager (deprecated) | `SecureMemory.py` |
| `SecureBuffer` | class | `SecureMemory.py` |
---
## 3. `SecureChatCompletion` (entrypoint)
### Constructor
```python
SecureChatCompletion(
base_url: str = "https://api.nomyo.ai",
allow_http: bool = False,
api_key: Optional[str] = None,
secure_memory: bool = True,
key_dir: Optional[str] = None,
max_retries: int = 2
)
```
| Param | Default | Description |
|---|---|---|
| `base_url` | `"https://api.nomyo.ai"` | NOMYO Router base URL. HTTPS enforced unless `allow_http=True`. |
| `allow_http` | `False` | Permit `http://` URLs (dev only). |
| `api_key` | `None` | Bearer token for auth. Can also be passed per-call via `create()`. |
| `secure_memory` | `True` | Enable memory locking/zeroing. Warns if unavailable. |
| `key_dir` | `None` | Directory to persist RSA keys. `None` = ephemeral (in-memory only). |
| `max_retries` | `2` | Retries on 429/500/502/503/504 + network errors. Exponential backoff: 1s, 2s, 4s… |
### `create(model, messages, **kwargs) -> Dict[str, Any]`
Async method. Returns a **dict** (not an object). Same signature as `openai.ChatCompletion.create()`.
| Param | Type | Required | Description |
|---|---|---|---|
| `model` | `str` | yes | Model identifier, e.g. `"Qwen/Qwen3-0.6B"` |
| `messages` | `List[Dict]` | yes | OpenAI-format messages: `[{"role": "user", "content": "..."}]` |
| `temperature` | `float` | no | 02 |
| `max_tokens` | `int` | no | |
| `top_p` | `float` | no | |
| `stop` | `str \| List[str]` | no | |
| `presence_penalty` | `float` | no | -2.0 to 2.0 |
| `frequency_penalty` | `float` | no | -2.0 to 2.0 |
| `n` | `int` | no | Number of completions |
| `best_of` | `int` | no | |
| `seed` | `int` | no | |
| `logit_bias` | `Dict[str, float]` | no | |
| `user` | `str` | no | |
| `tools` | `List[Dict]` | no | Tool definitions passed through to llama.cpp |
| `tool_choice` | `str` | no | `"auto"`, `"none"`, or specific tool name |
| `response_format` | `Dict` | no | `{"type": "json_object"}` or `{"type": "json_schema", ...}` |
| `stream` | `bool` | no | **NOT supported.** Server rejects with HTTP 400. Always use `False`. |
| `base_url` | `str` | no | Per-call override (creates temp client internally). |
| `security_tier` | `str` | no | `"standard"`, `"high"`, or `"maximum"`. Invalid values raise `ValueError`. |
| `api_key` | `str` | no | Per-call override of instance `api_key`. |
**Return value:** `Dict[str, Any]` — OpenAI-compatible response dict (see §6.2).
### `acreate(model, messages, **kwargs) -> Dict[str, Any]`
Async alias for `create()`. Identical behavior.
---
## 4. `SecureCompletionClient` (low-level)
### Constructor
```python
SecureCompletionClient(
router_url: str = "https://api.nomyo.ai",
allow_http: bool = False,
secure_memory: bool = True,
max_retries: int = 2
)
```
Same semantics as `SecureChatCompletion` constructor (maps directly to inner client).
### Instance attributes
| Attribute | Type | Description |
|---|---|---|
| `router_url` | `str` | Base URL (trailing slash stripped). |
| `private_key` | `rsa.RSAPrivateKey \| None` | Loaded/generated RSA private key. |
| `public_key_pem` | `str \| None` | PEM-encoded public key string. |
| `key_size` | `int` | Always `4096`. |
| `allow_http` | `bool` | HTTP allowance flag. |
| `max_retries` | `int` | Retry count. |
| `_use_secure_memory` | `bool` | Whether secure memory ops are active. |
### `generate_keys(save_to_file: bool = False, key_dir: str = "client_keys", password: Optional[str] = None) -> None`
Generates a 4096-bit RSA key pair (public exponent `65537`). If `save_to_file=True`:
- Creates `key_dir/` (mode 755).
- Writes `private_key.pem` with mode `0o600`.
- Writes `public_key.pem` with mode `0o644`.
- If `password` is given, private key is encrypted with `BestAvailableEncryption`.
### `load_keys(private_key_path: str, public_key_path: Optional[str] = None, password: Optional[str] = None) -> None`
Loads an RSA private key from disk. If `public_key_path` is omitted, derives the public key from the loaded private key. Validates key size >= 2048 bits.
### `fetch_server_public_key() -> str` (async)
`GET {router_url}/pki/public_key`
- Returns server PEM public key as string.
- Validates it parses as a valid PEM public key.
- Raises `SecurityError` if URL is not HTTPS and `allow_http=False`.
### `encrypt_payload(payload: Dict[str, Any]) -> bytes` (async)
Encrypts a dict payload using hybrid encryption. Returns raw encrypted bytes (JSON package, serialized to bytes).
**Encryption process:**
1. Serialize payload to JSON → `bytearray`.
2. Validate size <= 10 MB.
3. Generate 256-bit AES key via `secrets.token_bytes(32)``bytearray`.
4. If secure memory enabled: lock both payload and AES key in memory.
5. Call `_do_encrypt()` (see below).
6. Zero/destroy payload and AES key from memory on exit.
### `_do_encrypt(payload_bytes: bytes \| bytearray, aes_key: bytes \| bytearray) -> bytes` (async)
Core hybrid encryption routine. **This is the wire format constructor.**
```
1. nonce = secrets.token_bytes(12) # 96-bit GCM nonce
2. ciphertext = AES-256-GCM_encrypt(aes_key, nonce, payload_bytes)
3. tag = GCM_tag
4. server_pubkey = await fetch_server_public_key()
5. encrypted_aes_key = RSA-OAEP-SHA256_encrypt(server_pubkey, aes_key_bytes)
6. Build JSON package (see §6.1)
7. Return json.dumps(package).encode('utf-8')
```
### `decrypt_response(encrypted_response: bytes, payload_id: str) -> Dict[str, Any]` (async)
Decrypts a server response.
**Validation chain:**
1. Parse JSON.
2. Check `version == "1.0"` — raises `ValueError` if mismatch.
3. Check `algorithm == "hybrid-aes256-rsa4096"` — raises `ValueError` if mismatch.
4. Validate `encrypted_payload` has `ciphertext`, `nonce`, `tag`.
5. Require `self.private_key` is not `None`.
6. Decrypt AES key: `RSA-OAEP-SHA256_decrypt(private_key, encrypted_aes_key)`.
7. Decrypt payload: `AES-256-GCM_decrypt(aes_key, nonce, tag, ciphertext)`.
8. Parse decrypted bytes as JSON → response dict.
9. Attach `_metadata` if not present (see §6.2).
Any decryption failure (except JSON parse errors) raises `SecurityError("Decryption failed: integrity check or authentication failed")`.
### `send_secure_request(payload, payload_id, api_key=None, security_tier=None) -> Dict[str, Any]` (async)
Full request lifecycle: encrypt → HTTP POST → retry → decrypt → return.
**Request headers:**
```
Content-Type: application/octet-stream
X-Payload-ID: {payload_id}
X-Public-Key: {url_encoded_pem_public_key}
Authorization: Bearer {api_key} (if api_key is provided)
X-Security-Tier: {tier} (if security_tier is provided)
```
**POST** to `{router_url}/v1/chat/secure_completion` with encrypted payload as body.
**Retry logic:**
- Retryable status codes: `{429, 500, 502, 503, 504}`.
- Backoff: `2^(attempt-1)` seconds (1s, 2s, 4s…).
- Total attempts: `max_retries + 1`.
- Network errors also retry.
- Non-retryable exceptions propagate immediately.
**Status → exception mapping:**
| Status | Exception |
|---|---|
| 200 | Return decrypted response dict |
| 400 | `InvalidRequestError` |
| 401 | `AuthenticationError` |
| 403 | `ForbiddenError` |
| 404 | `APIError` |
| 429 | `RateLimitError` |
| 500 | `ServerError` |
| 503 | `ServiceUnavailableError` |
| 502/504 | `APIError` (retryable) |
| other | `APIError` (non-retryable) |
| network error | `APIConnectionError` |
---
## 5. Encryption Wire Format
The encrypted package is a JSON object sent as `application/octet-stream`:
```json
{
"version": "1.0",
"algorithm": "hybrid-aes256-rsa4096",
"encrypted_payload": {
"ciphertext": "<base64>",
"nonce": "<base64>",
"tag": "<base64>"
},
"encrypted_aes_key": "<base64>",
"key_algorithm": "RSA-OAEP-SHA256",
"payload_algorithm": "AES-256-GCM"
}
```
| Field | Encoding | Description |
|---|---|---|
| `version` | string | Protocol version. **Never change** — used for downgrade detection. |
| `algorithm` | string | `"hybrid-aes256-rsa4096"`. **Never change** — used for downgrade detection. |
| `encrypted_payload.ciphertext` | base64 | AES-256-GCM encrypted payload. |
| `encrypted_payload.nonce` | base64 | 12-byte GCM nonce. |
| `encrypted_payload.tag` | base64 | 16-byte GCM authentication tag. |
| `encrypted_aes_key` | base64 | RSA-OAEP-SHA256 encrypted 32-byte AES key. |
| `key_algorithm` | string | `"RSA-OAEP-SHA256"` |
| `payload_algorithm` | string | `"AES-256-GCM"` |
---
## 6. Data Structures
### 6.1 Encrypted Request Payload (before encryption)
The dict passed to `encrypt_payload()` has this structure:
```json
{
"model": "Qwen/Qwen3-0.6B",
"messages": [
{"role": "user", "content": "Hello"}
],
"temperature": 0.7,
"...": "any other OpenAI-compatible param"
}
```
**Important:** `api_key` is **never** included in the encrypted payload. It is sent only as the `Authorization: Bearer` HTTP header.
### 6.2 Response Dict (after decryption)
```json
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1234567890,
"model": "Qwen/Qwen3-0.6B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris.",
"tool_calls": [...],
"reasoning_content": "..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
},
"_metadata": {
"payload_id": "openai-compat-abc123",
"processed_at": 1765250382,
"is_encrypted": true,
"encryption_algorithm": "hybrid-aes256-rsa4096",
"security_tier": "standard",
"memory_protection": {
"platform": "linux",
"memory_locking": true,
"secure_zeroing": true,
"core_dump_prevention": true
},
"cuda_device": {
"available": true,
"device_hash": "sha256_hex"
}
}
}
```
### 6.3 Security Tier Values
| Value | Hardware | Use case |
|---|---|---|
| `"standard"` | GPU | General secure inference |
| `"high"` | CPU/GPU | Sensitive business data |
| `"maximum"` | CPU only | PHI, classified data |
Sent as `X-Security-Tier` HTTP header. Invalid values raise `ValueError`.
---
## 7. Error Class Hierarchy
All errors are exceptions. `APIError` subclasses carry `status_code` and `error_details`.
```
Exception
└── APIError (base, has message/status_code/error_details)
├── AuthenticationError (status_code=401)
├── InvalidRequestError (status_code=400)
├── RateLimitError (status_code=429)
├── ForbiddenError (status_code=403)
├── ServerError (status_code=500)
└── ServiceUnavailableError (status_code=503)
Exception
└── SecurityError (crypto/key failure, no status_code)
Exception
└── APIConnectionError (network failure, no status_code)
```
**APIError constructor:**
```python
APIError(message: str, status_code: Optional[int] = None, error_details: Optional[Dict] = None)
```
---
## 8. SecureMemory Module
Optional, platform-specific. Fails gracefully if unavailable (e.g. Windows on some Python builds).
### `SecureBuffer` class
Wraps a `bytearray` with memory locking and guaranteed zeroing on exit.
| Attribute/Method | Type | Description |
|---|---|---|
| `data` | `bytearray` | Underlying mutable buffer |
| `address` | `int` | Memory address (via ctypes) |
| `size` | `int` | Buffer size in bytes |
| `lock() -> bool` | method | Attempt memory lock |
| `unlock() -> bool` | method | Unlock memory |
| `zero()` | method | Securely zero contents |
| `__enter__` / `__exit__` | context mgr | Auto-lock on enter, auto-zero+unlock on exit |
### `secure_bytearray(data: bytes \| bytearray, lock: bool = True) -> SecureBuffer` (context manager)
Recommended secure handling. Converts input to `bytearray`, locks (best-effort), yields `SecureBuffer`. Always zeros on exit, even on exception.
### `secure_bytes(data: bytes, lock: bool = True) -> SecureBuffer` (context manager, **deprecated**)
Same as `secure_bytearray` but accepts immutable `bytes`. Emits deprecation warning. Original bytes cannot be zeroed.
### `get_memory_protection_info() -> Dict[str, Any]`
Returns protection capabilities:
```json
{
"enabled": true,
"platform": "linux",
"protection_level": "full",
"has_memory_locking": true,
"has_secure_zeroing": true,
"supports_full_protection": true,
"page_size": 4096
}
```
`protection_level` values: `"full"`, `"zeroing_only"`, `"none"`.
### `disable_secure_memory()` / `enable_secure_memory()`
Globally disable/re-enable secure memory operations.
---
## 9. Constants
| Constant | Value | Location | Notes |
|---|---|---|---|
| Protocol version | `"1.0"` | `SecureCompletionClient.py` | **Never change** — downgrade detection |
| Algorithm string | `"hybrid-aes256-rsa4096"` | `SecureCompletionClient.py` | **Never change** — downgrade detection |
| RSA key size | `4096` | `SecureCompletionClient.py` | Fixed |
| RSA public exponent | `65537` | `SecureCompletionClient.py` | Fixed |
| AES key size | `32` bytes (256-bit) | `SecureCompletionClient.py` | Per-request ephemeral |
| GCM nonce size | `12` bytes (96-bit) | `SecureCompletionClient.py` | Per-request via `secrets.token_bytes` |
| Max payload size | `10 * 1024 * 1024` (10 MB) | `SecureCompletionClient.py` | DoS protection |
| Default max retries | `2` | Both client classes | Exponential backoff: 1s, 2s, 4s… |
| Private key file mode | `0o600` | `SecureCompletionClient.py` | Owner read/write only |
| Public key file mode | `0o644` | `SecureCompletionClient.py` | Owner rw, group/others r |
| Min RSA key size (validation) | `2048` | `SecureCompletionClient.py` | `_validate_rsa_key` |
| Valid security tiers | `["standard", "high", "maximum"]` | `SecureCompletionClient.py` | Case-sensitive |
| Retryable status codes | `{429, 500, 502, 503, 504}` | `SecureCompletionClient.py` | |
| Package version | `"0.2.7"` | `pyproject.toml` + `__init__.py` | Bump both |
---
## 10. Endpoint URLs
| Endpoint | Method | Purpose |
|---|---|---|
| `{router_url}/pki/public_key` | GET | Fetch server RSA public key |
| `{router_url}/v1/chat/secure_completion` | POST | Encrypted chat completion |
---
## 11. Key Lifecycle
1. **First `create()` call**`_ensure_keys()` runs (async, double-checked locking via `asyncio.Lock`).
2. If `key_dir` is set:
- Try `load_keys()` from `{key_dir}/private_key.pem` + `{key_dir}/public_key.pem`.
- If that fails → `generate_keys(save_to_file=True, key_dir=key_dir)`.
3. If `key_dir` is `None``generate_keys()` (ephemeral, in-memory only).
4. Keys are reused across all subsequent calls until the client is discarded.
---
## 12. HTTP Client Details
- Uses `httpx.AsyncClient` with `timeout=60.0`.
- SSL verification enabled for HTTPS URLs; disabled for `http://`.
- Request body is raw bytes (not JSON) — `Content-Type: application/octet-stream`.
- Public key is URL-encoded in the `X-Public-Key` header.
---
## 13. Memory Protection Platform Matrix
| Platform | Locking | Zeroing |
|---|---|---|
| Linux | `mlock()` via `libc.so.6` | `memset()` via `libc.so.6` |
| Windows | `VirtualLock()` via `kernel32` | `RtlZeroMemory()` via `ntdll` + Python-level fallback |
| macOS | `mlock()` via `libc.dylib` | `memset()` via `libc.dylib` |
| Other | No lock | Python-level byte-by-byte zeroing |
mlock may fail with `EPERM` (need `CAP_IPC_LOCK` or `ulimit -l` increase) — degrades to zeroing-only gracefully.