# NOMYO Python Client — Translation Reference > Target: Port this library to another language. Every class, method, signature, constant, wire format, and error mapping is documented below. --- ## 1. Package Layout | File (relative to package root) | Purpose | |---|---| | `nomyo/__init__.py` | Public exports, version string | | `nomyo/nomyo.py` | `SecureChatCompletion` — OpenAI-compatible entrypoint | | `nomyo/SecureCompletionClient.py` | Key mgmt, hybrid encryption, HTTP roundtrip, retries | | `nomyo/SecureMemory.py` | Cross-platform memory locking + secure zeroing (optional, platform-specific) | **Python version:** `>= 3.10` **Build:** `hatchling` (pyproject.toml) **Dependencies:** `anyio`, `certifi`, `cffi`, `cryptography`, `exceptiongroup`, `h11`, `httpcore`, `httpx`, `idna`, `pycparser`, `typing_extensions` --- ## 2. Public API Surface (`__all__`) | Export | Type | Source file | |---|---|---| | `SecureChatCompletion` | class | `nomyo.py` | | `SecurityError` | exception | `SecureCompletionClient.py` | | `APIError` | exception (base) | `SecureCompletionClient.py` | | `AuthenticationError` | exception (401) | `SecureCompletionClient.py` | | `InvalidRequestError` | exception (400) | `SecureCompletionClient.py` | | `APIConnectionError` | exception (network) | `SecureCompletionClient.py` | | `ForbiddenError` | exception (403) | `SecureCompletionClient.py` | | `RateLimitError` | exception (429) | `SecureCompletionClient.py` | | `ServerError` | exception (500) | `SecureCompletionClient.py` | | `ServiceUnavailableError` | exception (503) | `SecureCompletionClient.py` | | `get_memory_protection_info` | function | `SecureMemory.py` | | `disable_secure_memory` | function | `SecureMemory.py` | | `enable_secure_memory` | function | `SecureMemory.py` | | `secure_bytearray` | context manager | `SecureMemory.py` | | `secure_bytes` | context manager (deprecated) | `SecureMemory.py` | | `SecureBuffer` | class | `SecureMemory.py` | --- ## 3. `SecureChatCompletion` (entrypoint) ### Constructor ```python SecureChatCompletion( base_url: str = "https://api.nomyo.ai", allow_http: bool = False, api_key: Optional[str] = None, secure_memory: bool = True, key_dir: Optional[str] = None, max_retries: int = 2 ) ``` | Param | Default | Description | |---|---|---| | `base_url` | `"https://api.nomyo.ai"` | NOMYO Router base URL. HTTPS enforced unless `allow_http=True`. | | `allow_http` | `False` | Permit `http://` URLs (dev only). | | `api_key` | `None` | Bearer token for auth. Can also be passed per-call via `create()`. | | `secure_memory` | `True` | Enable memory locking/zeroing. Warns if unavailable. | | `key_dir` | `None` | Directory to persist RSA keys. `None` = ephemeral (in-memory only). | | `max_retries` | `2` | Retries on 429/500/502/503/504 + network errors. Exponential backoff: 1s, 2s, 4s… | ### `create(model, messages, **kwargs) -> Dict[str, Any]` Async method. Returns a **dict** (not an object). Same signature as `openai.ChatCompletion.create()`. | Param | Type | Required | Description | |---|---|---|---| | `model` | `str` | yes | Model identifier, e.g. `"Qwen/Qwen3-0.6B"` | | `messages` | `List[Dict]` | yes | OpenAI-format messages: `[{"role": "user", "content": "..."}]` | | `temperature` | `float` | no | 0–2 | | `max_tokens` | `int` | no | | | `top_p` | `float` | no | | | `stop` | `str \| List[str]` | no | | | `presence_penalty` | `float` | no | -2.0 to 2.0 | | `frequency_penalty` | `float` | no | -2.0 to 2.0 | | `n` | `int` | no | Number of completions | | `best_of` | `int` | no | | | `seed` | `int` | no | | | `logit_bias` | `Dict[str, float]` | no | | | `user` | `str` | no | | | `tools` | `List[Dict]` | no | Tool definitions passed through to llama.cpp | | `tool_choice` | `str` | no | `"auto"`, `"none"`, or specific tool name | | `response_format` | `Dict` | no | `{"type": "json_object"}` or `{"type": "json_schema", ...}` | | `stream` | `bool` | no | **NOT supported.** Server rejects with HTTP 400. Always use `False`. | | `base_url` | `str` | no | Per-call override (creates temp client internally). | | `security_tier` | `str` | no | `"standard"`, `"high"`, or `"maximum"`. Invalid values raise `ValueError`. | | `api_key` | `str` | no | Per-call override of instance `api_key`. | **Return value:** `Dict[str, Any]` — OpenAI-compatible response dict (see §6.2). ### `acreate(model, messages, **kwargs) -> Dict[str, Any]` Async alias for `create()`. Identical behavior. --- ## 4. `SecureCompletionClient` (low-level) ### Constructor ```python SecureCompletionClient( router_url: str = "https://api.nomyo.ai", allow_http: bool = False, secure_memory: bool = True, max_retries: int = 2 ) ``` Same semantics as `SecureChatCompletion` constructor (maps directly to inner client). ### Instance attributes | Attribute | Type | Description | |---|---|---| | `router_url` | `str` | Base URL (trailing slash stripped). | | `private_key` | `rsa.RSAPrivateKey \| None` | Loaded/generated RSA private key. | | `public_key_pem` | `str \| None` | PEM-encoded public key string. | | `key_size` | `int` | Always `4096`. | | `allow_http` | `bool` | HTTP allowance flag. | | `max_retries` | `int` | Retry count. | | `_use_secure_memory` | `bool` | Whether secure memory ops are active. | ### `generate_keys(save_to_file: bool = False, key_dir: str = "client_keys", password: Optional[str] = None) -> None` Generates a 4096-bit RSA key pair (public exponent `65537`). If `save_to_file=True`: - Creates `key_dir/` (mode 755). - Writes `private_key.pem` with mode `0o600`. - Writes `public_key.pem` with mode `0o644`. - If `password` is given, private key is encrypted with `BestAvailableEncryption`. ### `load_keys(private_key_path: str, public_key_path: Optional[str] = None, password: Optional[str] = None) -> None` Loads an RSA private key from disk. If `public_key_path` is omitted, derives the public key from the loaded private key. Validates key size >= 2048 bits. ### `fetch_server_public_key() -> str` (async) `GET {router_url}/pki/public_key` - Returns server PEM public key as string. - Validates it parses as a valid PEM public key. - Raises `SecurityError` if URL is not HTTPS and `allow_http=False`. ### `encrypt_payload(payload: Dict[str, Any]) -> bytes` (async) Encrypts a dict payload using hybrid encryption. Returns raw encrypted bytes (JSON package, serialized to bytes). **Encryption process:** 1. Serialize payload to JSON → `bytearray`. 2. Validate size <= 10 MB. 3. Generate 256-bit AES key via `secrets.token_bytes(32)` → `bytearray`. 4. If secure memory enabled: lock both payload and AES key in memory. 5. Call `_do_encrypt()` (see below). 6. Zero/destroy payload and AES key from memory on exit. ### `_do_encrypt(payload_bytes: bytes \| bytearray, aes_key: bytes \| bytearray) -> bytes` (async) Core hybrid encryption routine. **This is the wire format constructor.** ``` 1. nonce = secrets.token_bytes(12) # 96-bit GCM nonce 2. ciphertext = AES-256-GCM_encrypt(aes_key, nonce, payload_bytes) 3. tag = GCM_tag 4. server_pubkey = await fetch_server_public_key() 5. encrypted_aes_key = RSA-OAEP-SHA256_encrypt(server_pubkey, aes_key_bytes) 6. Build JSON package (see §6.1) 7. Return json.dumps(package).encode('utf-8') ``` ### `decrypt_response(encrypted_response: bytes, payload_id: str) -> Dict[str, Any]` (async) Decrypts a server response. **Validation chain:** 1. Parse JSON. 2. Check `version == "1.0"` — raises `ValueError` if mismatch. 3. Check `algorithm == "hybrid-aes256-rsa4096"` — raises `ValueError` if mismatch. 4. Validate `encrypted_payload` has `ciphertext`, `nonce`, `tag`. 5. Require `self.private_key` is not `None`. 6. Decrypt AES key: `RSA-OAEP-SHA256_decrypt(private_key, encrypted_aes_key)`. 7. Decrypt payload: `AES-256-GCM_decrypt(aes_key, nonce, tag, ciphertext)`. 8. Parse decrypted bytes as JSON → response dict. 9. Attach `_metadata` if not present (see §6.2). Any decryption failure (except JSON parse errors) raises `SecurityError("Decryption failed: integrity check or authentication failed")`. ### `send_secure_request(payload, payload_id, api_key=None, security_tier=None) -> Dict[str, Any]` (async) Full request lifecycle: encrypt → HTTP POST → retry → decrypt → return. **Request headers:** ``` Content-Type: application/octet-stream X-Payload-ID: {payload_id} X-Public-Key: {url_encoded_pem_public_key} Authorization: Bearer {api_key} (if api_key is provided) X-Security-Tier: {tier} (if security_tier is provided) ``` **POST** to `{router_url}/v1/chat/secure_completion` with encrypted payload as body. **Retry logic:** - Retryable status codes: `{429, 500, 502, 503, 504}`. - Backoff: `2^(attempt-1)` seconds (1s, 2s, 4s…). - Total attempts: `max_retries + 1`. - Network errors also retry. - Non-retryable exceptions propagate immediately. **Status → exception mapping:** | Status | Exception | |---|---| | 200 | Return decrypted response dict | | 400 | `InvalidRequestError` | | 401 | `AuthenticationError` | | 403 | `ForbiddenError` | | 404 | `APIError` | | 429 | `RateLimitError` | | 500 | `ServerError` | | 503 | `ServiceUnavailableError` | | 502/504 | `APIError` (retryable) | | other | `APIError` (non-retryable) | | network error | `APIConnectionError` | --- ## 5. Encryption Wire Format The encrypted package is a JSON object sent as `application/octet-stream`: ```json { "version": "1.0", "algorithm": "hybrid-aes256-rsa4096", "encrypted_payload": { "ciphertext": "", "nonce": "", "tag": "" }, "encrypted_aes_key": "", "key_algorithm": "RSA-OAEP-SHA256", "payload_algorithm": "AES-256-GCM" } ``` | Field | Encoding | Description | |---|---|---| | `version` | string | Protocol version. **Never change** — used for downgrade detection. | | `algorithm` | string | `"hybrid-aes256-rsa4096"`. **Never change** — used for downgrade detection. | | `encrypted_payload.ciphertext` | base64 | AES-256-GCM encrypted payload. | | `encrypted_payload.nonce` | base64 | 12-byte GCM nonce. | | `encrypted_payload.tag` | base64 | 16-byte GCM authentication tag. | | `encrypted_aes_key` | base64 | RSA-OAEP-SHA256 encrypted 32-byte AES key. | | `key_algorithm` | string | `"RSA-OAEP-SHA256"` | | `payload_algorithm` | string | `"AES-256-GCM"` | --- ## 6. Data Structures ### 6.1 Encrypted Request Payload (before encryption) The dict passed to `encrypt_payload()` has this structure: ```json { "model": "Qwen/Qwen3-0.6B", "messages": [ {"role": "user", "content": "Hello"} ], "temperature": 0.7, "...": "any other OpenAI-compatible param" } ``` **Important:** `api_key` is **never** included in the encrypted payload. It is sent only as the `Authorization: Bearer` HTTP header. ### 6.2 Response Dict (after decryption) ```json { "id": "chatcmpl-123", "object": "chat.completion", "created": 1234567890, "model": "Qwen/Qwen3-0.6B", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris.", "tool_calls": [...], "reasoning_content": "..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 10, "completion_tokens": 20, "total_tokens": 30 }, "_metadata": { "payload_id": "openai-compat-abc123", "processed_at": 1765250382, "is_encrypted": true, "encryption_algorithm": "hybrid-aes256-rsa4096", "security_tier": "standard", "memory_protection": { "platform": "linux", "memory_locking": true, "secure_zeroing": true, "core_dump_prevention": true }, "cuda_device": { "available": true, "device_hash": "sha256_hex" } } } ``` ### 6.3 Security Tier Values | Value | Hardware | Use case | |---|---|---| | `"standard"` | GPU | General secure inference | | `"high"` | CPU/GPU | Sensitive business data | | `"maximum"` | CPU only | PHI, classified data | Sent as `X-Security-Tier` HTTP header. Invalid values raise `ValueError`. --- ## 7. Error Class Hierarchy All errors are exceptions. `APIError` subclasses carry `status_code` and `error_details`. ``` Exception └── APIError (base, has message/status_code/error_details) ├── AuthenticationError (status_code=401) ├── InvalidRequestError (status_code=400) ├── RateLimitError (status_code=429) ├── ForbiddenError (status_code=403) ├── ServerError (status_code=500) └── ServiceUnavailableError (status_code=503) Exception └── SecurityError (crypto/key failure, no status_code) Exception └── APIConnectionError (network failure, no status_code) ``` **APIError constructor:** ```python APIError(message: str, status_code: Optional[int] = None, error_details: Optional[Dict] = None) ``` --- ## 8. SecureMemory Module Optional, platform-specific. Fails gracefully if unavailable (e.g. Windows on some Python builds). ### `SecureBuffer` class Wraps a `bytearray` with memory locking and guaranteed zeroing on exit. | Attribute/Method | Type | Description | |---|---|---| | `data` | `bytearray` | Underlying mutable buffer | | `address` | `int` | Memory address (via ctypes) | | `size` | `int` | Buffer size in bytes | | `lock() -> bool` | method | Attempt memory lock | | `unlock() -> bool` | method | Unlock memory | | `zero()` | method | Securely zero contents | | `__enter__` / `__exit__` | context mgr | Auto-lock on enter, auto-zero+unlock on exit | ### `secure_bytearray(data: bytes \| bytearray, lock: bool = True) -> SecureBuffer` (context manager) Recommended secure handling. Converts input to `bytearray`, locks (best-effort), yields `SecureBuffer`. Always zeros on exit, even on exception. ### `secure_bytes(data: bytes, lock: bool = True) -> SecureBuffer` (context manager, **deprecated**) Same as `secure_bytearray` but accepts immutable `bytes`. Emits deprecation warning. Original bytes cannot be zeroed. ### `get_memory_protection_info() -> Dict[str, Any]` Returns protection capabilities: ```json { "enabled": true, "platform": "linux", "protection_level": "full", "has_memory_locking": true, "has_secure_zeroing": true, "supports_full_protection": true, "page_size": 4096 } ``` `protection_level` values: `"full"`, `"zeroing_only"`, `"none"`. ### `disable_secure_memory()` / `enable_secure_memory()` Globally disable/re-enable secure memory operations. --- ## 9. Constants | Constant | Value | Location | Notes | |---|---|---|---| | Protocol version | `"1.0"` | `SecureCompletionClient.py` | **Never change** — downgrade detection | | Algorithm string | `"hybrid-aes256-rsa4096"` | `SecureCompletionClient.py` | **Never change** — downgrade detection | | RSA key size | `4096` | `SecureCompletionClient.py` | Fixed | | RSA public exponent | `65537` | `SecureCompletionClient.py` | Fixed | | AES key size | `32` bytes (256-bit) | `SecureCompletionClient.py` | Per-request ephemeral | | GCM nonce size | `12` bytes (96-bit) | `SecureCompletionClient.py` | Per-request via `secrets.token_bytes` | | Max payload size | `10 * 1024 * 1024` (10 MB) | `SecureCompletionClient.py` | DoS protection | | Default max retries | `2` | Both client classes | Exponential backoff: 1s, 2s, 4s… | | Private key file mode | `0o600` | `SecureCompletionClient.py` | Owner read/write only | | Public key file mode | `0o644` | `SecureCompletionClient.py` | Owner rw, group/others r | | Min RSA key size (validation) | `2048` | `SecureCompletionClient.py` | `_validate_rsa_key` | | Valid security tiers | `["standard", "high", "maximum"]` | `SecureCompletionClient.py` | Case-sensitive | | Retryable status codes | `{429, 500, 502, 503, 504}` | `SecureCompletionClient.py` | | | Package version | `"0.2.7"` | `pyproject.toml` + `__init__.py` | Bump both | --- ## 10. Endpoint URLs | Endpoint | Method | Purpose | |---|---|---| | `{router_url}/pki/public_key` | GET | Fetch server RSA public key | | `{router_url}/v1/chat/secure_completion` | POST | Encrypted chat completion | --- ## 11. Key Lifecycle 1. **First `create()` call** → `_ensure_keys()` runs (async, double-checked locking via `asyncio.Lock`). 2. If `key_dir` is set: - Try `load_keys()` from `{key_dir}/private_key.pem` + `{key_dir}/public_key.pem`. - If that fails → `generate_keys(save_to_file=True, key_dir=key_dir)`. 3. If `key_dir` is `None` → `generate_keys()` (ephemeral, in-memory only). 4. Keys are reused across all subsequent calls until the client is discarded. --- ## 12. HTTP Client Details - Uses `httpx.AsyncClient` with `timeout=60.0`. - SSL verification enabled for HTTPS URLs; disabled for `http://`. - Request body is raw bytes (not JSON) — `Content-Type: application/octet-stream`. - Public key is URL-encoded in the `X-Public-Key` header. --- ## 13. Memory Protection Platform Matrix | Platform | Locking | Zeroing | |---|---|---| | Linux | `mlock()` via `libc.so.6` | `memset()` via `libc.so.6` | | Windows | `VirtualLock()` via `kernel32` | `RtlZeroMemory()` via `ntdll` + Python-level fallback | | macOS | `mlock()` via `libc.dylib` | `memset()` via `libc.dylib` | | Other | No lock | Python-level byte-by-byte zeroing | mlock may fail with `EPERM` (need `CAP_IPC_LOCK` or `ulimit -l` increase) — degrades to zeroing-only gracefully.