479 lines
17 KiB
Markdown
479 lines
17 KiB
Markdown
|
|
# NOMYO Python Client — Translation Reference
|
|||
|
|
|
|||
|
|
> Target: Port this library to another language. Every class, method, signature, constant, wire format, and error mapping is documented below.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Package Layout
|
|||
|
|
|
|||
|
|
| File (relative to package root) | Purpose |
|
|||
|
|
|---|---|
|
|||
|
|
| `nomyo/__init__.py` | Public exports, version string |
|
|||
|
|
| `nomyo/nomyo.py` | `SecureChatCompletion` — OpenAI-compatible entrypoint |
|
|||
|
|
| `nomyo/SecureCompletionClient.py` | Key mgmt, hybrid encryption, HTTP roundtrip, retries |
|
|||
|
|
| `nomyo/SecureMemory.py` | Cross-platform memory locking + secure zeroing (optional, platform-specific) |
|
|||
|
|
|
|||
|
|
**Python version:** `>= 3.10`
|
|||
|
|
**Build:** `hatchling` (pyproject.toml)
|
|||
|
|
**Dependencies:** `anyio`, `certifi`, `cffi`, `cryptography`, `exceptiongroup`, `h11`, `httpcore`, `httpx`, `idna`, `pycparser`, `typing_extensions`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Public API Surface (`__all__`)
|
|||
|
|
|
|||
|
|
| Export | Type | Source file |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `SecureChatCompletion` | class | `nomyo.py` |
|
|||
|
|
| `SecurityError` | exception | `SecureCompletionClient.py` |
|
|||
|
|
| `APIError` | exception (base) | `SecureCompletionClient.py` |
|
|||
|
|
| `AuthenticationError` | exception (401) | `SecureCompletionClient.py` |
|
|||
|
|
| `InvalidRequestError` | exception (400) | `SecureCompletionClient.py` |
|
|||
|
|
| `APIConnectionError` | exception (network) | `SecureCompletionClient.py` |
|
|||
|
|
| `ForbiddenError` | exception (403) | `SecureCompletionClient.py` |
|
|||
|
|
| `RateLimitError` | exception (429) | `SecureCompletionClient.py` |
|
|||
|
|
| `ServerError` | exception (500) | `SecureCompletionClient.py` |
|
|||
|
|
| `ServiceUnavailableError` | exception (503) | `SecureCompletionClient.py` |
|
|||
|
|
| `get_memory_protection_info` | function | `SecureMemory.py` |
|
|||
|
|
| `disable_secure_memory` | function | `SecureMemory.py` |
|
|||
|
|
| `enable_secure_memory` | function | `SecureMemory.py` |
|
|||
|
|
| `secure_bytearray` | context manager | `SecureMemory.py` |
|
|||
|
|
| `secure_bytes` | context manager (deprecated) | `SecureMemory.py` |
|
|||
|
|
| `SecureBuffer` | class | `SecureMemory.py` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. `SecureChatCompletion` (entrypoint)
|
|||
|
|
|
|||
|
|
### Constructor
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
SecureChatCompletion(
|
|||
|
|
base_url: str = "https://api.nomyo.ai",
|
|||
|
|
allow_http: bool = False,
|
|||
|
|
api_key: Optional[str] = None,
|
|||
|
|
secure_memory: bool = True,
|
|||
|
|
key_dir: Optional[str] = None,
|
|||
|
|
max_retries: int = 2
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| Param | Default | Description |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `base_url` | `"https://api.nomyo.ai"` | NOMYO Router base URL. HTTPS enforced unless `allow_http=True`. |
|
|||
|
|
| `allow_http` | `False` | Permit `http://` URLs (dev only). |
|
|||
|
|
| `api_key` | `None` | Bearer token for auth. Can also be passed per-call via `create()`. |
|
|||
|
|
| `secure_memory` | `True` | Enable memory locking/zeroing. Warns if unavailable. |
|
|||
|
|
| `key_dir` | `None` | Directory to persist RSA keys. `None` = ephemeral (in-memory only). |
|
|||
|
|
| `max_retries` | `2` | Retries on 429/500/502/503/504 + network errors. Exponential backoff: 1s, 2s, 4s… |
|
|||
|
|
|
|||
|
|
### `create(model, messages, **kwargs) -> Dict[str, Any]`
|
|||
|
|
|
|||
|
|
Async method. Returns a **dict** (not an object). Same signature as `openai.ChatCompletion.create()`.
|
|||
|
|
|
|||
|
|
| Param | Type | Required | Description |
|
|||
|
|
|---|---|---|---|
|
|||
|
|
| `model` | `str` | yes | Model identifier, e.g. `"Qwen/Qwen3-0.6B"` |
|
|||
|
|
| `messages` | `List[Dict]` | yes | OpenAI-format messages: `[{"role": "user", "content": "..."}]` |
|
|||
|
|
| `temperature` | `float` | no | 0–2 |
|
|||
|
|
| `max_tokens` | `int` | no | |
|
|||
|
|
| `top_p` | `float` | no | |
|
|||
|
|
| `stop` | `str \| List[str]` | no | |
|
|||
|
|
| `presence_penalty` | `float` | no | -2.0 to 2.0 |
|
|||
|
|
| `frequency_penalty` | `float` | no | -2.0 to 2.0 |
|
|||
|
|
| `n` | `int` | no | Number of completions |
|
|||
|
|
| `best_of` | `int` | no | |
|
|||
|
|
| `seed` | `int` | no | |
|
|||
|
|
| `logit_bias` | `Dict[str, float]` | no | |
|
|||
|
|
| `user` | `str` | no | |
|
|||
|
|
| `tools` | `List[Dict]` | no | Tool definitions passed through to llama.cpp |
|
|||
|
|
| `tool_choice` | `str` | no | `"auto"`, `"none"`, or specific tool name |
|
|||
|
|
| `response_format` | `Dict` | no | `{"type": "json_object"}` or `{"type": "json_schema", ...}` |
|
|||
|
|
| `stream` | `bool` | no | **NOT supported.** Server rejects with HTTP 400. Always use `False`. |
|
|||
|
|
| `base_url` | `str` | no | Per-call override (creates temp client internally). |
|
|||
|
|
| `security_tier` | `str` | no | `"standard"`, `"high"`, or `"maximum"`. Invalid values raise `ValueError`. |
|
|||
|
|
| `api_key` | `str` | no | Per-call override of instance `api_key`. |
|
|||
|
|
|
|||
|
|
**Return value:** `Dict[str, Any]` — OpenAI-compatible response dict (see §6.2).
|
|||
|
|
|
|||
|
|
### `acreate(model, messages, **kwargs) -> Dict[str, Any]`
|
|||
|
|
|
|||
|
|
Async alias for `create()`. Identical behavior.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. `SecureCompletionClient` (low-level)
|
|||
|
|
|
|||
|
|
### Constructor
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
SecureCompletionClient(
|
|||
|
|
router_url: str = "https://api.nomyo.ai",
|
|||
|
|
allow_http: bool = False,
|
|||
|
|
secure_memory: bool = True,
|
|||
|
|
max_retries: int = 2
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Same semantics as `SecureChatCompletion` constructor (maps directly to inner client).
|
|||
|
|
|
|||
|
|
### Instance attributes
|
|||
|
|
|
|||
|
|
| Attribute | Type | Description |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `router_url` | `str` | Base URL (trailing slash stripped). |
|
|||
|
|
| `private_key` | `rsa.RSAPrivateKey \| None` | Loaded/generated RSA private key. |
|
|||
|
|
| `public_key_pem` | `str \| None` | PEM-encoded public key string. |
|
|||
|
|
| `key_size` | `int` | Always `4096`. |
|
|||
|
|
| `allow_http` | `bool` | HTTP allowance flag. |
|
|||
|
|
| `max_retries` | `int` | Retry count. |
|
|||
|
|
| `_use_secure_memory` | `bool` | Whether secure memory ops are active. |
|
|||
|
|
|
|||
|
|
### `generate_keys(save_to_file: bool = False, key_dir: str = "client_keys", password: Optional[str] = None) -> None`
|
|||
|
|
|
|||
|
|
Generates a 4096-bit RSA key pair (public exponent `65537`). If `save_to_file=True`:
|
|||
|
|
- Creates `key_dir/` (mode 755).
|
|||
|
|
- Writes `private_key.pem` with mode `0o600`.
|
|||
|
|
- Writes `public_key.pem` with mode `0o644`.
|
|||
|
|
- If `password` is given, private key is encrypted with `BestAvailableEncryption`.
|
|||
|
|
|
|||
|
|
### `load_keys(private_key_path: str, public_key_path: Optional[str] = None, password: Optional[str] = None) -> None`
|
|||
|
|
|
|||
|
|
Loads an RSA private key from disk. If `public_key_path` is omitted, derives the public key from the loaded private key. Validates key size >= 2048 bits.
|
|||
|
|
|
|||
|
|
### `fetch_server_public_key() -> str` (async)
|
|||
|
|
|
|||
|
|
`GET {router_url}/pki/public_key`
|
|||
|
|
- Returns server PEM public key as string.
|
|||
|
|
- Validates it parses as a valid PEM public key.
|
|||
|
|
- Raises `SecurityError` if URL is not HTTPS and `allow_http=False`.
|
|||
|
|
|
|||
|
|
### `encrypt_payload(payload: Dict[str, Any]) -> bytes` (async)
|
|||
|
|
|
|||
|
|
Encrypts a dict payload using hybrid encryption. Returns raw encrypted bytes (JSON package, serialized to bytes).
|
|||
|
|
|
|||
|
|
**Encryption process:**
|
|||
|
|
1. Serialize payload to JSON → `bytearray`.
|
|||
|
|
2. Validate size <= 10 MB.
|
|||
|
|
3. Generate 256-bit AES key via `secrets.token_bytes(32)` → `bytearray`.
|
|||
|
|
4. If secure memory enabled: lock both payload and AES key in memory.
|
|||
|
|
5. Call `_do_encrypt()` (see below).
|
|||
|
|
6. Zero/destroy payload and AES key from memory on exit.
|
|||
|
|
|
|||
|
|
### `_do_encrypt(payload_bytes: bytes \| bytearray, aes_key: bytes \| bytearray) -> bytes` (async)
|
|||
|
|
|
|||
|
|
Core hybrid encryption routine. **This is the wire format constructor.**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
1. nonce = secrets.token_bytes(12) # 96-bit GCM nonce
|
|||
|
|
2. ciphertext = AES-256-GCM_encrypt(aes_key, nonce, payload_bytes)
|
|||
|
|
3. tag = GCM_tag
|
|||
|
|
4. server_pubkey = await fetch_server_public_key()
|
|||
|
|
5. encrypted_aes_key = RSA-OAEP-SHA256_encrypt(server_pubkey, aes_key_bytes)
|
|||
|
|
6. Build JSON package (see §6.1)
|
|||
|
|
7. Return json.dumps(package).encode('utf-8')
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### `decrypt_response(encrypted_response: bytes, payload_id: str) -> Dict[str, Any]` (async)
|
|||
|
|
|
|||
|
|
Decrypts a server response.
|
|||
|
|
|
|||
|
|
**Validation chain:**
|
|||
|
|
1. Parse JSON.
|
|||
|
|
2. Check `version == "1.0"` — raises `ValueError` if mismatch.
|
|||
|
|
3. Check `algorithm == "hybrid-aes256-rsa4096"` — raises `ValueError` if mismatch.
|
|||
|
|
4. Validate `encrypted_payload` has `ciphertext`, `nonce`, `tag`.
|
|||
|
|
5. Require `self.private_key` is not `None`.
|
|||
|
|
6. Decrypt AES key: `RSA-OAEP-SHA256_decrypt(private_key, encrypted_aes_key)`.
|
|||
|
|
7. Decrypt payload: `AES-256-GCM_decrypt(aes_key, nonce, tag, ciphertext)`.
|
|||
|
|
8. Parse decrypted bytes as JSON → response dict.
|
|||
|
|
9. Attach `_metadata` if not present (see §6.2).
|
|||
|
|
|
|||
|
|
Any decryption failure (except JSON parse errors) raises `SecurityError("Decryption failed: integrity check or authentication failed")`.
|
|||
|
|
|
|||
|
|
### `send_secure_request(payload, payload_id, api_key=None, security_tier=None) -> Dict[str, Any]` (async)
|
|||
|
|
|
|||
|
|
Full request lifecycle: encrypt → HTTP POST → retry → decrypt → return.
|
|||
|
|
|
|||
|
|
**Request headers:**
|
|||
|
|
```
|
|||
|
|
Content-Type: application/octet-stream
|
|||
|
|
X-Payload-ID: {payload_id}
|
|||
|
|
X-Public-Key: {url_encoded_pem_public_key}
|
|||
|
|
Authorization: Bearer {api_key} (if api_key is provided)
|
|||
|
|
X-Security-Tier: {tier} (if security_tier is provided)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**POST** to `{router_url}/v1/chat/secure_completion` with encrypted payload as body.
|
|||
|
|
|
|||
|
|
**Retry logic:**
|
|||
|
|
- Retryable status codes: `{429, 500, 502, 503, 504}`.
|
|||
|
|
- Backoff: `2^(attempt-1)` seconds (1s, 2s, 4s…).
|
|||
|
|
- Total attempts: `max_retries + 1`.
|
|||
|
|
- Network errors also retry.
|
|||
|
|
- Non-retryable exceptions propagate immediately.
|
|||
|
|
|
|||
|
|
**Status → exception mapping:**
|
|||
|
|
|
|||
|
|
| Status | Exception |
|
|||
|
|
|---|---|
|
|||
|
|
| 200 | Return decrypted response dict |
|
|||
|
|
| 400 | `InvalidRequestError` |
|
|||
|
|
| 401 | `AuthenticationError` |
|
|||
|
|
| 403 | `ForbiddenError` |
|
|||
|
|
| 404 | `APIError` |
|
|||
|
|
| 429 | `RateLimitError` |
|
|||
|
|
| 500 | `ServerError` |
|
|||
|
|
| 503 | `ServiceUnavailableError` |
|
|||
|
|
| 502/504 | `APIError` (retryable) |
|
|||
|
|
| other | `APIError` (non-retryable) |
|
|||
|
|
| network error | `APIConnectionError` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Encryption Wire Format
|
|||
|
|
|
|||
|
|
The encrypted package is a JSON object sent as `application/octet-stream`:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"version": "1.0",
|
|||
|
|
"algorithm": "hybrid-aes256-rsa4096",
|
|||
|
|
"encrypted_payload": {
|
|||
|
|
"ciphertext": "<base64>",
|
|||
|
|
"nonce": "<base64>",
|
|||
|
|
"tag": "<base64>"
|
|||
|
|
},
|
|||
|
|
"encrypted_aes_key": "<base64>",
|
|||
|
|
"key_algorithm": "RSA-OAEP-SHA256",
|
|||
|
|
"payload_algorithm": "AES-256-GCM"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| Field | Encoding | Description |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `version` | string | Protocol version. **Never change** — used for downgrade detection. |
|
|||
|
|
| `algorithm` | string | `"hybrid-aes256-rsa4096"`. **Never change** — used for downgrade detection. |
|
|||
|
|
| `encrypted_payload.ciphertext` | base64 | AES-256-GCM encrypted payload. |
|
|||
|
|
| `encrypted_payload.nonce` | base64 | 12-byte GCM nonce. |
|
|||
|
|
| `encrypted_payload.tag` | base64 | 16-byte GCM authentication tag. |
|
|||
|
|
| `encrypted_aes_key` | base64 | RSA-OAEP-SHA256 encrypted 32-byte AES key. |
|
|||
|
|
| `key_algorithm` | string | `"RSA-OAEP-SHA256"` |
|
|||
|
|
| `payload_algorithm` | string | `"AES-256-GCM"` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Data Structures
|
|||
|
|
|
|||
|
|
### 6.1 Encrypted Request Payload (before encryption)
|
|||
|
|
|
|||
|
|
The dict passed to `encrypt_payload()` has this structure:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"model": "Qwen/Qwen3-0.6B",
|
|||
|
|
"messages": [
|
|||
|
|
{"role": "user", "content": "Hello"}
|
|||
|
|
],
|
|||
|
|
"temperature": 0.7,
|
|||
|
|
"...": "any other OpenAI-compatible param"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Important:** `api_key` is **never** included in the encrypted payload. It is sent only as the `Authorization: Bearer` HTTP header.
|
|||
|
|
|
|||
|
|
### 6.2 Response Dict (after decryption)
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"id": "chatcmpl-123",
|
|||
|
|
"object": "chat.completion",
|
|||
|
|
"created": 1234567890,
|
|||
|
|
"model": "Qwen/Qwen3-0.6B",
|
|||
|
|
"choices": [
|
|||
|
|
{
|
|||
|
|
"index": 0,
|
|||
|
|
"message": {
|
|||
|
|
"role": "assistant",
|
|||
|
|
"content": "The capital of France is Paris.",
|
|||
|
|
"tool_calls": [...],
|
|||
|
|
"reasoning_content": "..."
|
|||
|
|
},
|
|||
|
|
"finish_reason": "stop"
|
|||
|
|
}
|
|||
|
|
],
|
|||
|
|
"usage": {
|
|||
|
|
"prompt_tokens": 10,
|
|||
|
|
"completion_tokens": 20,
|
|||
|
|
"total_tokens": 30
|
|||
|
|
},
|
|||
|
|
"_metadata": {
|
|||
|
|
"payload_id": "openai-compat-abc123",
|
|||
|
|
"processed_at": 1765250382,
|
|||
|
|
"is_encrypted": true,
|
|||
|
|
"encryption_algorithm": "hybrid-aes256-rsa4096",
|
|||
|
|
"security_tier": "standard",
|
|||
|
|
"memory_protection": {
|
|||
|
|
"platform": "linux",
|
|||
|
|
"memory_locking": true,
|
|||
|
|
"secure_zeroing": true,
|
|||
|
|
"core_dump_prevention": true
|
|||
|
|
},
|
|||
|
|
"cuda_device": {
|
|||
|
|
"available": true,
|
|||
|
|
"device_hash": "sha256_hex"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 6.3 Security Tier Values
|
|||
|
|
|
|||
|
|
| Value | Hardware | Use case |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `"standard"` | GPU | General secure inference |
|
|||
|
|
| `"high"` | CPU/GPU | Sensitive business data |
|
|||
|
|
| `"maximum"` | CPU only | PHI, classified data |
|
|||
|
|
|
|||
|
|
Sent as `X-Security-Tier` HTTP header. Invalid values raise `ValueError`.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. Error Class Hierarchy
|
|||
|
|
|
|||
|
|
All errors are exceptions. `APIError` subclasses carry `status_code` and `error_details`.
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Exception
|
|||
|
|
└── APIError (base, has message/status_code/error_details)
|
|||
|
|
├── AuthenticationError (status_code=401)
|
|||
|
|
├── InvalidRequestError (status_code=400)
|
|||
|
|
├── RateLimitError (status_code=429)
|
|||
|
|
├── ForbiddenError (status_code=403)
|
|||
|
|
├── ServerError (status_code=500)
|
|||
|
|
└── ServiceUnavailableError (status_code=503)
|
|||
|
|
|
|||
|
|
Exception
|
|||
|
|
└── SecurityError (crypto/key failure, no status_code)
|
|||
|
|
|
|||
|
|
Exception
|
|||
|
|
└── APIConnectionError (network failure, no status_code)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**APIError constructor:**
|
|||
|
|
```python
|
|||
|
|
APIError(message: str, status_code: Optional[int] = None, error_details: Optional[Dict] = None)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. SecureMemory Module
|
|||
|
|
|
|||
|
|
Optional, platform-specific. Fails gracefully if unavailable (e.g. Windows on some Python builds).
|
|||
|
|
|
|||
|
|
### `SecureBuffer` class
|
|||
|
|
|
|||
|
|
Wraps a `bytearray` with memory locking and guaranteed zeroing on exit.
|
|||
|
|
|
|||
|
|
| Attribute/Method | Type | Description |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `data` | `bytearray` | Underlying mutable buffer |
|
|||
|
|
| `address` | `int` | Memory address (via ctypes) |
|
|||
|
|
| `size` | `int` | Buffer size in bytes |
|
|||
|
|
| `lock() -> bool` | method | Attempt memory lock |
|
|||
|
|
| `unlock() -> bool` | method | Unlock memory |
|
|||
|
|
| `zero()` | method | Securely zero contents |
|
|||
|
|
| `__enter__` / `__exit__` | context mgr | Auto-lock on enter, auto-zero+unlock on exit |
|
|||
|
|
|
|||
|
|
### `secure_bytearray(data: bytes \| bytearray, lock: bool = True) -> SecureBuffer` (context manager)
|
|||
|
|
|
|||
|
|
Recommended secure handling. Converts input to `bytearray`, locks (best-effort), yields `SecureBuffer`. Always zeros on exit, even on exception.
|
|||
|
|
|
|||
|
|
### `secure_bytes(data: bytes, lock: bool = True) -> SecureBuffer` (context manager, **deprecated**)
|
|||
|
|
|
|||
|
|
Same as `secure_bytearray` but accepts immutable `bytes`. Emits deprecation warning. Original bytes cannot be zeroed.
|
|||
|
|
|
|||
|
|
### `get_memory_protection_info() -> Dict[str, Any]`
|
|||
|
|
|
|||
|
|
Returns protection capabilities:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"enabled": true,
|
|||
|
|
"platform": "linux",
|
|||
|
|
"protection_level": "full",
|
|||
|
|
"has_memory_locking": true,
|
|||
|
|
"has_secure_zeroing": true,
|
|||
|
|
"supports_full_protection": true,
|
|||
|
|
"page_size": 4096
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
`protection_level` values: `"full"`, `"zeroing_only"`, `"none"`.
|
|||
|
|
|
|||
|
|
### `disable_secure_memory()` / `enable_secure_memory()`
|
|||
|
|
|
|||
|
|
Globally disable/re-enable secure memory operations.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. Constants
|
|||
|
|
|
|||
|
|
| Constant | Value | Location | Notes |
|
|||
|
|
|---|---|---|---|
|
|||
|
|
| Protocol version | `"1.0"` | `SecureCompletionClient.py` | **Never change** — downgrade detection |
|
|||
|
|
| Algorithm string | `"hybrid-aes256-rsa4096"` | `SecureCompletionClient.py` | **Never change** — downgrade detection |
|
|||
|
|
| RSA key size | `4096` | `SecureCompletionClient.py` | Fixed |
|
|||
|
|
| RSA public exponent | `65537` | `SecureCompletionClient.py` | Fixed |
|
|||
|
|
| AES key size | `32` bytes (256-bit) | `SecureCompletionClient.py` | Per-request ephemeral |
|
|||
|
|
| GCM nonce size | `12` bytes (96-bit) | `SecureCompletionClient.py` | Per-request via `secrets.token_bytes` |
|
|||
|
|
| Max payload size | `10 * 1024 * 1024` (10 MB) | `SecureCompletionClient.py` | DoS protection |
|
|||
|
|
| Default max retries | `2` | Both client classes | Exponential backoff: 1s, 2s, 4s… |
|
|||
|
|
| Private key file mode | `0o600` | `SecureCompletionClient.py` | Owner read/write only |
|
|||
|
|
| Public key file mode | `0o644` | `SecureCompletionClient.py` | Owner rw, group/others r |
|
|||
|
|
| Min RSA key size (validation) | `2048` | `SecureCompletionClient.py` | `_validate_rsa_key` |
|
|||
|
|
| Valid security tiers | `["standard", "high", "maximum"]` | `SecureCompletionClient.py` | Case-sensitive |
|
|||
|
|
| Retryable status codes | `{429, 500, 502, 503, 504}` | `SecureCompletionClient.py` | |
|
|||
|
|
| Package version | `"0.2.7"` | `pyproject.toml` + `__init__.py` | Bump both |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 10. Endpoint URLs
|
|||
|
|
|
|||
|
|
| Endpoint | Method | Purpose |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `{router_url}/pki/public_key` | GET | Fetch server RSA public key |
|
|||
|
|
| `{router_url}/v1/chat/secure_completion` | POST | Encrypted chat completion |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 11. Key Lifecycle
|
|||
|
|
|
|||
|
|
1. **First `create()` call** → `_ensure_keys()` runs (async, double-checked locking via `asyncio.Lock`).
|
|||
|
|
2. If `key_dir` is set:
|
|||
|
|
- Try `load_keys()` from `{key_dir}/private_key.pem` + `{key_dir}/public_key.pem`.
|
|||
|
|
- If that fails → `generate_keys(save_to_file=True, key_dir=key_dir)`.
|
|||
|
|
3. If `key_dir` is `None` → `generate_keys()` (ephemeral, in-memory only).
|
|||
|
|
4. Keys are reused across all subsequent calls until the client is discarded.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 12. HTTP Client Details
|
|||
|
|
|
|||
|
|
- Uses `httpx.AsyncClient` with `timeout=60.0`.
|
|||
|
|
- SSL verification enabled for HTTPS URLs; disabled for `http://`.
|
|||
|
|
- Request body is raw bytes (not JSON) — `Content-Type: application/octet-stream`.
|
|||
|
|
- Public key is URL-encoded in the `X-Public-Key` header.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 13. Memory Protection Platform Matrix
|
|||
|
|
|
|||
|
|
| Platform | Locking | Zeroing |
|
|||
|
|
|---|---|---|
|
|||
|
|
| Linux | `mlock()` via `libc.so.6` | `memset()` via `libc.so.6` |
|
|||
|
|
| Windows | `VirtualLock()` via `kernel32` | `RtlZeroMemory()` via `ntdll` + Python-level fallback |
|
|||
|
|
| macOS | `mlock()` via `libc.dylib` | `memset()` via `libc.dylib` |
|
|||
|
|
| Other | No lock | Python-level byte-by-byte zeroing |
|
|||
|
|
|
|||
|
|
mlock may fail with `EPERM` (need `CAP_IPC_LOCK` or `ulimit -l` increase) — degrades to zeroing-only gracefully.
|