17 KiB
NOMYO Python Client — Translation Reference
Target: Port this library to another language. Every class, method, signature, constant, wire format, and error mapping is documented below.
1. Package Layout
| File (relative to package root) | Purpose |
|---|---|
nomyo/__init__.py |
Public exports, version string |
nomyo/nomyo.py |
SecureChatCompletion — OpenAI-compatible entrypoint |
nomyo/SecureCompletionClient.py |
Key mgmt, hybrid encryption, HTTP roundtrip, retries |
nomyo/SecureMemory.py |
Cross-platform memory locking + secure zeroing (optional, platform-specific) |
Python version: >= 3.10
Build: hatchling (pyproject.toml)
Dependencies: anyio, certifi, cffi, cryptography, exceptiongroup, h11, httpcore, httpx, idna, pycparser, typing_extensions
2. Public API Surface (__all__)
| Export | Type | Source file |
|---|---|---|
SecureChatCompletion |
class | nomyo.py |
SecurityError |
exception | SecureCompletionClient.py |
APIError |
exception (base) | SecureCompletionClient.py |
AuthenticationError |
exception (401) | SecureCompletionClient.py |
InvalidRequestError |
exception (400) | SecureCompletionClient.py |
APIConnectionError |
exception (network) | SecureCompletionClient.py |
ForbiddenError |
exception (403) | SecureCompletionClient.py |
RateLimitError |
exception (429) | SecureCompletionClient.py |
ServerError |
exception (500) | SecureCompletionClient.py |
ServiceUnavailableError |
exception (503) | SecureCompletionClient.py |
get_memory_protection_info |
function | SecureMemory.py |
disable_secure_memory |
function | SecureMemory.py |
enable_secure_memory |
function | SecureMemory.py |
secure_bytearray |
context manager | SecureMemory.py |
secure_bytes |
context manager (deprecated) | SecureMemory.py |
SecureBuffer |
class | SecureMemory.py |
3. SecureChatCompletion (entrypoint)
Constructor
SecureChatCompletion(
base_url: str = "https://api.nomyo.ai",
allow_http: bool = False,
api_key: Optional[str] = None,
secure_memory: bool = True,
key_dir: Optional[str] = None,
max_retries: int = 2
)
| Param | Default | Description |
|---|---|---|
base_url |
"https://api.nomyo.ai" |
NOMYO Router base URL. HTTPS enforced unless allow_http=True. |
allow_http |
False |
Permit http:// URLs (dev only). |
api_key |
None |
Bearer token for auth. Can also be passed per-call via create(). |
secure_memory |
True |
Enable memory locking/zeroing. Warns if unavailable. |
key_dir |
None |
Directory to persist RSA keys. None = ephemeral (in-memory only). |
max_retries |
2 |
Retries on 429/500/502/503/504 + network errors. Exponential backoff: 1s, 2s, 4s… |
create(model, messages, **kwargs) -> Dict[str, Any]
Async method. Returns a dict (not an object). Same signature as openai.ChatCompletion.create().
| Param | Type | Required | Description |
|---|---|---|---|
model |
str |
yes | Model identifier, e.g. "Qwen/Qwen3-0.6B" |
messages |
List[Dict] |
yes | OpenAI-format messages: [{"role": "user", "content": "..."}] |
temperature |
float |
no | 0–2 |
max_tokens |
int |
no | |
top_p |
float |
no | |
stop |
str | List[str] |
no | |
presence_penalty |
float |
no | -2.0 to 2.0 |
frequency_penalty |
float |
no | -2.0 to 2.0 |
n |
int |
no | Number of completions |
best_of |
int |
no | |
seed |
int |
no | |
logit_bias |
Dict[str, float] |
no | |
user |
str |
no | |
tools |
List[Dict] |
no | Tool definitions passed through to llama.cpp |
tool_choice |
str |
no | "auto", "none", or specific tool name |
response_format |
Dict |
no | {"type": "json_object"} or {"type": "json_schema", ...} |
stream |
bool |
no | NOT supported. Server rejects with HTTP 400. Always use False. |
base_url |
str |
no | Per-call override (creates temp client internally). |
security_tier |
str |
no | "standard", "high", or "maximum". Invalid values raise ValueError. |
api_key |
str |
no | Per-call override of instance api_key. |
Return value: Dict[str, Any] — OpenAI-compatible response dict (see §6.2).
acreate(model, messages, **kwargs) -> Dict[str, Any]
Async alias for create(). Identical behavior.
4. SecureCompletionClient (low-level)
Constructor
SecureCompletionClient(
router_url: str = "https://api.nomyo.ai",
allow_http: bool = False,
secure_memory: bool = True,
max_retries: int = 2
)
Same semantics as SecureChatCompletion constructor (maps directly to inner client).
Instance attributes
| Attribute | Type | Description |
|---|---|---|
router_url |
str |
Base URL (trailing slash stripped). |
private_key |
rsa.RSAPrivateKey | None |
Loaded/generated RSA private key. |
public_key_pem |
str | None |
PEM-encoded public key string. |
key_size |
int |
Always 4096. |
allow_http |
bool |
HTTP allowance flag. |
max_retries |
int |
Retry count. |
_use_secure_memory |
bool |
Whether secure memory ops are active. |
generate_keys(save_to_file: bool = False, key_dir: str = "client_keys", password: Optional[str] = None) -> None
Generates a 4096-bit RSA key pair (public exponent 65537). If save_to_file=True:
- Creates
key_dir/(mode 755). - Writes
private_key.pemwith mode0o600. - Writes
public_key.pemwith mode0o644. - If
passwordis given, private key is encrypted withBestAvailableEncryption.
load_keys(private_key_path: str, public_key_path: Optional[str] = None, password: Optional[str] = None) -> None
Loads an RSA private key from disk. If public_key_path is omitted, derives the public key from the loaded private key. Validates key size >= 2048 bits.
fetch_server_public_key() -> str (async)
GET {router_url}/pki/public_key
- Returns server PEM public key as string.
- Validates it parses as a valid PEM public key.
- Raises
SecurityErrorif URL is not HTTPS andallow_http=False.
encrypt_payload(payload: Dict[str, Any]) -> bytes (async)
Encrypts a dict payload using hybrid encryption. Returns raw encrypted bytes (JSON package, serialized to bytes).
Encryption process:
- Serialize payload to JSON →
bytearray. - Validate size <= 10 MB.
- Generate 256-bit AES key via
secrets.token_bytes(32)→bytearray. - If secure memory enabled: lock both payload and AES key in memory.
- Call
_do_encrypt()(see below). - Zero/destroy payload and AES key from memory on exit.
_do_encrypt(payload_bytes: bytes \| bytearray, aes_key: bytes \| bytearray) -> bytes (async)
Core hybrid encryption routine. This is the wire format constructor.
1. nonce = secrets.token_bytes(12) # 96-bit GCM nonce
2. ciphertext = AES-256-GCM_encrypt(aes_key, nonce, payload_bytes)
3. tag = GCM_tag
4. server_pubkey = await fetch_server_public_key()
5. encrypted_aes_key = RSA-OAEP-SHA256_encrypt(server_pubkey, aes_key_bytes)
6. Build JSON package (see §6.1)
7. Return json.dumps(package).encode('utf-8')
decrypt_response(encrypted_response: bytes, payload_id: str) -> Dict[str, Any] (async)
Decrypts a server response.
Validation chain:
- Parse JSON.
- Check
version == "1.0"— raisesValueErrorif mismatch. - Check
algorithm == "hybrid-aes256-rsa4096"— raisesValueErrorif mismatch. - Validate
encrypted_payloadhasciphertext,nonce,tag. - Require
self.private_keyis notNone. - Decrypt AES key:
RSA-OAEP-SHA256_decrypt(private_key, encrypted_aes_key). - Decrypt payload:
AES-256-GCM_decrypt(aes_key, nonce, tag, ciphertext). - Parse decrypted bytes as JSON → response dict.
- Attach
_metadataif not present (see §6.2).
Any decryption failure (except JSON parse errors) raises SecurityError("Decryption failed: integrity check or authentication failed").
send_secure_request(payload, payload_id, api_key=None, security_tier=None) -> Dict[str, Any] (async)
Full request lifecycle: encrypt → HTTP POST → retry → decrypt → return.
Request headers:
Content-Type: application/octet-stream
X-Payload-ID: {payload_id}
X-Public-Key: {url_encoded_pem_public_key}
Authorization: Bearer {api_key} (if api_key is provided)
X-Security-Tier: {tier} (if security_tier is provided)
POST to {router_url}/v1/chat/secure_completion with encrypted payload as body.
Retry logic:
- Retryable status codes:
{429, 500, 502, 503, 504}. - Backoff:
2^(attempt-1)seconds (1s, 2s, 4s…). - Total attempts:
max_retries + 1. - Network errors also retry.
- Non-retryable exceptions propagate immediately.
Status → exception mapping:
| Status | Exception |
|---|---|
| 200 | Return decrypted response dict |
| 400 | InvalidRequestError |
| 401 | AuthenticationError |
| 403 | ForbiddenError |
| 404 | APIError |
| 429 | RateLimitError |
| 500 | ServerError |
| 503 | ServiceUnavailableError |
| 502/504 | APIError (retryable) |
| other | APIError (non-retryable) |
| network error | APIConnectionError |
5. Encryption Wire Format
The encrypted package is a JSON object sent as application/octet-stream:
{
"version": "1.0",
"algorithm": "hybrid-aes256-rsa4096",
"encrypted_payload": {
"ciphertext": "<base64>",
"nonce": "<base64>",
"tag": "<base64>"
},
"encrypted_aes_key": "<base64>",
"key_algorithm": "RSA-OAEP-SHA256",
"payload_algorithm": "AES-256-GCM"
}
| Field | Encoding | Description |
|---|---|---|
version |
string | Protocol version. Never change — used for downgrade detection. |
algorithm |
string | "hybrid-aes256-rsa4096". Never change — used for downgrade detection. |
encrypted_payload.ciphertext |
base64 | AES-256-GCM encrypted payload. |
encrypted_payload.nonce |
base64 | 12-byte GCM nonce. |
encrypted_payload.tag |
base64 | 16-byte GCM authentication tag. |
encrypted_aes_key |
base64 | RSA-OAEP-SHA256 encrypted 32-byte AES key. |
key_algorithm |
string | "RSA-OAEP-SHA256" |
payload_algorithm |
string | "AES-256-GCM" |
6. Data Structures
6.1 Encrypted Request Payload (before encryption)
The dict passed to encrypt_payload() has this structure:
{
"model": "Qwen/Qwen3-0.6B",
"messages": [
{"role": "user", "content": "Hello"}
],
"temperature": 0.7,
"...": "any other OpenAI-compatible param"
}
Important: api_key is never included in the encrypted payload. It is sent only as the Authorization: Bearer HTTP header.
6.2 Response Dict (after decryption)
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1234567890,
"model": "Qwen/Qwen3-0.6B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris.",
"tool_calls": [...],
"reasoning_content": "..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
},
"_metadata": {
"payload_id": "openai-compat-abc123",
"processed_at": 1765250382,
"is_encrypted": true,
"encryption_algorithm": "hybrid-aes256-rsa4096",
"security_tier": "standard",
"memory_protection": {
"platform": "linux",
"memory_locking": true,
"secure_zeroing": true,
"core_dump_prevention": true
},
"cuda_device": {
"available": true,
"device_hash": "sha256_hex"
}
}
}
6.3 Security Tier Values
| Value | Hardware | Use case |
|---|---|---|
"standard" |
GPU | General secure inference |
"high" |
CPU/GPU | Sensitive business data |
"maximum" |
CPU only | PHI, classified data |
Sent as X-Security-Tier HTTP header. Invalid values raise ValueError.
7. Error Class Hierarchy
All errors are exceptions. APIError subclasses carry status_code and error_details.
Exception
└── APIError (base, has message/status_code/error_details)
├── AuthenticationError (status_code=401)
├── InvalidRequestError (status_code=400)
├── RateLimitError (status_code=429)
├── ForbiddenError (status_code=403)
├── ServerError (status_code=500)
└── ServiceUnavailableError (status_code=503)
Exception
└── SecurityError (crypto/key failure, no status_code)
Exception
└── APIConnectionError (network failure, no status_code)
APIError constructor:
APIError(message: str, status_code: Optional[int] = None, error_details: Optional[Dict] = None)
8. SecureMemory Module
Optional, platform-specific. Fails gracefully if unavailable (e.g. Windows on some Python builds).
SecureBuffer class
Wraps a bytearray with memory locking and guaranteed zeroing on exit.
| Attribute/Method | Type | Description |
|---|---|---|
data |
bytearray |
Underlying mutable buffer |
address |
int |
Memory address (via ctypes) |
size |
int |
Buffer size in bytes |
lock() -> bool |
method | Attempt memory lock |
unlock() -> bool |
method | Unlock memory |
zero() |
method | Securely zero contents |
__enter__ / __exit__ |
context mgr | Auto-lock on enter, auto-zero+unlock on exit |
secure_bytearray(data: bytes \| bytearray, lock: bool = True) -> SecureBuffer (context manager)
Recommended secure handling. Converts input to bytearray, locks (best-effort), yields SecureBuffer. Always zeros on exit, even on exception.
secure_bytes(data: bytes, lock: bool = True) -> SecureBuffer (context manager, deprecated)
Same as secure_bytearray but accepts immutable bytes. Emits deprecation warning. Original bytes cannot be zeroed.
get_memory_protection_info() -> Dict[str, Any]
Returns protection capabilities:
{
"enabled": true,
"platform": "linux",
"protection_level": "full",
"has_memory_locking": true,
"has_secure_zeroing": true,
"supports_full_protection": true,
"page_size": 4096
}
protection_level values: "full", "zeroing_only", "none".
disable_secure_memory() / enable_secure_memory()
Globally disable/re-enable secure memory operations.
9. Constants
| Constant | Value | Location | Notes |
|---|---|---|---|
| Protocol version | "1.0" |
SecureCompletionClient.py |
Never change — downgrade detection |
| Algorithm string | "hybrid-aes256-rsa4096" |
SecureCompletionClient.py |
Never change — downgrade detection |
| RSA key size | 4096 |
SecureCompletionClient.py |
Fixed |
| RSA public exponent | 65537 |
SecureCompletionClient.py |
Fixed |
| AES key size | 32 bytes (256-bit) |
SecureCompletionClient.py |
Per-request ephemeral |
| GCM nonce size | 12 bytes (96-bit) |
SecureCompletionClient.py |
Per-request via secrets.token_bytes |
| Max payload size | 10 * 1024 * 1024 (10 MB) |
SecureCompletionClient.py |
DoS protection |
| Default max retries | 2 |
Both client classes | Exponential backoff: 1s, 2s, 4s… |
| Private key file mode | 0o600 |
SecureCompletionClient.py |
Owner read/write only |
| Public key file mode | 0o644 |
SecureCompletionClient.py |
Owner rw, group/others r |
| Min RSA key size (validation) | 2048 |
SecureCompletionClient.py |
_validate_rsa_key |
| Valid security tiers | ["standard", "high", "maximum"] |
SecureCompletionClient.py |
Case-sensitive |
| Retryable status codes | {429, 500, 502, 503, 504} |
SecureCompletionClient.py |
|
| Package version | "0.2.7" |
pyproject.toml + __init__.py |
Bump both |
10. Endpoint URLs
| Endpoint | Method | Purpose |
|---|---|---|
{router_url}/pki/public_key |
GET | Fetch server RSA public key |
{router_url}/v1/chat/secure_completion |
POST | Encrypted chat completion |
11. Key Lifecycle
- First
create()call →_ensure_keys()runs (async, double-checked locking viaasyncio.Lock). - If
key_diris set:- Try
load_keys()from{key_dir}/private_key.pem+{key_dir}/public_key.pem. - If that fails →
generate_keys(save_to_file=True, key_dir=key_dir).
- Try
- If
key_dirisNone→generate_keys()(ephemeral, in-memory only). - Keys are reused across all subsequent calls until the client is discarded.
12. HTTP Client Details
- Uses
httpx.AsyncClientwithtimeout=60.0. - SSL verification enabled for HTTPS URLs; disabled for
http://. - Request body is raw bytes (not JSON) —
Content-Type: application/octet-stream. - Public key is URL-encoded in the
X-Public-Keyheader.
13. Memory Protection Platform Matrix
| Platform | Locking | Zeroing |
|---|---|---|
| Linux | mlock() via libc.so.6 |
memset() via libc.so.6 |
| Windows | VirtualLock() via kernel32 |
RtlZeroMemory() via ntdll + Python-level fallback |
| macOS | mlock() via libc.dylib |
memset() via libc.dylib |
| Other | No lock | Python-level byte-by-byte zeroing |
mlock may fail with EPERM (need CAP_IPC_LOCK or ulimit -l increase) — degrades to zeroing-only gracefully.