nomyo4J/TRANSLATION_REFERENCE.md
2026-04-21 17:24:11 +02:00

17 KiB
Raw Permalink Blame History

NOMYO Python Client — Translation Reference

Target: Port this library to another language. Every class, method, signature, constant, wire format, and error mapping is documented below.


1. Package Layout

File (relative to package root) Purpose
nomyo/__init__.py Public exports, version string
nomyo/nomyo.py SecureChatCompletion — OpenAI-compatible entrypoint
nomyo/SecureCompletionClient.py Key mgmt, hybrid encryption, HTTP roundtrip, retries
nomyo/SecureMemory.py Cross-platform memory locking + secure zeroing (optional, platform-specific)

Python version: >= 3.10 Build: hatchling (pyproject.toml) Dependencies: anyio, certifi, cffi, cryptography, exceptiongroup, h11, httpcore, httpx, idna, pycparser, typing_extensions


2. Public API Surface (__all__)

Export Type Source file
SecureChatCompletion class nomyo.py
SecurityError exception SecureCompletionClient.py
APIError exception (base) SecureCompletionClient.py
AuthenticationError exception (401) SecureCompletionClient.py
InvalidRequestError exception (400) SecureCompletionClient.py
APIConnectionError exception (network) SecureCompletionClient.py
ForbiddenError exception (403) SecureCompletionClient.py
RateLimitError exception (429) SecureCompletionClient.py
ServerError exception (500) SecureCompletionClient.py
ServiceUnavailableError exception (503) SecureCompletionClient.py
get_memory_protection_info function SecureMemory.py
disable_secure_memory function SecureMemory.py
enable_secure_memory function SecureMemory.py
secure_bytearray context manager SecureMemory.py
secure_bytes context manager (deprecated) SecureMemory.py
SecureBuffer class SecureMemory.py

3. SecureChatCompletion (entrypoint)

Constructor

SecureChatCompletion(
    base_url: str = "https://api.nomyo.ai",
    allow_http: bool = False,
    api_key: Optional[str] = None,
    secure_memory: bool = True,
    key_dir: Optional[str] = None,
    max_retries: int = 2
)
Param Default Description
base_url "https://api.nomyo.ai" NOMYO Router base URL. HTTPS enforced unless allow_http=True.
allow_http False Permit http:// URLs (dev only).
api_key None Bearer token for auth. Can also be passed per-call via create().
secure_memory True Enable memory locking/zeroing. Warns if unavailable.
key_dir None Directory to persist RSA keys. None = ephemeral (in-memory only).
max_retries 2 Retries on 429/500/502/503/504 + network errors. Exponential backoff: 1s, 2s, 4s…

create(model, messages, **kwargs) -> Dict[str, Any]

Async method. Returns a dict (not an object). Same signature as openai.ChatCompletion.create().

Param Type Required Description
model str yes Model identifier, e.g. "Qwen/Qwen3-0.6B"
messages List[Dict] yes OpenAI-format messages: [{"role": "user", "content": "..."}]
temperature float no 02
max_tokens int no
top_p float no
stop str | List[str] no
presence_penalty float no -2.0 to 2.0
frequency_penalty float no -2.0 to 2.0
n int no Number of completions
best_of int no
seed int no
logit_bias Dict[str, float] no
user str no
tools List[Dict] no Tool definitions passed through to llama.cpp
tool_choice str no "auto", "none", or specific tool name
response_format Dict no {"type": "json_object"} or {"type": "json_schema", ...}
stream bool no NOT supported. Server rejects with HTTP 400. Always use False.
base_url str no Per-call override (creates temp client internally).
security_tier str no "standard", "high", or "maximum". Invalid values raise ValueError.
api_key str no Per-call override of instance api_key.

Return value: Dict[str, Any] — OpenAI-compatible response dict (see §6.2).

acreate(model, messages, **kwargs) -> Dict[str, Any]

Async alias for create(). Identical behavior.


4. SecureCompletionClient (low-level)

Constructor

SecureCompletionClient(
    router_url: str = "https://api.nomyo.ai",
    allow_http: bool = False,
    secure_memory: bool = True,
    max_retries: int = 2
)

Same semantics as SecureChatCompletion constructor (maps directly to inner client).

Instance attributes

Attribute Type Description
router_url str Base URL (trailing slash stripped).
private_key rsa.RSAPrivateKey | None Loaded/generated RSA private key.
public_key_pem str | None PEM-encoded public key string.
key_size int Always 4096.
allow_http bool HTTP allowance flag.
max_retries int Retry count.
_use_secure_memory bool Whether secure memory ops are active.

generate_keys(save_to_file: bool = False, key_dir: str = "client_keys", password: Optional[str] = None) -> None

Generates a 4096-bit RSA key pair (public exponent 65537). If save_to_file=True:

  • Creates key_dir/ (mode 755).
  • Writes private_key.pem with mode 0o600.
  • Writes public_key.pem with mode 0o644.
  • If password is given, private key is encrypted with BestAvailableEncryption.

load_keys(private_key_path: str, public_key_path: Optional[str] = None, password: Optional[str] = None) -> None

Loads an RSA private key from disk. If public_key_path is omitted, derives the public key from the loaded private key. Validates key size >= 2048 bits.

fetch_server_public_key() -> str (async)

GET {router_url}/pki/public_key

  • Returns server PEM public key as string.
  • Validates it parses as a valid PEM public key.
  • Raises SecurityError if URL is not HTTPS and allow_http=False.

encrypt_payload(payload: Dict[str, Any]) -> bytes (async)

Encrypts a dict payload using hybrid encryption. Returns raw encrypted bytes (JSON package, serialized to bytes).

Encryption process:

  1. Serialize payload to JSON → bytearray.
  2. Validate size <= 10 MB.
  3. Generate 256-bit AES key via secrets.token_bytes(32)bytearray.
  4. If secure memory enabled: lock both payload and AES key in memory.
  5. Call _do_encrypt() (see below).
  6. Zero/destroy payload and AES key from memory on exit.

_do_encrypt(payload_bytes: bytes \| bytearray, aes_key: bytes \| bytearray) -> bytes (async)

Core hybrid encryption routine. This is the wire format constructor.

1. nonce = secrets.token_bytes(12)           # 96-bit GCM nonce
2. ciphertext = AES-256-GCM_encrypt(aes_key, nonce, payload_bytes)
3. tag = GCM_tag
4. server_pubkey = await fetch_server_public_key()
5. encrypted_aes_key = RSA-OAEP-SHA256_encrypt(server_pubkey, aes_key_bytes)
6. Build JSON package (see §6.1)
7. Return json.dumps(package).encode('utf-8')

decrypt_response(encrypted_response: bytes, payload_id: str) -> Dict[str, Any] (async)

Decrypts a server response.

Validation chain:

  1. Parse JSON.
  2. Check version == "1.0" — raises ValueError if mismatch.
  3. Check algorithm == "hybrid-aes256-rsa4096" — raises ValueError if mismatch.
  4. Validate encrypted_payload has ciphertext, nonce, tag.
  5. Require self.private_key is not None.
  6. Decrypt AES key: RSA-OAEP-SHA256_decrypt(private_key, encrypted_aes_key).
  7. Decrypt payload: AES-256-GCM_decrypt(aes_key, nonce, tag, ciphertext).
  8. Parse decrypted bytes as JSON → response dict.
  9. Attach _metadata if not present (see §6.2).

Any decryption failure (except JSON parse errors) raises SecurityError("Decryption failed: integrity check or authentication failed").

send_secure_request(payload, payload_id, api_key=None, security_tier=None) -> Dict[str, Any] (async)

Full request lifecycle: encrypt → HTTP POST → retry → decrypt → return.

Request headers:

Content-Type: application/octet-stream
X-Payload-ID: {payload_id}
X-Public-Key: {url_encoded_pem_public_key}
Authorization: Bearer {api_key}            (if api_key is provided)
X-Security-Tier: {tier}                    (if security_tier is provided)

POST to {router_url}/v1/chat/secure_completion with encrypted payload as body.

Retry logic:

  • Retryable status codes: {429, 500, 502, 503, 504}.
  • Backoff: 2^(attempt-1) seconds (1s, 2s, 4s…).
  • Total attempts: max_retries + 1.
  • Network errors also retry.
  • Non-retryable exceptions propagate immediately.

Status → exception mapping:

Status Exception
200 Return decrypted response dict
400 InvalidRequestError
401 AuthenticationError
403 ForbiddenError
404 APIError
429 RateLimitError
500 ServerError
503 ServiceUnavailableError
502/504 APIError (retryable)
other APIError (non-retryable)
network error APIConnectionError

5. Encryption Wire Format

The encrypted package is a JSON object sent as application/octet-stream:

{
  "version": "1.0",
  "algorithm": "hybrid-aes256-rsa4096",
  "encrypted_payload": {
    "ciphertext": "<base64>",
    "nonce": "<base64>",
    "tag": "<base64>"
  },
  "encrypted_aes_key": "<base64>",
  "key_algorithm": "RSA-OAEP-SHA256",
  "payload_algorithm": "AES-256-GCM"
}
Field Encoding Description
version string Protocol version. Never change — used for downgrade detection.
algorithm string "hybrid-aes256-rsa4096". Never change — used for downgrade detection.
encrypted_payload.ciphertext base64 AES-256-GCM encrypted payload.
encrypted_payload.nonce base64 12-byte GCM nonce.
encrypted_payload.tag base64 16-byte GCM authentication tag.
encrypted_aes_key base64 RSA-OAEP-SHA256 encrypted 32-byte AES key.
key_algorithm string "RSA-OAEP-SHA256"
payload_algorithm string "AES-256-GCM"

6. Data Structures

6.1 Encrypted Request Payload (before encryption)

The dict passed to encrypt_payload() has this structure:

{
  "model": "Qwen/Qwen3-0.6B",
  "messages": [
    {"role": "user", "content": "Hello"}
  ],
  "temperature": 0.7,
  "...": "any other OpenAI-compatible param"
}

Important: api_key is never included in the encrypted payload. It is sent only as the Authorization: Bearer HTTP header.

6.2 Response Dict (after decryption)

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "Qwen/Qwen3-0.6B",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris.",
        "tool_calls": [...],
        "reasoning_content": "..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  },
  "_metadata": {
    "payload_id": "openai-compat-abc123",
    "processed_at": 1765250382,
    "is_encrypted": true,
    "encryption_algorithm": "hybrid-aes256-rsa4096",
    "security_tier": "standard",
    "memory_protection": {
      "platform": "linux",
      "memory_locking": true,
      "secure_zeroing": true,
      "core_dump_prevention": true
    },
    "cuda_device": {
      "available": true,
      "device_hash": "sha256_hex"
    }
  }
}

6.3 Security Tier Values

Value Hardware Use case
"standard" GPU General secure inference
"high" CPU/GPU Sensitive business data
"maximum" CPU only PHI, classified data

Sent as X-Security-Tier HTTP header. Invalid values raise ValueError.


7. Error Class Hierarchy

All errors are exceptions. APIError subclasses carry status_code and error_details.

Exception
└── APIError                     (base, has message/status_code/error_details)
    ├── AuthenticationError      (status_code=401)
    ├── InvalidRequestError      (status_code=400)
    ├── RateLimitError           (status_code=429)
    ├── ForbiddenError           (status_code=403)
    ├── ServerError              (status_code=500)
    └── ServiceUnavailableError  (status_code=503)

Exception
└── SecurityError                (crypto/key failure, no status_code)

Exception
└── APIConnectionError           (network failure, no status_code)

APIError constructor:

APIError(message: str, status_code: Optional[int] = None, error_details: Optional[Dict] = None)

8. SecureMemory Module

Optional, platform-specific. Fails gracefully if unavailable (e.g. Windows on some Python builds).

SecureBuffer class

Wraps a bytearray with memory locking and guaranteed zeroing on exit.

Attribute/Method Type Description
data bytearray Underlying mutable buffer
address int Memory address (via ctypes)
size int Buffer size in bytes
lock() -> bool method Attempt memory lock
unlock() -> bool method Unlock memory
zero() method Securely zero contents
__enter__ / __exit__ context mgr Auto-lock on enter, auto-zero+unlock on exit

secure_bytearray(data: bytes \| bytearray, lock: bool = True) -> SecureBuffer (context manager)

Recommended secure handling. Converts input to bytearray, locks (best-effort), yields SecureBuffer. Always zeros on exit, even on exception.

secure_bytes(data: bytes, lock: bool = True) -> SecureBuffer (context manager, deprecated)

Same as secure_bytearray but accepts immutable bytes. Emits deprecation warning. Original bytes cannot be zeroed.

get_memory_protection_info() -> Dict[str, Any]

Returns protection capabilities:

{
  "enabled": true,
  "platform": "linux",
  "protection_level": "full",
  "has_memory_locking": true,
  "has_secure_zeroing": true,
  "supports_full_protection": true,
  "page_size": 4096
}

protection_level values: "full", "zeroing_only", "none".

disable_secure_memory() / enable_secure_memory()

Globally disable/re-enable secure memory operations.


9. Constants

Constant Value Location Notes
Protocol version "1.0" SecureCompletionClient.py Never change — downgrade detection
Algorithm string "hybrid-aes256-rsa4096" SecureCompletionClient.py Never change — downgrade detection
RSA key size 4096 SecureCompletionClient.py Fixed
RSA public exponent 65537 SecureCompletionClient.py Fixed
AES key size 32 bytes (256-bit) SecureCompletionClient.py Per-request ephemeral
GCM nonce size 12 bytes (96-bit) SecureCompletionClient.py Per-request via secrets.token_bytes
Max payload size 10 * 1024 * 1024 (10 MB) SecureCompletionClient.py DoS protection
Default max retries 2 Both client classes Exponential backoff: 1s, 2s, 4s…
Private key file mode 0o600 SecureCompletionClient.py Owner read/write only
Public key file mode 0o644 SecureCompletionClient.py Owner rw, group/others r
Min RSA key size (validation) 2048 SecureCompletionClient.py _validate_rsa_key
Valid security tiers ["standard", "high", "maximum"] SecureCompletionClient.py Case-sensitive
Retryable status codes {429, 500, 502, 503, 504} SecureCompletionClient.py
Package version "0.2.7" pyproject.toml + __init__.py Bump both

10. Endpoint URLs

Endpoint Method Purpose
{router_url}/pki/public_key GET Fetch server RSA public key
{router_url}/v1/chat/secure_completion POST Encrypted chat completion

11. Key Lifecycle

  1. First create() call_ensure_keys() runs (async, double-checked locking via asyncio.Lock).
  2. If key_dir is set:
    • Try load_keys() from {key_dir}/private_key.pem + {key_dir}/public_key.pem.
    • If that fails → generate_keys(save_to_file=True, key_dir=key_dir).
  3. If key_dir is Nonegenerate_keys() (ephemeral, in-memory only).
  4. Keys are reused across all subsequent calls until the client is discarded.

12. HTTP Client Details

  • Uses httpx.AsyncClient with timeout=60.0.
  • SSL verification enabled for HTTPS URLs; disabled for http://.
  • Request body is raw bytes (not JSON) — Content-Type: application/octet-stream.
  • Public key is URL-encoded in the X-Public-Key header.

13. Memory Protection Platform Matrix

Platform Locking Zeroing
Linux mlock() via libc.so.6 memset() via libc.so.6
Windows VirtualLock() via kernel32 RtlZeroMemory() via ntdll + Python-level fallback
macOS mlock() via libc.dylib memset() via libc.dylib
Other No lock Python-level byte-by-byte zeroing

mlock may fail with EPERM (need CAP_IPC_LOCK or ulimit -l increase) — degrades to zeroing-only gracefully.