nomyo/README.md
alpha-nerd-nomyo 5641a746b7 feat: enhance security by adding security_tier parameter and improving secure memory handling
This commit introduces the security_tier parameter to all SecureChatCompletion calls, allowing users to specify security levels (standard, high, maximum). It also refactors secure memory management in SecureCompletionClient.py to use secure_bytearray instead of secure_bytes, improves memory protection by aligning addresses to page boundaries, and enhances the secure memory locking mechanism. The README.md has been updated to document the new security_tier parameter and include missing import statements.
2026-02-03 13:59:46 +01:00

11 KiB

NOMYO Secure Python Chat Client

OpenAI-compatible secure chat client with end-to-end encryption with NOMYO Inference Endpoints

🔒 All prompts and responses are automatically encrypted and decrypted 🔑 Uses hybrid encryption (AES-256-GCM + RSA-OAEP with 4096-bit keys) 🔄 Drop-in replacement for OpenAI's ChatCompletion API

🚀 Quick Start

1. Install dependencies

pip install -r requirements.txt

2. Use the client (same API as OpenAI)

import asyncio
from nomyo import SecureChatCompletion

async def main():
    # Initialize client (defaults to http://api.nomyo.ai:12434)
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    # Simple chat completion
    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "Hello! How are you today?"}
        ],
        security_tier="standard", #optional: standard, high or maximum
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

# Run the async function
asyncio.run(main())

3. Run tests

python3 test.py

🔐 Security Features

Hybrid Encryption

  • Payload encryption: AES-256-GCM (authenticated encryption)
  • Key exchange: RSA-OAEP with SHA-256
  • Key size: 4096-bit RSA keys
  • All communication: End-to-end encrypted

Key Management

  • Automatic key generation: Keys are automatically generated on first use
  • Automatic key loading: Existing keys are loaded automatically from client_keys/ directory
  • No manual intervention required: The library handles key management automatically
  • Keys kept in memory: Active session keys are stored in memory for performance
  • Optional persistence: Keys can be saved to client_keys/ directory for reuse across sessions
  • Password protection: Optional password encryption for private keys (recommended for production)
  • Secure permissions: Private keys stored with restricted permissions (600 - owner-only access)

Secure Memory Protection

Ephemeral AES Keys

  • Per-request encryption keys: A unique AES-256 key is generated for each request
  • Automatic rotation: AES keys are never reused - a fresh key is created for every encryption operation
  • Forward secrecy: Compromise of one AES key only affects that single request
  • Secure generation: AES keys are generated using cryptographically secure random number generation (secrets.token_bytes)
  • Automatic cleanup: AES keys are zeroed from memory immediately after use
  • Automatic protection: Plaintext payloads are automatically protected during encryption
  • Prevents memory swapping: Sensitive data cannot be swapped to disk
  • Guaranteed zeroing: Memory is zeroed after encryption completes
  • Fallback mechanism: Graceful degradation if SecureMemory module unavailable

🔄 OpenAI Compatibility

The SecureChatCompletion class provides exact API compatibility with OpenAI's ChatCompletion.create() method.

Supported Parameters

All standard OpenAI parameters are supported:

  • model: Model identifier
  • messages: List of message objects
  • temperature: Sampling temperature (0-2)
  • max_tokens: Maximum tokens to generate
  • top_p: Nucleus sampling
  • frequency_penalty: Frequency penalty
  • presence_penalty: Presence penalty
  • stop: Stop sequences
  • n: Number of completions
  • stream: Streaming (not yet implemented)
  • tools: Tool definitions
  • tool_choice: Tool selection strategy
  • user: User identifier
  • And more...

Response Format

Responses follow the OpenAI format exactly, with an additional _metadata field for debugging and security information:

{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1234567890,
    "model": "Qwen/Qwen3-0.6B",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! I'm doing well, thank you for asking.",
                "tool_calls": [...]  # if tools were used
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 20,
        "total_tokens": 30
    },
    "_metadata": {
        "payload_id": "openai-compat-abc123",  # Unique identifier for this request
        "processed_at": 1765250382,  # Timestamp when server processed the request
        "is_encrypted": True,  # Indicates this response was decrypted
        "encryption_algorithm": "hybrid-aes256-rsa4096",  # Encryption method used
        "response_status": "success"  # Status of the decryption/processing
    }
}

The _metadata field contains security-related information about the encrypted communication and is automatically added to all responses.

🛠️ Usage Examples

Basic Chat

import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is the capital of France?"}
        ],
        security_tier="standard", #optional: standard, high or maximum
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

asyncio.run(main())

With Tools


import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"}
        ],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "Get weather information",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {"type": "string"}
                        },
                        "required": ["location"]
                    }
                }
            }
        ],
        security_tier="standard", #optional: standard, high or maximum
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

asyncio.run(main())

Using acreate() Alias

import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    response = await client.acreate(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "Hello!"}
        ],
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

asyncio.run(main())

📦 Dependencies

See requirements.txt for the complete list:

  • cryptography: Cryptographic primitives (RSA, AES, etc.)
  • httpx: Async HTTP client
  • anyio: Async compatibility layer

🔧 Configuration

Custom Base URL

import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://NOMYO-Pro-Router:12434")
    # ... rest of your code
    asyncio.run(main())
```### API Key Authentication

```python
import asyncio
from nomyo import SecureChatCompletion

async def main():
    # Initialize with API key (recommended for production)
    client = SecureChatCompletion(
        base_url="https://api.nomyo.ai:12434",
        api_key="your-api-key-here"
    )

    # Or pass API key in the create() method
    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "Hello!"}
        ],
        api_key="your-api-key-here"  # Overrides instance API key
    )

asyncio.run(main())

Secure Memory Configuration

import asyncio
from nomyo import SecureChatCompletion

async def main():
    # Enable secure memory protection (default, recommended)
    client = SecureChatCompletion(
        base_url="https://api.nomyo.ai:12434",
        secure_memory=True  # Default
    )

    # Disable secure memory (not recommended, for testing only)
    client = SecureChatCompletion(
        base_url="https://api.nomyo.ai:12434",
        secure_memory=False
    )

asyncio.run(main())

Key Management

Keys are automatically generated on first use.

Generate Keys Manually

import asyncio
from nomyo.SecureCompletionClient import SecureCompletionClient

async def main():
    client = SecureCompletionClient()
    await client.generate_keys(save_to_file=True, password="your-password")

asyncio.run(main())

Load Existing Keys

import asyncio
from nomyo.SecureCompletionClient import SecureCompletionClient

async def main():
    client = SecureCompletionClient()
    await client.load_keys("client_keys/private_key.pem", "client_keys/public_key.pem", password="your-password")

asyncio.run(main())

🧪 Testing

Run the comprehensive test suite:

python3 test.py

Tests verify:

  • OpenAI API compatibility
  • Basic chat completion
  • Tool usage
  • All OpenAI parameters
  • Async methods
  • Error handling

📚 API Reference

SecureChatCompletion

Constructor

SecureChatCompletion(
    base_url: str = "https://api.nomyo.ai:12434",
    allow_http: bool = False,
    api_key: Optional[str] = None,
    secure_memory: bool = True
)

Parameters:

  • base_url: Base URL of the NOMYO Router (must use HTTPS for production)
  • allow_http: Allow HTTP connections (ONLY for local development, never in production)
  • api_key: Optional API key for bearer authentication
  • secure_memory: Enable secure memory protection (default: True)

Methods

  • create(model, messages, **kwargs): Create a chat completion
  • acreate(model, messages, **kwargs): Async alias for create()

SecureCompletionClient

Constructor

SecureCompletionClient(router_url: str = "http://api.nomyo.ai:12434")

Methods

  • generate_keys(save_to_file=False, key_dir="client_keys", password=None): Generate RSA key pair
  • load_keys(private_key_path, public_key_path=None, password=None): Load keys from files
  • fetch_server_public_key(): Fetch server's public key
  • encrypt_payload(payload): Encrypt a payload
  • decrypt_response(encrypted_response, payload_id): Decrypt a response
  • send_secure_request(payload, payload_id): Send encrypted request and receive decrypted response

📝 Notes

Security Best Practices

  • Always use password protection for private keys in production
  • Keep private keys secure (permissions set to 600)
  • Never share your private key
  • Verify server's public key fingerprint before first use

Performance

  • Key generation takes ~1-2 seconds (one-time operation)
  • Encryption/decryption adds minimal overhead (~10-20ms per request)

Compatibility

  • Works with any OpenAI-compatible code
  • No changes needed to existing OpenAI client code
  • Simply replace openai.ChatCompletion.create() with SecureChatCompletion.create()

🤝 Contributing

Contributions are welcome! Please open issues or pull requests on the project repository.

📄 License

See LICENSE file for licensing information.

📞 Support

For questions or issues, please refer to the project documentation or open an issue.