OpenAI compatible secure chat client with end-to-end encryption on NOMYO Inference Endpoints
Find a file
alpha-nerd-nomyo 77084737dd fix: improve error handling and update base URL for SecureCompletionClient
Enhanced error message handling in SecureCompletionClient to provide more detailed information when unexpected status codes occur. The base URL for SecureChatCompletion has been updated.
2026-02-03 15:51:13 +01:00
nomyo fix: improve error handling and update base URL for SecureCompletionClient 2026-02-03 15:51:13 +01:00
.gitignore feat: add API key authentication support 2025-12-18 09:18:05 +01:00
LICENSE Hello World! 2025-12-17 16:03:20 +01:00
MANIFEST.in Hello World! 2025-12-17 16:03:20 +01:00
pyproject.toml Hello World! 2025-12-17 16:03:20 +01:00
README.md feat: enhance security by adding security_tier parameter and improving secure memory handling 2026-02-03 13:59:46 +01:00
requirements.txt Hello World! 2025-12-17 16:03:20 +01:00
SECURE_MEMORY.md feat: implement secure memory operations for enhanced data protection 2026-01-13 12:13:05 +01:00
SECURITY.md docs: enhancement and clarification 2026-01-17 11:12:40 +01:00
test.py feat: add API key authentication support 2025-12-18 09:18:05 +01:00

NOMYO Secure Python Chat Client

OpenAI-compatible secure chat client with end-to-end encryption with NOMYO Inference Endpoints

🔒 All prompts and responses are automatically encrypted and decrypted 🔑 Uses hybrid encryption (AES-256-GCM + RSA-OAEP with 4096-bit keys) 🔄 Drop-in replacement for OpenAI's ChatCompletion API

🚀 Quick Start

1. Install dependencies

pip install -r requirements.txt

2. Use the client (same API as OpenAI)

import asyncio
from nomyo import SecureChatCompletion

async def main():
    # Initialize client (defaults to http://api.nomyo.ai:12434)
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    # Simple chat completion
    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "Hello! How are you today?"}
        ],
        security_tier="standard", #optional: standard, high or maximum
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

# Run the async function
asyncio.run(main())

3. Run tests

python3 test.py

🔐 Security Features

Hybrid Encryption

  • Payload encryption: AES-256-GCM (authenticated encryption)
  • Key exchange: RSA-OAEP with SHA-256
  • Key size: 4096-bit RSA keys
  • All communication: End-to-end encrypted

Key Management

  • Automatic key generation: Keys are automatically generated on first use
  • Automatic key loading: Existing keys are loaded automatically from client_keys/ directory
  • No manual intervention required: The library handles key management automatically
  • Keys kept in memory: Active session keys are stored in memory for performance
  • Optional persistence: Keys can be saved to client_keys/ directory for reuse across sessions
  • Password protection: Optional password encryption for private keys (recommended for production)
  • Secure permissions: Private keys stored with restricted permissions (600 - owner-only access)

Secure Memory Protection

Ephemeral AES Keys

  • Per-request encryption keys: A unique AES-256 key is generated for each request
  • Automatic rotation: AES keys are never reused - a fresh key is created for every encryption operation
  • Forward secrecy: Compromise of one AES key only affects that single request
  • Secure generation: AES keys are generated using cryptographically secure random number generation (secrets.token_bytes)
  • Automatic cleanup: AES keys are zeroed from memory immediately after use
  • Automatic protection: Plaintext payloads are automatically protected during encryption
  • Prevents memory swapping: Sensitive data cannot be swapped to disk
  • Guaranteed zeroing: Memory is zeroed after encryption completes
  • Fallback mechanism: Graceful degradation if SecureMemory module unavailable

🔄 OpenAI Compatibility

The SecureChatCompletion class provides exact API compatibility with OpenAI's ChatCompletion.create() method.

Supported Parameters

All standard OpenAI parameters are supported:

  • model: Model identifier
  • messages: List of message objects
  • temperature: Sampling temperature (0-2)
  • max_tokens: Maximum tokens to generate
  • top_p: Nucleus sampling
  • frequency_penalty: Frequency penalty
  • presence_penalty: Presence penalty
  • stop: Stop sequences
  • n: Number of completions
  • stream: Streaming (not yet implemented)
  • tools: Tool definitions
  • tool_choice: Tool selection strategy
  • user: User identifier
  • And more...

Response Format

Responses follow the OpenAI format exactly, with an additional _metadata field for debugging and security information:

{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1234567890,
    "model": "Qwen/Qwen3-0.6B",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! I'm doing well, thank you for asking.",
                "tool_calls": [...]  # if tools were used
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 20,
        "total_tokens": 30
    },
    "_metadata": {
        "payload_id": "openai-compat-abc123",  # Unique identifier for this request
        "processed_at": 1765250382,  # Timestamp when server processed the request
        "is_encrypted": True,  # Indicates this response was decrypted
        "encryption_algorithm": "hybrid-aes256-rsa4096",  # Encryption method used
        "response_status": "success"  # Status of the decryption/processing
    }
}

The _metadata field contains security-related information about the encrypted communication and is automatically added to all responses.

🛠️ Usage Examples

Basic Chat

import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is the capital of France?"}
        ],
        security_tier="standard", #optional: standard, high or maximum
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

asyncio.run(main())

With Tools


import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"}
        ],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "Get weather information",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {"type": "string"}
                        },
                        "required": ["location"]
                    }
                }
            }
        ],
        security_tier="standard", #optional: standard, high or maximum
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

asyncio.run(main())

Using acreate() Alias

import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://api.nomyo.ai:12434")

    response = await client.acreate(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "Hello!"}
        ],
        temperature=0.7
    )

    print(response['choices'][0]['message']['content'])

asyncio.run(main())

📦 Dependencies

See requirements.txt for the complete list:

  • cryptography: Cryptographic primitives (RSA, AES, etc.)
  • httpx: Async HTTP client
  • anyio: Async compatibility layer

🔧 Configuration

Custom Base URL

import asyncio
from nomyo import SecureChatCompletion

async def main():
    client = SecureChatCompletion(base_url="https://NOMYO-Pro-Router:12434")
    # ... rest of your code
    asyncio.run(main())
```### API Key Authentication

```python
import asyncio
from nomyo import SecureChatCompletion

async def main():
    # Initialize with API key (recommended for production)
    client = SecureChatCompletion(
        base_url="https://api.nomyo.ai:12434",
        api_key="your-api-key-here"
    )

    # Or pass API key in the create() method
    response = await client.create(
        model="Qwen/Qwen3-0.6B",
        messages=[
            {"role": "user", "content": "Hello!"}
        ],
        api_key="your-api-key-here"  # Overrides instance API key
    )

asyncio.run(main())

Secure Memory Configuration

import asyncio
from nomyo import SecureChatCompletion

async def main():
    # Enable secure memory protection (default, recommended)
    client = SecureChatCompletion(
        base_url="https://api.nomyo.ai:12434",
        secure_memory=True  # Default
    )

    # Disable secure memory (not recommended, for testing only)
    client = SecureChatCompletion(
        base_url="https://api.nomyo.ai:12434",
        secure_memory=False
    )

asyncio.run(main())

Key Management

Keys are automatically generated on first use.

Generate Keys Manually

import asyncio
from nomyo.SecureCompletionClient import SecureCompletionClient

async def main():
    client = SecureCompletionClient()
    await client.generate_keys(save_to_file=True, password="your-password")

asyncio.run(main())

Load Existing Keys

import asyncio
from nomyo.SecureCompletionClient import SecureCompletionClient

async def main():
    client = SecureCompletionClient()
    await client.load_keys("client_keys/private_key.pem", "client_keys/public_key.pem", password="your-password")

asyncio.run(main())

🧪 Testing

Run the comprehensive test suite:

python3 test.py

Tests verify:

  • OpenAI API compatibility
  • Basic chat completion
  • Tool usage
  • All OpenAI parameters
  • Async methods
  • Error handling

📚 API Reference

SecureChatCompletion

Constructor

SecureChatCompletion(
    base_url: str = "https://api.nomyo.ai:12434",
    allow_http: bool = False,
    api_key: Optional[str] = None,
    secure_memory: bool = True
)

Parameters:

  • base_url: Base URL of the NOMYO Router (must use HTTPS for production)
  • allow_http: Allow HTTP connections (ONLY for local development, never in production)
  • api_key: Optional API key for bearer authentication
  • secure_memory: Enable secure memory protection (default: True)

Methods

  • create(model, messages, **kwargs): Create a chat completion
  • acreate(model, messages, **kwargs): Async alias for create()

SecureCompletionClient

Constructor

SecureCompletionClient(router_url: str = "http://api.nomyo.ai:12434")

Methods

  • generate_keys(save_to_file=False, key_dir="client_keys", password=None): Generate RSA key pair
  • load_keys(private_key_path, public_key_path=None, password=None): Load keys from files
  • fetch_server_public_key(): Fetch server's public key
  • encrypt_payload(payload): Encrypt a payload
  • decrypt_response(encrypted_response, payload_id): Decrypt a response
  • send_secure_request(payload, payload_id): Send encrypted request and receive decrypted response

📝 Notes

Security Best Practices

  • Always use password protection for private keys in production
  • Keep private keys secure (permissions set to 600)
  • Never share your private key
  • Verify server's public key fingerprint before first use

Performance

  • Key generation takes ~1-2 seconds (one-time operation)
  • Encryption/decryption adds minimal overhead (~10-20ms per request)

Compatibility

  • Works with any OpenAI-compatible code
  • No changes needed to existing OpenAI client code
  • Simply replace openai.ChatCompletion.create() with SecureChatCompletion.create()

🤝 Contributing

Contributions are welcome! Please open issues or pull requests on the project repository.

📄 License

See LICENSE file for licensing information.

📞 Support

For questions or issues, please refer to the project documentation or open an issue.