diff --git a/.gitignore b/.gitignore index 8355d64..1693057 100644 --- a/.gitignore +++ b/.gitignore @@ -7,3 +7,6 @@ coverage/ .nyc_output/ build/ *.node +settings.json +*.pem +client_keys/ \ No newline at end of file diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..24b1f8f --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,11 @@ +# Contributing + +## Development Setup + +```bash +npm install +npm run build +npm test +``` + +Node.js 18 LTS or later is required for tests and the TypeScript compiler. diff --git a/LICENSE b/LICENSE index c2366ea..83d6df7 100644 --- a/LICENSE +++ b/LICENSE @@ -186,7 +186,7 @@ same "printed page" as the copyright notice for easier identification within third-party archives. - Copyright 2025 OpenAI + Copyright 2025 NOMYO LLC Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. diff --git a/README.md b/README.md index 7c9bdfb..e48516f 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,13 @@ -# NOMYO.js - Secure JavaScript Chat Client +# NOMYO.js โ€” Secure JavaScript Chat Client **OpenAI-compatible secure chat client with end-to-end encryption for NOMYO Inference Endpoints** -๐Ÿ”’ **All prompts and responses are automatically encrypted and decrypted** -๐Ÿ”‘ **Uses hybrid encryption (AES-256-GCM + RSA-OAEP with 4096-bit keys)** -๐Ÿ”„ **Drop-in replacement for OpenAI's ChatCompletion API** -๐ŸŒ **Works in both Node.js and browsers** +- All prompts and responses are automatically encrypted and decrypted +- Hybrid encryption: AES-256-GCM payload + RSA-OAEP-SHA256 key exchange, 4096-bit keys +- Drop-in replacement for OpenAI's ChatCompletion API +- Works in both Node.js and browsers -## ๐Ÿš€ Quick Start +## Quick Start ### Installation @@ -20,330 +20,323 @@ npm install nomyo-js ```javascript import { SecureChatCompletion } from 'nomyo-js'; -// Initialize client (defaults to https://api.nomyo.ai:12434) const client = new SecureChatCompletion({ - baseUrl: 'https://api.nomyo.ai:12434' + apiKey: process.env.NOMYO_API_KEY, }); -// Simple chat completion const response = await client.create({ model: 'Qwen/Qwen3-0.6B', - messages: [ - { role: 'user', content: 'Hello! How are you today?' } - ], - temperature: 0.7 + messages: [{ role: 'user', content: 'Hello!' }], + temperature: 0.7, }); console.log(response.choices[0].message.content); +client.dispose(); ``` ### Basic Usage (Browser) ```html - - - - - - -

NOMYO Secure Chat

- - + console.log(response.choices[0].message.content); + ``` -## ๐Ÿ” Security Features +## Documentation + +Full documentation is in the [`doc/`](doc/) directory: + +- [Getting Started](doc/getting-started.md) โ€” walkthrough for new users +- [API Reference](doc/api-reference.md) โ€” complete constructor options, methods, types, and error classes +- [Models](doc/models.md) โ€” available models and selection guide +- [Security Guide](doc/security-guide.md) โ€” encryption, memory protection, key management, compliance +- [Rate Limits](doc/rate-limits.md) โ€” limits, automatic retry behaviour, batch throttling +- [Examples](doc/examples.md) โ€” 12+ code examples for common scenarios +- [Troubleshooting](doc/troubleshooting.md) โ€” error reference and debugging tips + +## Security Features ### Hybrid Encryption --**Payload encryption**: AES-256-GCM (authenticated encryption) -- **Key exchange**: RSA-OAEP with SHA-256 +- **Payload encryption**: AES-256-GCM (authenticated encryption) +- **Key exchange**: RSA-OAEP-SHA256 - **Key size**: 4096-bit RSA keys -- **All communication**: End-to-end encrypted +- **Scope**: All communication is end-to-end encrypted ### Key Management -- **Automatic key generation**: Keys are automatically generated on first use -- **Automatic key loading**: Existing keys are loaded automatically from `client_keys/` directory (Node.js only) -- **No manual intervention required**: The library handles key management automatically -- **Optional persistence**: Keys can be saved to `client_keys/` directory for reuse across sessions (Node.js only) -- **Password protection**: Optional password encryption for private keys (recommended for production) -- **Secure permissions**: Private keys stored with restricted permissions (600 - owner-only access) +- **Automatic**: Keys are generated on first use and saved to `keyDir` (default: `client_keys/`). Existing keys are reloaded on subsequent runs. Node.js only. +- **Password protection**: Optional AES-encrypted private key files (minimum 8 characters). +- **Secure permissions**: Private key files saved at `0600` (owner-only). +- **Auto-rotation**: Keys rotate every 24 hours by default (configurable via `keyRotationInterval`). +- **Explicit lifecycle**: Call `dispose()` to zero in-memory key material and stop the rotation timer. -### Secure Memory Protection +### Secure Memory -> [!NOTE] -> **Pure JavaScript Implementation**: This version uses pure JavaScript with immediate memory zeroing. -> OS-level memory locking (`mlock`) is NOT available without a native addon. -> For enhanced security in production, consider implementing the optional native addon (see `native/` directory). +The library wraps all intermediate sensitive buffers (AES keys, plaintext payload, decrypted bytes) in `SecureByteContext`, which zeroes them in a `finally` block immediately after use. -- **Automatic cleanup**: Sensitive data is zeroed from memory immediately after use -- **Best-effort protection**: Minimizes exposure time of sensitive data -- **Fallback mechanism**: Graceful degradation if enhanced security is unavailable - -## ๐Ÿ”„ OpenAI Compatibility - -The `SecureChatCompletion` class provides **exact API compatibility** with OpenAI's `ChatCompletion.create()` method. - -### Supported Parameters - -All standard OpenAI parameters are supported: - -- `model`: Model identifier -- `messages`: List of message objects -- `temperature`: Sampling temperature (0-2) -- `max_tokens`: Maximum tokens to generate -- `top_p`: Nucleus sampling -- `frequency_penalty`: Frequency penalty -- `presence_penalty`: Presence penalty -- `stop`: Stop sequences -- `n`: Number of completions -- `tools`: Tool definitions -- `tool_choice`: Tool selection strategy -- `user`: User identifier - -### Response Format - -Responses follow the OpenAI format exactly, with an additional `_metadata` field for debugging and security information: +Pure JavaScript cannot lock pages to prevent OS swapping (`mlock`). For environments where swap-file exposure is unacceptable, install the optional `nomyo-native` addon. Check the current protection level: ```javascript -{ - "id": "chatcmpl-123", - "object": "chat.completion", - "created": 1234567890, - "model": "Qwen/Qwen3-0.6B", - "choices": [ - { - "index": 0, - "message": { - "role": "assistant", - "content": "Hello! I'm doing well, thank you for asking." - }, - "finish_reason": "stop" - } - ], - "usage": { - "prompt_tokens": 10, - "completion_tokens": 20, - "total_tokens": 30 - }, - "_metadata": { - "payload_id": "openai-compat-abc123", - "processed_at": 1765250382, - "is_encrypted": true, - "encryption_algorithm": "hybrid-aes256-rsa4096", - "response_status": "success" - } +import { getMemoryProtectionInfo } from 'nomyo-js'; + +const info = getMemoryProtectionInfo(); +// Without addon: { method: 'zero-only', canLock: false } +// With addon: { method: 'mlock', canLock: true } +``` + +### Security Tiers + +Pass `security_tier` per request to route inference to increasingly isolated hardware: + +| Tier | Hardware | Use case | +|------|----------|----------| +| `"standard"` | GPU | General secure inference | +| `"high"` | CPU/GPU balanced | Sensitive business data | +| `"maximum"` | CPU only | HIPAA PHI, classified data | + +```javascript +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Patient record summary...' }], + security_tier: 'maximum', +}); +``` + +## Usage Examples + +### With API Key + +```javascript +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); +``` + +### Error Handling + +```javascript +import { + SecureChatCompletion, + AuthenticationError, + RateLimitError, + ForbiddenError, +} from 'nomyo-js'; + +try { + const response = await client.create({ model: 'Qwen/Qwen3-0.6B', messages: [...] }); +} catch (err) { + if (err instanceof AuthenticationError) console.error('Check API key:', err.message); + else if (err instanceof RateLimitError) console.error('Rate limit hit:', err.message); + else if (err instanceof ForbiddenError) console.error('Model/tier mismatch:', err.message); + else throw err; } ``` -## ๐Ÿ› ๏ธ Usage Examples +### Per-Request Router Override -### Basic Chat +Send a single request to a different router without changing the main client: ```javascript -import { SecureChatCompletion } from 'nomyo-js'; - -const client = new SecureChatCompletion({ - baseUrl: 'https://api.nomyo.ai:12434' -}); - const response = await client.create({ - model: 'Qwen/Qwen3-0.6B', - messages: [ - { role: 'system', content: 'You are a helpful assistant.' }, - { role: 'user', content: 'What is the capital of France?' } - ], - temperature: 0.7 + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello from secondary router' }], + base_url: 'https://secondary.nomyo.ai:12435', // temporary โ€” main client unchanged }); - -console.log(response.choices[0].message.content); ``` -### With Tools +### Tool / Function Calling ```javascript const response = await client.create({ - model: 'Qwen/Qwen3-0.6B', - messages: [ - { role: 'user', content: "What's the weather in Paris?" } - ], + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: "What's the weather in Paris?" }], tools: [ { type: 'function', function: { - name: 'get_weather', - description: 'Get weather information', + name: 'get_weather', + description: 'Get weather information for a location', parameters: { type: 'object', - properties: { - location: { type: 'string' } - }, - required: ['location'] - } - } - } - ] + properties: { location: { type: 'string' } }, + required: ['location'], + }, + }, + }, + ], + tool_choice: 'auto', }); ``` -### With API Key Authentication +### Thinking Models ```javascript -const client = new SecureChatCompletion({ - baseUrl: 'https://api.nomyo.ai:12434', - apiKey: 'your-api-key-here' -}); - -// API key will be automatically included in all requests const response = await client.create({ - model: 'Qwen/Qwen3-0.6B', - messages: [ - { role: 'user', content: 'Hello!' } - ] + model: 'LiquidAI/LFM2.5-1.2B-Thinking', + messages: [{ role: 'user', content: 'Is 9.9 larger than 9.11?' }], }); + +const { content, reasoning_content } = response.choices[0].message; +console.log('Reasoning:', reasoning_content); +console.log('Answer:', content); ``` -### Custom Key Management (Node.js) +### Resource Management ```javascript -import { SecureCompletionClient } from 'nomyo-js'; +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); -const client = new SecureCompletionClient({ - routerUrl: 'https://api.nomyo.ai:12434' -}); - -// Generate keys with password protection -await client.generateKeys({ - saveToFile: true, - keyDir: 'client_keys', - password: 'your-secure-password' -}); - -// Or load existing keys -await client.loadKeys( - 'client_keys/private_key.pem', - 'client_keys/public_key.pem', - 'your-secure-password' -); +try { + const response = await client.create({ model: 'Qwen/Qwen3-0.6B', messages: [...] }); + console.log(response.choices[0].message.content); +} finally { + client.dispose(); // zeros key material, stops rotation timer +} ``` -## ๐Ÿงช Platform Support - -### Node.js - -- **Minimum version**: Node.js 15+ (for `crypto.webcrypto`) -- **Recommended**: Node.js 18 LTS or later -- **Key storage**: File system (`client_keys/` directory) -- **Security**: Full implementation with automatic key persistence - -### Browsers - -- **Supported browsers**: Modern browsers with Web Crypto API support - - Chrome 37+ - - Firefox 34+ - - Safari 11+ - - Edge 79+ -- **Key storage**: In-memory only (keys not persisted for security) -- **Security**: Best-effort memory protection (no OS-level locking) - -## ๐Ÿ“š API Reference - -### SecureChatCompletion - -#### Constructor - -```typescript -new SecureChatCompletion(config?: { - baseUrl?: string; // Default: 'https://api.nomyo.ai:12434' - allowHttp?: boolean; // Default: false - apiKey?: string; // Default: undefined - secureMemory?: boolean; // Default: true -}) -``` - -#### Methods - -- `create(request: ChatCompletionRequest): Promise` -- `acreate(request: ChatCompletionRequest): Promise` (alias) - -### SecureCompletionClient - -Lower-level API for advanced use cases. - -#### Constructor - -```typescript -new SecureCompletionClient(config?: { - routerUrl?: string; // Default: 'https://api.nomyo.ai:12434' - allowHttp?: boolean; // Default: false - secureMemory?: boolean; // Default: true - keySize?: 2048 | 4096; // Default: 4096 -}) -``` - -#### Methods - -- `generateKeys(options?: KeyGenOptions): Promise` -- `loadKeys(privateKeyPath: string, publicKeyPath?: string, password?: string): Promise` -- `fetchServerPublicKey(): Promise` -- `encryptPayload(payload: object): Promise` -- `decryptResponse(encrypted: ArrayBuffer, payloadId: string): Promise` -- `sendSecureRequest(payload: object, payloadId: string, apiKey?: string): Promise` - -## ๐Ÿ”ง Configuration - ### Local Development (HTTP) ```javascript const client = new SecureChatCompletion({ - baseUrl: 'http://localhost:12434', - allowHttp: true // Required for HTTP connections + baseUrl: 'http://localhost:12435', + allowHttp: true, // required โ€” also prints a visible warning }); ``` -โš ๏ธ **Warning**: Only use HTTP for local development. Never use in production! +## API Reference -### Disable Secure Memory +### `SecureChatCompletion` โ€” Constructor Options -```javascript -const client = new SecureChatCompletion({ - baseUrl: 'https://api.nomyo.ai:12434', - secureMemory: false // Disable memory protection (not recommended) -}); +```typescript +new SecureChatCompletion(config?: ChatCompletionConfig) ``` -## ๐Ÿ“ Security Best Practices +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `baseUrl` | `string` | `'https://api.nomyo.ai'` | NOMYO router URL. Must be HTTPS in production. | +| `allowHttp` | `boolean` | `false` | Allow HTTP connections. Local development only. | +| `apiKey` | `string` | `undefined` | Bearer token for `Authorization` header. | +| `secureMemory` | `boolean` | `true` | Zero sensitive buffers immediately after use. | +| `timeout` | `number` | `60000` | Request timeout in milliseconds. | +| `debug` | `boolean` | `false` | Print verbose logging to the console. | +| `keyDir` | `string` | `'client_keys'` | Directory to load/save RSA keys on startup. | +| `keyRotationInterval` | `number` | `86400000` | Auto-rotate keys every N ms. `0` disables rotation. | +| `keyRotationDir` | `string` | `'client_keys'` | Directory for rotated key files. Node.js only. | +| `keyRotationPassword` | `string` | `undefined` | Password for encrypted rotated key files. | +| `maxRetries` | `number` | `2` | Extra retry attempts on 429/5xx/network errors. Exponential backoff (1 s, 2 s, โ€ฆ). | -- โœ… Always use HTTPS in production -- โœ… Use password protection for private keys (Node.js) -- โœ… Keep private keys secure (permissions set to 600) -- โœ… Never share your private key -- โœ… Verify server's public key fingerprint before first use -- โœ… Enable secure memory protection (default) +#### Methods -## ๐Ÿค Contributing +- `create(request): Promise` โ€” send an encrypted chat completion +- `acreate(request): Promise` โ€” alias for `create()` +- `dispose(): void` โ€” zero key material and stop the rotation timer -Contributions are welcome! Please open issues or pull requests on the project repository. +#### `create()` Request Fields -## ๐Ÿ“„ License +All standard OpenAI fields (`model`, `messages`, `temperature`, `top_p`, `max_tokens`, `stop`, `n`, `tools`, `tool_choice`, `user`, `frequency_penalty`, `presence_penalty`, `logit_bias`) plus: -See LICENSE file for licensing information. +| Field | Description | +|-------|-------------| +| `security_tier` | `"standard"` \| `"high"` \| `"maximum"` โ€” hardware isolation level | +| `api_key` | Per-request API key override | +| `base_url` | Per-request router URL override โ€” creates a temporary client, used once, then disposed | -## ๐Ÿ“ž Support +### `SecureCompletionClient` โ€” Constructor Options -For questions or issues, please refer to the project documentation or open an issue. +Lower-level client. All options above apply, with these differences: + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `routerUrl` | `string` | `'https://api.nomyo.ai'` | Base URL (`baseUrl` is renamed here) | +| `keySize` | `2048 \| 4096` | `4096` | RSA modulus length | + +#### Methods + +- `generateKeys(options?)` โ€” generate a new RSA key pair +- `loadKeys(privateKeyPath, publicKeyPath?, password?)` โ€” load existing PEM files +- `fetchServerPublicKey()` โ€” fetch the server's RSA public key +- `encryptPayload(payload)` โ€” encrypt a request payload +- `decryptResponse(encrypted, payloadId)` โ€” decrypt a response body +- `sendSecureRequest(payload, payloadId, apiKey?, securityTier?)` โ€” full encrypt โ†’ POST โ†’ decrypt cycle +- `dispose()` โ€” zero key material and stop rotation timer + +### Secure Memory Public API + +```typescript +import { + getMemoryProtectionInfo, + disableSecureMemory, + enableSecureMemory, + SecureByteContext, +} from 'nomyo-js'; +``` + +| Export | Description | +|--------|-------------| +| `getMemoryProtectionInfo()` | Returns `{ method, canLock, isPlatformSecure, details? }` | +| `disableSecureMemory()` | Disable global secure-memory zeroing | +| `enableSecureMemory()` | Re-enable global secure-memory zeroing | +| `SecureByteContext` | Low-level buffer wrapper โ€” zeros in `finally` block | + +### Error Classes + +```typescript +import { + AuthenticationError, InvalidRequestError, RateLimitError, + ForbiddenError, ServerError, ServiceUnavailableError, + APIConnectionError, SecurityError, DisposedError, APIError, +} from 'nomyo-js'; +``` + +| Class | HTTP | Thrown when | +|-------|------|-------------| +| `AuthenticationError` | 401 | Invalid or missing API key | +| `InvalidRequestError` | 400 | Malformed request | +| `ForbiddenError` | 403 | Model not allowed for the security tier | +| `RateLimitError` | 429 | Rate limit exceeded (after all retries) | +| `ServerError` | 500 | Internal server error (after all retries) | +| `ServiceUnavailableError` | 503 | Backend unavailable (after all retries) | +| `APIError` | varies | Other HTTP errors | +| `APIConnectionError` | โ€” | Network failure or timeout (after all retries) | +| `SecurityError` | โ€” | HTTPS not used, header injection, or crypto failure | +| `DisposedError` | โ€” | Method called after `dispose()` | + +## Platform Support + +### Node.js + +- **Minimum**: Node.js 14.17+ +- **Recommended**: Node.js 18 LTS or later +- **Key storage**: File system (`keyDir` directory, default `client_keys/`) + +### Browsers + +- **Supported**: Chrome 37+, Firefox 34+, Safari 11+, Edge 79+ +- **Key storage**: In-memory only (not persisted) +- **Limitation**: File-based key operations (`keyDir`, `loadKeys`) are not available + +## Security Best Practices + +- Always use HTTPS (`allowHttp` is `false` by default) +- Load API key from an environment variable, never hardcode it +- Use password-protected key files (`keyRotationPassword`) +- Store keys outside the project directory and outside version control +- Add `client_keys/` and `*.pem` to `.gitignore` +- Call `dispose()` when the client is no longer needed +- Use `security_tier: 'maximum'` for HIPAA PHI or classified data +- Consider the `nomyo-native` addon if swap-file exposure is unacceptable + +## License + +See LICENSE file. diff --git a/doc/README.md b/doc/README.md new file mode 100644 index 0000000..8d2e4d9 --- /dev/null +++ b/doc/README.md @@ -0,0 +1,49 @@ +# NOMYO.js Documentation + +Comprehensive documentation for the NOMYO secure JavaScript/TypeScript chat client โ€” a drop-in replacement for OpenAI's `ChatCompletion` API with end-to-end encryption. + +To use this library you need an active subscription on [NOMYO Inference](https://chat.nomyo.ai/). + +## Quick Start + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello!' }], + security_tier: 'standard', +}); + +console.log(response.choices[0].message.content); +``` + +## Documentation + +1. [Installation](installation.md) โ€” npm, CDN, and native addon setup +2. [Getting Started](getting-started.md) โ€” first request, auth, security tiers, error handling +3. [API Reference](api-reference.md) โ€” complete constructor options, methods, and types +4. [Models](models.md) โ€” available models and selection guidance +5. [Security Guide](security-guide.md) โ€” encryption architecture, best practices, and compliance +6. [Rate Limits](rate-limits.md) โ€” request limits, burst behaviour, and retry strategy +7. [Examples](examples.md) โ€” real-world scenarios, browser usage, and advanced patterns +8. [Troubleshooting](troubleshooting.md) โ€” common errors and their fixes + +--- + +## Key Features + +- **End-to-end encryption** โ€” AES-256-GCM + RSA-OAEP-4096. No plaintext ever leaves your process. +- **OpenAI-compatible API** โ€” `create()` / `acreate()` accept the same parameters as the OpenAI SDK. +- **Browser + Node.js** โ€” single package, separate entry points for each runtime. +- **Automatic key management** โ€” keys are generated on first use and optionally persisted to disk (Node.js). +- **Automatic key rotation** โ€” RSA keys rotate on a configurable interval (default 24 h) to limit fingerprint lifetime. +- **Security tiers** โ€” per-request routing to `standard`, `high`, or `maximum` isolation hardware. +- **Retry with exponential backoff** โ€” automatic retries on 429 / 5xx / network errors (configurable). +- **Resource lifecycle** โ€” `dispose()` immediately zeros in-memory key material and stops the rotation timer. + +## Technical Security Docs + +For cryptographic architecture, threat model, and implementation status see [SECURITY.md](SECURITY.md). diff --git a/docs/SECURITY.md b/doc/SECURITY.md similarity index 85% rename from docs/SECURITY.md rename to doc/SECURITY.md index eef7e37..e229b6a 100644 --- a/docs/SECURITY.md +++ b/doc/SECURITY.md @@ -53,18 +53,28 @@ NOMYO.js implements end-to-end encryption for OpenAI-compatible chat completions - Automatic key generation on first use - File-based persistence (Node.js) - In-memory keys (browsers) - - Password protection via PBKDF2 + AES-256-CBC + - Password protection via PBKDF2 + AES-256-CBC (minimum 8-character password enforced) + - Automatic periodic key rotation (default: 24 hours, configurable, or disabled with `keyRotationInterval: 0`) + - `dispose()` method severs in-memory key references and cancels the rotation timer 4. **Transport Security** - - HTTPS enforcement (with warnings for HTTP) + - HTTPS enforcement using proper URL parsing (`new URL()`) โ€” not string prefix matching - Certificate validation (browsers/Node.js) - Optional HTTP for local development (explicit opt-in) + - API key validated to reject CR/LF characters (prevents HTTP header injection) + - Server error detail truncated to 100 printable characters (prevents log injection) 5. **Memory Protection (Pure JavaScript)** - Immediate zeroing of sensitive buffers - - Context managers for automatic cleanup + - Context managers for automatic cleanup (`SecureByteContext`) with guarded `finally` blocks + - Intermediate crypto buffers (password bytes, salt, IV) wrapped in `SecureByteContext` during key encryption + - HTTP request body (`ArrayBuffer`) zeroed after data is handed to the socket - Best-effort memory management +6. **Response Integrity** + - Decrypted response validated against required `ChatCompletionResponse` schema fields before use + - Generic error messages from all crypto operations (no internal engine details leaked) + ### โš ๏ธ Limitations (Pure JavaScript) 1. **No OS-Level Memory Locking** @@ -94,9 +104,10 @@ NOMYO.js implements end-to-end encryption for OpenAI-compatible chat completions โœ… **DO:** - Use HTTPS in production (enforced by default) - Enable secure memory protection (default: `secureMemory: true`) -- Use password-protected private keys in Node.js +- Use password-protected private keys in Node.js (minimum 8 characters) - Set private key file permissions to 600 (owner-only) -- Rotate keys periodically +- Rely on automatic key rotation (`keyRotationInterval`, default 24h) to limit fingerprint lifetime +- Call `dispose()` when the client is no longer needed - Validate server public key fingerprint on first use โŒ **DON'T:** @@ -132,7 +143,7 @@ const client = new SecureChatCompletion({ baseUrl: 'https://...' }); # .env file (never commit to git) NOMYO_API_KEY=your-api-key NOMYO_KEY_PASSWORD=your-key-password -NOMYO_SERVER_URL=https://api.nomyo.ai:12434 +NOMYO_SERVER_URL=https://api.nomyo.ai ``` --- @@ -248,9 +259,12 @@ class SecureByteContext { try { return await fn(this.data); } finally { - // Always zero, even if exception occurs + // Always zero, even if exception occurs. + // zeroMemory failure is swallowed so it cannot mask the original error. if (this.useSecure) { - new Uint8Array(this.data).fill(0); + try { + this.secureMemory.zeroMemory(this.data); + } catch (_zeroErr) { /* intentional */ } } } } @@ -328,6 +342,11 @@ npm install nomyo-native โœ… **Timing Attacks (Partial)** - Web Crypto API uses constant-time operations - No length leakage in comparisons +- Generic error messages from all crypto operations (RSA, AES) โ€” internal engine errors not forwarded + +โœ… **Concurrent Key Generation Race** +- Promise-chain mutex serialises all `ensureKeys()` callers +- No risk of multiple simultaneous key generations overwriting each other โœ… **Key Compromise (Forward Secrecy)** - Ephemeral AES keys diff --git a/doc/api-reference.md b/doc/api-reference.md new file mode 100644 index 0000000..d49e702 --- /dev/null +++ b/doc/api-reference.md @@ -0,0 +1,272 @@ +# API Reference + +## `SecureChatCompletion` + +High-level OpenAI-compatible client. The recommended entry point for most use cases. + +### Constructor + +```typescript +new SecureChatCompletion(config?: ChatCompletionConfig) +``` + +#### `ChatCompletionConfig` + + +| Option | Type | Default | Description | +| ----------------------- | ----------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `baseUrl` | `string` | `'https://api.nomyo.ai'` | NOMYO router URL. Must be HTTPS in production. | +| `allowHttp` | `boolean` | `false` | Allow HTTP connections.**Local development only.** | +| `apiKey` | `string` | `undefined` | Bearer token sent in`Authorization` header. | +| `secureMemory` | `boolean` | `true` | Enable immediate zeroing of sensitive buffers after use. | +| `timeout` | `number` | `60000` | Request timeout in milliseconds. | +| `debug` | `boolean` | `false` | Print verbose logging to the console. | +| `keyDir` | `string` | `'client_keys'` | Directory to load/save RSA keys on startup. If the directory contains an existing key pair it is loaded; otherwise a new pair is generated and saved there. Node.js only. | +| `keyRotationInterval` | `number` | `86400000` (24 h) | Auto-rotate RSA keys every N milliseconds. Set to`0` to disable. | +| `keyRotationDir` | `string` | `'client_keys'` | Directory where rotated key files are saved. Node.js only. | +| `keyRotationPassword` | `string` | `undefined` | Password used to encrypt rotated key files. | +| `maxRetries` | `number` | `2` | Maximum extra attempts on retryable errors (429, 500, 502, 503, 504, network errors). Uses exponential backoff (1 s, 2 s, โ€ฆ). Set to`0` to disable retries. | + +### Methods + +#### `create(request): Promise` + +Send an encrypted chat completion request. Returns the decrypted response. + +```typescript +async create(request: ChatCompletionRequest): Promise +``` + +**`ChatCompletionRequest` fields:** + + +| Field | Type | Description | +| --------------------- | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | +| `model` | `string` | **Required.** Model ID (see [Models](models.md)). | +| `messages` | `Message[]` | **Required.** Conversation history. | +| `temperature` | `number` | Sampling temperature (0โ€“2). | +| `top_p` | `number` | Nucleus sampling. | +| `max_tokens` | `number` | Maximum tokens to generate. | +| `stop` | `string | string[]` | Stop sequences. | +| `n` | `number` | Number of completions to generate. | +| `stream` | `boolean` | Ignored server-side (encryption requires full response). | +| `presence_penalty` | `number` | Presence penalty (โˆ’2.0โ€“2.0). | +| `frequency_penalty` | `number` | Frequency penalty (โˆ’2.0โ€“2.0). | +| `logit_bias` | `Record` | Token bias map. | +| `user` | `string` | End-user identifier (passed through). | +| `tools` | `Tool[]` | Tool/function definitions. | +| `tool_choice` | `ToolChoice` | Tool selection strategy (`"auto"`, `"none"`, `"required"`, or specific tool). | +| `security_tier` | `string` | NOMYO-specific.`"standard"` \| `"high"` \| `"maximum"`. Not encrypted into the payload. | +| `api_key` | `string` | NOMYO-specific. Per-request API key override. Not encrypted into the payload. | +| `base_url` | `string` | NOMYO-specific. Per-request router URL override. Creates a temporary client for this one call. Not encrypted into the payload. | + +**Response shape (`ChatCompletionResponse`):** + +```typescript +{ + id: string; + object: 'chat.completion'; + created: number; + model: string; + choices: Array<{ + index: number; + message: { + role: string; + content: string; + tool_calls?: ToolCall[]; // present if tools were invoked + reasoning_content?: string; // chain-of-thought (Qwen3, DeepSeek-R1, etc.) + }; + finish_reason: 'stop' | 'length' | 'tool_calls' | 'content_filter' | null; + }>; + usage?: { + prompt_tokens: number; + completion_tokens: number; + total_tokens: number; + }; + _metadata?: { + payload_id: string; // echoes the X-Payload-ID sent with the request + processed_at: number; // Unix timestamp of server-side processing + is_encrypted: boolean; // always true for this endpoint + encryption_algorithm: string; // e.g. "hybrid-aes256-rsa4096" + response_status: string; // "success" on success + security_tier?: string; // active tier used by the server + memory_protection?: { + platform: string; + memory_locking: boolean; + secure_zeroing: boolean; + core_dump_prevention: boolean; + }; + cuda_device?: { + available: boolean; + device_hash: string; // SHA-256 of device name (not the raw name) + }; + }; +} +``` + +#### `acreate(request): Promise` + +Alias for `create()`. Provided for code that follows the OpenAI SDK naming convention. + +#### `dispose(): void` + +Stop the key-rotation timer and sever in-memory RSA key references so they can be garbage-collected. After calling `dispose()`, all methods throw `DisposedError`. + +```javascript +client.dispose(); +``` + +--- + +## `SecureCompletionClient` + +Lower-level client that exposes key management and individual encryption/decryption operations. +Use this when you need fine-grained control; for most use cases prefer `SecureChatCompletion`. + +### Constructor + +```typescript +new SecureCompletionClient(config?: ClientConfig) +``` + +#### `ClientConfig` + +All options from `ChatCompletionConfig`, plus: + + +| Option | Type | Default | Description | +| ------------- | --------------- | -------------------------------- | --------------------------------------------------------------- | +| `routerUrl` | `string` | `'https://api.nomyo.ai'` | NOMYO router base URL. | +| `keySize` | `2048 | 4096` | `4096` | RSA modulus length. 2048 is accepted but 4096 is recommended. | + +(`baseUrl` is renamed to `routerUrl` at this level; all other options are identical.) + +### Methods + +#### `generateKeys(options?): Promise` + +Generate a fresh RSA key pair. + +```typescript +await client.generateKeys({ + keySize?: 2048 | 4096, // default: 4096 + saveToFile?: boolean, // default: false + keyDir?: string, // default: 'client_keys' + password?: string, // minimum 8 characters if provided +}); +``` + +#### `loadKeys(privateKeyPath, publicKeyPath?, password?): Promise` + +Load an existing key pair from PEM files. Node.js only. + +```typescript +await client.loadKeys( + 'client_keys/private_key.pem', + 'client_keys/public_key.pem', // optional; derived from private key path if omitted + 'your-password' // required if private key is encrypted +); +``` + +#### `fetchServerPublicKey(): Promise` + +Fetch the server's RSA public key from `/pki/public_key` over HTTPS. Called automatically on every encryption; exposed for diagnostics. + +#### `encryptPayload(payload): Promise` + +Encrypt a request payload. Returns the encrypted binary package ready to POST. + +#### `decryptResponse(encrypted, payloadId): Promise` + +Decrypt a response body received from the secure endpoint. + +#### `sendSecureRequest(payload, payloadId, apiKey?, securityTier?): Promise` + +Full encrypt โ†’ POST โ†’ decrypt cycle with retry logic. Called internally by `SecureChatCompletion.create()`. + +#### `dispose(): void` + +Same as `SecureChatCompletion.dispose()`. + +--- + +## Secure Memory API + +```typescript +import { + getMemoryProtectionInfo, + disableSecureMemory, + enableSecureMemory, + SecureByteContext, +} from 'nomyo-js'; +``` + +### `getMemoryProtectionInfo(): ProtectionInfo` + +Returns information about the memory protection available on the current platform: + +```typescript +interface ProtectionInfo { + canLock: boolean; // true if mlock is available (requires native addon) + isPlatformSecure: boolean; + method: 'mlock' | 'zero-only' | 'none'; + details?: string; +} +``` + +### `disableSecureMemory(): void` + +Disable secure-memory zeroing globally. Affects new `SecureByteContext` instances that do not pass an explicit `useSecure` argument. Existing client instances are unaffected (they pass `useSecure` explicitly). + +### `enableSecureMemory(): void` + +Re-enable secure memory operations globally. + +### `SecureByteContext` + +Low-level context manager that zeros an `ArrayBuffer` in a `finally` block even if an exception occurs. Analogous to Python's `secure_bytearray()` context manager. + +```typescript +const context = new SecureByteContext(sensitiveBuffer); +const result = await context.use(async (data) => { + return doSomethingWith(data); +}); +// sensitiveBuffer is zeroed here regardless of whether doSomethingWith threw +``` + +--- + +## Error Classes + +All errors are exported from the package root. + +```typescript +import { + APIError, + AuthenticationError, + InvalidRequestError, + RateLimitError, + ForbiddenError, + ServerError, + ServiceUnavailableError, + APIConnectionError, + SecurityError, + DisposedError, +} from 'nomyo-js'; +``` + + +| Class | HTTP status | Thrown when | +| --------------------------- | ------------- | -------------------------------------------------------------- | +| `AuthenticationError` | 401 | Invalid or missing API key | +| `InvalidRequestError` | 400 | Malformed request (e.g. streaming requested) | +| `ForbiddenError` | 403 | Model not allowed for the requested security tier | +| `RateLimitError` | 429 | Rate limit exceeded (after all retries exhausted) | +| `ServerError` | 500 | Internal server error (after all retries exhausted) | +| `ServiceUnavailableError` | 503 | Inference backend unavailable (after all retries exhausted) | +| `APIError` | varies | Other HTTP errors (404, 502, 504, etc.) | +| `APIConnectionError` | โ€” | Network failure or timeout (after all retries exhausted) | +| `SecurityError` | โ€” | HTTPS not used, header injection detected, or crypto failure | +| `DisposedError` | โ€” | Method called after`dispose()` | + +All errors that extend `APIError` expose `statusCode?: number` and `errorDetails?: object`. diff --git a/doc/examples.md b/doc/examples.md new file mode 100644 index 0000000..4a57179 --- /dev/null +++ b/doc/examples.md @@ -0,0 +1,437 @@ +# Examples + +## Basic Usage + +### Simple Chat + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello, how are you?' }], + temperature: 0.7, +}); + +console.log(response.choices[0].message.content); +client.dispose(); +``` + +### Chat with System Message + +```javascript +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [ + { role: 'system', content: 'You are a concise technical assistant.' }, + { role: 'user', content: 'What is the capital of France?' }, + ], + temperature: 0.7, +}); + +console.log(response.choices[0].message.content); +``` + +--- + +## Security Tiers + +```javascript +// Standard โ€” general use (GPU) +const r1 = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'General query' }], + security_tier: 'standard', +}); + +// High โ€” sensitive business data +const r2 = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Review this contract clause...' }], + security_tier: 'high', +}); + +// Maximum โ€” HIPAA PHI / classified data (CPU-only) +const r3 = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Patient record summary...' }], + security_tier: 'maximum', +}); +``` + +--- + +## Tool / Function Calling + +```javascript +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: "What's the weather in Paris?" }], + tools: [ + { + type: 'function', + function: { + name: 'get_weather', + description: 'Get weather information for a location', + parameters: { + type: 'object', + properties: { + location: { type: 'string', description: 'City name' }, + }, + required: ['location'], + }, + }, + }, + ], + tool_choice: 'auto', +}); + +const message = response.choices[0].message; +if (message.tool_calls?.length) { + const call = message.tool_calls[0]; + const args = JSON.parse(call.function.arguments); + console.log(`Call ${call.function.name}(location="${args.location}")`); + // โ†’ Call get_weather(location="Paris") +} +``` + +--- + +## Error Handling + +```javascript +import { + SecureChatCompletion, + AuthenticationError, + RateLimitError, + ForbiddenError, + InvalidRequestError, + ServerError, + ServiceUnavailableError, + APIConnectionError, + SecurityError, +} from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +try { + const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello' }], + }); + console.log(response.choices[0].message.content); + +} catch (err) { + if (err instanceof AuthenticationError) { + console.error('Check your API key:', err.message); + } else if (err instanceof RateLimitError) { + console.error('Rate limit hit after all retries:', err.message); + } else if (err instanceof ForbiddenError) { + console.error('Model not allowed for this security tier:', err.message); + } else if (err instanceof InvalidRequestError) { + console.error('Bad request:', err.message, err.errorDetails); + } else if (err instanceof ServerError || err instanceof ServiceUnavailableError) { + console.error('Server error after retries:', err.message); + } else if (err instanceof APIConnectionError) { + console.error('Network error after retries:', err.message); + } else if (err instanceof SecurityError) { + console.error('Security/crypto failure:', err.message); + } else { + throw err; + } +} +``` + +--- + +## Real-World Scenarios + +### Chat Application with History + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +class SecureChatApp { + constructor(apiKey) { + this.client = new SecureChatCompletion({ apiKey }); + this.history = []; + } + + async chat(userMessage) { + this.history.push({ role: 'user', content: userMessage }); + + const response = await this.client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: this.history, + temperature: 0.7, + }); + + const assistantMessage = response.choices[0].message; + this.history.push({ role: assistantMessage.role, content: assistantMessage.content }); + return assistantMessage.content; + } + + dispose() { + this.client.dispose(); + } +} + +const app = new SecureChatApp(process.env.NOMYO_API_KEY); + +const r1 = await app.chat("What's your name?"); +console.log('Assistant:', r1); + +const r2 = await app.chat('What did I just ask you?'); +console.log('Assistant:', r2); + +app.dispose(); +``` + +### Per-Request Base URL Override + +For multi-tenant setups or testing against different router instances from a single client: + +```javascript +const client = new SecureChatCompletion({ + baseUrl: 'https://primary.nomyo.ai:12435', + apiKey: process.env.NOMYO_API_KEY, +}); + +// This single request goes to a different router; a temporary client is +// created, used, and disposed automatically โ€” the main client is unchanged +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello from secondary router' }], + base_url: 'https://secondary.nomyo.ai:12435', +}); +``` + +### Environment-Based Configuration + +```javascript +import 'dotenv/config'; +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ + baseUrl: process.env.NOMYO_SERVER_URL ?? 'https://api.nomyo.ai', + apiKey: process.env.NOMYO_API_KEY, + keyDir: process.env.NOMYO_KEY_DIR ?? 'client_keys', + maxRetries: Number(process.env.NOMYO_MAX_RETRIES ?? '2'), + debug: process.env.NODE_ENV === 'development', +}); +``` + +--- + +## Batch Processing + +### Sequential (Rate-Limit-Safe) + +```javascript +const queries = [ + 'Summarise document A', + 'Summarise document B', + 'Summarise document C', +]; + +const summaries = []; +for (const query of queries) { + const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: query }], + }); + summaries.push(response.choices[0].message.content); + // Optional: add a small delay to stay within rate limits + await new Promise(r => setTimeout(r, 600)); +} +``` + +### Concurrent (With Throttling) + +```javascript +// Process in batches of 2 (the default rate limit) +async function batchN(items, batchSize, fn) { + const results = []; + for (let i = 0; i < items.length; i += batchSize) { + const batch = items.slice(i, i + batchSize); + const batchResults = await Promise.all(batch.map(fn)); + results.push(...batchResults); + if (i + batchSize < items.length) { + await new Promise(r => setTimeout(r, 1100)); // wait >1 s between batches + } + } + return results; +} + +const summaries = await batchN(documents, 2, async (doc) => { + const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: `Summarise: ${doc}` }], + }); + return response.choices[0].message.content; +}); +``` + +--- + +## Thinking Models + +```javascript +const response = await client.create({ + model: 'LiquidAI/LFM2.5-1.2B-Thinking', + messages: [{ role: 'user', content: 'Is 9.9 larger than 9.11?' }], +}); + +const { content, reasoning_content } = response.choices[0].message; +console.log('Reasoning:', reasoning_content); // internal chain-of-thought +console.log('Answer:', content); // final answer to the user +``` + +--- + +## Browser Usage + +```html + + + + NOMYO Secure Chat + + + + +
+ + + + +``` + +--- + +## Advanced Key Management + +### Custom Key Directory + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + keyDir: '/var/lib/myapp/nomyo-keys', // outside project directory + keyRotationDir: '/var/lib/myapp/nomyo-keys', + keyRotationPassword: process.env.NOMYO_KEY_PASSWORD, +}); +``` + +### Generating Keys Manually + +```javascript +import { SecureCompletionClient } from 'nomyo-js'; + +const client = new SecureCompletionClient({ + routerUrl: 'https://api.nomyo.ai', +}); + +// Generate a new 4096-bit key pair and save it with password protection +await client.generateKeys({ + saveToFile: true, + keyDir: 'client_keys', + password: process.env.NOMYO_KEY_PASSWORD, +}); + +console.log('Keys generated and saved to client_keys/'); +client.dispose(); +``` + +### Loading Keys Explicitly + +```javascript +import { SecureCompletionClient } from 'nomyo-js'; + +const client = new SecureCompletionClient({ routerUrl: 'https://api.nomyo.ai' }); + +await client.loadKeys( + 'client_keys/private_key.pem', + 'client_keys/public_key.pem', + process.env.NOMYO_KEY_PASSWORD +); + +// Now send requests using the loaded keys +const result = await client.sendSecureRequest( + { model: 'Qwen/Qwen3-0.6B', messages: [{ role: 'user', content: 'Hello' }] }, + crypto.randomUUID() +); +client.dispose(); +``` + +--- + +## Inspecting Memory Protection + +```javascript +import { getMemoryProtectionInfo } from 'nomyo-js'; + +const info = getMemoryProtectionInfo(); + +console.log(`Memory method: ${info.method}`); // 'zero-only' or 'mlock' +console.log(`Can lock: ${info.canLock}`); // true if native addon present +console.log(`Details: ${info.details}`); +``` + +--- + +## TypeScript + +Full type safety out of the box: + +```typescript +import { + SecureChatCompletion, + ChatCompletionRequest, + ChatCompletionResponse, + Message, +} from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY! }); + +const messages: Message[] = [ + { role: 'user', content: 'Hello!' }, +]; + +const request: ChatCompletionRequest = { + model: 'Qwen/Qwen3-0.6B', + messages, + temperature: 0.7, +}; + +const response: ChatCompletionResponse = await client.create(request); +const content = response.choices[0].message.content; + +client.dispose(); +``` diff --git a/doc/getting-started.md b/doc/getting-started.md new file mode 100644 index 0000000..d224213 --- /dev/null +++ b/doc/getting-started.md @@ -0,0 +1,279 @@ +# Getting Started + +## Overview + +NOMYO.js provides end-to-end encryption for all communication between your application and NOMYO inference endpoints. Your prompts and responses are encrypted before leaving your process and decrypted only after they arrive back โ€” the server never sees plaintext. + +The API mirrors OpenAI's `ChatCompletion`, making it easy to integrate into existing code. + +> **Note on streaming:** The API is non-streaming. Setting `stream: true` in a request is ignored server-side to maintain full response encryption. + +--- + +## Simple Chat Completion + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, +}); + +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello! How are you today?' }], + temperature: 0.7, +}); + +// Extract what you need, then let the response go out of scope promptly. +// This minimises the time decrypted data lives in process memory +// (reduces exposure from swap files, core dumps, or memory inspection). +const reply = response.choices[0].message.content; +console.log(reply); +``` + +### With a System Message + +```javascript +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [ + { role: 'system', content: 'You are a helpful assistant.' }, + { role: 'user', content: 'What is the capital of France?' }, + ], + temperature: 0.7, +}); + +console.log(response.choices[0].message.content); +``` + +--- + +## API Key Authentication + +```javascript +// Constructor-level key (used for all requests from this instance) +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, +}); + +// Per-request key override (takes precedence over constructor key) +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello!' }], + api_key: 'override-key-for-this-request', +}); +``` + +--- + +## Security Tiers + +Pass `security_tier` in the request to control hardware routing and isolation level: + + +| Tier | Use case | +| -------------- | ------------------------------------------------------- | +| `"standard"` | General secure inference (GPU) | +| `"high"` | Sensitive business data โ€” enforces secure tokenizer | +| `"maximum"` | HIPAA PHI, classified data โ€” E2EE, maximum isolation | + +```javascript +// Standard โ€” general use +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'General query' }], + security_tier: 'standard', +}); + +// High โ€” sensitive business data +const response2 = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Summarise this contract clause...' }], + security_tier: 'high', +}); + +// Maximum โ€” PHI / classified data +const response3 = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Patient record summary...' }], + security_tier: 'maximum', +}); +``` + +> Using `"high"` or `"maximum"` adds latency vs `"standard"` due to additional isolation measures. + +--- + +## Using Tools (Function Calling) + +```javascript +const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: "What's the weather in Paris?" }], + tools: [ + { + type: 'function', + function: { + name: 'get_weather', + description: 'Get weather information for a location', + parameters: { + type: 'object', + properties: { + location: { type: 'string', description: 'City name' }, + }, + required: ['location'], + }, + }, + }, + ], + tool_choice: 'auto', + temperature: 0.7, +}); + +const message = response.choices[0].message; +if (message.tool_calls) { + const call = message.tool_calls[0]; + console.log('Tool called:', call.function.name); + console.log('Arguments:', call.function.arguments); +} +``` + +--- + +## Error Handling + +Import typed error classes to distinguish failure modes: + +```javascript +import { + SecureChatCompletion, + AuthenticationError, + RateLimitError, + InvalidRequestError, + ForbiddenError, + ServerError, + ServiceUnavailableError, + APIConnectionError, + SecurityError, +} from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +try { + const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello!' }], + }); + console.log(response.choices[0].message.content); + +} catch (err) { + if (err instanceof AuthenticationError) { + console.error('Bad API key:', err.message); + + } else if (err instanceof RateLimitError) { + // The client already retried automatically (default: 2 retries). + // If you reach here, all retries were exhausted. + console.error('Rate limit exceeded after retries:', err.message); + + } else if (err instanceof ForbiddenError) { + // Model not allowed for the requested security_tier + console.error('Forbidden:', err.message); + + } else if (err instanceof InvalidRequestError) { + console.error('Bad request:', err.message); + + } else if (err instanceof ServerError || err instanceof ServiceUnavailableError) { + console.error('Server error (retries exhausted):', err.message); + + } else if (err instanceof APIConnectionError) { + console.error('Network error (retries exhausted):', err.message); + + } else if (err instanceof SecurityError) { + console.error('Encryption/decryption failure:', err.message); + + } else { + throw err; // re-throw unexpected errors + } +} +``` + +All typed errors expose: + +- `message: string` โ€” human-readable description +- `statusCode?: number` โ€” HTTP status (where applicable) +- `errorDetails?: object` โ€” raw response body (where applicable) + +--- + +## Resource Management + +Always call `dispose()` when you're done with a client to stop the background key-rotation timer and zero in-memory key material: + +```javascript +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +try { + const response = await client.create({ ... }); + console.log(response.choices[0].message.content); +} finally { + client.dispose(); +} +``` + +For long-running servers (HTTP handlers, daemons), create one client instance and reuse it โ€” don't create a new one per request. + +--- + +## `acreate()` Alias + +`acreate()` is a direct alias for `create()` provided for code that follows the OpenAI naming convention: + +```javascript +const response = await client.acreate({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'Hello!' }], +}); +``` + +--- + +## Browser Usage + +In browsers, keys are kept in memory only (no file system). Everything else is identical to Node.js. + +```html + +``` + +> **Security note:** Embedding API keys in browser-side code exposes them to end users. In a real application, proxy requests through your backend or use short-lived tokens. + +--- + +## Local Development (HTTP) + +For a local NOMYO router running over plain HTTP: + +```javascript +const client = new SecureChatCompletion({ + baseUrl: 'http://localhost:12435', + allowHttp: true, // required; prints a security warning +}); +``` + +Never use `allowHttp: true` in production. diff --git a/doc/installation.md b/doc/installation.md new file mode 100644 index 0000000..7b757bd --- /dev/null +++ b/doc/installation.md @@ -0,0 +1,107 @@ +# Installation + +## Prerequisites + +- **Node.js**: 14.17 or higher (18 LTS recommended) +- **npm** / **yarn** / **pnpm** +- For TypeScript projects: TypeScript 4.7+ + +## Install from npm + +```bash +npm install nomyo-js +``` + +```bash +yarn add nomyo-js +``` + +```bash +pnpm add nomyo-js +``` + +## Browser (CDN) + +```html + +``` + +--- + +## Verify Installation + +### Node.js + +```javascript +import { SecureChatCompletion, getMemoryProtectionInfo } from 'nomyo-js'; + +const info = getMemoryProtectionInfo(); +console.log('Memory protection:', info.method); // e.g. "zero-only" +console.log('Can lock:', info.canLock); // true if native addon present + +const client = new SecureChatCompletion({ apiKey: 'test' }); +console.log('nomyo-js installed successfully'); +client.dispose(); +``` + +### CommonJS + +```javascript +const { SecureChatCompletion } = require('nomyo-js'); +``` + +## Optional: Native Memory Addon + +The pure-JS implementation zeroes buffers immediately after use but cannot prevent the OS from paging sensitive data to swap. +The optional native addon adds `mlock`/`VirtualLock` support for true OS-level memory locking. + +```bash +cd node_modules/nomyo-js/native +npm install +npm run build +``` + +Or if you installed `nomyo-native` separately: + +```bash +npm install nomyo-native +``` + +When the addon is present `getMemoryProtectionInfo()` will report `method: 'mlock'` and `canLock: true`. + +## TypeScript + +All public APIs ship with bundled type declarations โ€” no `@types/` package required. + +```typescript +import { + SecureChatCompletion, + ChatCompletionRequest, + ChatCompletionResponse, + getMemoryProtectionInfo, +} from 'nomyo-js'; +``` + +## Environment Variables + +Store secrets outside source code: + +```bash +# .env (never commit this file) +NOMYO_API_KEY=your-api-key +NOMYO_SERVER_URL=https://api.nomyo.ai +NOMYO_KEY_PASSWORD=your-key-password +``` + +```javascript +import 'dotenv/config'; // or use process.env directly +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ + baseUrl: process.env.NOMYO_SERVER_URL, + apiKey: process.env.NOMYO_API_KEY, +}); +``` diff --git a/doc/models.md b/doc/models.md new file mode 100644 index 0000000..046406e --- /dev/null +++ b/doc/models.md @@ -0,0 +1,85 @@ +# Available Models + +All models are available via `api.nomyo.ai`. Pass the model ID string directly to the `model` field of `create()`. + +## Model List + +| Model ID | Parameters | Type | Notes | +|---|---|---|---| +| `Qwen/Qwen3-0.6B` | 0.6B | General | Lightweight, fast inference | +| `Qwen/Qwen3.5-0.8B` | 0.8B | General | Lightweight, fast inference | +| `LiquidAI/LFM2.5-1.2B-Thinking` | 1.2B | Thinking | Reasoning model | +| `ibm-granite/granite-4.0-h-small` | Small | General | IBM Granite 4.0, enterprise-focused | +| `Qwen/Qwen3.5-9B` | 9B | General | Balanced quality and speed | +| `utter-project/EuroLLM-9B-Instruct-2512` | 9B | General | Multilingual, strong European language support | +| `zai-org/GLM-4.7-Flash` | โ€” | General | Fast GLM variant | +| `mistralai/Ministral-3-14B-Instruct-2512-GGUF` | 14B | General | Mistral instruction-tuned | +| `ServiceNow-AI/Apriel-1.6-15b-Thinker` | 15B | Thinking | Reasoning model | +| `openai/gpt-oss-20b` | 20B | General | OpenAI open-weight release | +| `LiquidAI/LFM2-24B-A2B` | 24B (2B active) | General | MoE โ€” efficient inference | +| `Qwen/Qwen3.5-27B` | 27B | General | High quality, large context | +| `google/medgemma-27b-it` | 27B | Specialized | Medical domain, instruction-tuned | +| `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4` | 30B (3B active) | General | MoE โ€” efficient inference | +| `Qwen/Qwen3.5-35B-A3B` | 35B (3B active) | General | MoE โ€” efficient inference | +| `moonshotai/Kimi-Linear-48B-A3B-Instruct` | 48B (3B active) | General | MoE โ€” large capacity, efficient inference | + +> **MoE** (Mixture of Experts) models show total/active parameter counts. Only active parameters are used per token, keeping inference cost low relative to total model size. + +## Usage + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +const response = await client.create({ + model: 'Qwen/Qwen3.5-9B', + messages: [{ role: 'user', content: 'Hello!' }], +}); +``` + +## Choosing a Model + +| Goal | Recommended models | +|------|--------------------| +| **Low latency / edge** | `Qwen/Qwen3-0.6B`, `Qwen/Qwen3.5-0.8B`, `LiquidAI/LFM2.5-1.2B-Thinking` | +| **Balanced quality + speed** | `Qwen/Qwen3.5-9B`, `mistralai/Ministral-3-14B-Instruct-2512-GGUF` | +| **Reasoning / chain-of-thought** | `LiquidAI/LFM2.5-1.2B-Thinking`, `ServiceNow-AI/Apriel-1.6-15b-Thinker` | +| **Multilingual** | `utter-project/EuroLLM-9B-Instruct-2512` | +| **Medical** | `google/medgemma-27b-it` | +| **Highest quality** | `moonshotai/Kimi-Linear-48B-A3B-Instruct`, `Qwen/Qwen3.5-35B-A3B` | + +## Thinking Models + +Models marked **Thinking** return an additional `reasoning_content` field in the response message alongside the normal `content`. This contains the model's internal chain-of-thought: + +```javascript +const response = await client.create({ + model: 'LiquidAI/LFM2.5-1.2B-Thinking', + messages: [{ role: 'user', content: 'Is 9.9 or 9.11 larger?' }], +}); + +const { content, reasoning_content } = response.choices[0].message; +console.log('Reasoning:', reasoning_content); // internal chain-of-thought +console.log('Answer:', content); // final answer +``` + +## Security Tier Compatibility + +Not all models are available on all security tiers. If a model is not permitted for the requested tier, the server returns HTTP 403 and the client throws `ForbiddenError`. + +```javascript +import { ForbiddenError } from 'nomyo-js'; + +try { + const response = await client.create({ + model: 'Qwen/Qwen3.5-27B', + messages: [{ role: 'user', content: '...' }], + security_tier: 'maximum', + }); +} catch (err) { + if (err instanceof ForbiddenError) { + // Model not available at this security tier โ€” retry with a different tier or model + } +} +``` diff --git a/doc/rate-limits.md b/doc/rate-limits.md new file mode 100644 index 0000000..1b3bd41 --- /dev/null +++ b/doc/rate-limits.md @@ -0,0 +1,115 @@ +# Rate Limits + +The NOMYO API (`api.nomyo.ai`) enforces rate limits to ensure fair usage and service stability for all users. + +## Default Rate Limit + +By default, each API key is limited to **2 requests per second**. + +## Burst Allowance + +Short bursts above the default limit are permitted. You may send up to **4 requests per second** in burst mode, provided you have not exceeded burst usage within the current **10-second window**. + +Burst capacity is granted once per 10-second window. If you consume the burst allowance, you must wait for the window to reset before burst is available again. + +## Rate Limit Summary + +| Mode | Limit | Condition | +|------|-------|-----------| +| Default | 2 requests/second | Always active | +| Burst | 4 requests/second | Once per 10-second window | + +## Error Responses + +### 429 Too Many Requests + +Returned when your request rate exceeds the allowed limit. + +The client retries automatically (see below). If all retries are exhausted, `RateLimitError` is thrown: + +```javascript +import { SecureChatCompletion, RateLimitError } from 'nomyo-js'; + +try { + const response = await client.create({ ... }); +} catch (err) { + if (err instanceof RateLimitError) { + // All retries exhausted โ€” back off manually before trying again + console.error('Rate limit exceeded:', err.message); + } +} +``` + +### 503 Service Unavailable (Cool-down) + +Returned when burst limits are abused repeatedly. A **30-minute cool-down** is applied to the offending API key. + +**What to do:** Wait 30 minutes before retrying. Review your request patterns to ensure you stay within the permitted limits. + +## Automatic Retry Behaviour + +The client retries automatically on `429`, `500`, `502`, `503`, `504`, and network errors using exponential backoff: + +| Attempt | Delay before attempt | +|---------|----------------------| +| 1st (initial) | โ€” | +| 2nd | 1 second | +| 3rd | 2 seconds | + +The default is **2 retries** (3 total attempts). Adjust per client: + +```javascript +// More retries for high-throughput workloads +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + maxRetries: 5, +}); + +// Disable retries entirely (fail fast) +const client2 = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + maxRetries: 0, +}); +``` + +## Best Practices + +- **Throttle requests client-side** to stay at or below 2 requests/second under normal load. +- **Use burst sparingly** โ€” it is intended for occasional spikes, not sustained high-throughput usage. +- **Increase `maxRetries`** for background jobs that can tolerate extra latency. +- **Monitor for `503` responses** โ€” repeated occurrences indicate your usage pattern is triggering the abuse threshold. +- **Parallel requests** (e.g. `Promise.all`) count against the same rate limit โ€” be careful with large batches. + +## Batch Processing Example + +Throttle parallel requests to stay within the rate limit: + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); + +async function throttledBatch(queries, requestsPerSecond = 2) { + const results = []; + const delayMs = 1000 / requestsPerSecond; + + for (const query of queries) { + const start = Date.now(); + + const response = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: query }], + }); + results.push(response.choices[0].message.content); + + // Throttle: wait for the remainder of the time slot + const elapsed = Date.now() - start; + if (elapsed < delayMs) { + await new Promise(resolve => setTimeout(resolve, delayMs - elapsed)); + } + } + + client.dispose(); + return results; +} +``` diff --git a/doc/security-guide.md b/doc/security-guide.md new file mode 100644 index 0000000..4f572ac --- /dev/null +++ b/doc/security-guide.md @@ -0,0 +1,237 @@ +# Security Guide + +## Overview + +NOMYO.js provides end-to-end encryption for all communication between your application and NOMYO inference endpoints. Your prompts and responses are encrypted before leaving your process โ€” the inference server never processes plaintext. + +For the full cryptographic architecture and threat model see [SECURITY.md](SECURITY.md). + +--- + +## Encryption Mechanism + +### Hybrid Encryption + +Each request uses a two-layer scheme: + +1. **AES-256-GCM** encrypts the payload (authenticated encryption โ€” prevents tampering). +2. **RSA-OAEP-SHA256** wraps the AES key for secure key exchange. + +The server holds the RSA private key; your client generates the AES key fresh for every request. + +### Per-Request Ephemeral AES Keys + +- A new 256-bit AES key is generated for every `create()` call using the Web Crypto API. +- The key is never reused โ€” forward secrecy is ensured per request. +- The key is zeroed from memory immediately after encryption. + +### Key Exchange + +Your client's RSA public key is sent in the `X-Public-Key` request header. The server encrypts the response with it so only your client can decrypt the reply. + +--- + +## Memory Protection + +### What the Library Does + +All intermediate sensitive buffers (AES key, plaintext payload, decrypted response bytes) are wrapped in `SecureByteContext`. This guarantees they are zeroed in a `finally` block immediately after use, even if an exception occurs. + +The encrypted request body (`ArrayBuffer`) is also zeroed by the Node.js HTTP client after the data is handed to the socket. + +### Limitations (Pure JavaScript) + +JavaScript has no direct access to OS memory management. The library cannot: + +- Lock pages to prevent swapping (`mlock` / `VirtualLock`) +- Prevent the garbage collector from copying data internally +- Guarantee memory won't appear in core dumps + +**Impact:** On a system under memory pressure, sensitive data could briefly be written to swap. For environments where this is unacceptable (PHI, classified), install the optional native addon or run on a system with swap disabled. + +### Native Addon (Optional) + +The `nomyo-native` addon adds true `mlock` support. When installed, `getMemoryProtectionInfo()` reports `method: 'mlock'` and `canLock: true`: + +```javascript +import { getMemoryProtectionInfo } from 'nomyo-js'; + +const info = getMemoryProtectionInfo(); +// Without addon: { method: 'zero-only', canLock: false } +// With addon: { method: 'mlock', canLock: true } +``` + +--- + +## Minimise Response Lifetime + +The library protects all intermediate crypto material in secure memory. However, the **final parsed response object** is returned to your code, and you are responsible for how long it lives. + +```javascript +// GOOD โ€” extract what you need, then drop the response immediately +const response = await client.create({ + model: 'Qwen/Qwen3.5-9B', + messages: [{ role: 'user', content: 'Summarise patient record #1234' }], + security_tier: 'maximum', +}); +const reply = response.choices[0].message.content; +// Let response go out of scope here โ€” don't hold it in a variable +// longer than necessary + +// BAD โ€” holding the full response object in a long-lived scope +this.lastResponse = response; // stored for minutes / hours +``` + +JavaScript's `delete` and variable reassignment do not zero the underlying memory. For sensitive data (PHI, classified), process and discard as quickly as possible โ€” do not store in class attributes, global caches, or log files. + +--- + +## Key Management + +### Default Behaviour + +Keys are automatically generated on first use and saved to `client_keys/` (Node.js). On subsequent runs the saved keys are reloaded automatically. + +``` +client_keys/ + private_key.pem # permissions 0600 (owner-only) + public_key.pem # permissions 0644 +``` + +### Configure the Key Directory + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + keyDir: '/etc/myapp/nomyo-keys', // custom path, outside project directory +}); +``` + +### Password-Protected Keys (Recommended for Production) + +Protect key files with a password so they cannot be used even if the file is leaked: + +```javascript +import { SecureCompletionClient } from 'nomyo-js'; + +const client = new SecureCompletionClient({ routerUrl: 'https://api.nomyo.ai' }); + +await client.generateKeys({ + saveToFile: true, + keyDir: 'client_keys', + password: process.env.NOMYO_KEY_PASSWORD, // minimum 8 characters +}); +``` + +To load password-protected keys manually: + +```javascript +await client.loadKeys( + 'client_keys/private_key.pem', + 'client_keys/public_key.pem', + process.env.NOMYO_KEY_PASSWORD +); +``` + +### Key Rotation + +Keys rotate automatically every 24 hours by default. Configure or disable: + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + keyRotationInterval: 3600000, // rotate every hour + keyRotationDir: '/var/lib/myapp/keys', + keyRotationPassword: process.env.KEY_PWD, +}); + +// Or disable entirely for short-lived processes +const client2 = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + keyRotationInterval: 0, +}); +``` + +### File Permissions + +Private key files are saved with `0600` permissions (owner read/write only) on Unix-like systems. Add `client_keys/` and `*.pem` to your `.gitignore` โ€” both are already included if you use this package's default `.gitignore`. + +--- + +## Security Tiers + +| Tier | Hardware | Use case | +|------|----------|----------| +| `"standard"` | GPU | General secure inference | +| `"high"` | CPU/GPU balanced | Sensitive business data, enforces secure tokenizer | +| `"maximum"` | CPU only | HIPAA PHI, classified data โ€” maximum isolation | + +Higher tiers add round-trip latency but increase hardware-level isolation. + +--- + +## HTTPS Enforcement + +The client enforces HTTPS by default. HTTP connections require explicit opt-in and print a visible warning: + +```javascript +// Production โ€” HTTPS only (default) +const client = new SecureChatCompletion({ baseUrl: 'https://api.nomyo.ai' }); + +// Local development โ€” HTTP allowed with explicit flag +const devClient = new SecureChatCompletion({ + baseUrl: 'http://localhost:12435', + allowHttp: true, // prints: "WARNING: Using HTTP instead of HTTPS..." +}); +``` + +Without `allowHttp: true`, connecting over HTTP throws `SecurityError`. + +The server's public key is fetched over HTTPS with TLS certificate verification to prevent man-in-the-middle attacks. + +--- + +## API Key Security + +API keys are sent as `Bearer` tokens in the `Authorization` header. The client validates that the key does not contain CR or LF characters to prevent HTTP header injection. + +Never hardcode API keys in source code โ€” use environment variables: + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, +}); +``` + +--- + +## Production Checklist + +- [ ] Always use HTTPS (`allowHttp` is `false` by default) +- [ ] Load API key from environment variable, not hardcoded +- [ ] Enable `secureMemory: true` (default) +- [ ] Use password-protected key files (`keyRotationPassword`) +- [ ] Store keys outside the project directory and outside version control +- [ ] Add `client_keys/` and `*.pem` to `.gitignore` +- [ ] Call `client.dispose()` when the client is no longer needed +- [ ] Consider the native addon if swap-file exposure is unacceptable + +--- + +## Compliance Considerations + +### HIPAA + +For Protected Health Information (PHI): +- Use `security_tier: 'maximum'` on requests containing PHI +- Enable password-protected key files +- Ensure HTTPS is enforced (the default) +- Minimise response lifetime in memory (extract, use, discard) + +### Data Classification + +| Classification | Recommended tier | +|---------------|-----------------| +| Public / internal | `"standard"` | +| Confidential business data | `"high"` | +| PHI, PII, classified | `"maximum"` | diff --git a/doc/troubleshooting.md b/doc/troubleshooting.md new file mode 100644 index 0000000..2ac32e1 --- /dev/null +++ b/doc/troubleshooting.md @@ -0,0 +1,314 @@ +# Troubleshooting + +## Authentication Errors + +### `AuthenticationError: Invalid or missing API key` + +The server rejected your API key. + +**Causes and fixes:** + +- Key not set โ€” pass `apiKey` to the constructor or use `process.env.NOMYO_API_KEY`. +- Key has leading/trailing whitespace โ€” check the value with `console.log(JSON.stringify(process.env.NOMYO_API_KEY))`. +- Key contains CR or LF characters โ€” the client rejects keys with `\r` or `\n` and throws `SecurityError` before the request is sent. Regenerate the key. + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, // never hardcode +}); +``` + +--- + +## Connection Errors + +### `APIConnectionError: Network error` / `connect ECONNREFUSED` + +The client could not reach the router. + +**Check:** + +1. `baseUrl` is correct โ€” the default is `https://api.nomyo.ai` (port **12435**). +2. You have network access to the host. +3. TLS is not being blocked by a proxy or firewall. + +### `SecurityError: HTTPS is required` + +You passed an `http://` URL without setting `allowHttp: true`. + +```javascript +// Local dev only +const client = new SecureChatCompletion({ + baseUrl: 'http://localhost:12435', + allowHttp: true, +}); +``` + +Never set `allowHttp: true` in production โ€” the server public key fetch and all request data would travel unencrypted. + +### `APIConnectionError: Request timed out` + +The default timeout is 60 seconds. Larger models or busy endpoints may need more: + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + timeout: 120000, // 2 minutes +}); +``` + +--- + +## Key Loading Failures + +### `Error: Failed to load keys: no such file or directory` + +The `keyDir` directory or the PEM files inside it don't exist. On first run the library generates and saves a new key pair automatically. If you specified a custom `keyDir`, make sure the directory is writable: + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + keyDir: '/var/lib/myapp/nomyo-keys', // directory must exist and be writable +}); +``` + +### `Error: Invalid passphrase` / `Error: Failed to decrypt private key` + +The password you passed to `loadKeys()` or `keyRotationPassword` doesn't match what was used to encrypt the file. + +```javascript +await client.loadKeys( + 'client_keys/private_key.pem', + 'client_keys/public_key.pem', + process.env.NOMYO_KEY_PASSWORD, // must match the password used on generateKeys() +); +``` + +### `Error: RSA key too small` + +The library enforces a minimum key size of 2048 bits. If you have old 1024-bit keys, regenerate them: + +```javascript +await client.generateKeys({ + saveToFile: true, + keyDir: 'client_keys', + keySize: 4096, // recommended +}); +``` + +### `Error: Failed to load keys` (browser) + +Key loading from files is a Node.js-only feature. In browsers, keys are generated in memory on first use. Do not call `loadKeys()` in a browser context. + +--- + +## Rate Limit Errors + +### `RateLimitError: Rate limit exceeded` + +All automatic retries were exhausted. The default limit is 2 requests/second; burst allows 4 requests/second once per 10-second window. + +**Fixes:** + +- Reduce concurrency โ€” avoid large `Promise.all` batches. +- Add client-side throttling (see [Rate Limits](rate-limits.md)). +- Increase `maxRetries` so the client backs off longer before giving up: + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + maxRetries: 5, +}); +``` + +### `ServiceUnavailableError` with 30-minute cool-down + +Burst limits were hit repeatedly and a cool-down was applied to your key. Wait 30 minutes, then review your request patterns. + +--- + +## Model / Tier Errors + +### `ForbiddenError: Model not allowed for this security tier` + +The model you requested is not available at the security tier you specified. Try a lower tier or a different model: + +```javascript +// If 'maximum' tier rejects the model, try 'high' or 'standard' +const response = await client.create({ + model: 'Qwen/Qwen3.5-27B', + messages: [...], + security_tier: 'high', // try 'standard' if still rejected +}); +``` + +See [Models โ€” Security Tier Compatibility](models.md#security-tier-compatibility) for details. + +--- + +## Crypto / Security Errors + +### `SecurityError: Decryption failed` + +The response could not be decrypted. This is intentionally vague to avoid leaking crypto details. + +**Possible causes:** + +- The server returned a malformed response (check `debug: true` output). +- A network proxy modified the response body. +- The server's public key changed mid-session โ€” the next request will re-fetch it automatically. + +Enable debug mode to log the raw response and narrow the cause: + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + debug: true, +}); +``` + +### `Error: Unsupported protocol version` / `Error: Unsupported encryption algorithm` + +The server sent a response in a protocol version or with an encryption algorithm not supported by this client version. Update the package: + +```bash +npm update nomyo-js +``` + +--- + +## `DisposedError`: Method called after `dispose()` + +You called a method on a client that has already been disposed. + +```javascript +client.dispose(); +await client.create(...); // throws DisposedError +``` + +Create a new client instance if you need to make more requests after disposal. + +--- + +## Memory Protection Warnings + +### `getMemoryProtectionInfo()` returns `method: 'zero-only'` + +This is normal for a pure JavaScript installation. The library zeroes sensitive buffers immediately after use but cannot lock pages to prevent swapping (OS `mlock` requires a native addon). + +```javascript +import { getMemoryProtectionInfo } from 'nomyo-js'; + +const info = getMemoryProtectionInfo(); +// { method: 'zero-only', canLock: false, isPlatformSecure: false } +``` + +For environments where swap-file exposure is unacceptable (HIPAA PHI, classified data), install the optional `nomyo-native` addon or run on a system with swap disabled. + +--- + +## Node.js-Specific Issues + +### `ReferenceError: crypto is not defined` + +In CommonJS modules on Node.js before v19, `crypto` is not a global. Import it explicitly: + +```javascript +// CommonJS +const { webcrypto } = require('crypto'); +global.crypto = webcrypto; + +// Or switch to ES modules (recommended) +// package.json: "type": "module" +``` + +The library itself imports `crypto` correctly โ€” this error only appears if your own application code tries to use `crypto` directly. + +### `SyntaxError: Cannot use import statement in a module` / CommonJS vs ESM + +The package ships both CommonJS (`dist/node/`) and ESM (`dist/esm/`) builds. Node.js selects the correct one automatically via `package.json` `exports`. If you see import errors, check that your `package.json` or bundler is not forcing the wrong format. + +For ESM: set `"type": "module"` in your `package.json` or use `.mjs` file extensions. +For CommonJS: use `require('nomyo-js')` or `.cjs` extensions. + +### TypeScript: `Cannot find module 'nomyo-js'` / missing types + +Ensure your `tsconfig.json` includes `"moduleResolution": "bundler"` or `"moduleResolution": "node16"` and that `nomyo-js` is in `dependencies` (not just `devDependencies`): + +```bash +npm install nomyo-js +``` + +--- + +## Browser-Specific Issues + +### `Content Security Policy blocked` + +If your app's CSP restricts `script-src` or `connect-src`, add the NOMYO API domain: + +``` +Content-Security-Policy: connect-src https://api.nomyo.ai; +``` + +### `TypeError: Failed to fetch` (CORS) + +The NOMYO API includes CORS headers. If you see CORS errors in a browser, verify the `baseUrl` is correct (HTTPS, correct port) and that no browser extension is blocking the request. + +### Keys not persisted across page reloads + +This is expected behaviour โ€” browsers do not have file system access. Keys are generated fresh on each page load. If you need persistent keys in a browser context, implement your own `loadKeys`/`generateKeys` wrapper using `localStorage` or `IndexedDB` (not recommended for high-security scenarios). + +--- + +## Debugging Tips + +### Enable verbose logging + +```javascript +const client = new SecureChatCompletion({ + apiKey: process.env.NOMYO_API_KEY, + debug: true, +}); +``` + +Debug mode logs: key generation/loading, server public key fetches, request encryption details, retry attempts, and response decryption. + +### Check memory protection status + +```javascript +import { getMemoryProtectionInfo } from 'nomyo-js'; +console.log(getMemoryProtectionInfo()); +``` + +### Inspect response metadata + +The `_metadata` field in every response carries server-side diagnostics: + +```javascript +const response = await client.create({ ... }); +console.log(response._metadata); +// { +// payload_id: '...', +// is_encrypted: true, +// encryption_algorithm: 'hybrid-aes256-rsa4096', +// security_tier: 'standard', +// memory_protection: { ... }, +// } +``` + +### Test with minimum configuration + +Strip all optional configuration and test with the simplest possible call to isolate the issue: + +```javascript +import { SecureChatCompletion } from 'nomyo-js'; + +const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY }); +const r = await client.create({ + model: 'Qwen/Qwen3-0.6B', + messages: [{ role: 'user', content: 'ping' }], +}); +console.log(r.choices[0].message.content); +client.dispose(); +``` diff --git a/examples/node/basic.js b/examples/node/basic.js index c29f530..72e54fb 100644 --- a/examples/node/basic.js +++ b/examples/node/basic.js @@ -7,7 +7,7 @@ import { SecureChatCompletion } from 'nomyo-js'; async function main() { // Initialize client const client = new SecureChatCompletion({ - baseUrl: 'https://api.nomyo.ai:12434', + baseUrl: 'https://api.nomyo.ai', // For local development, use: // baseUrl: 'http://localhost:12434', // allowHttp: true diff --git a/examples/node/with-tools.js b/examples/node/with-tools.js index 937eac9..aa15b72 100644 --- a/examples/node/with-tools.js +++ b/examples/node/with-tools.js @@ -6,7 +6,7 @@ import { SecureChatCompletion } from 'nomyo-js'; async function main() { const client = new SecureChatCompletion({ - baseUrl: 'https://api.nomyo.ai:12434' + baseUrl: 'https://api.nomyo.ai' }); try { diff --git a/src/api/SecureChatCompletion.ts b/src/api/SecureChatCompletion.ts index 685100d..0f18018 100644 --- a/src/api/SecureChatCompletion.ts +++ b/src/api/SecureChatCompletion.ts @@ -10,20 +10,37 @@ import { ChatCompletionRequest, ChatCompletionResponse } from '../types/api'; export class SecureChatCompletion { private client: SecureCompletionClient; private apiKey?: string; + /** Stored config used to spin up a temporary per-request instance when base_url is overridden */ + private readonly _config: ChatCompletionConfig; constructor(config: ChatCompletionConfig = {}) { const { - baseUrl = 'https://api.nomyo.ai:12434', + baseUrl = 'https://api.nomyo.ai', allowHttp = false, apiKey, secureMemory = true, + timeout, + debug, + keyRotationInterval, + keyRotationDir, + keyRotationPassword, + maxRetries, + keyDir, } = config; + this._config = config; this.apiKey = apiKey; this.client = new SecureCompletionClient({ routerUrl: baseUrl, allowHttp, secureMemory, + ...(timeout !== undefined && { timeout }), + ...(debug !== undefined && { debug }), + ...(keyRotationInterval !== undefined && { keyRotationInterval }), + ...(keyRotationDir !== undefined && { keyRotationDir }), + ...(keyRotationPassword !== undefined && { keyRotationPassword }), + ...(maxRetries !== undefined && { maxRetries }), + ...(keyDir !== undefined && { keyDir }), }); } @@ -33,18 +50,19 @@ export class SecureChatCompletion { * Supports additional NOMYO-specific fields: * - `security_tier`: "standard" | "high" | "maximum" โ€” controls hardware routing * - `api_key`: per-request API key override (takes precedence over constructor key) + * - `base_url`: per-request router URL override (creates a temporary client for + * this single call, matching the Python SDK's `create(base_url=...)` behaviour) */ async create(request: ChatCompletionRequest): Promise { const payloadId = generateUUID(); // Extract NOMYO-specific fields that must not go into the encrypted payload - const { security_tier, api_key, ...payload } = request as ChatCompletionRequest & { + const { security_tier, api_key, base_url, ...payload } = request as ChatCompletionRequest & { security_tier?: string; api_key?: string; + base_url?: string; }; - const apiKey = api_key ?? this.apiKey; - if (!payload.model) { throw new Error('Missing required field: model'); } @@ -52,6 +70,29 @@ export class SecureChatCompletion { throw new Error('Missing or invalid required field: messages'); } + const apiKey = api_key ?? this.apiKey; + + // Per-request base_url: spin up a temporary client for this one call, + // inheriting all other config from the current instance. + if (base_url !== undefined) { + const tempInstance = new SecureChatCompletion({ + ...this._config, + baseUrl: base_url, + apiKey: this.apiKey, + }); + try { + const response = await tempInstance.client.sendSecureRequest( + payload, + payloadId, + apiKey, + security_tier + ); + return response as unknown as ChatCompletionResponse; + } finally { + tempInstance.dispose(); + } + } + const response = await this.client.sendSecureRequest( payload, payloadId, @@ -68,4 +109,11 @@ export class SecureChatCompletion { async acreate(request: ChatCompletionRequest): Promise { return this.create(request); } + + /** + * Release resources: stop key rotation timer and zero in-memory key material. + */ + dispose(): void { + this.client.dispose(); + } } diff --git a/src/core/SecureCompletionClient.ts b/src/core/SecureCompletionClient.ts index 8e32850..1ed7872 100644 --- a/src/core/SecureCompletionClient.ts +++ b/src/core/SecureCompletionClient.ts @@ -21,6 +21,7 @@ import { ServerError, ForbiddenError, ServiceUnavailableError, + DisposedError, } from '../errors'; import { arrayBufferToBase64, @@ -64,41 +65,132 @@ export class SecureCompletionClient { private secureMemoryImpl = createSecureMemory(); private readonly keySize: 2048 | 4096; - constructor(config: ClientConfig = { routerUrl: 'https://api.nomyo.ai:12434' }) { + private disposed = false; + private readonly debugMode: boolean; + private readonly requestTimeout: number; + private readonly keyRotationInterval: number; + private keyRotationTimer?: ReturnType; + private readonly keyRotationDir?: string; + private readonly keyRotationPassword?: string; + private readonly maxRetries: number; + private readonly keyDir: string; + private _isHttps: boolean = true; + + // Promise-based mutex: serialises concurrent ensureKeys() calls + private ensureKeysLock: Promise = Promise.resolve(); + + constructor(config: ClientConfig = { routerUrl: 'https://api.nomyo.ai' }) { const { - routerUrl = 'https://api.nomyo.ai:12434', + routerUrl = 'https://api.nomyo.ai', allowHttp = false, secureMemory = true, keySize = 4096, + timeout = 60000, + debug = false, + keyRotationInterval = 86400000, // 24 hours + keyRotationDir, + keyRotationPassword, + maxRetries = 2, + keyDir = 'client_keys', } = config; + this.debugMode = debug; + this.requestTimeout = timeout; + this.keyRotationInterval = keyRotationInterval; + this.keyRotationDir = keyRotationDir; + this.keyRotationPassword = keyRotationPassword; + this.maxRetries = maxRetries; + this.keyDir = keyDir; this.keySize = keySize; - this.routerUrl = routerUrl.replace(/\/$/, ''); this.allowHttp = allowHttp; this.secureMemory = secureMemory; - // Validate HTTPS for security - if (!this.routerUrl.startsWith('https://')) { + // Validate and parse URL + let parsedUrl: URL; + try { + parsedUrl = new URL(routerUrl); + } catch { + throw new Error(`Invalid routerUrl: "${routerUrl}" is not a valid URL`); + } + this._isHttps = parsedUrl.protocol === 'https:'; + this.routerUrl = routerUrl.replace(/\/$/, ''); + + if (!this._isHttps) { if (!allowHttp) { console.warn( - 'โš ๏ธ WARNING: Using HTTP instead of HTTPS. ' + + 'WARNING: Using HTTP instead of HTTPS. ' + 'This is INSECURE and should only be used for local development. ' + 'Man-in-the-middle attacks are possible!' ); } else { - console.log('HTTP mode enabled for local development (INSECURE)'); + if (this.debugMode) console.log('HTTP mode enabled for local development (INSECURE)'); } } // Initialize components - this.keyManager = new KeyManager(); + this.keyManager = new KeyManager(this.debugMode); this.aes = new AESEncryption(); this.rsa = new RSAOperations(); this.httpClient = createHttpClient(); // Log memory protection info const protectionInfo = this.secureMemoryImpl.getProtectionInfo(); - console.log(`Memory protection: ${protectionInfo.method} (${protectionInfo.details})`); + if (this.debugMode) console.log(`Memory protection: ${protectionInfo.method} (${protectionInfo.details})`); + + // Start key rotation timer + if (this.keyRotationInterval > 0) { + this.startKeyRotationTimer(); + } + } + + /** + * Release resources: cancel key rotation timer and zero in-memory key material. + * After calling dispose(), all methods throw DisposedError. + */ + dispose(): void { + if (this.disposed) return; + this.disposed = true; + + if (this.keyRotationTimer !== undefined) { + clearInterval(this.keyRotationTimer); + this.keyRotationTimer = undefined; + } + + this.keyManager.zeroKeys(); + } + + private assertNotDisposed(): void { + if (this.disposed) { + throw new DisposedError(); + } + } + + private startKeyRotationTimer(): void { + this.keyRotationTimer = setInterval( + () => { void this.rotateKeys(); }, + this.keyRotationInterval + ); + // Allow the process to exit without waiting for the next rotation tick + const timer = this.keyRotationTimer as unknown as { unref?: () => void }; + if (typeof timer.unref === 'function') { + timer.unref(); + } + } + + private async rotateKeys(): Promise { + if (this.disposed) return; + if (this.debugMode) console.log('Key rotation: generating new key pair...'); + try { + await this.keyManager.rotateKeys({ + keySize: this.keySize, + saveToFile: typeof window === 'undefined', + keyDir: this.keyRotationDir ?? 'client_keys', + password: this.keyRotationPassword, + }); + if (this.debugMode) console.log('Key rotation: complete'); + } catch (err) { + console.error('Key rotation failed:', err instanceof Error ? err.message : 'unknown error'); + } } /** @@ -109,6 +201,7 @@ export class SecureCompletionClient { keyDir?: string; password?: string; } = {}): Promise { + this.assertNotDisposed(); await this.keyManager.generateKeys({ keySize: this.keySize, ...options, @@ -123,6 +216,7 @@ export class SecureCompletionClient { publicKeyPath?: string, password?: string ): Promise { + this.assertNotDisposed(); await this.keyManager.loadKeys( { privateKeyPath, publicKeyPath }, password @@ -130,36 +224,54 @@ export class SecureCompletionClient { } /** - * Ensure keys are loaded, generate if necessary + * Ensure keys are loaded, generate if necessary. + * Uses a Promise-chain mutex to prevent concurrent key generation races. */ - private async ensureKeys(): Promise { - if (this.keyManager.hasKeys()) { - return; - } + private ensureKeys(): Promise { + let resolve!: () => void; + let reject!: (e: unknown) => void; + const callerPromise = new Promise((res, rej) => { + resolve = res; + reject = rej; + }); + // Append to the shared chain so callers queue up + this.ensureKeysLock = this.ensureKeysLock.then(async () => { + try { + await this._doEnsureKeys(); + resolve(); + } catch (e) { + reject(e); + } + }); + return callerPromise; + } - // Try to load keys from default location (Node.js only) + private async _doEnsureKeys(): Promise { + if (this.keyManager.hasKeys()) return; + + // Try to load keys from the configured directory (Node.js only) if (typeof window === 'undefined') { try { const fs = require('fs').promises as { access: (p: string) => Promise }; const path = require('path') as { join: (...p: string[]) => string }; - const privateKeyPath = path.join('client_keys', 'private_key.pem'); - const publicKeyPath = path.join('client_keys', 'public_key.pem'); + const privateKeyPath = path.join(this.keyDir, 'private_key.pem'); + const publicKeyPath = path.join(this.keyDir, 'public_key.pem'); await fs.access(privateKeyPath); await fs.access(publicKeyPath); await this.loadKeys(privateKeyPath, publicKeyPath); - console.log('Loaded existing keys from client_keys/'); + if (this.debugMode) console.log(`Loaded existing keys from ${this.keyDir}/`); return; } catch (_error) { - console.log('No existing keys found, generating new keys...'); + if (this.debugMode) console.log(`No existing keys found in ${this.keyDir}/, generating new keys...`); } } await this.generateKeys({ saveToFile: typeof window === 'undefined', - keyDir: 'client_keys', + keyDir: this.keyDir, }); } @@ -167,14 +279,15 @@ export class SecureCompletionClient { * Fetch server's public key from /pki/public_key endpoint */ async fetchServerPublicKey(): Promise { - console.log("Fetching server's public key..."); + this.assertNotDisposed(); + if (this.debugMode) console.log("Fetching server's public key..."); - if (!this.routerUrl.startsWith('https://')) { + if (!this._isHttps) { if (!this.allowHttp) { throw new SecurityError( 'Server public key must be fetched over HTTPS to prevent MITM attacks. ' + 'For local development, initialize with allowHttp=true: ' + - 'new SecureChatCompletion({ baseUrl: "http://localhost:12434", allowHttp: true })' + 'new SecureChatCompletion({ baseUrl: "http://localhost:12435", allowHttp: true })' ); } else { console.warn('Fetching key over HTTP (local development mode)'); @@ -184,7 +297,7 @@ export class SecureCompletionClient { const url = `${this.routerUrl}/pki/public_key`; try { - const response = await this.httpClient.get(url, { timeout: 60000 }); + const response = await this.httpClient.get(url, { timeout: this.requestTimeout }); if (response.statusCode === 200) { const serverPublicKey = arrayBufferToString(response.body); @@ -196,8 +309,8 @@ export class SecureCompletionClient { throw new Error('Server returned invalid public key format'); } - if (this.routerUrl.startsWith('https://')) { - console.log("Server's public key fetched securely over HTTPS"); + if (this._isHttps) { + if (this.debugMode) console.log("Server's public key fetched securely over HTTPS"); } else { console.warn("Server's public key fetched over HTTP (INSECURE)"); } @@ -227,7 +340,7 @@ export class SecureCompletionClient { * - encrypted_aes_key: AES key encrypted with server's RSA public key */ async encryptPayload(payload: object): Promise { - console.log('Encrypting payload...'); + this.assertNotDisposed(); if (!payload || typeof payload !== 'object') { throw new Error('Payload must be an object'); @@ -243,7 +356,7 @@ export class SecureCompletionClient { throw new Error(`Payload too large: ${payloadBytes.byteLength} bytes (max: ${MAX_PAYLOAD_SIZE})`); } - console.log(`Payload size: ${payloadBytes.byteLength} bytes`); + if (this.debugMode) console.log(`Payload size: ${payloadBytes.byteLength} bytes`); if (this.secureMemory) { const context = new SecureByteContext(payloadBytes, true); @@ -297,7 +410,7 @@ export class SecureCompletionClient { const packageJson = JSON.stringify(encryptedPackage); const packageBytes = stringToArrayBuffer(packageJson); - console.log(`Encrypted package size: ${packageBytes.byteLength} bytes`); + if (this.debugMode) console.log(`Encrypted package size: ${packageBytes.byteLength} bytes`); return packageBytes; }); @@ -310,7 +423,7 @@ export class SecureCompletionClient { * Web Crypto AES-GCM decrypt expects ciphertext || tag concatenated. */ async decryptResponse(encryptedResponse: ArrayBuffer, payloadId: string): Promise> { - console.log('Decrypting response...'); + this.assertNotDisposed(); if (!encryptedResponse || encryptedResponse.byteLength === 0) { throw new Error('Empty encrypted response'); @@ -332,6 +445,22 @@ export class SecureCompletionClient { } } + // Validate version and algorithm to prevent downgrade attacks + const SUPPORTED_VERSION = '1.0'; + const SUPPORTED_ALGORITHM = 'hybrid-aes256-rsa4096'; + if (packageData.version !== SUPPORTED_VERSION) { + throw new Error( + `Unsupported protocol version: '${String(packageData.version)}'. ` + + `Expected: '${SUPPORTED_VERSION}'` + ); + } + if (packageData.algorithm !== SUPPORTED_ALGORITHM) { + throw new Error( + `Unsupported encryption algorithm: '${String(packageData.algorithm)}'. ` + + `Expected: '${SUPPORTED_ALGORITHM}'` + ); + } + const encryptedPayload = packageData.encrypted_payload as Record; if (typeof encryptedPayload !== 'object' || encryptedPayload === null) { throw new Error('Invalid encrypted_payload: must be an object'); @@ -368,7 +497,22 @@ export class SecureCompletionClient { const plaintextContext = new SecureByteContext(plaintext, this.secureMemory); return await plaintextContext.use(async (protectedPlaintext) => { const responseJson = arrayBufferToString(protectedPlaintext); - return JSON.parse(responseJson) as Record; + const decoded = JSON.parse(responseJson) as Record; + + // Validate required ChatCompletionResponse fields + if ( + typeof decoded.id !== 'string' || + typeof decoded.object !== 'string' || + typeof decoded.created !== 'number' || + typeof decoded.model !== 'string' || + !Array.isArray(decoded.choices) + ) { + throw new SecurityError( + 'Decrypted response does not conform to expected schema' + ); + } + + return decoded; }); }); @@ -382,7 +526,7 @@ export class SecureCompletionClient { encryption_algorithm: packageData.algorithm, }; - console.log('Response decrypted successfully'); + if (this.debugMode) console.log('Response decrypted successfully'); return response; } catch (error) { // Don't leak specific decryption errors (timing attacks) @@ -393,6 +537,9 @@ export class SecureCompletionClient { /** * Send a secure chat completion request to the router. * + * Retries on transient errors (429, 500, 502, 503, 504, network errors) + * with exponential backoff matching the Python SDK's `max_retries` behaviour. + * * @param securityTier Optional routing tier: "standard" | "high" | "maximum" */ async sendSecureRequest( @@ -401,7 +548,8 @@ export class SecureCompletionClient { apiKey?: string, securityTier?: string ): Promise> { - console.log('Sending secure chat completion request...'); + this.assertNotDisposed(); + if (this.debugMode) console.log('Sending secure chat completion request...'); // Validate security tier if (securityTier !== undefined) { @@ -413,9 +561,14 @@ export class SecureCompletionClient { } } - await this.ensureKeys(); + // Validate API key does not contain header injection characters + if (apiKey !== undefined) { + if (/[\r\n]/.test(apiKey)) { + throw new SecurityError('Invalid API key: must not contain line separator characters'); + } + } - const encryptedPayload = await this.encryptPayload(payload); + await this.ensureKeys(); const publicKeyPem = await this.keyManager.getPublicKeyPEM(); const headers: Record = { @@ -433,33 +586,88 @@ export class SecureCompletionClient { } const url = `${this.routerUrl}/v1/chat/secure_completion`; - console.log(`Target URL: ${url}`); + if (this.debugMode) console.log(`Target URL: ${url}`); - let response: { statusCode: number; body: ArrayBuffer }; - try { - response = await this.httpClient.post(url, { - headers, - body: encryptedPayload, - timeout: 60000, - }); - } catch (error) { - if (error instanceof Error) { - if (error.message === 'Request timeout') { - throw new APIConnectionError('Connection to server timed out'); + // Retry loop โ€” mirrors Python SDK's max_retries + exponential backoff. + // The payload is re-encrypted on every attempt so each attempt gets a + // fresh AES key and nonce (the HTTP client zeros the buffer after write). + let lastError: Error = new APIConnectionError('Request failed'); + + for (let attempt = 0; attempt <= this.maxRetries; attempt++) { + if (attempt > 0) { + const delaySec = Math.pow(2, attempt - 1); // 1 s, 2 s, 4 s, โ€ฆ + if (this.debugMode) { + console.warn( + `Retrying request (attempt ${attempt}/${this.maxRetries}) ` + + `after ${delaySec}s...` + ); } - throw new APIConnectionError(`Failed to connect to router: ${error.message}`); + await new Promise(resolve => setTimeout(resolve, delaySec * 1000)); } - throw error; + + // Re-encrypt each attempt (throws non-retryable errors like SecurityError + // or DisposedError โ€” let those propagate immediately) + const encryptedPayload = await this.encryptPayload(payload); + + let response: { statusCode: number; body: ArrayBuffer }; + try { + response = await this.httpClient.post(url, { + headers, + body: encryptedPayload, + timeout: this.requestTimeout, + }); + } catch (error) { + // Network / timeout errors from the HTTP client + let connError: APIConnectionError; + if (error instanceof Error) { + connError = error.message === 'Request timeout' + ? new APIConnectionError('Connection to server timed out') + : new APIConnectionError(`Failed to connect to router: ${error.message}`); + } else { + connError = new APIConnectionError('Failed to connect to router: unknown error'); + } + lastError = connError; + if (attempt < this.maxRetries) { + if (this.debugMode) console.warn(`Network error on attempt ${attempt}: ${connError.message}`); + continue; + } + throw lastError; + } + + if (this.debugMode) console.log(`HTTP Status: ${response.statusCode}`); + + if (response.statusCode === 200) { + return await this.decryptResponse(response.body, payloadId); + } + + const err = this.buildErrorFromResponse(response); + + if (this.isRetryableError(err) && attempt < this.maxRetries) { + if (this.debugMode) { + console.warn(`Got retryable status ${response.statusCode}: retrying...`); + } + lastError = err; + continue; + } + + throw err; } - console.log(`HTTP Status: ${response.statusCode}`); + throw lastError; + } - if (response.statusCode === 200) { - return await this.decryptResponse(response.body, payloadId); - } - - // Map HTTP error status codes to typed errors - throw this.buildErrorFromResponse(response); + /** + * Return true for errors that warrant a retry (transient failures). + * Non-retryable errors (auth, bad request, forbidden, etc.) propagate immediately. + */ + private isRetryableError(error: Error): boolean { + if (error instanceof APIConnectionError) return true; + if (error instanceof RateLimitError) return true; + if (error instanceof ServerError) return true; + if (error instanceof ServiceUnavailableError) return true; + // 502 Bad Gateway and 504 Gateway Timeout fall through as generic APIError + if (error instanceof APIError && (error.statusCode === 502 || error.statusCode === 504)) return true; + return false; } /** @@ -474,7 +682,9 @@ export class SecureCompletionClient { // Ignore JSON parse errors } - const detail = (errorData.detail as string | undefined) ?? 'Unknown error'; + // Truncate and strip non-printable chars to prevent log injection + const rawDetail = (errorData.detail as string | undefined) ?? 'Unknown error'; + const detail = rawDetail.slice(0, 100).replace(/[^\x20-\x7E]/g, ''); switch (response.statusCode) { case 400: @@ -526,7 +736,7 @@ export class SecureCompletionClient { ); } - console.log(`Valid ${algorithm.modulusLength}-bit RSA ${keyType} key`); + if (this.debugMode) console.log(`Valid ${algorithm.modulusLength}-bit RSA ${keyType} key`); } } diff --git a/src/core/crypto/encryption.ts b/src/core/crypto/encryption.ts index 780696b..f511e96 100644 --- a/src/core/crypto/encryption.ts +++ b/src/core/crypto/encryption.ts @@ -36,14 +36,20 @@ export class AESEncryption { data: ArrayBuffer, key: CryptoKey ): Promise<{ ciphertext: ArrayBuffer; nonce: ArrayBuffer }> { - // Generate random 96-bit (12-byte) nonce - const nonce = generateRandomBytes(12); + // Generate random 96-bit (12-byte) nonce โ€” copy into a plain ArrayBuffer + // so the buffer type is strictly ArrayBuffer (not ArrayBufferLike) + const nonceRaw = generateRandomBytes(12); + const nonce = nonceRaw.buffer.slice( + nonceRaw.byteOffset, + nonceRaw.byteOffset + nonceRaw.byteLength + ) as ArrayBuffer; + const nonceView = new Uint8Array(nonce); // Encrypt with AES-GCM const ciphertext = await this.subtle.encrypt( { name: 'AES-GCM', - iv: nonce, + iv: nonceView, tagLength: 128, // 128-bit authentication tag }, key, @@ -52,7 +58,7 @@ export class AESEncryption { return { ciphertext, - nonce: nonce.buffer, + nonce, }; } @@ -80,8 +86,8 @@ export class AESEncryption { ); return plaintext; - } catch (error) { - throw new Error(`AES-GCM decryption failed: ${error instanceof Error ? error.message : 'Unknown error'}`); + } catch (_error) { + throw new Error('AES-GCM decryption failed'); } } diff --git a/src/core/crypto/keys.ts b/src/core/crypto/keys.ts index ae54de2..1fa4e02 100644 --- a/src/core/crypto/keys.ts +++ b/src/core/crypto/keys.ts @@ -1,12 +1,13 @@ /** * Key management for RSA key pairs * Handles key generation, loading, and persistence - * + * * NOTE: Browser storage is NOT implemented in this version for security reasons. * Keys are kept in-memory only in browsers. For persistent keys, use Node.js. */ import { RSAOperations } from './rsa'; +import { getCrypto } from './utils'; import { KeyGenOptions, KeyPaths } from '../../types/client'; export class KeyManager { @@ -14,9 +15,11 @@ export class KeyManager { private publicKey?: CryptoKey; private privateKey?: CryptoKey; private publicKeyPem?: string; + private debug: boolean; - constructor() { + constructor(debug = false) { this.rsa = new RSAOperations(); + this.debug = debug; } /** @@ -31,7 +34,11 @@ export class KeyManager { password, } = options; - console.log(`Generating ${keySize}-bit RSA key pair...`); + if (password !== undefined && password.length < 8) { + throw new Error('Password must be at least 8 characters'); + } + + if (this.debug) console.log(`Generating ${keySize}-bit RSA key pair...`); // Generate key pair const keyPair = await this.rsa.generateKeyPair(keySize); @@ -41,7 +48,7 @@ export class KeyManager { // Export public key to PEM this.publicKeyPem = await this.rsa.exportPublicKey(this.publicKey); - console.log(`Generated ${keySize}-bit RSA key pair`); + if (this.debug) console.log(`Generated ${keySize}-bit RSA key pair`); // Save to file if requested (Node.js only) if (saveToFile) { @@ -60,14 +67,40 @@ export class KeyManager { throw new Error('File-based key loading is not supported in browsers. Use in-memory keys only.'); } - console.log('Loading keys from files...'); + if (password !== undefined && password.length < 8) { + throw new Error('Password must be at least 8 characters'); + } + + if (this.debug) console.log('Loading keys from files...'); const fs = require('fs').promises; const path = require('path'); // Load private key - const privateKeyPem = await fs.readFile(paths.privateKeyPath, 'utf-8'); - this.privateKey = await this.rsa.importPrivateKey(privateKeyPem, password); + const privateKeyPem = await fs.readFile(paths.privateKeyPath, 'utf-8') as string; + + if (password && privateKeyPem.includes('BEGIN ENCRYPTED PRIVATE KEY')) { + // Standard PKCS#8 encrypted format (produced by Python/OpenSSL) + const { createPrivateKey } = require('crypto') as typeof import('crypto'); + const keyObject = createPrivateKey({ key: privateKeyPem, format: 'pem', passphrase: password }); + const pkcs8Der = keyObject.export({ type: 'pkcs8', format: 'der' }) as Buffer; + // Copy into a plain ArrayBuffer to satisfy strict Web Crypto typings + const pkcs8Buf = pkcs8Der.buffer.slice( + pkcs8Der.byteOffset, + pkcs8Der.byteOffset + pkcs8Der.byteLength + ) as ArrayBuffer; + const subtle = getCrypto(); + this.privateKey = await subtle.importKey( + 'pkcs8', + pkcs8Buf, + { name: 'RSA-OAEP', hash: 'SHA-256' }, + true, + ['decrypt'] + ); + } else { + // Unencrypted PKCS#8 or legacy JS custom-encrypted format + this.privateKey = await this.rsa.importPrivateKey(privateKeyPem, password); + } // Validate private key size (minimum 2048 bits) const privAlgorithm = this.privateKey.algorithm as RsaHashedKeyAlgorithm; @@ -79,21 +112,25 @@ export class KeyManager { ); } - // Load or derive public key + // Load or derive public key โ€” use local variables to satisfy strict null checks if (paths.publicKeyPath) { - this.publicKeyPem = await fs.readFile(paths.publicKeyPath, 'utf-8'); - this.publicKey = await this.rsa.importPublicKey(this.publicKeyPem); + const pem = await fs.readFile(paths.publicKeyPath, 'utf-8') as string; + this.publicKeyPem = pem; + this.publicKey = await this.rsa.importPublicKey(pem); } else { const publicKeyPath = path.join( path.dirname(paths.privateKeyPath), 'public_key.pem' ); - this.publicKeyPem = await fs.readFile(publicKeyPath, 'utf-8'); - this.publicKey = await this.rsa.importPublicKey(this.publicKeyPem); + const pem = await fs.readFile(publicKeyPath, 'utf-8') as string; + this.publicKeyPem = pem; + this.publicKey = await this.rsa.importPublicKey(pem); } - console.log(`Valid ${privAlgorithm.modulusLength}-bit RSA private key`); - console.log('Keys loaded successfully'); + if (this.debug) { + console.log(`Valid ${privAlgorithm.modulusLength}-bit RSA private key`); + console.log('Keys loaded successfully'); + } } /** @@ -114,20 +151,30 @@ export class KeyManager { const fs = require('fs').promises; const path = require('path'); - console.log(`Saving keys to ${directory}/...`); + if (this.debug) console.log(`Saving keys to ${directory}/...`); // Create directory if it doesn't exist await fs.mkdir(directory, { recursive: true }); // Export and save private key - const privateKeyPem = await this.rsa.exportPrivateKey(this.privateKey, password); + let privateKeyPem: string; + if (password) { + // Use standard PKCS#8 encrypted format (compatible with Python/OpenSSL) + const { createPrivateKey } = require('crypto') as typeof import('crypto'); + const subtle = getCrypto(); + const pkcs8Der = await subtle.exportKey('pkcs8', this.privateKey); + const keyObject = createPrivateKey({ key: Buffer.from(pkcs8Der), format: 'der', type: 'pkcs8' }); + privateKeyPem = keyObject.export({ type: 'pkcs8', format: 'pem', cipher: 'aes-256-cbc', passphrase: password }) as string; + } else { + privateKeyPem = await this.rsa.exportPrivateKey(this.privateKey); + } const privateKeyPath = path.join(directory, 'private_key.pem'); await fs.writeFile(privateKeyPath, privateKeyPem, 'utf-8'); // Set restrictive permissions on private key (Unix-like systems) try { await fs.chmod(privateKeyPath, 0o600); // Owner read/write only - console.log('Private key permissions set to 600 (owner-only access)'); + if (this.debug) console.log('Private key permissions set to 600 (owner-only access)'); } catch (error) { console.warn('Could not set private key permissions:', error); } @@ -142,18 +189,18 @@ export class KeyManager { // Set permissions on public key try { await fs.chmod(publicKeyPath, 0o644); // Owner read/write, others read - console.log('Public key permissions set to 644'); + if (this.debug) console.log('Public key permissions set to 644'); } catch (error) { console.warn('Could not set public key permissions:', error); } if (password) { - console.log('Private key encrypted with password'); + if (this.debug) console.log('Private key encrypted with password'); } else { console.warn('Private key saved UNENCRYPTED (not recommended for production)'); } - console.log(`Keys saved to ${directory}/`); + if (this.debug) console.log(`Keys saved to ${directory}/`); } /** @@ -195,4 +242,26 @@ export class KeyManager { hasKeys(): boolean { return !!(this.privateKey && this.publicKey); } + + /** + * Zero in-memory key references. + * CryptoKey objects are opaque handles โ€” their backing memory is owned by the + * Web Crypto engine and cannot be zeroed from JavaScript. We sever the + * references so the GC can collect them as soon as possible. + */ + zeroKeys(): void { + this.privateKey = undefined; + this.publicKey = undefined; + // Strings are immutable; we can only null the reference. + this.publicKeyPem = undefined; + } + + /** + * Rotate keys: zero the existing pair then generate a fresh one. + * @param options Key generation options (same as generateKeys) + */ + async rotateKeys(options: KeyGenOptions = {}): Promise { + this.zeroKeys(); + await this.generateKeys(options); + } } diff --git a/src/core/crypto/rsa.ts b/src/core/crypto/rsa.ts index fbf6668..e521f7d 100644 --- a/src/core/crypto/rsa.ts +++ b/src/core/crypto/rsa.ts @@ -3,7 +3,8 @@ * Matches the Python implementation using RSA-OAEP with SHA-256 */ -import { getCrypto, pemToArrayBuffer, arrayBufferToPem, stringToArrayBuffer, arrayBufferToString } from './utils'; +import { getCrypto, pemToArrayBuffer, arrayBufferToPem, stringToArrayBuffer, arrayBufferToString, generateRandomBytes } from './utils'; +import { SecureByteContext } from '../memory/secure'; export class RSAOperations { private subtle: SubtleCrypto; @@ -60,8 +61,8 @@ export class RSAOperations { privateKey, encryptedKey ); - } catch (error) { - throw new Error(`RSA-OAEP decryption failed: ${error instanceof Error ? error.message : 'Unknown error'}`); + } catch (_error) { + throw new Error('RSA key decryption failed'); } } @@ -148,47 +149,52 @@ export class RSAOperations { * @returns PEM-encoded encrypted private key */ private async encryptPrivateKeyWithPassword(keyData: ArrayBuffer, password: string): Promise { - // Derive encryption key from password using PBKDF2 - const passwordKey = await this.subtle.importKey( - 'raw', - stringToArrayBuffer(password), - 'PBKDF2', - false, - ['deriveKey'] - ); + // Wrap password bytes so they are zeroed after key derivation + const passwordBytes = stringToArrayBuffer(password); + const pwContext = new SecureByteContext(passwordBytes, true); + return pwContext.use(async (pwData) => { + const passwordKey = await this.subtle.importKey( + 'raw', + pwData, + 'PBKDF2', + false, + ['deriveKey'] + ); - const salt = crypto.getRandomValues(new Uint8Array(16)); - const derivedKey = await this.subtle.deriveKey( - { - name: 'PBKDF2', - salt: salt, - iterations: 100000, - hash: 'SHA-256', - }, - passwordKey, - { name: 'AES-CBC', length: 256 }, - false, - ['encrypt'] - ); + // Wrap salt so it is zeroed after use + const saltBytes = generateRandomBytes(16); + const saltContext = new SecureByteContext(saltBytes.buffer as ArrayBuffer, true); + return saltContext.use(async (saltBuf) => { + const saltView = new Uint8Array(saltBuf); + const derivedKey = await this.subtle.deriveKey( + { name: 'PBKDF2', salt: saltView, iterations: 100000, hash: 'SHA-256' }, + passwordKey, + { name: 'AES-CBC', length: 256 }, + false, + ['encrypt'] + ); - // Encrypt private key with AES-256-CBC - const iv = crypto.getRandomValues(new Uint8Array(16)); - const encrypted = await this.subtle.encrypt( - { - name: 'AES-CBC', - iv: iv, - }, - derivedKey, - keyData - ); + // Wrap IV so it is zeroed after use + const ivBytes = generateRandomBytes(16); + const ivContext = new SecureByteContext(ivBytes.buffer as ArrayBuffer, true); + return ivContext.use(async (ivBuf) => { + const ivView = new Uint8Array(ivBuf); + const encrypted = await this.subtle.encrypt( + { name: 'AES-CBC', iv: ivView }, + derivedKey, + keyData + ); - // Combine salt + iv + encrypted data - const combined = new Uint8Array(salt.length + iv.length + encrypted.byteLength); - combined.set(salt, 0); - combined.set(iv, salt.length); - combined.set(new Uint8Array(encrypted), salt.length + iv.length); + // Combine salt + iv + encrypted data + const combined = new Uint8Array(saltView.length + ivView.length + encrypted.byteLength); + combined.set(saltView, 0); + combined.set(ivView, saltView.length); + combined.set(new Uint8Array(encrypted), saltView.length + ivView.length); - return arrayBufferToPem(combined.buffer, 'PRIVATE'); + return arrayBufferToPem(combined.buffer, 'PRIVATE'); + }); + }); + }); } /** diff --git a/src/core/http/node.ts b/src/core/http/node.ts index 9579ab7..3f7cc12 100644 --- a/src/core/http/node.ts +++ b/src/core/http/node.ts @@ -75,6 +75,10 @@ export class NodeHttpClient implements HttpClient { }); req.write(bodyBuffer); + // Zero the source ArrayBuffer after data has been handed to the socket + if (body instanceof ArrayBuffer) { + new Uint8Array(body).fill(0); + } req.end(); }); } diff --git a/src/core/memory/secure.ts b/src/core/memory/secure.ts index 8ae115b..9d6e0d7 100644 --- a/src/core/memory/secure.ts +++ b/src/core/memory/secure.ts @@ -1,15 +1,46 @@ /** - * Secure memory interface and context manager - * + * Secure memory interface, context manager, and public API. + * * IMPORTANT: This is a pure JavaScript implementation that provides memory zeroing only. * OS-level memory locking (mlock) is NOT implemented in this version. - * + * * For production use, consider implementing a native addon for true memory locking. * See SECURITY.md for details on memory protection limitations. */ import { ProtectionInfo } from '../../types/crypto'; +// โ”€โ”€โ”€ Global secure-memory state โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +/** Module-level flag, mirrors Python's global _secure_memory.enabled. */ +let _globalSecureMemoryEnabled = true; + +/** + * Disable secure memory operations globally. + * Affects new SecureByteContext instances created without an explicit `useSecure` argument. + * Existing client instances are unaffected (they pass `useSecure` explicitly). + * Mirrors Python's `disable_secure_memory()`. + */ +export function disableSecureMemory(): void { + _globalSecureMemoryEnabled = false; +} + +/** + * Re-enable secure memory operations globally. + * Mirrors Python's `enable_secure_memory()`. + */ +export function enableSecureMemory(): void { + _globalSecureMemoryEnabled = true; +} + +/** + * Return information about the memory protection capabilities available on this + * platform/runtime. Mirrors Python's `get_memory_protection_info()`. + */ +export function getMemoryProtectionInfo(): ProtectionInfo { + return createSecureMemory().getProtectionInfo(); +} + export interface SecureMemory { /** * Zero memory (fill with zeros) @@ -24,15 +55,19 @@ export interface SecureMemory { } /** - * Secure byte context manager - * Ensures memory is zeroed even if an exception occurs (similar to Python's context manager) + * Secure byte context manager. + * Ensures memory is zeroed even if an exception occurs (analogous to Python's + * `secure_bytearray()` context manager and `SecureBuffer` class). + * + * When `useSecure` is omitted, the module-level global flag set by + * `disableSecureMemory()` / `enableSecureMemory()` is consulted. */ export class SecureByteContext { private data: ArrayBuffer; private secureMemory: SecureMemory; private useSecure: boolean; - constructor(data: ArrayBuffer, useSecure: boolean = true) { + constructor(data: ArrayBuffer, useSecure: boolean = _globalSecureMemoryEnabled) { this.data = data; this.useSecure = useSecure; this.secureMemory = createSecureMemory(); @@ -48,7 +83,11 @@ export class SecureByteContext { } finally { // Always zero memory, even if exception occurred if (this.useSecure) { - this.secureMemory.zeroMemory(this.data); + try { + this.secureMemory.zeroMemory(this.data); + } catch (_zeroErr) { + // zeroMemory failure must not mask the original error + } } } } diff --git a/src/errors/index.ts b/src/errors/index.ts index 60b1d47..f8ecbb9 100644 --- a/src/errors/index.ts +++ b/src/errors/index.ts @@ -84,3 +84,14 @@ export class SecurityError extends Error { } } } + +export class DisposedError extends Error { + constructor(message = 'This client instance has been disposed and can no longer be used') { + super(message); + this.name = 'DisposedError'; + + if (captureStackTrace) { + captureStackTrace(this, this.constructor); + } + } +} diff --git a/src/index.ts b/src/index.ts index ccc92c7..7baa07a 100644 --- a/src/index.ts +++ b/src/index.ts @@ -13,3 +13,13 @@ export * from './types/crypto'; // Export errors export * from './errors'; + +// Secure memory public API โ€” mirrors Python's get_memory_protection_info(), +// disable_secure_memory(), enable_secure_memory(), and SecureBuffer/secure_bytearray() +export { + getMemoryProtectionInfo, + disableSecureMemory, + enableSecureMemory, + SecureByteContext, + createSecureMemory, +} from './core/memory/secure'; diff --git a/src/types/api.ts b/src/types/api.ts index 27442ef..3d10b9e 100644 --- a/src/types/api.ts +++ b/src/types/api.ts @@ -9,6 +9,8 @@ export interface Message { name?: string; tool_calls?: ToolCall[]; tool_call_id?: string; + /** Thinking-model reasoning output (Qwen3, DeepSeek-R1, etc.) */ + reasoning_content?: string; } export interface ToolCall { @@ -70,12 +72,28 @@ export interface Choice { logprobs?: unknown; } +export interface MemoryProtectionInfo { + enabled: boolean; + platform: string; + protection_level: string; + has_memory_locking: boolean; + has_secure_zeroing: boolean; + supports_full_protection: boolean; + page_size?: number; +} + export interface ResponseMetadata { payload_id: string; processed_at: number; is_encrypted: boolean; encryption_algorithm: string; response_status: string; + /** Hardware routing tier used for this request */ + security_tier?: string; + /** Server-side memory protection details */ + memory_protection?: MemoryProtectionInfo; + /** CUDA device ID used for inference (if applicable) */ + cuda_device?: string | number; } export interface ChatCompletionResponse { diff --git a/src/types/client.ts b/src/types/client.ts index 6f9accb..5422ea7 100644 --- a/src/types/client.ts +++ b/src/types/client.ts @@ -3,7 +3,7 @@ */ export interface ClientConfig { - /** Base URL of the NOMYO router (e.g., https://api.nomyo.ai:12434) */ + /** Base URL of the NOMYO router (e.g., https://api.nomyo.ai) */ routerUrl: string; /** Allow HTTP connections (ONLY for local development, never in production) */ @@ -17,6 +17,35 @@ export interface ClientConfig { /** Optional API key for authentication */ apiKey?: string; + + /** Request timeout in milliseconds (default: 60000) */ + timeout?: number; + + /** Enable debug logging (default: false) */ + debug?: boolean; + + /** Key rotation interval in milliseconds. Set to 0 to disable. (default: 86400000 = 24h) */ + keyRotationInterval?: number; + + /** Directory for rotated key files (Node.js only, default: 'client_keys') */ + keyRotationDir?: string; + + /** Password to encrypt rotated private key files */ + keyRotationPassword?: string; + + /** + * Directory to load/save RSA keys on startup. + * If the directory contains an existing key pair it is loaded; otherwise a + * new pair is generated and saved there. Default: 'client_keys'. + * Matches the Python SDK's `key_dir` constructor parameter. + */ + keyDir?: string; + + /** + * Maximum number of retries on retryable errors (429, 500, 502, 503, 504, + * network errors). Uses exponential backoff (1 s, 2 s, 4 s, โ€ฆ). Default: 2. + */ + maxRetries?: number; } export interface KeyGenOptions { @@ -53,4 +82,34 @@ export interface ChatCompletionConfig { /** Enable secure memory protection */ secureMemory?: boolean; + + /** Request timeout in milliseconds (default: 60000) */ + timeout?: number; + + /** Enable debug logging (default: false) */ + debug?: boolean; + + /** Key rotation interval in milliseconds. Set to 0 to disable. (default: 86400000 = 24h) */ + keyRotationInterval?: number; + + /** Directory for rotated key files (Node.js only, default: 'client_keys') */ + keyRotationDir?: string; + + /** Password to encrypt rotated private key files */ + keyRotationPassword?: string; + + /** + * Directory to load/save RSA keys on startup. + * If the directory contains an existing key pair it is loaded; otherwise a + * new pair is generated and saved there. + * Omit (or set to undefined) to use the default 'client_keys/' directory. + * Matches the Python SDK's `key_dir` constructor parameter. + */ + keyDir?: string; + + /** + * Maximum number of retries on retryable errors (429, 500, 502, 503, 504, + * network errors). Uses exponential backoff (1 s, 2 s, 4 s, โ€ฆ). Default: 2. + */ + maxRetries?: number; } diff --git a/tests/unit/crypto.test.ts b/tests/unit/crypto.test.ts index 42b0298..e4c3ac7 100644 --- a/tests/unit/crypto.test.ts +++ b/tests/unit/crypto.test.ts @@ -40,6 +40,20 @@ describe('AESEncryption', () => { await expect(aes.decrypt(ciphertext, nonce, key2)).rejects.toThrow(); }); + test('decrypt with wrong key throws generic message (no internal details)', async () => { + const key1 = await aes.generateKey(); + const key2 = await aes.generateKey(); + const { ciphertext, nonce } = await aes.encrypt(stringToArrayBuffer('secret'), key1); + + await expect(aes.decrypt(ciphertext, nonce, key2)) + .rejects.toThrow('AES-GCM decryption failed'); + try { + await aes.decrypt(ciphertext, nonce, key2); + } catch (e) { + expect((e as Error).message).toBe('AES-GCM decryption failed'); + } + }); + test('exportKey / importKey roundtrip', async () => { const key = await aes.generateKey(); const exported = await aes.exportKey(key); @@ -124,6 +138,64 @@ describe('RSAOperations', () => { const pem = await rsa.exportPrivateKey(kp.privateKey, 'correct-password'); await expect(rsa.importPrivateKey(pem, 'wrong-password')).rejects.toThrow(); }, 30000); + + test('decryptKey with wrong private key throws generic message', async () => { + const kp1 = await rsa.generateKeyPair(2048); + const kp2 = await rsa.generateKeyPair(2048); + const aes = new AESEncryption(); + const aesKey = await aes.generateKey(); + const aesKeyBytes = await aes.exportKey(aesKey); + + const encrypted = await rsa.encryptKey(aesKeyBytes, kp1.publicKey); + + await expect(rsa.decryptKey(encrypted, kp2.privateKey)) + .rejects.toThrow('RSA key decryption failed'); + try { + await rsa.decryptKey(encrypted, kp2.privateKey); + } catch (e) { + // Must not contain internal engine error details + expect((e as Error).message).toBe('RSA key decryption failed'); + } + }, 30000); +}); + +describe('KeyManager password validation', () => { + test('generateKeys rejects password shorter than 8 characters', async () => { + const { KeyManager } = await import('../../src/core/crypto/keys'); + const km = new KeyManager(); + await expect(km.generateKeys({ keySize: 2048, password: 'short' })) + .rejects.toThrow('at least 8 characters'); + }, 30000); + + test('generateKeys rejects empty password', async () => { + const { KeyManager } = await import('../../src/core/crypto/keys'); + const km = new KeyManager(); + await expect(km.generateKeys({ keySize: 2048, password: '' })) + .rejects.toThrow('at least 8 characters'); + }, 30000); + + test('generateKeys accepts password of exactly 8 characters', async () => { + const { KeyManager } = await import('../../src/core/crypto/keys'); + const km = new KeyManager(); + await expect(km.generateKeys({ keySize: 2048, password: '12345678' })) + .resolves.toBeUndefined(); + }, 30000); + + test('generateKeys accepts undefined password (no encryption)', async () => { + const { KeyManager } = await import('../../src/core/crypto/keys'); + const km = new KeyManager(); + await expect(km.generateKeys({ keySize: 2048 })) + .resolves.toBeUndefined(); + }, 30000); + + test('zeroKeys clears key references', async () => { + const { KeyManager } = await import('../../src/core/crypto/keys'); + const km = new KeyManager(); + await km.generateKeys({ keySize: 2048 }); + expect(km.hasKeys()).toBe(true); + km.zeroKeys(); + expect(km.hasKeys()).toBe(false); + }, 30000); }); describe('Base64 utilities', () => { diff --git a/tests/unit/secure_client.test.ts b/tests/unit/secure_client.test.ts index c16b673..28e139a 100644 --- a/tests/unit/secure_client.test.ts +++ b/tests/unit/secure_client.test.ts @@ -10,6 +10,7 @@ import { ForbiddenError, ServiceUnavailableError, RateLimitError, + DisposedError, } from '../../src/errors'; import { stringToArrayBuffer } from '../../src/core/crypto/utils'; @@ -50,20 +51,74 @@ describe('SecureCompletionClient constructor', () => { test('removes trailing slash from routerUrl', () => { const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com:12434/' }); - // We can verify indirectly via fetchServerPublicKey URL construction expect((client as unknown as { routerUrl: string }).routerUrl).toBe('https://api.example.com:12434'); + client.dispose(); + }); + + test('throws on invalid URL', () => { + expect(() => new SecureCompletionClient({ routerUrl: 'not-a-url' })) + .toThrow('Invalid routerUrl'); + }); + + test('http:// URL with allowHttp=true does not throw', () => { + expect(() => new SecureCompletionClient({ + routerUrl: 'http://localhost:1234', + allowHttp: true, + })).not.toThrow(); + }); +}); + +describe('SecureCompletionClient.dispose()', () => { + test('calling dispose() twice does not throw', () => { + const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com', keyRotationInterval: 0 }); + client.dispose(); + expect(() => client.dispose()).not.toThrow(); + }); + + test('methods throw DisposedError after dispose()', async () => { + const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com', keyRotationInterval: 0 }); + client.dispose(); + await expect(client.fetchServerPublicKey()).rejects.toBeInstanceOf(DisposedError); + await expect(client.encryptPayload({})).rejects.toBeInstanceOf(DisposedError); + await expect(client.sendSecureRequest({}, 'id')).rejects.toBeInstanceOf(DisposedError); + }); + + test('dispose() clears key rotation timer', () => { + jest.useFakeTimers(); + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com', + keyRotationInterval: 1000, + }); + const timerBefore = (client as unknown as { keyRotationTimer: unknown }).keyRotationTimer; + expect(timerBefore).toBeDefined(); + client.dispose(); + const timerAfter = (client as unknown as { keyRotationTimer: unknown }).keyRotationTimer; + expect(timerAfter).toBeUndefined(); + jest.useRealTimers(); + }); + + test('keyRotationInterval=0 does not start timer', () => { + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com', + keyRotationInterval: 0, + }); + const timer = (client as unknown as { keyRotationTimer: unknown }).keyRotationTimer; + expect(timer).toBeUndefined(); + client.dispose(); }); }); describe('SecureCompletionClient.fetchServerPublicKey', () => { test('throws SecurityError over HTTP without allowHttp', async () => { + const warnSpy = jest.spyOn(console, 'warn').mockImplementation(() => undefined); const client = new SecureCompletionClient({ routerUrl: 'http://localhost:1234', allowHttp: false, + keyRotationInterval: 0, }); - // Suppress console.warn from constructor - jest.spyOn(console, 'warn').mockImplementation(() => undefined); await expect(client.fetchServerPublicKey()).rejects.toBeInstanceOf(SecurityError); + client.dispose(); + warnSpy.mockRestore(); }); }); @@ -71,42 +126,122 @@ describe('SecureCompletionClient.sendSecureRequest โ€” security tier validation' test('throws for invalid security tier', async () => { const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com:12434', + keyRotationInterval: 0, }); await expect( client.sendSecureRequest({}, 'test-id', undefined, 'ultra') ).rejects.toThrow("Invalid securityTier: 'ultra'"); + client.dispose(); }); test('accepts valid security tiers', async () => { - // We just need to verify no validation error is thrown at the tier check stage - // (subsequent network call will fail, which is expected in unit tests) const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com:12434', + keyRotationInterval: 0, }); for (const tier of ['standard', 'high', 'maximum']) { - // Should not throw a tier validation error (will throw something else) await expect( client.sendSecureRequest({}, 'test-id', undefined, tier) ).rejects.not.toThrow("Invalid securityTier"); } + client.dispose(); + }); +}); + +describe('SecureCompletionClient โ€” header injection validation', () => { + test('apiKey containing CR throws SecurityError', async () => { + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com', + keyRotationInterval: 0, + }); + await (client as unknown as { generateKeys: () => Promise }).generateKeys(); + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') + .mockResolvedValue(new ArrayBuffer(8)); + + await expect( + client.sendSecureRequest({}, 'id', 'key\rwith\rcr') + ).rejects.toBeInstanceOf(SecurityError); + client.dispose(); + }, 30000); + + test('apiKey containing LF throws SecurityError', async () => { + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com', + keyRotationInterval: 0, + }); + await (client as unknown as { generateKeys: () => Promise }).generateKeys(); + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') + .mockResolvedValue(new ArrayBuffer(8)); + + await expect( + client.sendSecureRequest({}, 'id', 'key\nwith\nlf') + ).rejects.toBeInstanceOf(SecurityError); + client.dispose(); + }, 30000); +}); + +describe('SecureCompletionClient โ€” error detail sanitization', () => { + test('long server detail is truncated to โ‰ค100 chars in error message', async () => { + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com:12434', + keyRotationInterval: 0, + }); + + const http = mockHttpClient(async () => makeJsonResponse(400, { detail: 'x'.repeat(200) })); + (client as unknown as { httpClient: typeof http }).httpClient = http; + + await (client as unknown as { generateKeys: () => Promise }).generateKeys(); + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') + .mockResolvedValue(new ArrayBuffer(8)); + + try { + await client.sendSecureRequest({}, 'id'); + } catch (err) { + expect(err).toBeInstanceOf(Error); + // "Bad request: " prefix + max 100 char detail + expect((err as Error).message.length).toBeLessThanOrEqual(115); + } + client.dispose(); + }, 30000); +}); + +describe('SecureCompletionClient โ€” debug flag', () => { + test('console.log not called during construction when debug=false', () => { + const spy = jest.spyOn(console, 'log').mockImplementation(() => undefined); + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com', + debug: false, + keyRotationInterval: 0, + }); + expect(spy).not.toHaveBeenCalled(); + spy.mockRestore(); + client.dispose(); + }); + + test('console.log called during construction when debug=true', () => { + const spy = jest.spyOn(console, 'log').mockImplementation(() => undefined); + const client = new SecureCompletionClient({ + routerUrl: 'https://api.example.com', + debug: true, + keyRotationInterval: 0, + }); + expect(spy).toHaveBeenCalled(); + spy.mockRestore(); + client.dispose(); }); }); describe('SecureCompletionClient.buildErrorFromResponse (via sendSecureRequest)', () => { - // We can test error mapping by making the HTTP mock return specific status codes - // and verifying the correct typed error is thrown. - async function clientWithMockedHttp(statusCode: number, body: object) { const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com:12434', + keyRotationInterval: 0, }); - // Inject mocked HTTP client const http = mockHttpClient(async (url: string) => { if (url.includes('/pki/public_key')) { - // Should not be reached in error tests throw new Error('unexpected pki call'); } return makeJsonResponse(statusCode, body); @@ -118,104 +253,99 @@ describe('SecureCompletionClient.buildErrorFromResponse (via sendSecureRequest)' test('401 โ†’ AuthenticationError', async () => { const client = await clientWithMockedHttp(401, { detail: 'bad key' }); - // Keys must be generated first, so inject a pre-generated key set await (client as unknown as { generateKeys: () => Promise }).generateKeys(); - - // Mock encryptPayload to skip actual encryption - jest.spyOn(client as unknown as { encryptPayload: () => Promise }, 'encryptPayload') + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') .mockResolvedValue(new ArrayBuffer(8)); await expect( client.sendSecureRequest({ model: 'test', messages: [] }, 'id-1') ).rejects.toBeInstanceOf(AuthenticationError); + client.dispose(); }, 30000); test('403 โ†’ ForbiddenError', async () => { const client = await clientWithMockedHttp(403, { detail: 'not allowed' }); await (client as unknown as { generateKeys: () => Promise }).generateKeys(); - jest.spyOn(client as unknown as { encryptPayload: () => Promise }, 'encryptPayload') + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') .mockResolvedValue(new ArrayBuffer(8)); await expect( client.sendSecureRequest({ model: 'test', messages: [] }, 'id-1') ).rejects.toBeInstanceOf(ForbiddenError); + client.dispose(); }, 30000); test('429 โ†’ RateLimitError', async () => { const client = await clientWithMockedHttp(429, { detail: 'too many' }); await (client as unknown as { generateKeys: () => Promise }).generateKeys(); - jest.spyOn(client as unknown as { encryptPayload: () => Promise }, 'encryptPayload') + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') .mockResolvedValue(new ArrayBuffer(8)); await expect( client.sendSecureRequest({ model: 'test', messages: [] }, 'id-1') ).rejects.toBeInstanceOf(RateLimitError); + client.dispose(); }, 30000); test('503 โ†’ ServiceUnavailableError', async () => { const client = await clientWithMockedHttp(503, { detail: 'down' }); await (client as unknown as { generateKeys: () => Promise }).generateKeys(); - jest.spyOn(client as unknown as { encryptPayload: () => Promise }, 'encryptPayload') + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') .mockResolvedValue(new ArrayBuffer(8)); await expect( client.sendSecureRequest({ model: 'test', messages: [] }, 'id-1') ).rejects.toBeInstanceOf(ServiceUnavailableError); + client.dispose(); }, 30000); - test('network error โ†’ APIConnectionError (not wrapping typed errors)', async () => { + test('network error โ†’ APIConnectionError', async () => { const client = new SecureCompletionClient({ routerUrl: 'https://api.example.com:12434', + keyRotationInterval: 0, }); const http = mockHttpClient(async () => { throw new Error('ECONNREFUSED'); }); (client as unknown as { httpClient: typeof http }).httpClient = http; await (client as unknown as { generateKeys: () => Promise }).generateKeys(); - jest.spyOn(client as unknown as { encryptPayload: () => Promise }, 'encryptPayload') + jest.spyOn(client as unknown as { encryptPayload: (p: object) => Promise }, 'encryptPayload') .mockResolvedValue(new ArrayBuffer(8)); await expect( client.sendSecureRequest({ model: 'test', messages: [] }, 'id-1') ).rejects.toBeInstanceOf(APIConnectionError); + client.dispose(); }, 30000); }); describe('SecureCompletionClient encrypt/decrypt roundtrip', () => { test('encryptPayload + decryptResponse roundtrip', async () => { - // Use two clients: one for "client", one to simulate "server" - const clientA = new SecureCompletionClient({ routerUrl: 'https://x', allowHttp: true }); - const clientB = new SecureCompletionClient({ routerUrl: 'https://x', allowHttp: true }); + const clientA = new SecureCompletionClient({ routerUrl: 'https://x', allowHttp: true, keyRotationInterval: 0 }); + const clientB = new SecureCompletionClient({ routerUrl: 'https://x', allowHttp: true, keyRotationInterval: 0 }); await (clientA as unknown as { generateKeys: () => Promise }).generateKeys(); await (clientB as unknown as { generateKeys: () => Promise }).generateKeys(); const payload = { model: 'test', messages: [{ role: 'user', content: 'hi' }] }; - // clientA encrypts, clientB decrypts (simulating server responding) - // We can only test the client-side encrypt โ†’ client-side decrypt roundtrip - // because the server uses its own key pair to encrypt the response. - - // Directly test encryptPayload โ†’ decryptResponse using the SAME client's keys - // (as the server would decrypt with its private key and re-encrypt with client's public key) - // For a full roundtrip test we encrypt with clientB's public key and decrypt with clientB's private key. const serverPublicKeyPem = await (clientB as unknown as { keyManager: { getPublicKeyPEM: () => Promise } }).keyManager.getPublicKeyPEM(); - // Mock fetchServerPublicKey on clientA to return clientB's public key jest.spyOn(clientA as unknown as { fetchServerPublicKey: () => Promise }, 'fetchServerPublicKey') .mockResolvedValue(serverPublicKeyPem); const encrypted = await clientA.encryptPayload(payload); expect(encrypted.byteLength).toBeGreaterThan(0); - // Now simulate clientB decrypting (server decrypts the payload โ€” we can only test - // structure here since decryptResponse expects server-format encrypted response) const pkg = JSON.parse(new TextDecoder().decode(encrypted)); expect(pkg.version).toBe('1.0'); expect(pkg.algorithm).toBe('hybrid-aes256-rsa4096'); expect(pkg.encrypted_payload.ciphertext).toBeTruthy(); expect(pkg.encrypted_payload.nonce).toBeTruthy(); - expect(pkg.encrypted_payload.tag).toBeTruthy(); // tag must be present + expect(pkg.encrypted_payload.tag).toBeTruthy(); expect(pkg.encrypted_aes_key).toBeTruthy(); + + clientA.dispose(); + clientB.dispose(); }, 60000); });