YetheSamartaka eca4a92a33 add: Optional router-level API key that gates router/API/web UI access

Optional router-level API key that gates router/API/web UI access (leave empty to disable)

## Supplying the router API key

If you set `nomyo-router-api-key` in `config.yaml` (or `NOMYO_ROUTER_API_KEY` env), every request to NOMYO Router must include the key:

- HTTP header (recommended): `Authorization: Bearer <router_key>`
- Query param (fallback): `?api_key=<router_key>`

Examples:
```bash
curl -H "Authorization: Bearer $NOMYO_ROUTER_API_KEY" http://localhost:12434/api/tags
curl "http://localhost:12434/api/tags?api_key=$NOMYO_ROUTER_API_KEY"
```

2026-01-14 09:28:02 +01:00

4.5 KiB

Raw Blame History

NOMYO Router Documentation

Welcome to the NOMYO Router documentation! This folder contains comprehensive guides for using, configuring, and deploying the NOMYO Router.

Documentation Structure

doc/
├── architecture.md          # Technical architecture overview
├── configuration.md         # Detailed configuration guide
├── usage.md                 # API usage examples
├── deployment.md            # Deployment scenarios
├── monitoring.md            # Monitoring and troubleshooting
└── examples/                # Example configurations and scripts
    ├── docker-compose.yml
    ├── sample-config.yaml
    └── k8s-deployment.yaml

Getting Started

Quick Start Guide

Install the router:

git clone https://github.com/nomyo-ai/nomyo-router.git
cd nomyo-router
python3 -m venv .venv/router
source .venv/router/bin/activate
pip3 install -r requirements.txt

Configure endpoints in config.yaml:

endpoints:
  - http://localhost:11434
max_concurrent_connections: 2

Optional router-level API key (leave blank to disable)

nomyo-router-api-key: ""

3. **Run the router**:

```bash
uvicorn router:app --host 0.0.0.0 --port 12434

Use the router: Point your frontend to http://localhost:12434 instead of your Ollama instance.

Key Features

Intelligent Routing: Model deployment-aware routing with load balancing
Multi-Endpoint Support: Combine Ollama and OpenAI-compatible endpoints
Token Tracking: Comprehensive token usage monitoring
Real-time Monitoring: Server-Sent Events for live usage updates
OpenAI Compatibility: Full OpenAI API compatibility layer
MOE System: Multiple Opinions Ensemble for improved responses with smaller models

Documentation Guides

Architecture

Learn about the router's internal architecture, routing algorithm, caching mechanisms, and advanced features like the MOE system.

Configuration

Detailed guide on configuring the router with multiple endpoints, API keys, and environment variables.

Usage

Comprehensive API reference with examples for making requests, streaming responses, and using advanced features.

Deployment

Step-by-step deployment guides for bare metal, Docker, Kubernetes, and production environments.

Monitoring

Monitoring endpoints, troubleshooting guides, performance tuning, and best practices for maintaining your router.

Examples

The examples directory contains ready-to-use configuration files:

docker-compose.yml: Complete Docker Compose setup with multiple Ollama instances
sample-config.yaml: Example configuration with comments
k8s-deployment.yaml: Kubernetes deployment manifests

Need Help?

Common Issues

Check the Monitoring Guide for troubleshooting common problems:

Endpoint unavailable
Model not found
High latency
Connection limits reached
Token tracking issues

Support

For additional help:

Check the GitHub Issues
Review the Monitoring Guide for diagnostics
Examine the router logs for detailed error messages

Best Practices

Configuration

Use environment variables for API keys
Set appropriate max_concurrent_connections based on your hardware
Monitor endpoint health regularly
Keep models loaded on multiple endpoints for redundancy

Deployment

Use Docker for containerized deployments
Consider Kubernetes for production at scale
Set up monitoring and alerting
Implement regular backups of token counts database

Performance

Balance load across multiple endpoints
Keep frequently used models loaded
Monitor connection counts and token usage
Scale horizontally when needed

Next Steps

Read the Architecture Guide to understand how the router works
Configure your endpoints in config.yaml
Deploy the router using your preferred method
Monitor your setup using the monitoring endpoints
Scale as needed by adding more endpoints

Happy routing! 🚀

Router API key usage

If the router API key is set (NOMYO_ROUTER_API_KEY env or nomyo-router-api-key in config), include it in every request:

Header (preferred): Authorization: Bearer <router_key>
Query param: ?api_key=<router_key>

Example:

curl -H "Authorization: Bearer $NOMYO_ROUTER_API_KEY" http://localhost:12434/api/tags

4.5 KiB Raw Blame History

NOMYO Router Documentation

Documentation Structure

Getting Started

Quick Start Guide

Optional router-level API key (leave blank to disable)

Key Features

Documentation Guides

Architecture

Configuration

Usage

Deployment

Monitoring

Examples

Need Help?

Common Issues

Support

Best Practices

Configuration

Deployment

Performance

Next Steps

Router API key usage

4.5 KiB

Raw Blame History