| .. | ||
| examples | ||
| architecture.md | ||
| configuration.md | ||
| deployment.md | ||
| monitoring.md | ||
| README.md | ||
| usage.md | ||
NOMYO Router Documentation
Welcome to the NOMYO Router documentation! This folder contains comprehensive guides for using, configuring, and deploying the NOMYO Router.
Documentation Structure
doc/
├── architecture.md # Technical architecture overview
├── configuration.md # Detailed configuration guide
├── usage.md # API usage examples
├── deployment.md # Deployment scenarios
├── monitoring.md # Monitoring and troubleshooting
└── examples/ # Example configurations and scripts
├── docker-compose.yml
├── sample-config.yaml
└── k8s-deployment.yaml
Getting Started
Quick Start Guide
-
Install the router:
git clone https://github.com/nomyo-ai/nomyo-router.git cd nomyo-router python3 -m venv .venv/router source .venv/router/bin/activate pip3 install -r requirements.txt -
Configure endpoints in
config.yaml:endpoints: - http://localhost:11434 max_concurrent_connections: 2
Optional router-level API key (leave blank to disable)
nomyo-router-api-key: ""
3. **Run the router**:
```bash
uvicorn router:app --host 0.0.0.0 --port 12434
- Use the router: Point your frontend to
http://localhost:12434instead of your Ollama instance.
Key Features
- Intelligent Routing: Model deployment-aware routing with load balancing
- Multi-Endpoint Support: Combine Ollama and OpenAI-compatible endpoints
- Token Tracking: Comprehensive token usage monitoring
- Real-time Monitoring: Server-Sent Events for live usage updates
- OpenAI Compatibility: Full OpenAI API compatibility layer
- MOE System: Multiple Opinions Ensemble for improved responses with smaller models
Documentation Guides
Architecture
Learn about the router's internal architecture, routing algorithm, caching mechanisms, and advanced features like the MOE system.
Configuration
Detailed guide on configuring the router with multiple endpoints, API keys, and environment variables.
Usage
Comprehensive API reference with examples for making requests, streaming responses, and using advanced features.
Deployment
Step-by-step deployment guides for bare metal, Docker, Kubernetes, and production environments.
Monitoring
Monitoring endpoints, troubleshooting guides, performance tuning, and best practices for maintaining your router.
Examples
The examples directory contains ready-to-use configuration files:
- docker-compose.yml: Complete Docker Compose setup with multiple Ollama instances
- sample-config.yaml: Example configuration with comments
- k8s-deployment.yaml: Kubernetes deployment manifests
Need Help?
Common Issues
Check the Monitoring Guide for troubleshooting common problems:
- Endpoint unavailable
- Model not found
- High latency
- Connection limits reached
- Token tracking issues
Support
For additional help:
- Check the GitHub Issues
- Review the Monitoring Guide for diagnostics
- Examine the router logs for detailed error messages
Best Practices
Configuration
- Use environment variables for API keys
- Set appropriate
max_concurrent_connectionsbased on your hardware - Monitor endpoint health regularly
- Keep models loaded on multiple endpoints for redundancy
Deployment
- Use Docker for containerized deployments
- Consider Kubernetes for production at scale
- Set up monitoring and alerting
- Implement regular backups of token counts database
Performance
- Balance load across multiple endpoints
- Keep frequently used models loaded
- Monitor connection counts and token usage
- Scale horizontally when needed
Next Steps
- Read the Architecture Guide to understand how the router works
- Configure your endpoints in
config.yaml - Deploy the router using your preferred method
- Monitor your setup using the monitoring endpoints
- Scale as needed by adding more endpoints
Happy routing! 🚀
Router API key usage
If the router API key is set (NOMYO_ROUTER_API_KEY env or nomyo-router-api-key in config), include it in every request:
- Header (preferred): Authorization: Bearer <router_key>
- Query param: ?api_key=<router_key>
Example:
curl -H "Authorization: Bearer $NOMYO_ROUTER_API_KEY" http://localhost:12434/api/tags