nomyo-router/doc/examples/sample-config.yaml

# config.yaml
# Ollama endpoints local + remote
endpoints:
  - http://localhost:11434
  - http://192.168.0.51:11434
  - http://192.168.0.52:11434
  # External OpenAI-compatible endpoints (will NOT be queried for /api/ps /api/ps_details)
  - https://api.openai.com/v1

# llama-server endpoints (OpenAI-compatible with /v1/models status info)
# These endpoints will be queried for /api/tags, /api/ps, /api/ps_details
# and included in the model selection pool for inference routing
llama_server_endpoints:
  - http://localhost:8080/v1
  - http://192.168.0.33:8081/v1

# Maximum concurrent connections *per endpoint‑model pair* (equals to OLLAMA_NUM_PARALLEL)
max_concurrent_connections: 2

# Optional router-level API key that gates router/API/web UI access (leave empty to disable)
nomyo-router-api-key: ""

# API keys for remote endpoints
# Set an environment variable like OPENAI_KEY
# Confirm endpoints are exactly as in endpoints block
api_keys:
  "http://192.168.0.50:11434": "ollama"
  "http://192.168.0.51:11434": "ollama"
  "http://192.168.0.52:11434": "ollama"
  "https://api.openai.com/v1": "${OPENAI_KEY}"
  "http://localhost:8080/v1": "llama-server"  # Optional API key for llama-server - depends on llama_server config
  "http://192.168.0.33:8081/v1": "llama-server"
-												feat(router): Add llama-server endpoints support and model parsing

Add `llama_server_endpoints` configuration field to support llama_server OpenAI-compatible endpoints for status checks. Implement helper functions to parse model names and quantization levels from llama-server responses (best effort). Update `is_ext_openai_endpoint` to properly distinguish these endpoints from external OpenAI services. Update sample configuration documentation.

											
										
										
											2026-02-10 16:46:51 +01:00
+								# config.yaml
-												fix: better sample config

											
										
										
											2026-02-13 10:52:14 +01:00
+								# Ollama endpoints local + remote
-												feat:
added buffer_lock to prevent race condition in high concurrency scenarios
added documentation

											
										
										
											2026-01-05 17:16:31 +01:00
+								endpoints:
-												fix: better sample config

											
										
										
											2026-02-13 10:52:14 +01:00
+								  - http://localhost:11434
-												feat(router): Add llama-server endpoints support and model parsing

Add `llama_server_endpoints` configuration field to support llama_server OpenAI-compatible endpoints for status checks. Implement helper functions to parse model names and quantization levels from llama-server responses (best effort). Update `is_ext_openai_endpoint` to properly distinguish these endpoints from external OpenAI services. Update sample configuration documentation.

											
										
										
											2026-02-10 16:46:51 +01:00
+								  - http://192.168.0.51:11434
 								  - http://192.168.0.52:11434
 								  # External OpenAI-compatible endpoints (will NOT be queried for /api/ps /api/ps_details)
 								  - https://api.openai.com/v1
 								# llama-server endpoints (OpenAI-compatible with /v1/models status info)
 								# These endpoints will be queried for /api/tags, /api/ps, /api/ps_details
 								# and included in the model selection pool for inference routing
 								llama_server_endpoints:
 								  - http://localhost:8080/v1
-												fix: better sample config

											
										
										
											2026-02-13 10:52:14 +01:00
+								  - http://192.168.0.33:8081/v1
-												feat(router): Add llama-server endpoints support and model parsing

Add `llama_server_endpoints` configuration field to support llama_server OpenAI-compatible endpoints for status checks. Implement helper functions to parse model names and quantization levels from llama-server responses (best effort). Update `is_ext_openai_endpoint` to properly distinguish these endpoints from external OpenAI services. Update sample configuration documentation.

											
										
										
											2026-02-10 16:46:51 +01:00
 								# Maximum concurrent connections *per endpoint‑model pair* (equals to OLLAMA_NUM_PARALLEL)
-												feat:
added buffer_lock to prevent race condition in high concurrency scenarios
added documentation

											
										
										
											2026-01-05 17:16:31 +01:00
+								max_concurrent_connections: 2
-												feat(router): Add llama-server endpoints support and model parsing

Add `llama_server_endpoints` configuration field to support llama_server OpenAI-compatible endpoints for status checks. Implement helper functions to parse model names and quantization levels from llama-server responses (best effort). Update `is_ext_openai_endpoint` to properly distinguish these endpoints from external OpenAI services. Update sample configuration documentation.

											
										
										
											2026-02-10 16:46:51 +01:00
+								# Optional router-level API key that gates router/API/web UI access (leave empty to disable)
-												add: Optional router-level API key that gates router/API/web UI access

Optional router-level API key that gates router/API/web UI access (leave empty to disable)

## Supplying the router API key

If you set `nomyo-router-api-key` in `config.yaml` (or `NOMYO_ROUTER_API_KEY` env), every request to NOMYO Router must include the key:

- HTTP header (recommended): `Authorization: Bearer <router_key>`
- Query param (fallback): `?api_key=<router_key>`

Examples:
```bash
curl -H "Authorization: Bearer $NOMYO_ROUTER_API_KEY" http://localhost:12434/api/tags
curl "http://localhost:12434/api/tags?api_key=$NOMYO_ROUTER_API_KEY"
```

											
										
										
											2026-01-14 09:28:02 +01:00
+								nomyo-router-api-key: ""
-												feat:
added buffer_lock to prevent race condition in high concurrency scenarios
added documentation

											
										
										
											2026-01-05 17:16:31 +01:00
+								# API keys for remote endpoints
-												feat(router): Add llama-server endpoints support and model parsing

Add `llama_server_endpoints` configuration field to support llama_server OpenAI-compatible endpoints for status checks. Implement helper functions to parse model names and quantization levels from llama-server responses (best effort). Update `is_ext_openai_endpoint` to properly distinguish these endpoints from external OpenAI services. Update sample configuration documentation.

											
										
										
											2026-02-10 16:46:51 +01:00
+								# Set an environment variable like OPENAI_KEY
 								# Confirm endpoints are exactly as in endpoints block
-												feat:
added buffer_lock to prevent race condition in high concurrency scenarios
added documentation

											
										
										
											2026-01-05 17:16:31 +01:00
+								api_keys:
-												feat(router): Add llama-server endpoints support and model parsing

Add `llama_server_endpoints` configuration field to support llama_server OpenAI-compatible endpoints for status checks. Implement helper functions to parse model names and quantization levels from llama-server responses (best effort). Update `is_ext_openai_endpoint` to properly distinguish these endpoints from external OpenAI services. Update sample configuration documentation.

											
										
										
											2026-02-10 16:46:51 +01:00
+								  "http://192.168.0.50:11434": "ollama"
 								  "http://192.168.0.51:11434": "ollama"
 								  "http://192.168.0.52:11434": "ollama"
 								  "https://api.openai.com/v1": "${OPENAI_KEY}"
-												fix: better sample config

											
										
										
											2026-02-13 10:52:14 +01:00
+								  "http://localhost:8080/v1": "llama-server"  # Optional API key for llama-server - depends on llama_server config
 								  "http://192.168.0.33:8081/v1": "llama-server"