No results
5
Models
Alpha Nerd edited this page 2026-04-13 18:04:42 +02:00
Table of Contents
Available Models
All models are available via api.nomyo.ai. Pass the model ID string directly to the model parameter of create().
Model List
| Model ID | Parameters | Type | ctx | Notes |
|---|---|---|---|---|
Qwen/Qwen3-0.6B |
0.6B | General | 40k | Lightweight, fast inference |
Qwen/Qwen3.5-0.8B |
0.8B | General | 40k | Lightweight, fast inference |
LiquidAI/LFM2.5-1.2B-Thinking |
1.2B | Thinking | 4k | Reasoning model |
ibm-granite/granite-4.0-h-small |
32B | General | 131k | IBM Granite 4.0, enterprise-focused |
Qwen/Qwen3.5-9B |
9B | General | 200k | Balanced quality and speed |
utter-project/EuroLLM-9B-Instruct-2512 |
9B | General | 32k | Multilingual, strong European language support |
zai-org/GLM-4.7-Flash |
30B (3B active) | General | 131k | Fast GLM variant |
mistralai/Ministral-3-14B-Instruct-2512-GGUF |
14B | General | 32k | Mistral instruction-tuned |
ServiceNow-AI/Apriel-1.6-15b-Thinker |
15B | Specialized | 32k | Reasoning model strong in math/physics/science |
openai/gpt-oss-20b |
20B | General | 131k | OpenAI open-weight release |
LiquidAI/LFM2-24B-A2B |
24B (2B active) | General | 128k | MoE — efficient inference |
Qwen/Qwen3.5-27B |
27B | General | 200k | High quality, large context |
google/medgemma-27b-it |
27B | Specialized | 131k | Medical domain, instruction-tuned |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 |
30B (3B active) | General | 200k | MoE — efficient inference |
Qwen/Qwen3.5-35B-A3B |
35B (3B active) | General | 200k | MoE — efficient inference |
moonshotai/Kimi-Linear-48B-A3B-Instruct |
48B (3B active) | General | 1m | MoE — large capacity, efficient inference |
MoE (Mixture of Experts) models show total/active parameter counts.
Usage Example
from nomyo import SecureChatCompletion
client = SecureChatCompletion(api_key="your-api-key")
response = await client.create(
model="Qwen/Qwen3.5-9B",
messages=[{"role": "user", "content": "Hello!"}]
)
Choosing a Model
- Low latency / edge use:
Qwen/Qwen3-0.6B,Qwen/Qwen3.5-0.8B,LiquidAI/LFM2.5-1.2B-Thinking - Balanced quality and speed:
Qwen/Qwen3.5-9B,mistralai/Ministral-3-14B-Instruct-2512-GGUF - Reasoning / chain-of-thought:
LiquidAI/LFM2.5-1.2B-Thinking,ServiceNow-AI/Apriel-1.6-15b-Thinker - Multilingual:
utter-project/EuroLLM-9B-Instruct-2512 - Medical:
google/medgemma-27b-it - Highest quality:
moonshotai/Kimi-Linear-48B-A3B-Instruct,Qwen/Qwen3.5-35B-A3B