3.7 KiB
3.7 KiB
Available Models
All models are available via api.nomyo.ai. Pass the model ID string directly to the model field of create().
Model List
| Model ID | Parameters | Type | Notes |
|---|---|---|---|
Qwen/Qwen3-0.6B |
0.6B | General | Lightweight, fast inference |
Qwen/Qwen3.5-0.8B |
0.8B | General | Lightweight, fast inference |
LiquidAI/LFM2.5-1.2B-Thinking |
1.2B | Thinking | Reasoning model |
ibm-granite/granite-4.0-h-small |
Small | General | IBM Granite 4.0, enterprise-focused |
Qwen/Qwen3.5-9B |
9B | General | Balanced quality and speed |
utter-project/EuroLLM-9B-Instruct-2512 |
9B | General | Multilingual, strong European language support |
zai-org/GLM-4.7-Flash |
— | General | Fast GLM variant |
mistralai/Ministral-3-14B-Instruct-2512-GGUF |
14B | General | Mistral instruction-tuned |
ServiceNow-AI/Apriel-1.6-15b-Thinker |
15B | Thinking | Reasoning model |
openai/gpt-oss-20b |
20B | General | OpenAI open-weight release |
LiquidAI/LFM2-24B-A2B |
24B (2B active) | General | MoE — efficient inference |
Qwen/Qwen3.5-27B |
27B | General | High quality, large context |
google/medgemma-27b-it |
27B | Specialized | Medical domain, instruction-tuned |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 |
30B (3B active) | General | MoE — efficient inference |
Qwen/Qwen3.5-35B-A3B |
35B (3B active) | General | MoE — efficient inference |
moonshotai/Kimi-Linear-48B-A3B-Instruct |
48B (3B active) | General | MoE — large capacity, efficient inference |
MoE (Mixture of Experts) models show total/active parameter counts. Only active parameters are used per token, keeping inference cost low relative to total model size.
Usage
import { SecureChatCompletion } from 'nomyo-js';
const client = new SecureChatCompletion({ apiKey: process.env.NOMYO_API_KEY });
const response = await client.create({
model: 'Qwen/Qwen3.5-9B',
messages: [{ role: 'user', content: 'Hello!' }],
});
Choosing a Model
| Goal | Recommended models |
|---|---|
| Low latency / edge | Qwen/Qwen3-0.6B, Qwen/Qwen3.5-0.8B, LiquidAI/LFM2.5-1.2B-Thinking |
| Balanced quality + speed | Qwen/Qwen3.5-9B, mistralai/Ministral-3-14B-Instruct-2512-GGUF |
| Reasoning / chain-of-thought | LiquidAI/LFM2.5-1.2B-Thinking, ServiceNow-AI/Apriel-1.6-15b-Thinker |
| Multilingual | utter-project/EuroLLM-9B-Instruct-2512 |
| Medical | google/medgemma-27b-it |
| Highest quality | moonshotai/Kimi-Linear-48B-A3B-Instruct, Qwen/Qwen3.5-35B-A3B |
Thinking Models
Models marked Thinking return an additional reasoning_content field in the response message alongside the normal content. This contains the model's internal chain-of-thought:
const response = await client.create({
model: 'LiquidAI/LFM2.5-1.2B-Thinking',
messages: [{ role: 'user', content: 'Is 9.9 or 9.11 larger?' }],
});
const { content, reasoning_content } = response.choices[0].message;
console.log('Reasoning:', reasoning_content); // internal chain-of-thought
console.log('Answer:', content); // final answer
Security Tier Compatibility
Not all models are available on all security tiers. If a model is not permitted for the requested tier, the server returns HTTP 403 and the client throws ForbiddenError.
import { ForbiddenError } from 'nomyo-js';
try {
const response = await client.create({
model: 'Qwen/Qwen3.5-27B',
messages: [{ role: 'user', content: '...' }],
security_tier: 'maximum',
});
} catch (err) {
if (err instanceof ForbiddenError) {
// Model not available at this security tier — retry with a different tier or model
}
}