feat(llm): add Gemini CLI provider as primary; set qwen3.5:9b as default Ollama model

- Add GeminiCliProvider: shells out to `gemini -p` with --output-format json,
  injection-safe prompt passing, MCP server suppression via temp workdir,
  6-slot concurrency semaphore, 60s subprocess deadline
- Add --llm-provider, --llm-model, --llm-base-url CLI flags for per-call overrides
- Provider chain: Gemini CLI → OpenAI → Ollama → Anthropic
- Move LLM timing to dispatch layer (LLM: Xs on stderr)
- Default Ollama model: qwen3:8b → qwen3.5:9b (benchmark shows better schema extraction)
- Add noxa mcp subcommand
- Add docs/reports/llm-benchmark-2026-04-11.md (Gemini vs qwen3.5:4b vs qwen3.5:9b)
- Bump version 0.3.11 → 0.4.0

Co-authored-by: Claude <claude@anthropic.com>
This commit is contained in:
Jacob Magar 2026-04-12 00:52:53 -04:00
parent 464eb1baec
commit adf4b6ba55
39 changed files with 1999 additions and 1789 deletions

View file

@ -13,8 +13,16 @@ NOXA_PROXY_FILE=
# Webhook URL for completion notifications
NOXA_WEBHOOK_URL=
# LLM base URL (Ollama or OpenAI-compatible endpoint)
NOXA_LLM_BASE_URL=
# LLM provider configuration and backend defaults
# NOXA_LLM_PROVIDER=gemini
# NOXA_LLM_MODEL=gemini-2.5-pro
# NOXA_LLM_BASE_URL= (Ollama or OpenAI-compatible endpoint)
# GEMINI_MODEL=gemini-2.5-pro
# OLLAMA_HOST=http://localhost:11434
# OLLAMA_MODEL=qwen3.5:9b
# OLLAMA_HEALTH_TIMEOUT_MS=2000
# OPENAI_API_KEY=
# ANTHROPIC_API_KEY=
# Optional: path to a non-default config file (default: ./config.json)
# NOXA_CONFIG=/path/to/my-config.json