plano/crates/hermesllm
Musa 5a4487fc6e
Some checks failed
CI / pre-commit (push) Has been cancelled
CI / plano-tools-tests (push) Has been cancelled
CI / native-smoke-test (push) Has been cancelled
CI / docker-build (push) Has been cancelled
CI / validate-config (push) Has been cancelled
Publish docker image (latest) / build-arm64 (push) Has been cancelled
Publish docker image (latest) / build-amd64 (push) Has been cancelled
Build and Deploy Documentation / build (push) Has been cancelled
CI / security-scan (push) Has been cancelled
CI / test-prompt-gateway (push) Has been cancelled
CI / test-model-alias-routing (push) Has been cancelled
CI / test-responses-api-with-state (push) Has been cancelled
CI / e2e-plano-tests (3.10) (push) Has been cancelled
CI / e2e-plano-tests (3.11) (push) Has been cancelled
CI / e2e-plano-tests (3.12) (push) Has been cancelled
CI / e2e-plano-tests (3.13) (push) Has been cancelled
CI / e2e-plano-tests (3.14) (push) Has been cancelled
CI / e2e-demo-preference (push) Has been cancelled
CI / e2e-demo-currency (push) Has been cancelled
Publish docker image (latest) / create-manifest (push) Has been cancelled
ci+fix: add update-providers workflow + non-destructive fetch_models (#914)
* ci: add update-providers workflow

Adds .github/workflows/update-providers.yml so the provider_models.yaml
refresh can be triggered via workflow_dispatch (manual UI / gh CLI) or
repository_dispatch (from the PlanoHelper Slack bot).

The workflow:
  - Runs cargo run --bin fetch_models --features model-fetch with all
    provider API keys + AWS creds available as env from secrets.
  - Opens a PR via peter-evans/create-pull-request scoped to just
    crates/hermesllm/src/bin/provider_models.yaml.
  - On repository_dispatch, posts the PR link (or failure) back to Slack
    via the response_url in the dispatch payload.

Includes keys for the providers fetch_models reads today (OpenAI,
Anthropic, Mistral, DeepSeek, Grok, Moonshot, Dashscope/Qwen, Zhipu,
Xiaomi/Mimo, Google) plus forward-compat env for OpenRouter and Vercel
AI Gateway (added in #902).

The workflow has no push: or schedule: trigger, so landing this is inert
until something dispatches it. Required secrets are documented in
apps/planohelper/README.md (in a follow-up PR).

* fix(fetch_models): preserve existing providers when keys are missing

Previously fetch_models rebuilt provider_models.yaml from scratch on
every run, so running locally (or in CI) without e.g. ANTHROPIC_API_KEY,
GOOGLE_API_KEY, or AWS Bedrock credentials would silently drop those
providers' entries from the file. The user only meant to refresh what
they had keys for.

Now fetch_models loads the existing provider_models.yaml first and
treats each provider independently:

  - Successful fetch -> entry replaced with fresh data ("updated")
  - Missing API key  -> existing entry preserved ("skipped")
  - Failed fetch     -> existing entry preserved ("failed, kept existing")
  - Missing AWS creds -> Amazon entry preserved instead of running
    `aws bedrock list-foundation-models` and erroring out

If the file doesn't exist yet it starts fresh, same as before. If the
file exists but can't be parsed, the binary refuses to overwrite it and
exits with an error rather than silently nuking it.

Other changes that come along for the ride:

  - HashMap -> BTreeMap for the providers map. Output YAML now has a
    stable, alphabetical provider order across runs (eliminates
    HashMap-iteration churn in PR diffs). The first PR after this
    lands will reorder existing entries one time.
  - Per-provider summary at the end (updated / skipped / failed)
    so the workflow logs and Slack PR body make it obvious what
    actually changed vs. what was left alone.
  - File-level usage comment updated to match the new behavior and
    list the additional env vars (MISTRAL_API_KEY, MIMO_API_KEY).

No tests existed for this binary; manually verified with `env -i` (no
keys at all) that all 13 existing providers are preserved with their
original model counts.
2026-05-05 14:19:52 -07:00
..
src ci+fix: add update-providers workflow + non-destructive fetch_models (#914) 2026-05-05 14:19:52 -07:00
Cargo.toml Adding support for wildcard models in the model_providers config (#696) 2026-01-28 17:47:33 -08:00
README.md updating the implementation of /v1/chat/completions to use the generi… (#548) 2025-08-20 12:55:29 -07:00

hermesllm

A Rust library for handling LLM (Large Language Model) API requests and responses with unified abstractions across multiple providers.

Features

  • Unified request/response types with provider-specific parsing
  • Support for both streaming and non-streaming responses
  • Type-safe provider identification
  • OpenAI-compatible API structure with extensible provider support

Supported Providers

  • OpenAI
  • Mistral
  • Groq
  • Deepseek
  • Gemini
  • Claude
  • GitHub

Installation

Add to your Cargo.toml:

[dependencies]
hermesllm = { path = "../hermesllm" }  # or appropriate path in workspace

Usage

Basic Request Parsing

use hermesllm::providers::{ProviderRequestType, ProviderRequest, ProviderId};

// Parse request from JSON bytes
let request_bytes = r#"{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}"#;

// Parse with provider context
let request = ProviderRequestType::try_from((request_bytes.as_bytes(), &ProviderId::OpenAI))?;

// Access request properties
println!("Model: {}", request.model());
println!("User message: {:?}", request.get_recent_user_message());
println!("Is streaming: {}", request.is_streaming());

Working with Responses

use hermesllm::providers::{ProviderResponseType, ProviderResponse};

// Parse response from provider
let response_bytes = /* JSON response from LLM */;
let response = ProviderResponseType::try_from((response_bytes, ProviderId::OpenAI))?;

// Extract token usage
if let Some((prompt, completion, total)) = response.extract_usage_counts() {
    println!("Tokens used: {}/{}/{}", prompt, completion, total);
}

Handling Streaming Responses

use hermesllm::providers::{ProviderStreamResponseIter, ProviderStreamResponse};

// Create streaming iterator from SSE data
let sse_data = /* Server-Sent Events data */;
let mut stream = ProviderStreamResponseIter::try_from((sse_data, &ProviderId::OpenAI))?;

// Process streaming chunks
for chunk_result in stream {
    match chunk_result {
        Ok(chunk) => {
            if let Some(content) = chunk.content_delta() {
                print!("{}", content);
            }
            if chunk.is_final() {
                break;
            }
        }
        Err(e) => eprintln!("Stream error: {}", e),
    }
}

Provider Compatibility

use hermesllm::providers::{ProviderId, has_compatible_api, supported_apis};

// Check API compatibility
let provider = ProviderId::Groq;
if has_compatible_api(&provider, "/v1/chat/completions") {
    println!("Provider supports chat completions");
}

// List supported APIs
let apis = supported_apis(&provider);
println!("Supported APIs: {:?}", apis);

Core Types

Provider Types

  • ProviderId - Enum identifying supported providers (OpenAI, Mistral, Groq, etc.)
  • ProviderRequestType - Enum wrapping provider-specific request types
  • ProviderResponseType - Enum wrapping provider-specific response types
  • ProviderStreamResponseIter - Iterator for streaming response chunks

Traits

  • ProviderRequest - Common interface for all request types
  • ProviderResponse - Common interface for all response types
  • ProviderStreamResponse - Interface for streaming response chunks
  • TokenUsage - Interface for token usage information

OpenAI API Types

  • ChatCompletionsRequest - Chat completion request structure
  • ChatCompletionsResponse - Chat completion response structure
  • Message, Role, MessageContent - Message building blocks

Architecture

The library uses a type-safe enum-based approach that:

  • Provides Type Safety: All provider operations are checked at compile time
  • Enables Runtime Provider Selection: Provider can be determined from request headers or config
  • Maintains Clean Abstractions: Common traits hide provider-specific details
  • Supports Extensibility: New providers can be added by extending the enums

All requests are parsed into a common ProviderRequestType enum which implements the ProviderRequest trait, allowing uniform access to request properties regardless of the underlying provider format.

Examples

See the src/lib.rs tests for complete working examples of:

  • Parsing requests with provider context
  • Handling streaming responses
  • Working with token usage information

License

This project is licensed under the MIT License.