plano/skills/rules/routing-passthrough.md at 743d074184af534e39011ac1bd2c940590046fbe

apunkt/plano

Fork 0

mirror of https://github.com/katanemo/plano.git synced 2026-04-24 16:26:34 +02:00

Musa 743d074184

CI / pre-commit (push) Waiting to run

Details

CI / plano-tools-tests (push) Waiting to run

Details

CI / native-smoke-test (push) Waiting to run

Details

CI / docker-build (push) Waiting to run

Details

CI / validate-config (push) Waiting to run

Details

CI / security-scan (push) Blocked by required conditions

Details

CI / test-prompt-gateway (push) Blocked by required conditions

Details

CI / test-model-alias-routing (push) Blocked by required conditions

Details

CI / test-responses-api-with-state (push) Blocked by required conditions

Details

CI / e2e-plano-tests (3.10) (push) Blocked by required conditions

Details

CI / e2e-plano-tests (3.11) (push) Blocked by required conditions

Details

CI / e2e-plano-tests (3.12) (push) Blocked by required conditions

Details

CI / e2e-plano-tests (3.13) (push) Blocked by required conditions

Details

CI / e2e-plano-tests (3.14) (push) Blocked by required conditions

Details

CI / e2e-demo-preference (push) Blocked by required conditions

Details

CI / e2e-demo-currency (push) Blocked by required conditions

Details

Publish docker image (latest) / build-arm64 (push) Waiting to run

Details

Publish docker image (latest) / build-amd64 (push) Waiting to run

Details

Publish docker image (latest) / create-manifest (push) Blocked by required conditions

Details

Build and Deploy Documentation / build (push) Waiting to run

Details

add Plano agent skills framework and rule set (#797 )

* feat: add initial documentation for Plano Agent Skills

* feat: readme with examples

* feat: add detailed skills documentation and examples for Plano

---------

Co-authored-by: Adil Hafeez <adil.hafeez@gmail.com>

2026-04-16 13:16:51 -07:00

2.3 KiB

Raw Blame History

title	impact	impactDescription	tags
Use Passthrough Auth for Proxy and Multi-Tenant Setups	MEDIUM	Without passthrough auth, self-hosted proxy services (LiteLLM, vLLM, etc.) reject Plano's requests because the wrong Authorization header is sent	routing, authentication, proxy, litellm, multi-tenant

Use Passthrough Auth for Proxy and Multi-Tenant Setups

When routing to a self-hosted LLM proxy (LiteLLM, vLLM, OpenRouter, Azure APIM) or in multi-tenant setups where clients supply their own keys, set passthrough_auth: true. This forwards the client's Authorization header rather than Plano's configured access_key. Combine with a base_url pointing to the proxy.

Incorrect (Plano sends its own key to a proxy that expects the client's key):

model_providers:
  - model: custom/proxy
    base_url: http://host.docker.internal:8000
    access_key: $SOME_KEY    # Plano overwrites the client's auth — proxy rejects it

Correct (forward client Authorization header to the proxy):

version: v0.3.0

listeners:
  - type: model
    name: model_listener
    port: 12000

model_providers:
  - model: custom/litellm-proxy
    base_url: http://host.docker.internal:4000    # LiteLLM server
    provider_interface: openai                    # LiteLLM uses OpenAI format
    passthrough_auth: true                        # Forward client's Bearer token
    default: true

Multi-tenant pattern (client supplies their own API key):

model_providers:
  # Plano acts as a passthrough gateway; each client has their own OpenAI key
  - model: openai/gpt-4o
    passthrough_auth: true    # No access_key here — client's key is forwarded
    default: true

Combined: proxy for some models, Plano-managed for others:

model_providers:
  - model: openai/gpt-4o-mini
    access_key: $OPENAI_API_KEY    # Plano manages this key
    default: true
    routing_preferences:
      - name: quick tasks
        description: Short answers, simple lookups, fast completions

  - model: custom/vllm-llama
    base_url: http://gpu-server:8000
    provider_interface: openai
    passthrough_auth: true         # vLLM cluster handles its own auth
    routing_preferences:
      - name: long context
        description: Processing very long documents, multi-document analysis

Reference: https://github.com/katanemo/archgw

2.3 KiB Raw Blame History

Use Passthrough Auth for Proxy and Multi-Tenant Setups

2.3 KiB

Raw Blame History