.. _deployment: Deployment ========== Plano can be deployed in two ways: **natively** on the host (default) or inside a **Docker container**. Native Deployment (Default) --------------------------- Plano runs natively by default. Pre-compiled binaries (Envoy, WASM plugins, brightstaff) are automatically downloaded on the first run and cached at ``~/.plano/``. Supported platforms: Linux (x86_64, aarch64), macOS (Apple Silicon). Start Plano ~~~~~~~~~~~~ .. code-block:: bash planoai up plano_config.yaml Options: - ``--foreground`` — stay attached and stream logs (Ctrl+C to stop) - ``--with-tracing`` — start a local OTLP trace collector Runtime files (rendered configs, logs, PID file) are stored in ``~/.plano/run/``. Stop Plano ~~~~~~~~~~ .. code-block:: bash planoai down Build from Source (Developer) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you want to build from source instead of using pre-compiled binaries, you need: - `Rust `_ with the ``wasm32-wasip1`` target - OpenSSL dev headers (``libssl-dev`` on Debian/Ubuntu, ``openssl`` on macOS) .. code-block:: bash planoai build --native Docker Deployment ----------------- Below is a minimal, production-ready example showing how to deploy the Plano Docker image directly and run basic runtime checks. Adjust image names, tags, and the ``plano_config.yaml`` path to match your environment. .. note:: You will need to pass all required environment variables that are referenced in your ``plano_config.yaml`` file. For ``plano_config.yaml``, you can use any sample configuration defined earlier in the documentation. For example, you can try the :ref:`LLM Routing ` sample config. Docker Compose Setup ~~~~~~~~~~~~~~~~~~~~ Create a ``docker-compose.yml`` file with the following configuration: .. code-block:: yaml # docker-compose.yml services: plano: image: katanemo/plano:0.4.16 container_name: plano ports: - "10000:10000" # ingress (client -> plano) - "12000:12000" # egress (plano -> upstream/llm proxy) volumes: - ./plano_config.yaml:/app/plano_config.yaml:ro environment: - OPENAI_API_KEY=${OPENAI_API_KEY:?error} - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?error} Starting the Stack ~~~~~~~~~~~~~~~~~~ Start the services from the directory containing ``docker-compose.yml`` and ``plano_config.yaml``: .. code-block:: bash # Set required environment variables and start services OPENAI_API_KEY=xxx ANTHROPIC_API_KEY=yyy docker compose up -d Check container health and logs: .. code-block:: bash docker compose ps docker compose logs -f plano You can also use the CLI with Docker mode: .. code-block:: bash planoai up plano_config.yaml --docker planoai down --docker Kubernetes Deployment --------------------- Plano runs as a single container in Kubernetes. The container bundles Envoy, WASM plugins, and brightstaff, managed by supervisord internally. Deploy it as a standard Kubernetes Deployment with your ``plano_config.yaml`` mounted via a ConfigMap and API keys injected via a Secret. .. note:: All environment variables referenced in your ``plano_config.yaml`` (e.g. ``$OPENAI_API_KEY``) must be set in the container environment. Use Kubernetes Secrets for API keys. Step 1: Create the Config ~~~~~~~~~~~~~~~~~~~~~~~~~ Store your ``plano_config.yaml`` in a ConfigMap: .. code-block:: bash kubectl create configmap plano-config --from-file=plano_config.yaml=./plano_config.yaml Step 2: Create API Key Secrets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Store your LLM provider API keys in a Secret: .. code-block:: bash kubectl create secret generic plano-secrets \ --from-literal=OPENAI_API_KEY=sk-... \ --from-literal=ANTHROPIC_API_KEY=sk-ant-... Step 3: Deploy Plano ~~~~~~~~~~~~~~~~~~~~ Create a ``plano-deployment.yaml``: .. code-block:: yaml apiVersion: apps/v1 kind: Deployment metadata: name: plano labels: app: plano spec: replicas: 1 selector: matchLabels: app: plano template: metadata: labels: app: plano spec: containers: - name: plano image: katanemo/plano:0.4.16 ports: - containerPort: 12000 # LLM gateway (chat completions, model routing) name: llm-gateway envFrom: - secretRef: name: plano-secrets env: - name: LOG_LEVEL value: "info" volumeMounts: - name: plano-config mountPath: /app/plano_config.yaml subPath: plano_config.yaml readOnly: true readinessProbe: httpGet: path: /healthz port: 12000 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: httpGet: path: /healthz port: 12000 initialDelaySeconds: 10 periodSeconds: 30 resources: requests: memory: "256Mi" cpu: "250m" limits: memory: "512Mi" cpu: "1000m" volumes: - name: plano-config configMap: name: plano-config --- apiVersion: v1 kind: Service metadata: name: plano spec: selector: app: plano ports: - name: llm-gateway port: 12000 targetPort: 12000 Apply it: .. code-block:: bash kubectl apply -f plano-deployment.yaml Step 4: Verify ~~~~~~~~~~~~~~ .. code-block:: bash # Check pod status kubectl get pods -l app=plano # Check logs kubectl logs -l app=plano -f # Test routing (port-forward for local testing) kubectl port-forward svc/plano 12000:12000 curl -s -H "Content-Type: application/json" \ -d '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \ http://localhost:12000/v1/chat/completions | jq .model Updating Configuration ~~~~~~~~~~~~~~~~~~~~~~ To update ``plano_config.yaml``, replace the ConfigMap and restart the pod: .. code-block:: bash kubectl create configmap plano-config \ --from-file=plano_config.yaml=./plano_config.yaml \ --dry-run=client -o yaml | kubectl apply -f - kubectl rollout restart deployment/plano Enabling OTEL Tracing ~~~~~~~~~~~~~~~~~~~~~ Plano emits OpenTelemetry traces for every request — including routing decisions, model selection, and upstream latency. To export traces to an OTEL collector in your cluster, add the ``tracing`` section to your ``plano_config.yaml``: .. code-block:: yaml tracing: opentracing_grpc_endpoint: "http://otel-collector.monitoring:4317" random_sampling: 100 # percentage of requests to trace (1-100) trace_arch_internal: true # include internal Plano spans span_attributes: header_prefixes: # capture request headers as span attributes - "x-" static: # add static attributes to all spans environment: "production" service: "plano" Set the ``OTEL_TRACING_GRPC_ENDPOINT`` environment variable or configure it directly in the config. Plano propagates the ``traceparent`` header end-to-end, so traces correlate across your upstream and downstream services. Environment Variables Reference ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following environment variables can be set on the container: .. list-table:: :header-rows: 1 :widths: 30 50 20 * - Variable - Description - Default * - ``LOG_LEVEL`` - Log verbosity (``debug``, ``info``, ``warn``, ``error``) - ``info`` * - ``OPENAI_API_KEY`` - OpenAI API key (if referenced in config) - * - ``ANTHROPIC_API_KEY`` - Anthropic API key (if referenced in config) - * - ``OTEL_TRACING_GRPC_ENDPOINT`` - OTEL collector endpoint for trace export - ``http://localhost:4317`` Any environment variable referenced in ``plano_config.yaml`` with ``$VAR_NAME`` syntax will be substituted at startup. Use Kubernetes Secrets for sensitive values and ConfigMaps or ``env`` entries for non-sensitive configuration. Runtime Tests ------------- Perform basic runtime tests to verify routing and functionality. Gateway Smoke Test ~~~~~~~~~~~~~~~~~~ Test the chat completion endpoint with automatic routing: .. code-block:: bash # Request handled by the gateway. 'model: "none"' lets Plano decide routing curl --header 'Content-Type: application/json' \ --data '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \ http://localhost:12000/v1/chat/completions | jq .model Expected output: .. code-block:: json "gpt-5.2" Model-Based Routing ~~~~~~~~~~~~~~~~~~~ Test explicit provider and model routing: .. code-block:: bash curl -s -H "Content-Type: application/json" \ -d '{"messages":[{"role":"user","content":"Explain quantum computing"}], "model":"anthropic/claude-sonnet-4-5"}' \ http://localhost:12000/v1/chat/completions | jq .model Expected output: .. code-block:: json "claude-sonnet-4-5" Troubleshooting --------------- Common Issues and Solutions ~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Environment Variables** Ensure all environment variables (``OPENAI_API_KEY``, ``ANTHROPIC_API_KEY``, etc.) used by ``plano_config.yaml`` are set before starting services. **TLS/Connection Errors** If you encounter TLS or connection errors to upstream providers: - Check DNS resolution - Verify proxy settings - Confirm correct protocol and port in your ``plano_config`` endpoints **Verbose Logging** To enable more detailed logs for debugging: - Run plano with a higher component log level - See the :ref:`Observability ` guide for logging and monitoring details - Rebuild the image if required with updated log configuration **CI/Automated Checks** For continuous integration or automated testing, you can use the curl commands above as health checks in your deployment pipeline.