This quickstart assumes basic familiarity with agents and prompt targets from the Concepts section. For background, see :ref:`Agents <agents>` and :ref:`Prompt Target <prompt_target>`.
The full agent and backend API implementations used here are available in the `plano-quickstart repository <https://github.com/plano-ai/plano-quickstart>`_. This guide focuses on wiring and configuring Plano (orchestration, prompt targets, and the model proxy), not application code.
Plano operates based on a configuration file where you can define LLM providers, prompt targets, guardrails, etc. Below is an example configuration that defines OpenAI and Anthropic LLM providers.
Create ``plano_config.yaml`` file with the following content:
..code-block:: yaml
version: v0.3.0
listeners:
- type: model
name: model_1
address: 0.0.0.0
port: 12000
model_providers:
- access_key: $OPENAI_API_KEY
model: openai/gpt-4o
default: true
- access_key: $ANTHROPIC_API_KEY
model: anthropic/claude-sonnet-4-5
Step 2. Start plano
~~~~~~~~~~~~~~~~~~~
Once the config file is created, ensure that you have environment variables set up for ``ANTHROPIC_API_KEY`` and ``OPENAI_API_KEY`` (or these are defined in a ``.env`` file).
When the requested model is not found in the configuration, Plano will randomly select an available model from the configured providers. In this example, we use ``"model": "none"`` and Plano selects the default model ``openai/gpt-4o``.
Step 3.2: Using OpenAI Python client
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Make outbound calls via the Plano gateway:
..code-block:: python
from openai import OpenAI
# Use the OpenAI client as usual
client = OpenAI(
# No need to set a specific openai.api_key since it's configured in Plano's gateway
api_key='--',
# Set the OpenAI API base URL to the Plano gateway endpoint
base_url="http://127.0.0.1:12000/v1"
)
response = client.chat.completions.create(
# we select model from plano_config file
model="--",
messages=[{"role": "user", "content": "What is the capital of France?"}],
Agents are where your business logic lives (the "inner loop"). Plano takes care of the "outer loop"—routing, sequencing, and managing calls across agents and LLMs.
1.**Implement your agent** in your framework of choice (Python, JS/TS, etc.), exposing it as an HTTP service.
2.**Route LLM calls through Plano's Model Proxy**, so all models share a consistent interface and observability.
3.**Configure Plano to orchestrate**: define which agent(s) can handle which kinds of prompts, and let Plano decide when to call an agent vs. an LLM.
This quickstart uses a simplified version of the Travel Booking Assistant; for the full multi-agent walkthrough, see :ref:`Orchestration <agent_routing>`.
Step 1. Minimal orchestration config
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is a minimal configuration that wires Plano-Orchestrator to two HTTP services: one for flights and one for hotels.
description: Search for flights and provide flight status.
- id: hotel_agent
description: Find hotels and check availability.
tracing:
random_sampling: 100
Step 2. Start your agents and Plano
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Run your ``flight_agent`` and ``hotel_agent`` services (see :ref:`Orchestration <agent_routing>` for a full Travel Booking example), then start Plano with the config above:
Plano will start the orchestrator and expose an agent listener on port ``8001``.
Step 3. Send a prompt and let Plano route
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Now send a request to Plano using the OpenAI-compatible chat completions API—the orchestrator will analyze the prompt and route it to the right agent based on intent:
--data '{"messages": [{"role": "user","content": "Find me flights from SFO to JFK tomorrow"}], "model": "openai/gpt-4o"}' \
http://localhost:8001/v1/chat/completions
You can then ask a follow-up like "Also book me a hotel near JFK" and Plano-Orchestrator will route to ``hotel_agent``—your agents stay focused on business logic while Plano handles routing.
.._quickstart_prompt_targets:
Deterministic API calls with prompt targets
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Next, we'll show Plano's deterministic API calling using a single prompt target. We'll build a currency exchange backend powered by `https://api.frankfurter.dev/`, assuming USD as the base currency.
Step 1. Create plano config file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Create ``plano_config.yaml`` file with the following content:
"As of the date provided in your context, December 5, 2024, the exchange rate for GBP (British Pound) from USD (United States Dollar) is 0.78558. This means that 1 USD is equivalent to 0.78558 GBP."
"Here is a list of the currencies that are supported for conversion from USD, along with their symbols:\n\n1. AUD - Australian Dollar\n2. BGN - Bulgarian Lev\n3. BRL - Brazilian Real\n4. CAD - Canadian Dollar\n5. CHF - Swiss Franc\n6. CNY - Chinese Renminbi Yuan\n7. CZK - Czech Koruna\n8. DKK - Danish Krone\n9. EUR - Euro\n10. GBP - British Pound\n11. HKD - Hong Kong Dollar\n12. HUF - Hungarian Forint\n13. IDR - Indonesian Rupiah\n14. ILS - Israeli New Sheqel\n15. INR - Indian Rupee\n16. ISK - Icelandic Króna\n17. JPY - Japanese Yen\n18. KRW - South Korean Won\n19. MXN - Mexican Peso\n20. MYR - Malaysian Ringgit\n21. NOK - Norwegian Krone\n22. NZD - New Zealand Dollar\n23. PHP - Philippine Peso\n24. PLN - Polish Złoty\n25. RON - Romanian Leu\n26. SEK - Swedish Krona\n27. SGD - Singapore Dollar\n28. THB - Thai Baht\n29. TRY - Turkish Lira\n30. USD - United States Dollar\n31. ZAR - South African Rand\n\nIf you want to convert USD to any of these currencies, you can select the one you are interested in."
Plano ships two CLI tools for visibility into LLM traffic. Both consume the same OTLP/gRPC span stream from brightstaff; they just slice it differently — use whichever (or both) fits the question you're answering.
``planoai obs`` Live view while you drive traffic Per-request rows + aggregates: tokens (prompt / completion / cached / cache-creation / reasoning), TTFT, latency, cost, session id, route name, totals by model
``planoai trace`` Deep-dive into a single request after the fact Full span tree for a trace id: brightstaff → routing → upstream LLM, attributes on every span, status codes, errors
Both require brightstaff to be exporting spans. If you're running the zero-config path (``planoai up`` with no config file), tracing is auto-wired to ``http://localhost:4317``. If you have your own ``plano_config.yaml``, add:
..code-block:: yaml
tracing:
random_sampling: 100
opentracing_grpc_endpoint: http://localhost:4317
Live console — ``planoai obs``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
..code-block:: console
$ planoai obs
# In another terminal:
$ planoai up
Cost is populated automatically from DigitalOcean's public pricing catalog — no signup or token required.
With no API keys set, every provider runs in pass-through mode — supply the ``Authorization`` header yourself on each request:
"messages":[{"role":"user","content":"write code to print prime numbers in python"}],
"stream":false}'
When you export ``OPENAI_API_KEY`` / ``ANTHROPIC_API_KEY`` / ``DO_API_KEY`` / etc. before ``planoai up``, Plano picks them up and clients no longer need to send ``Authorization``.
Press ``Ctrl-C`` in the obs terminal to exit. Data lives in memory only — nothing is persisted to disk.
Single-request traces — ``planoai trace``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When you need to understand what happened on one specific request (which model was picked, how long each hop took, what an upstream returned), use ``trace``:
..code-block:: console
$ planoai trace listen # start the OTLP listener (daemon)
# drive some traffic through localhost:12000 ...
$ planoai trace # show the most recent trace
$ planoai trace <trace-id> # show a specific trace by id
$ planoai trace --list # list the last 50 trace ids
Use ``obs`` to spot that p95 latency spiked for ``openai-gpt-5.4``; switch to ``trace`` on one of those slow request ids to see which hop burned the time.
Congratulations! You've successfully set up Plano and made your first prompt-based request. To further enhance your GenAI applications, explore the following resources:
With Plano, building scalable, fast, and personalized GenAI applications has never been easier. Dive deeper into Plano's capabilities and start creating innovative AI-driven experiences today!