<liclass="toctree-l2"><aclass="reference internal"href="../concepts/llm_providers/supported_providers.html">Supported Providers & Configuration</a></li>
<spanid="llm-router"></span><h1>LLM Routing<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#llm-routing"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h1>
<p>With the rapid proliferation of large language models (LLMs) — each optimized for different strengths, style, or latency/cost profile — routing has become an essential technique to operationalize the use of different models. Plano provides three distinct routing approaches to meet different use cases: <aclass="reference internal"href="#model-based-routing"><spanclass="std std-ref">Model-based routing</span></a>, <aclass="reference internal"href="#alias-based-routing"><spanclass="std std-ref">Alias-based routing</span></a>, and <aclass="reference internal"href="#preference-aligned-routing"><spanclass="std std-ref">Preference-aligned routing</span></a>. This enables optimal performance, cost efficiency, and response quality by matching requests with the most suitable model from your available LLM fleet.</p>
<divclass="admonition note">
<pclass="admonition-title">Note</p>
<p>For details on supported model providers, configuration options, and client libraries, see <aclass="reference internal"href="../concepts/llm_providers/llm_providers.html#llm-providers"><spanclass="std std-ref">LLM Providers</span></a>.</p>
<p>Direct routing allows you to specify exact provider and model combinations using the format <codeclass="docutils literal notranslate"><spanclass="pre">provider/model-name</span></code>:</p>
<h4>Configuration<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#configuration"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h4>
<p>Configure your LLM providers with specific provider/model names:</p>
<h4>Client usage<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#client-usage"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h4>
<divclass="highlight-python notranslate"><divclass="highlight"><pre><span></span><code><spanid="line-1"><spanclass="c1"># Direct provider/model specification</span>
</span><spanid="line-9"><spanclass="n">messages</span><spanclass="o">=</span><spanclass="p">[{</span><spanclass="s2">"role"</span><spanclass="p">:</span><spanclass="s2">"user"</span><spanclass="p">,</span><spanclass="s2">"content"</span><spanclass="p">:</span><spanclass="s2">"Write a story"</span><spanclass="p">}]</span>
<spanid="id2"></span><h3>Alias-based routing<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#alias-based-routing"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#alias-based-routing'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h3>
<p>Alias-based routing lets you create semantic model names that decouple your application from specific providers:</p>
<ulclass="simple">
<li><p>Use meaningful names like <codeclass="docutils literal notranslate"><spanclass="pre">fast-model</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">reasoning-model</span></code>, or <codeclass="docutils literal notranslate"><spanclass="pre">plano.summarize.v1</span></code> (see <aclass="reference internal"href="../concepts/llm_providers/model_aliases.html#model-aliases"><spanclass="std std-ref">Model Aliases</span></a>)</p></li>
<li><p>Maps semantic names to underlying provider models for easier experimentation and provider switching</p></li>
<li><p>Ideal for applications that want abstraction from specific model names while maintaining control</p></li>
</ul>
<sectionid="id3">
<h4>Configuration<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#id3"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h4>
<h4>Client usage<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#id4"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h4>
<divclass="highlight-python notranslate"><divclass="highlight"><pre><span></span><code><spanid="line-1"><spanclass="c1"># Using semantic aliases</span>
</span><spanid="line-3"><spanclass="n">model</span><spanclass="o">=</span><spanclass="s2">"fast-model"</span><spanclass="p">,</span><spanclass="c1"># Routes to best available fast model</span>
</span><spanid="line-8"><spanclass="n">model</span><spanclass="o">=</span><spanclass="s2">"reasoning-model"</span><spanclass="p">,</span><spanclass="c1"># Routes to best reasoning model</span>
</span><spanid="line-9"><spanclass="n">messages</span><spanclass="o">=</span><spanclass="p">[{</span><spanclass="s2">"role"</span><spanclass="p">:</span><spanclass="s2">"user"</span><spanclass="p">,</span><spanclass="s2">"content"</span><spanclass="p">:</span><spanclass="s2">"Solve this complex problem"</span><spanclass="p">}]</span>
<spanid="preference-aligned-routing"></span><h3>Preference-aligned routing (Arch-Router)<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#preference-aligned-routing-arch-router"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#preference-aligned-routing-arch-router'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h3>
<p>Preference-aligned routing uses the <aclass="reference external"href="https://huggingface.co/katanemo/Arch-Router-1.5B"rel="nofollow noopener">Arch-Router<svgfill="currentColor"height="1em"stroke="none"viewbox="0 96 960 960"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M188 868q-11-11-11-28t11-28l436-436H400q-17 0-28.5-11.5T360 336q0-17 11.5-28.5T400 296h320q17 0 28.5 11.5T760 336v320q0 17-11.5 28.5T720 696q-17 0-28.5-11.5T680 656V432L244 868q-11 11-28 11t-28-11Z"></path></svg></a> model to pick the best LLM based on domain, action, and your configured preferences instead of hard-coding a model.</p>
<ulclass="simple">
<li><p><strong>Domain</strong>: High-level topic of the request (e.g., legal, healthcare, programming).</p></li>
<li><p><strong>Action</strong>: What the user wants to do (e.g., summarize, generate code, translate).</p></li>
<li><p><strong>Routing preferences</strong>: Your mapping from (domain, action) to preferred models.</p></li>
</ul>
<p>Arch-Router analyzes each prompt to infer domain and action, then applies your preferences to select a model. This decouples <strong>routing policy</strong> (how to choose) from <strong>model assignment</strong> (what to run), making routing transparent, controllable, and easy to extend as you add or swap models.</p>
<sectionid="id5">
<h4>Configuration<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#id5"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h4>
<p>To configure preference-aligned dynamic routing, define routing preferences that map domains and actions to specific models:</p>
</span><spanid="line-19"><spanclass="w"></span><spanclass="nt">description</span><spanclass="p">:</span><spanclass="w"></span><spanclass="l l-Scalar l-Scalar-Plain">deep analysis, mathematical problem solving, and logical reasoning</span>
</span><spanid="line-27"><spanclass="w"></span><spanclass="nt">description</span><spanclass="p">:</span><spanclass="w"></span><spanclass="l l-Scalar l-Scalar-Plain">generating new code snippets, functions, or boilerplate based on user prompts</span>
<h4>Client usage<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#id6"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h4>
<p>Clients can let the router decide or still specify aliases:</p>
<divclass="highlight-python notranslate"><divclass="highlight"><pre><span></span><code><spanid="line-1"><spanclass="c1"># Let Arch-Router choose based on content</span>
</span><spanid="line-3"><spanclass="n">messages</span><spanclass="o">=</span><spanclass="p">[{</span><spanclass="s2">"role"</span><spanclass="p">:</span><spanclass="s2">"user"</span><spanclass="p">,</span><spanclass="s2">"content"</span><spanclass="p">:</span><spanclass="s2">"Write a creative story about space exploration"</span><spanclass="p">}]</span>
<h2>Arch-Router<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#id7"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#id7'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>The <aclass="reference external"href="https://huggingface.co/katanemo/Arch-Router-1.5B"rel="nofollow noopener">Arch-Router<svgfill="currentColor"height="1em"stroke="none"viewbox="0 96 960 960"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M188 868q-11-11-11-28t11-28l436-436H400q-17 0-28.5-11.5T360 336q0-17 11.5-28.5T400 296h320q17 0 28.5 11.5T760 336v320q0 17-11.5 28.5T720 696q-17 0-28.5-11.5T680 656V432L244 868q-11 11-28 11t-28-11Z"></path></svg></a> is a state-of-the-art <strong>preference-based routing model</strong> specifically designed to address the limitations of traditional LLM routing. This compact 1.5B model delivers production-ready performance with low latency and high accuracy while solving key routing challenges.</p>
<p><strong>Addressing Traditional Routing Limitations:</strong></p>
<p><strong>Human Preference Alignment</strong>
Unlike benchmark-driven approaches, Arch-Router learns to match queries with human preferences by using domain-action mappings that capture subjective evaluation criteria, ensuring routing decisions align with real-world user needs.</p>
<p><strong>Flexible Model Integration</strong>
The system supports seamlessly adding new models for routing without requiring retraining or architectural modifications, enabling dynamic adaptation to evolving model landscapes.</p>
<p><strong>Preference-Encoded Routing</strong>
Provides a practical mechanism to encode user preferences through domain-action mappings, offering transparent and controllable routing decisions that can be customized for specific use cases.</p>
<p>To support effective routing, Arch-Router introduces two key concepts:</p>
<ulclass="simple">
<li><p><strong>Domain</strong>– the high-level thematic category or subject matter of a request (e.g., legal, healthcare, programming).</p></li>
<li><p><strong>Action</strong>– the specific type of operation the user wants performed (e.g., summarization, code generation, booking appointment, translation).</p></li>
</ul>
<p>Both domain and action configs are associated with preferred models or model variants. At inference time, Arch-Router analyzes the incoming prompt to infer its domain and action using semantic similarity, task indicators, and contextual cues. It then applies the user-defined routing preferences to select the model best suited to handle the request.</p>
<p>In summary, Arch-Router demonstrates:</p>
<ulclass="simple">
<li><p><strong>Structured Preference Routing</strong>: Aligns prompt request with model strengths using explicit domain–action mappings.</p></li>
<li><p><strong>Transparent and Controllable</strong>: Makes routing decisions transparent and configurable, empowering users to customize system behavior.</p></li>
<li><p><strong>Flexible and Adaptive</strong>: Supports evolving user needs, model updates, and new domains/actions without retraining the router.</p></li>
<li><p><strong>Production-Ready Performance</strong>: Optimized for low-latency, high-throughput applications in multi-model environments.</p></li>
</span><spanid="line-10"><spanclass="w"></span><spanclass="nt">description</span><spanclass="p">:</span><spanclass="w"></span><spanclass="l l-Scalar l-Scalar-Plain">deep analysis and complex problem solving</span>
<li><p><strong>Use direct model selection</strong>: <codeclass="docutils literal notranslate"><spanclass="pre">model="fast-model"</span></code></p></li>
<li><p><strong>Let the router decide</strong>: No model specified, router analyzes content</p></li>
<h2>Example Use Cases<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#example-use-cases"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#example-use-cases'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Here are common scenarios where Arch-Router excels:</p>
<ulclass="simple">
<li><p><strong>Coding Tasks</strong>: Distinguish between code generation requests (“write a Python function”), debugging needs (“fix this error”), and code optimization (“make this faster”), routing each to appropriately specialized models.</p></li>
<li><p><strong>Content Processing Workflows</strong>: Classify requests as summarization (“summarize this document”), translation (“translate to Spanish”), or analysis (“what are the key themes”), enabling targeted model selection.</p></li>
<li><p><strong>Multi-Domain Applications</strong>: Accurately identify whether requests fall into legal, healthcare, technical, or general domains, even when the subject matter isn’t explicitly stated in the prompt.</p></li>
<li><p><strong>Conversational Routing</strong>: Track conversation context to identify when topics shift between domains or when the type of assistance needed changes mid-conversation.</p></li>
<li><p><strong>💡 Clear Usage Description:</strong> Make your route names and descriptions specific, unambiguous, and minimizing overlap between routes. The Router performs better when it can clearly distinguish between different types of requests.</p>
<li><p><strong>💡Nouns Descriptor:</strong> Preference-based routers perform better with noun-centric descriptors, as they offer more stable and semantically rich signals for matching.</p></li>
<li><p><strong>💡Domain Inclusion:</strong> for best user experience, you should always include a domain route. This helps the router fall back to domain when action is not confidently inferred.</p></li>
</ul>
</section>
<sectionid="unsupported-features">
<h2>Unsupported Features<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#unsupported-features"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#unsupported-features'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>The following features are <strong>not supported</strong> by the Arch-Router model:</p>
<ulclass="simple">
<li><p><strong>Multi-modality</strong>: The model is not trained to process raw image or audio inputs. It can handle textual queries <em>about</em> these modalities (e.g., “generate an image of a cat”), but cannot interpret encoded multimedia data directly.</p></li>
<li><p><strong>Function calling</strong>: Arch-Router is designed for <strong>semantic preference matching</strong>, not exact intent classification or tool execution. For structured function invocation, use models in the Plano Function Calling collection instead.</p></li>
<li><p><strong>System prompt dependency</strong>: Arch-Router routes based solely on the user’s conversation history. It does not use or rely on system prompts for routing decisions.</p></li>