mirror of
https://github.com/katanemo/plano.git
synced 2026-05-05 13:53:03 +02:00
Update docs to Plano (#639)
This commit is contained in:
parent
15fbb6c3af
commit
e224cba3e3
139 changed files with 4407 additions and 24735 deletions
|
|
@ -1,16 +1,16 @@
|
|||
.. _llm_providers:
|
||||
|
||||
LLM Providers
|
||||
=============
|
||||
**LLM Providers** are a top-level primitive in Arch, helping developers centrally define, secure, observe,
|
||||
and manage the usage of their LLMs. Arch builds on Envoy's reliable `cluster subsystem <https://www.envoyproxy.io/docs/envoy/v1.31.2/intro/arch_overview/upstream/cluster_manager>`_
|
||||
to manage egress traffic to LLMs, which includes intelligent routing, retry and fail-over mechanisms,
|
||||
ensuring high availability and fault tolerance. This abstraction also enables developers to seamlessly
|
||||
switch between LLM providers or upgrade LLM versions, simplifying the integration and scaling of LLMs
|
||||
across applications.
|
||||
Model (LLM) Providers
|
||||
=====================
|
||||
**Model Providers** are a top-level primitive in Plano, helping developers centrally define, secure, observe,
|
||||
and manage the usage of their models. Plano builds on Envoy's reliable `cluster subsystem <https://www.envoyproxy.io/docs/envoy/v1.31.2/intro/arch_overview/upstream/cluster_manager>`_ to manage egress traffic to models, which includes intelligent routing, retry and fail-over mechanisms,
|
||||
ensuring high availability and fault tolerance. This abstraction also enables developers to seamlessly switch between model providers or upgrade model versions, simplifying the integration and scaling of models across applications.
|
||||
|
||||
Today, we are enabling you to connect to 11+ different AI providers through a unified interface with advanced routing and management capabilities.
|
||||
Whether you're using OpenAI, Anthropic, Azure OpenAI, local Ollama models, or any OpenAI-compatible provider, Arch provides seamless integration with enterprise-grade features.
|
||||
Today, we are enable you to connect to 15+ different AI providers through a unified interface with advanced routing and management capabilities.
|
||||
Whether you're using OpenAI, Anthropic, Azure OpenAI, local Ollama models, or any OpenAI-compatible provider, Plano provides seamless integration with enterprise-grade features.
|
||||
|
||||
.. note::
|
||||
Please refer to the quickstart guide :ref:`here <llm_routing_quickstart>` to configure and use LLM providers via common client libraries like OpenAI and Anthropic Python SDKs, or via direct HTTP/cURL requests.
|
||||
|
||||
Core Capabilities
|
||||
-----------------
|
||||
|
|
@ -18,29 +18,29 @@ Core Capabilities
|
|||
**Multi-Provider Support**
|
||||
Connect to any combination of providers simultaneously (see :ref:`supported_providers` for full details):
|
||||
|
||||
- **First-Class Providers**: Native integrations with OpenAI, Anthropic, DeepSeek, Mistral, Groq, Google Gemini, Together AI, xAI, Azure OpenAI, and Ollama
|
||||
- **OpenAI-Compatible Providers**: Any provider implementing the OpenAI Chat Completions API standard
|
||||
- First-Class Providers: Native integrations with OpenAI, Anthropic, DeepSeek, Mistral, Groq, Google Gemini, Together AI, xAI, Azure OpenAI, and Ollama
|
||||
- OpenAI-Compatible Providers: Any provider implementing the OpenAI Chat Completions API standard
|
||||
|
||||
**Intelligent Routing**
|
||||
Three powerful routing approaches to optimize model selection:
|
||||
|
||||
- **Model-based Routing**: Direct routing to specific models using provider/model names (see :ref:`supported_providers`)
|
||||
- **Alias-based Routing**: Semantic routing using custom aliases (see :ref:`model_aliases`)
|
||||
- **Preference-aligned Routing**: Intelligent routing using the Arch-Router model (see :ref:`preference_aligned_routing`)
|
||||
- Model-based Routing: Direct routing to specific models using provider/model names (see :ref:`supported_providers`)
|
||||
- Alias-based Routing: Semantic routing using custom aliases (see :ref:`model_aliases`)
|
||||
- Preference-aligned Routing: Intelligent routing using the Plano-Router model (see :ref:`preference_aligned_routing`)
|
||||
|
||||
**Unified Client Interface**
|
||||
Use your preferred client library without changing existing code (see :ref:`client_libraries` for details):
|
||||
|
||||
- **OpenAI Python SDK**: Full compatibility with all providers
|
||||
- **Anthropic Python SDK**: Native support with cross-provider capabilities
|
||||
- **cURL & HTTP Clients**: Direct REST API access for any programming language
|
||||
- **Custom Integrations**: Standard HTTP interfaces for seamless integration
|
||||
- OpenAI Python SDK: Full compatibility with all providers
|
||||
- Anthropic Python SDK: Native support with cross-provider capabilities
|
||||
- cURL & HTTP Clients: Direct REST API access for any programming language
|
||||
- Custom Integrations: Standard HTTP interfaces for seamless integration
|
||||
|
||||
Key Benefits
|
||||
------------
|
||||
|
||||
- **Provider Flexibility**: Switch between providers without changing client code
|
||||
- **Three Routing Methods**: Choose from model-based, alias-based, or preference-aligned routing (using `Arch-Router-1.5B <https://huggingface.co/katanemo/Arch-Router-1.5B>`_) strategies
|
||||
- **Three Routing Methods**: Choose from model-based, alias-based, or preference-aligned routing (using `Plano-Router-1.5B <https://huggingface.co/katanemo/Plano-Router-1.5B>`_) strategies
|
||||
- **Cost Optimization**: Route requests to cost-effective models based on complexity
|
||||
- **Performance Optimization**: Use fast models for simple tasks, powerful models for complex reasoning
|
||||
- **Environment Management**: Configure different models for different environments
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue