Configuration Reference

The following is a complete reference of the prompt-conifg.yml that controls the behavior of a single instance of the Arch gateway. We’ve kept things simple (less than 80 lines) and held off on exposing additional functionality (for e.g. suppporting push observability stats, managing prompt-endpoints as virtual cluster, exposing more load balancing options, etc). Our belief that the simple things, should be simple. So we offert good defaults for developers, so that they can spend more of their time in building features unique to their AI experience.

  1version: "0.1-beta"
  2
  3listener:
  4  address: 0.0.0.0 # or 127.0.0.1
  5  port: 10000
  6  # Defines how Arch should parse the content from application/json or text/pain Content-type in the http request
  7  message_format: huggingface
  8  common_tls_context: # If you configure port 443, you'll need to update the listener with your TLS certificates
  9    tls_certificates:
 10      - certificate_chain:
 11          filename: "/etc/certs/cert.pem"
 12        private_key:
 13          filename: "/etc/certs/key.pem"
 14
 15# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
 16endpoints:
 17  app_server:
 18    # value could be ip address or a hostname with port
 19    # this could also be a list of endpoints for load balancing
 20    # for example endpoint: [ ip1:port, ip2:port ]
 21    endpoint: "127.0.0.1:80"
 22    # max time to wait for a connection to be established
 23    connect_timeout: 0.005s
 24
 25  mistral_local:
 26    endpoint: "127.0.0.1:8001"
 27
 28  error_target:
 29    endpoint: "error_target_1"
 30
 31# Centralized way to manage LLMs, manage keys, retry logic, failover and limits in a central way
 32llm_providers:
 33  - name: "OpenAI"
 34    provider: "openai"
 35    access_key: $OPENAI_API_KEY
 36    model: gpt-4o
 37    default: true
 38    stream: true
 39    rate_limits:
 40      selector: #optional headers, to add rate limiting based on http headers like JWT tokens or API keys
 41        http_header:
 42          name: "Authorization"
 43          value: "" # Empty value means each separate value has a separate limit
 44      limit:
 45        tokens: 100000 # Tokens per unit
 46        unit: "minute"
 47
 48  - name: "Mistral8x7b"
 49    provider: "mistral"
 50    access_key: $MISTRAL_API_KEY
 51    model: "mistral-8x7b"
 52
 53  - name: "MistralLocal7b"
 54    provider: "local"
 55    model: "mistral-7b-instruct"
 56    endpoint: "mistral_local"
 57
 58# provides a way to override default settings for the arch system
 59overrides:
 60  # By default Arch uses an NLI + embedding approach to match an incomming prompt to a prompt target.
 61  # The intent matching threshold is kept at 0.80, you can overide this behavior if you would like
 62  prompt_target_intent_matching_threshold: 0.60
 63
 64# default system prompt used by all prompt targets
 65system_prompt: |
 66  You are a network assistant that just offers facts; not advice on manufacturers or purchasing decisions.
 67
 68prompt_guards:
 69  input_guards:
 70    jailbreak:
 71      on_exception:
 72        message: "Looks like you're curious about my abilities, but I can only provide assistance within my programmed parameters."
 73
 74prompt_targets:
 75  - name: "reboot_network_device"
 76    description: "Helps network operators perform device operations like rebooting a device."
 77    endpoint:
 78      name: app_server
 79      path: "/agent/action"
 80    parameters:
 81      - name: "device_id"
 82        # additional type options include: int | float | bool | string | list | dict
 83        type: "string"
 84        description: "Identifier of the network device to reboot."
 85        required: true
 86      - name: "confirmation"
 87        type: "string"
 88        description: "Confirmation flag to proceed with reboot."
 89        default: "no"
 90        enum: [yes, no]
 91
 92  - name: "information_extraction"
 93    default: true
 94    description: "This prompt handles all scenarios that are question and answer in nature. Like summarization, information extraction, etc."
 95    endpoint:
 96      name: app_server
 97      path: "/agent/summary"
 98    # Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
 99    auto_llm_dispatch_on_response: true
100    # override system prompt for this prompt target
101    system_prompt: |
102      You are a helpful information extraction assistant. Use the information that is provided to you.
103
104error_target:
105  endpoint:
106    name: error_target_1
107    path: /error
108
109tracing: 100 #sampling rate. Note by default Arch works on OpenTelemetry compatible tracing.