plano/envoyfilter/katanemo-config.yaml

katanemo-prompt-config:
  default-prompt-endpoint: "127.0.0.1"
  load-balancing: "round-robin"
  timeout-ms: 5000

  embedding-provider:
    name: "SentenceTransformer"
    model: "all-MiniLM-L6-v2"

  llm-providers:

    - name: "open-ai-gpt-4"
      api-key: "$OPEN_AI_API_KEY"
      model: gpt-4

  system-prompt: |
    You are a helpful weather forecaster. Please following following guidelines when responding to user queries:
    - Use farenheight for temperature
    - Use miles per hour for wind speed

  prompt-targets:

    - type: context-resolver
      name: weather-forecast
      few-shot-examples:
        - what is the weather in New York?
      endpoint: "POST:$WEATHER_FORECAST_API_ENDPOINT"
      cache-response: true
      cache-response-settings:
        - cache-ttl-secs: 3600 # cache expiry in seconds
        - cache-max-size: 1000 # in number of items
        - cache-eviction-strategy: LRU
Parse katanemo config using serde/yaml package (#6) * Parse katanemo config using serde/yaml package - load yaml file into typed classes - pass katanemo config to plugin using envoy wasm plugin config - add tests in configuration.rs file 2024-07-16 14:50:32 -07:00			`katanemo-prompt-config:`
			`default-prompt-endpoint: "127.0.0.1"`
			`load-balancing: "round-robin"`
			`timeout-ms: 5000`

			`embedding-provider:`
			`name: "SentenceTransformer"`
			`model: "all-MiniLM-L6-v2"`

			`llm-providers:`

			`- name: "open-ai-gpt-4"`
			`api-key: "$OPEN_AI_API_KEY"`
			`model: gpt-4`

			`system-prompt: \|`
			`You are a helpful weather forecaster. Please following following guidelines when responding to user queries:`
			`- Use farenheight for temperature`
			`- Use miles per hour for wind speed`

			`prompt-targets:`

			`- type: context-resolver`
			`name: weather-forecast`
			`few-shot-examples:`
			`- what is the weather in New York?`
			`endpoint: "POST:$WEATHER_FORECAST_API_ENDPOINT"`
			`cache-response: true`
			`cache-response-settings:`
			`- cache-ttl-secs: 3600 # cache expiry in seconds`
			`- cache-max-size: 1000 # in number of items`
			`- cache-eviction-strategy: LRU`