plano/demos/samples_python/weather_forecast/arch_config.yaml
Salman Paracha fb0581fd39
add support for v1/messages and transformations (#558)
* pushing draft PR

* transformations are working. Now need to add some tests next

* updated tests and added necessary response transformations for Anthropics' message response object

* fixed bugs for integration tests

* fixed doc tests

* fixed serialization issues with enums on response

* adding some debug logs to help

* fixed issues with non-streaming responses

* updated the stream_context to update response bytes

* the serialized bytes length must be set in the response side

* fixed the debug statement that was causing the integration tests for wasm to fail

* fixing json parsing errors

* intentionally removing the headers

* making sure that we convert the raw bytes to the correct provider type upstream

* fixing non-streaming responses to tranform correctly

* /v1/messages works with transformations to and from /v1/chat/completions

* updating the CLI and demos to support anthropic vs. claude

* adding the anthropic key to the preference based routing tests

* fixed test cases and added more structured logs

* fixed integration tests and cleaned up logs

* added python client tests for anthropic and openai

* cleaned up logs and fixed issue with connectivity for llm gateway in weather forecast demo

* fixing the tests. python dependency order was broken

* updated the openAI client to fix demos

* removed the raw response debug statement

* fixed the dup cloning issue and cleaned up the ProviderRequestType enum and traits

* fixing logs

* moved away from string literals to consts

* fixed streaming from Anthropic Client to OpenAI

* removed debug statement that would likely trip up integration tests

* fixed integration tests for llm_gateway

* cleaned up test cases and removed unnecessary crates

* fixing comments from PR

* fixed bug whereby we were sending an OpenAIChatCompletions request object to llm_gateway even though the request may have been AnthropicMessages

---------

Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-4.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-9.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-10.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-41.local>
Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-136.local>
2025-09-10 07:40:30 -07:00

81 lines
2 KiB
YAML

version: v0.1.0
listeners:
ingress_traffic:
address: 0.0.0.0
port: 10000
message_format: openai
timeout: 30s
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s
endpoints:
weather_forecast_service:
endpoint: host.docker.internal:18083
connect_timeout: 0.005s
overrides:
# confidence threshold for prompt target intent matching
prompt_target_intent_matching_threshold: 0.6
llm_providers:
- access_key: $GROQ_API_KEY
model: groq/llama-3.2-3b-preview
- access_key: $OPENAI_API_KEY
model: openai/gpt-4o
default: true
- access_key: $OPENAI_API_KEY
model: openai/gpt-4o-mini
- access_key: $ANTHROPIC_API_KEY
model: anthropic/claude-sonnet-4-20250514
system_prompt: |
You are a helpful assistant.
prompt_guards:
input_guards:
jailbreak:
on_exception:
message: Looks like you're curious about my abilities, but I can only provide assistance for weather forecasting.
prompt_targets:
- name: get_current_weather
description: Get current weather at a location.
parameters:
- name: location
description: The location to get the weather for
required: true
type: string
format: City, State
- name: days
description: the number of days for the request
required: true
type: int
endpoint:
name: weather_forecast_service
path: /weather
http_method: POST
- name: default_target
default: true
description: This is the default target for all unmatched prompts.
endpoint:
name: weather_forecast_service
path: /default_target
http_method: POST
system_prompt: |
You are a helpful assistant! Summarize the user's request and provide a helpful response.
# if it is set to false arch will send response that it received from this prompt target to the user
# if true arch will forward the response to the default LLM
auto_llm_dispatch_on_response: false
tracing:
random_sampling: 100
trace_arch_internal: true