Implement flexible HTTP agent adapter with request templates and connection guides - Add request_template, response_path, method, query_params, and parse_structured_input to AgentConfig - Implement structured input parser for key-value extraction from golden prompts - Implement template engine with variable substitution for {prompt} and {field_name} - Implement response extractor supporting JSONPath and dot notation - Update HTTPAgentAdapter to support all HTTP methods (GET, POST, PUT, PATCH, DELETE) - Add comprehensive connection guide explaining localhost vs public endpoints - Update documentation with examples for TypeScript/JavaScript developers - Add tests for all new features

2026-04-25 00:36:54 +02:00 · 2025-12-31 23:04:47 +08:00 · 2025-12-31 23:04:47 +08:00 · 859566ee59
commit 859566ee59
parent 050204ef42
10 changed files with 1839 additions and 31 deletions
--- a/docs/CONFIGURATION_GUIDE.md
+++ b/docs/CONFIGURATION_GUIDE.md
@ -47,6 +47,10 @@ Define how flakestorm connects to your AI agent.

 ### HTTP Agent

+FlakeStorm's HTTP adapter is highly flexible and supports any endpoint format through request templates and response path configuration.
+
+#### Basic Configuration
+
 ```yaml
 agent:
  endpoint: "http://localhost:8000/invoke"
@ -57,7 +61,7 @@ agent:
    Content-Type: "application/json"
 ```

-**Expected API Format:**
+**Default Format (if no template specified):**

 Request:
 ```json
@ -70,6 +74,126 @@ Response:
 {"output": "agent response text"}
 ```

+#### Custom Request Template
+
+Map your endpoint's exact format using `request_template`:
+
+```yaml
+agent:
+  endpoint: "http://localhost:8000/api/chat"
+  type: "http"
+  method: "POST"
+  request_template: |
+    {"message": "{prompt}", "stream": false}
+  response_path: "$.reply"
+```
+
+**Template Variables:**
+- `{prompt}` - Full golden prompt text
+- `{field_name}` - Parsed structured input fields (see Structured Input below)
+
+#### Structured Input Parsing
+
+For agents that accept structured input (like your Reddit query generator):
+
+```yaml
+agent:
+  endpoint: "http://localhost:8000/generate-query"
+  type: "http"
+  method: "POST"
+  request_template: |
+    {
+      "industry": "{industry}",
+      "productName": "{productName}",
+      "businessModel": "{businessModel}",
+      "targetMarket": "{targetMarket}",
+      "description": "{description}"
+    }
+  response_path: "$.query"
+  parse_structured_input: true  # Default: true
+```
+
+**Golden Prompt Format:**
+```yaml
+golden_prompts:
+  - |
+    Industry: Fitness tech
+    Product/Service: AI personal trainer app
+    Business Model: B2C
+    Target Market: fitness enthusiasts
+    Description: An app that provides personalized workout plans
+```
+
+FlakeStorm will automatically parse this and map fields to your template.
+
+#### HTTP Methods
+
+Support for all HTTP methods:
+
+**GET Request:**
+```yaml
+agent:
+  endpoint: "http://api.example.com/search"
+  type: "http"
+  method: "GET"
+  request_template: "q={prompt}"
+  query_params:
+    api_key: "${API_KEY}"
+    format: "json"
+```
+
+**PUT Request:**
+```yaml
+agent:
+  endpoint: "http://api.example.com/update"
+  type: "http"
+  method: "PUT"
+  request_template: |
+    {"id": "123", "content": "{prompt}"}
+```
+
+#### Response Path Extraction
+
+Extract responses from complex JSON structures:
+
+```yaml
+agent:
+  endpoint: "http://api.example.com/chat"
+  type: "http"
+  response_path: "$.choices[0].message.content"  # JSONPath
+  # OR
+  response_path: "data.result"  # Dot notation
+```
+
+**Supported Formats:**
+- JSONPath: `"$.data.result"`, `"$.choices[0].message.content"`
+- Dot notation: `"data.result"`, `"response.text"`
+- Simple key: `"output"`, `"response"`
+
+#### Complete Example
+
+```yaml
+agent:
+  endpoint: "http://localhost:8000/api/v1/agent"
+  type: "http"
+  method: "POST"
+  timeout: 30000
+  headers:
+    Authorization: "Bearer ${API_KEY}"
+    Content-Type: "application/json"
+  request_template: |
+    {
+      "messages": [
+        {"role": "user", "content": "{prompt}"}
+      ],
+      "temperature": 0.7
+    }
+  response_path: "$.choices[0].message.content"
+  query_params:
+    version: "v1"
+  parse_structured_input: true
+```
+
 ### Python Agent

 ```yaml
@ -109,6 +233,11 @@ chain: Runnable = ...  # Your LangChain chain
 |--------|------|---------|-------------|
 | `endpoint` | string | required | URL or module path |
 | `type` | string | `"http"` | `http`, `python`, or `langchain` |
+| `method` | string | `"POST"` | HTTP method: `GET`, `POST`, `PUT`, `PATCH`, `DELETE` |
+| `request_template` | string | `null` | Template for request body/query with `{prompt}` or `{field_name}` variables |
+| `response_path` | string | `null` | JSONPath or dot notation to extract response (e.g., `"$.data.result"`) |
+| `query_params` | object | `{}` | Static query parameters (supports env vars) |
+| `parse_structured_input` | boolean | `true` | Whether to parse structured golden prompts into key-value pairs |
 | `timeout` | integer | `30000` | Request timeout in ms (1000-300000) |
 | `headers` | object | `{}` | HTTP headers (supports env vars) |

--- a/docs/CONNECTION_GUIDE.md
+++ b/docs/CONNECTION_GUIDE.md
@ -0,0 +1,317 @@
+# FlakeStorm Connection Guide
+
+This guide explains how to connect FlakeStorm to your agent, covering different scenarios from localhost to public endpoints, and options for internal code.
+
+---
+
+## Table of Contents
+
+1. [Connection Requirements](#connection-requirements)
+2. [Localhost vs Public Endpoints](#localhost-vs-public-endpoints)
+3. [Internal Code Options](#internal-code-options)
+4. [Exposing Local Endpoints](#exposing-local-endpoints)
+5. [Troubleshooting](#troubleshooting)
+
+---
+
+## Connection Requirements
+
+### When Do You Need an HTTP Endpoint?
+
+| Your Agent Code | Adapter Type | Endpoint Needed? | Notes |
+|----------------|--------------|------------------|-------|
+| Python (internal) | Python adapter | ❌ No | Use `type: "python"`, call function directly |
+| TypeScript/JavaScript | HTTP adapter | ✅ Yes | Must create HTTP endpoint (can be localhost) |
+| Java/Go/Rust | HTTP adapter | ✅ Yes | Must create HTTP endpoint (can be localhost) |
+| Already has HTTP API | HTTP adapter | ✅ Yes | Use existing endpoint |
+
+**Key Point:** FlakeStorm is a Python CLI tool. It can only directly call Python functions. For non-Python code, you **must** create an HTTP endpoint wrapper.
+
+---
+
+## Localhost vs Public Endpoints
+
+### When Localhost Works
+
+| FlakeStorm Location | Agent Location | Endpoint Type | Works? |
+|---------------------|----------------|---------------|--------|
+| Same machine | Same machine | `localhost:8000` | ✅ Yes |
+| Different machine | Your machine | `localhost:8000` | ❌ No |
+| CI/CD server | Your machine | `localhost:8000` | ❌ No |
+| CI/CD server | Cloud (AWS/GCP) | `https://api.example.com` | ✅ Yes |
+
+**Rule of Thumb:** If FlakeStorm and your agent run on the **same machine**, use `localhost`. Otherwise, you need a **public endpoint**.
+
+---
+
+## Internal Code Options
+
+### Option 1: Python Adapter (Recommended for Python Code)
+
+If your agent code is in Python, use the Python adapter - **no HTTP endpoint needed**:
+
+```python
+# my_agent.py
+async def flakestorm_agent(input: str) -> str:
+    """
+    FlakeStorm will call this function directly.
+
+    Args:
+        input: The golden prompt text (may be structured)
+
+    Returns:
+        The agent's response as a string
+    """
+    # Parse input, call your internal functions
+    params = parse_structured_input(input)
+    result = await your_internal_function(params)
+    return result
+```
+
+```yaml
+# flakestorm.yaml
+agent:
+  endpoint: "my_agent:flakestorm_agent"
+  type: "python"  # ← No HTTP endpoint needed!
+```
+
+**Benefits:**
+- No server setup required
+- Faster (no HTTP overhead)
+- Works offline
+- No network configuration
+
+### Option 2: HTTP Wrapper Endpoint (Required for Non-Python Code)
+
+For TypeScript/JavaScript/Java/Go/Rust, create a simple HTTP wrapper:
+
+**TypeScript/Node.js Example:**
+```typescript
+// test-endpoint.ts
+import express from 'express';
+import { generateRedditSearchQuery } from './your-internal-code';
+
+const app = express();
+app.use(express.json());
+
+app.post('/flakestorm-test', async (req, res) => {
+  // FlakeStorm sends: {"input": "Industry: X\nProduct: Y..."}
+  const structuredText = req.body.input;
+
+  // Parse structured input
+  const params = parseStructuredInput(structuredText);
+
+  // Call your internal function
+  const query = await generateRedditSearchQuery(params);
+
+  // Return in FlakeStorm's expected format
+  res.json({ output: query });
+});
+
+app.listen(8000, () => {
+  console.log('FlakeStorm test endpoint: http://localhost:8000/flakestorm-test');
+});
+```
+
+**Python FastAPI Example:**
+```python
+# test_endpoint.py
+from fastapi import FastAPI
+from pydantic import BaseModel
+
+app = FastAPI()
+
+class Request(BaseModel):
+    input: str
+
+@app.post("/flakestorm-test")
+async def flakestorm_test(request: Request):
+    # Parse structured input
+    params = parse_structured_input(request.input)
+
+    # Call your internal function
+    result = await your_internal_function(params)
+
+    return {"output": result}
+```
+
+Then in `flakestorm.yaml`:
+```yaml
+agent:
+  endpoint: "http://localhost:8000/flakestorm-test"
+  type: "http"
+  request_template: |
+    {
+      "industry": "{industry}",
+      "productName": "{productName}",
+      "businessModel": "{businessModel}",
+      "targetMarket": "{targetMarket}",
+      "description": "{description}"
+    }
+  response_path: "$.output"
+```
+
+---
+
+## Exposing Local Endpoints
+
+If FlakeStorm runs on a different machine (e.g., CI/CD), you need to expose your local endpoint publicly.
+
+### Option 1: ngrok (Recommended)
+
+```bash
+# Install ngrok
+brew install ngrok  # macOS
+# Or download from https://ngrok.com/download
+
+# Expose local port 8000
+ngrok http 8000
+
+# Output:
+# Forwarding  https://abc123.ngrok.io -> http://localhost:8000
+```
+
+Then use the ngrok URL in your config:
+```yaml
+agent:
+  endpoint: "https://abc123.ngrok.io/flakestorm-test"
+  type: "http"
+```
+
+### Option 2: localtunnel
+
+```bash
+# Install
+npm install -g localtunnel
+
+# Expose port
+lt --port 8000
+
+# Output:
+# your url is: https://xyz.localtunnel.me
+```
+
+### Option 3: Deploy to Cloud
+
+Deploy your test endpoint to a cloud service:
+- **Vercel** (for Node.js/TypeScript)
+- **Railway** (any language)
+- **Fly.io** (any language)
+- **AWS Lambda** (serverless)
+
+### Option 4: VPN/SSH Tunnel
+
+If both machines are on the same network:
+```bash
+# SSH tunnel
+ssh -L 8000:localhost:8000 user@agent-machine
+
+# Then use localhost:8000 in config
+```
+
+---
+
+## Troubleshooting
+
+### "Connection Refused" Error
+
+**Problem:** FlakeStorm can't reach your endpoint.
+
+**Solutions:**
+1. **Check if agent is running:**
+   ```bash
+   curl http://localhost:8000/health
+   ```
+
+2. **Verify endpoint URL in config:**
+   ```yaml
+   agent:
+     endpoint: "http://localhost:8000/invoke"  # Check this matches your server
+   ```
+
+3. **Check firewall:**
+   ```bash
+   # macOS: System Preferences > Security & Privacy > Firewall
+   # Linux: sudo ufw allow 8000
+   ```
+
+4. **For Docker/containers:**
+   - Use `host.docker.internal:8000` instead of `localhost:8000`
+   - Or use container networking
+
+### "Timeout" Error
+
+**Problem:** Agent takes too long to respond.
+
+**Solutions:**
+1. **Increase timeout:**
+   ```yaml
+   agent:
+     timeout: 60000  # 60 seconds
+   ```
+
+2. **Check agent performance:**
+   - Is the agent actually processing requests?
+   - Are there network issues?
+
+### "Invalid Response Format" Error
+
+**Problem:** Response doesn't match expected format.
+
+**Solutions:**
+1. **Use response_path:**
+   ```yaml
+   agent:
+     response_path: "$.data.result"  # Extract from nested JSON
+   ```
+
+2. **Check actual response:**
+   ```bash
+   curl -X POST http://localhost:8000/invoke \
+     -H "Content-Type: application/json" \
+     -d '{"input": "test"}'
+   ```
+
+3. **Update request_template if needed:**
+   ```yaml
+   agent:
+     request_template: |
+       {"your_field": "{prompt}"}
+   ```
+
+### Network Connectivity Issues
+
+**Problem:** Can't connect from CI/CD or remote machine.
+
+**Solutions:**
+1. **Use public endpoint** (ngrok, cloud deployment)
+2. **Check network policies** (corporate firewall, VPN)
+3. **Verify DNS resolution** (if using domain name)
+4. **Test with curl** from the same machine FlakeStorm runs on
+
+---
+
+## Best Practices
+
+1. **For Development:** Use Python adapter if possible (fastest, simplest)
+2. **For Testing:** Use localhost HTTP endpoint (easy to debug)
+3. **For CI/CD:** Use public endpoint or cloud deployment
+4. **For Production Testing:** Use production endpoint with proper authentication
+5. **Security:** Never commit API keys - use environment variables
+
+---
+
+## Quick Reference
+
+| Scenario | Solution |
+|----------|----------|
+| Python code, same machine | Python adapter (`type: "python"`) |
+| TypeScript/JS, same machine | HTTP endpoint (`localhost:8000`) |
+| Any language, CI/CD | Public endpoint (ngrok/cloud) |
+| Already has HTTP API | Use existing endpoint |
+| Need custom request format | Use `request_template` |
+| Complex response structure | Use `response_path` |
+
+---
+
+*For more examples, see [Configuration Guide](CONFIGURATION_GUIDE.md) and [Usage Guide](USAGE_GUIDE.md).*
--- a/docs/DEVELOPER_FAQ.md
+++ b/docs/DEVELOPER_FAQ.md
@ -456,6 +456,109 @@ class PythonAgentAdapter:

 ---

+### Q: When do I need to create an HTTP endpoint vs use Python adapter?
+
+**A:** It depends on your agent's language and setup:
+
+| Your Agent Code | Adapter Type | Endpoint Needed? | Notes |
+|----------------|--------------|------------------|-------|
+| Python (internal) | Python adapter | ❌ No | Use `type: "python"`, call function directly |
+| TypeScript/JavaScript | HTTP adapter | ✅ Yes | Must create HTTP endpoint (can be localhost) |
+| Java/Go/Rust | HTTP adapter | ✅ Yes | Must create HTTP endpoint (can be localhost) |
+| Already has HTTP API | HTTP adapter | ✅ Yes | Use existing endpoint |
+
+**For non-Python code (TypeScript example):**
+
+Since FlakeStorm is a Python CLI tool, it can only directly call Python functions. For TypeScript/JavaScript/other languages, you **must** create an HTTP endpoint:
+
+```typescript
+// test-endpoint.ts - Wrapper endpoint for FlakeStorm
+import express from 'express';
+import { generateRedditSearchQuery } from './your-internal-code';
+
+const app = express();
+app.use(express.json());
+
+app.post('/flakestorm-test', async (req, res) => {
+  // FlakeStorm sends: {"input": "Industry: X\nProduct: Y..."}
+  const structuredText = req.body.input;
+
+  // Parse structured input
+  const params = parseStructuredInput(structuredText);
+
+  // Call your internal function
+  const query = await generateRedditSearchQuery(params);
+
+  // Return in FlakeStorm's expected format
+  res.json({ output: query });
+});
+
+app.listen(8000, () => {
+  console.log('FlakeStorm test endpoint: http://localhost:8000/flakestorm-test');
+});
+```
+
+Then in `flakestorm.yaml`:
+```yaml
+agent:
+  endpoint: "http://localhost:8000/flakestorm-test"
+  type: "http"
+  request_template: |
+    {
+      "industry": "{industry}",
+      "productName": "{productName}",
+      "businessModel": "{businessModel}",
+      "targetMarket": "{targetMarket}",
+      "description": "{description}"
+    }
+  response_path: "$.output"
+```
+
+---
+
+### Q: Do I need a public endpoint or can I use localhost?
+
+**A:** It depends on where FlakeStorm runs:
+
+| FlakeStorm Location | Agent Location | Endpoint Type | Works? |
+|---------------------|----------------|---------------|--------|
+| Same machine | Same machine | `localhost:8000` | ✅ Yes |
+| Different machine | Your machine | `localhost:8000` | ❌ No - use public endpoint or ngrok |
+| CI/CD server | Your machine | `localhost:8000` | ❌ No - use public endpoint |
+| CI/CD server | Cloud (AWS/GCP) | `https://api.example.com` | ✅ Yes |
+
+**Options for exposing local endpoint:**
+1. **ngrok**: `ngrok http 8000` → get public URL
+2. **localtunnel**: `lt --port 8000` → get public URL
+3. **Deploy to cloud**: Deploy your test endpoint to a cloud service
+4. **VPN/SSH tunnel**: If both machines are on same network
+
+---
+
+### Q: Can I test internal code without creating an endpoint?
+
+**A:** Only if your code is in Python:
+
+```python
+# my_agent.py
+async def flakestorm_agent(input: str) -> str:
+    # Parse input, call your internal functions
+    return result
+```
+
+```yaml
+# flakestorm.yaml
+agent:
+  endpoint: "my_agent:flakestorm_agent"
+  type: "python"  # ← No HTTP endpoint needed!
+```
+
+For non-Python code, you **must** create an HTTP endpoint wrapper.
+
+See [Connection Guide](CONNECTION_GUIDE.md) for detailed examples and troubleshooting.
+
+---
+
 ## Testing & Quality

 ### Q: Why are tests split by module?
--- a/docs/USAGE_GUIDE.md
+++ b/docs/USAGE_GUIDE.md
@ -455,23 +455,280 @@ open reports/flakestorm-*.html

 **What they are:** Carefully crafted prompts that represent your agent's core use cases. These are prompts that *should always work correctly*.

-**How to choose them:**
- Cover all major user intents
- Include edge cases you've seen in production
- Represent different complexity levels
+#### Understanding Golden Prompts vs System Prompts

+**Key Distinction:**
+- **System Prompt**: Instructions that define your agent's role and behavior (stays in your code)
+- **Golden Prompt**: Example user inputs that should work correctly (what FlakeStorm mutates and tests)
+
+**Example:**
+```javascript
+// System Prompt (in your agent code - NOT in flakestorm.yaml)
+const systemPrompt = `You are a helpful assistant that books flights...`;
+
+// Golden Prompts (in flakestorm.yaml - what FlakeStorm tests)
+golden_prompts:
+  - "Book a flight from NYC to LA"
+  - "I need to fly to Paris next Monday"
+```
+
+FlakeStorm takes your golden prompts, mutates them (adds typos, paraphrases, etc.), and sends them to your agent. Your agent processes them using its system prompt.
+
+#### How to Choose Golden Prompts
+
+**1. Cover All Major User Intents**
+```yaml
+golden_prompts:
+  # Primary use case
+  - "Book a flight from New York to Los Angeles"
+
+  # Secondary use case
+  - "What's my account balance?"
+
+  # Another feature
+  - "Cancel my reservation #12345"
+```
+
+**2. Include Different Complexity Levels**
 ```yaml
 golden_prompts:
  # Simple intent
  - "Hello, how are you?"

-  # Complex intent with parameters
-  - "Book a flight from New York to Los Angeles departing March 15th"
+  # Medium complexity
+  - "Book a flight to Paris"

-  # Edge case
-  - "What if I need to cancel my booking?"
+  # Complex with multiple parameters
+  - "Book a flight from New York to Los Angeles departing March 15th, returning March 22nd, economy class, window seat"
 ```

+**3. Include Edge Cases**
+```yaml
+golden_prompts:
+  # Normal case
+  - "Book a flight to Paris"
+
+  # Edge case: unusual request
+  - "What if I need to cancel my booking?"
+
+  # Edge case: minimal input
+  - "Paris"
+
+  # Edge case: ambiguous request
+  - "I need to travel somewhere warm"
+```
+
+#### Examples by Agent Type
+
+**1. Simple Chat Agent**
+```yaml
+golden_prompts:
+  - "What is the weather in New York?"
+  - "Tell me a joke"
+  - "How do I make a paper airplane?"
+  - "What's 2 + 2?"
+```
+
+**2. E-commerce Assistant**
+```yaml
+golden_prompts:
+  - "I'm looking for a red dress size medium"
+  - "Show me running shoes under $100"
+  - "What's the return policy?"
+  - "Add this to my cart"
+  - "Track my order #ABC123"
+```
+
+**3. Structured Input Agent (Reddit Search Query Generator)**
+
+For agents that accept structured input (like a Reddit community discovery assistant):
+
+```yaml
+golden_prompts:
+  # B2C SaaS example
+  - |
+    Industry: Fitness tech
+    Product/Service: AI personal trainer app
+    Business Model: B2C
+    Target Market: fitness enthusiasts, people who want to lose weight
+    Description: An app that provides personalized workout plans using AI
+
+  # B2B SaaS example
+  - |
+    Industry: Marketing tech
+    Product/Service: Email automation platform
+    Business Model: B2B SaaS
+    Target Market: small business owners, marketing teams
+    Description: Automated email campaigns for small businesses
+
+  # Marketplace example
+  - |
+    Industry: E-commerce
+    Product/Service: Handmade crafts marketplace
+    Business Model: Marketplace
+    Target Market: crafters, DIY enthusiasts, gift buyers
+    Description: Platform connecting artisans with buyers
+
+  # Edge case - minimal description
+  - |
+    Industry: Healthcare tech
+    Product/Service: Telemedicine platform
+    Business Model: B2C
+    Target Market: busy professionals
+    Description: Video consultations
+```
+
+**4. API/Function-Calling Agent**
+```yaml
+golden_prompts:
+  - "Get the weather for San Francisco"
+  - "Send an email to john@example.com with subject 'Meeting'"
+  - "Create a calendar event for tomorrow at 3pm"
+  - "What's my schedule for next week?"
+```
+
+**5. Code Generation Agent**
+```yaml
+golden_prompts:
+  - "Write a Python function to sort a list"
+  - "Create a React component for a login form"
+  - "How do I connect to a PostgreSQL database in Node.js?"
+  - "Fix this bug: [code snippet]"
+```
+
+#### Best Practices
+
+**1. Start Small, Then Expand**
+```yaml
+# Phase 1: Start with 2-3 core prompts
+golden_prompts:
+  - "Primary use case 1"
+  - "Primary use case 2"
+
+# Phase 2: Add more as you validate
+golden_prompts:
+  - "Primary use case 1"
+  - "Primary use case 2"
+  - "Secondary use case"
+  - "Edge case 1"
+  - "Edge case 2"
+```
+
+**2. Cover Different User Personas**
+```yaml
+golden_prompts:
+  # Professional user
+  - "I need to schedule a meeting with the team for Q4 planning"
+
+  # Casual user
+  - "hey can u help me book something"
+
+  # Technical user
+  - "Query the database for all users created after 2024-01-01"
+
+  # Non-technical user
+  - "Show me my account"
+```
+
+**3. Include Real Production Examples**
+```yaml
+golden_prompts:
+  # From your production logs
+  - "Actual user query from logs"
+  - "Another real example"
+  - "Edge case that caused issues before"
+```
+
+**4. Test Different Input Formats**
+```yaml
+golden_prompts:
+  # Well-formatted
+  - "Book a flight from New York to Los Angeles on March 15th"
+
+  # Informal
+  - "need a flight nyc to la march 15"
+
+  # With extra context
+  - "Hi! I'm planning a trip and I need to book a flight from New York City to Los Angeles on March 15th, 2024. Can you help?"
+```
+
+**5. For Structured Input: Cover All Variations**
+```yaml
+golden_prompts:
+  # Complete input
+  - |
+    Industry: Tech
+    Product: SaaS platform
+    Model: B2B
+    Market: Enterprises
+    Description: Full description here
+
+  # Minimal input (edge case)
+  - |
+    Industry: Tech
+    Product: Platform
+
+  # Different business models
+  - |
+    Industry: Retail
+    Product: E-commerce site
+    Model: B2C
+    Market: Consumers
+```
+
+#### Common Patterns
+
+**Pattern 1: Question-Answer Agent**
+```yaml
+golden_prompts:
+  - "What is X?"
+  - "How do I Y?"
+  - "Why does Z happen?"
+  - "When should I do A?"
+```
+
+**Pattern 2: Task-Oriented Agent**
+```yaml
+golden_prompts:
+  - "Do X" (imperative)
+  - "I need to do X" (declarative)
+  - "Can you help me with X?" (question form)
+  - "X please" (polite request)
+```
+
+**Pattern 3: Multi-Turn Context Agent**
+```yaml
+golden_prompts:
+  # First turn
+  - "I'm looking for a hotel"
+  # Second turn (test separately)
+  - "In Paris"
+  # Third turn (test separately)
+  - "Under $200 per night"
+```
+
+**Pattern 4: Data Processing Agent**
+```yaml
+golden_prompts:
+  - "Analyze this data: [data]"
+  - "Summarize the following: [text]"
+  - "Extract key information from: [content]"
+```
+
+#### What NOT to Include
+
+❌ **Don't include:**
+- Prompts that are known to fail (those are edge cases to test, not golden prompts)
+- System prompts or instructions (those stay in your code)
+- Malformed inputs (FlakeStorm will generate those as mutations)
+- Test-only prompts that users would never send
+
+✅ **Do include:**
+- Real user queries from production
+- Expected use cases
+- Prompts that should always work
+- Representative examples of your user base
+
 ### Mutation Types

 flakestorm generates adversarial variations of your golden prompts:
@ -862,6 +1119,143 @@ agent = AgentExecutor(...)

 ---

+## Request Templates and Connection Setup
+
+### Understanding Request Templates
+
+Request templates allow you to map FlakeStorm's format to your agent's exact API format.
+
+#### Basic Template
+
+```yaml
+agent:
+  endpoint: "http://localhost:8000/api/chat"
+  type: "http"
+  request_template: |
+    {"message": "{prompt}", "stream": false}
+  response_path: "$.reply"
+```
+
+**What happens:**
+1. FlakeStorm takes golden prompt: `"Book a flight to Paris"`
+2. Replaces `{prompt}` in template: `{"message": "Book a flight to Paris", "stream": false}`
+3. Sends to your endpoint
+4. Extracts response from `$.reply` path
+
+#### Structured Input Mapping
+
+For agents that accept structured input:
+
+```yaml
+agent:
+  endpoint: "http://localhost:8000/generate-query"
+  type: "http"
+  method: "POST"
+  request_template: |
+    {
+      "industry": "{industry}",
+      "productName": "{productName}",
+      "businessModel": "{businessModel}",
+      "targetMarket": "{targetMarket}",
+      "description": "{description}"
+    }
+  response_path: "$.query"
+  parse_structured_input: true
+```
+
+**Golden Prompt:**
+```yaml
+golden_prompts:
+  - |
+    Industry: Fitness tech
+    Product/Service: AI personal trainer app
+    Business Model: B2C
+    Target Market: fitness enthusiasts
+    Description: An app that provides personalized workout plans
+```
+
+**What happens:**
+1. FlakeStorm parses structured input into key-value pairs
+2. Maps fields to template: `{"industry": "Fitness tech", "productName": "AI personal trainer app", ...}`
+3. Sends to your endpoint
+4. Extracts response from `$.query`
+
+#### Different HTTP Methods
+
+**GET Request:**
+```yaml
+agent:
+  endpoint: "http://api.example.com/search"
+  type: "http"
+  method: "GET"
+  request_template: "q={prompt}"
+  query_params:
+    api_key: "${API_KEY}"
+    format: "json"
+```
+
+**PUT Request:**
+```yaml
+agent:
+  endpoint: "http://api.example.com/update"
+  type: "http"
+  method: "PUT"
+  request_template: |
+    {"id": "123", "content": "{prompt}"}
+```
+
+### Connection Setup
+
+#### For Python Code (No Endpoint Needed)
+
+```python
+# my_agent.py
+async def flakestorm_agent(input: str) -> str:
+    # Your agent logic
+    return result
+```
+
+```yaml
+agent:
+  endpoint: "my_agent:flakestorm_agent"
+  type: "python"
+```
+
+#### For TypeScript/JavaScript (Need HTTP Endpoint)
+
+Create a wrapper endpoint:
+
+```typescript
+// test-endpoint.ts
+import express from 'express';
+import { yourAgentFunction } from './your-code';
+
+const app = express();
+app.use(express.json());
+
+app.post('/flakestorm-test', async (req, res) => {
+  const result = await yourAgentFunction(req.body.input);
+  res.json({ output: result });
+});
+
+app.listen(8000);
+```
+
+```yaml
+agent:
+  endpoint: "http://localhost:8000/flakestorm-test"
+  type: "http"
+```
+
+#### Localhost vs Public Endpoint
+
+- **Same machine:** Use `localhost:8000`
+- **Different machine/CI/CD:** Use public endpoint (ngrok, cloud deployment)
+
+See [Connection Guide](CONNECTION_GUIDE.md) for detailed setup instructions.
+
+---
+
 ## Advanced Usage

 ### Custom Mutation Templates
@ -921,6 +1315,306 @@ advanced:
  retries: 3      # Retry failed requests 3 times
 ```

+### Golden Prompt Guide
+
+A comprehensive guide to creating effective golden prompts for your agent.
+
+#### Step-by-Step: Creating Golden Prompts
+
+**Step 1: Identify Core Use Cases**
+```yaml
+# List your agent's primary functions
+# Example: Flight booking agent
+golden_prompts:
+  - "Book a flight"           # Core function
+  - "Check flight status"     # Core function
+  - "Cancel booking"           # Core function
+```
+
+**Step 2: Add Variations for Each Use Case**
+```yaml
+golden_prompts:
+  # Booking variations
+  - "Book a flight from NYC to LA"
+  - "I need to fly to Paris"
+  - "Reserve a ticket to Tokyo"
+  - "Can you book me a flight?"
+
+  # Status check variations
+  - "What's my flight status?"
+  - "Check my booking"
+  - "Is my flight on time?"
+```
+
+**Step 3: Include Edge Cases**
+```yaml
+golden_prompts:
+  # Normal cases (from Step 2)
+  - "Book a flight from NYC to LA"
+
+  # Edge cases
+  - "Book a flight"                    # Minimal input
+  - "I need to travel somewhere"      # Vague request
+  - "What if I need to change my flight?"  # Conditional
+  - "Book a flight for next year"     # Far future
+```
+
+**Step 4: Cover Different User Styles**
+```yaml
+golden_prompts:
+  # Formal
+  - "I would like to book a flight from New York to Los Angeles"
+
+  # Casual
+  - "hey can u book me a flight nyc to la"
+
+  # Technical/precise
+  - "Flight booking: JFK -> LAX, 2024-03-15, economy"
+
+  # Verbose
+  - "Hi! I'm planning a trip and I need to book a flight from New York City to Los Angeles on March 15th, 2024. Can you help me with that?"
+```
+
+#### Golden Prompts for Structured Input Agents
+
+For agents that accept structured data (JSON, YAML, key-value pairs):
+
+**Example: Reddit Community Discovery Agent**
+```yaml
+golden_prompts:
+  # Complete structured input
+  - |
+    Industry: Fitness tech
+    Product/Service: AI personal trainer app
+    Business Model: B2C
+    Target Market: fitness enthusiasts, people who want to lose weight
+    Description: An app that provides personalized workout plans using AI
+
+  # Different business model
+  - |
+    Industry: Marketing tech
+    Product/Service: Email automation platform
+    Business Model: B2B SaaS
+    Target Market: small business owners, marketing teams
+    Description: Automated email campaigns for small businesses
+
+  # Minimal input (edge case)
+  - |
+    Industry: Healthcare tech
+    Product/Service: Telemedicine platform
+    Business Model: B2C
+
+  # Different industry
+  - |
+    Industry: E-commerce
+    Product/Service: Handmade crafts marketplace
+    Business Model: Marketplace
+    Target Market: crafters, DIY enthusiasts
+    Description: Platform connecting artisans with buyers
+```
+
+**Example: API Request Builder Agent**
+```yaml
+golden_prompts:
+  - |
+    Method: GET
+    Endpoint: /users
+    Headers: {"Authorization": "Bearer token"}
+
+  - |
+    Method: POST
+    Endpoint: /orders
+    Body: {"product_id": 123, "quantity": 2}
+
+  - |
+    Method: PUT
+    Endpoint: /users/123
+    Body: {"name": "John Doe"}
+```
+
+#### Domain-Specific Examples
+
+**E-commerce Agent:**
+```yaml
+golden_prompts:
+  # Product search
+  - "I'm looking for a red dress size medium"
+  - "Show me running shoes under $100"
+  - "Find blue jeans for men"
+
+  # Cart operations
+  - "Add this to my cart"
+  - "What's in my cart?"
+  - "Remove item from cart"
+
+  # Orders
+  - "Track my order #ABC123"
+  - "What's my order status?"
+  - "Cancel my order"
+
+  # Support
+  - "What's the return policy?"
+  - "How do I exchange an item?"
+  - "Contact customer service"
+```
+
+**Code Generation Agent:**
+```yaml
+golden_prompts:
+  # Simple functions
+  - "Write a Python function to sort a list"
+  - "Create a function to calculate factorial"
+
+  # Components
+  - "Create a React component for a login form"
+  - "Build a Vue component for a todo list"
+
+  # Integration
+  - "How do I connect to PostgreSQL in Node.js?"
+  - "Show me how to use Redis with Python"
+
+  # Debugging
+  - "Fix this bug: [code snippet]"
+  - "Why is this code not working?"
+```
+
+**Customer Support Agent:**
+```yaml
+golden_prompts:
+  # Account questions
+  - "What's my account balance?"
+  - "How do I change my password?"
+  - "Update my email address"
+
+  # Product questions
+  - "How do I use feature X?"
+  - "What are the system requirements?"
+  - "Is there a mobile app?"
+
+  # Billing
+  - "What's my subscription status?"
+  - "How do I cancel my subscription?"
+  - "Update my payment method"
+```
+
+#### Quality Checklist
+
+Before finalizing your golden prompts, verify:
+
+- [ ] **Coverage**: All major features/use cases included
+- [ ] **Diversity**: Different complexity levels (simple, medium, complex)
+- [ ] **Realism**: Based on actual user queries from production
+- [ ] **Edge Cases**: Unusual but valid inputs included
+- [ ] **User Styles**: Formal, casual, technical, verbose variations
+- [ ] **Quantity**: 5-15 prompts recommended (start with 5, expand)
+- [ ] **Clarity**: Each prompt represents a distinct use case
+- [ ] **Relevance**: All prompts are things users would actually send
+
+#### Iterative Improvement
+
+**Phase 1: Initial Set (5 prompts)**
+```yaml
+golden_prompts:
+  - "Primary use case 1"
+  - "Primary use case 2"
+  - "Primary use case 3"
+  - "Secondary use case 1"
+  - "Edge case 1"
+```
+
+**Phase 2: Expand (10 prompts)**
+```yaml
+# Add variations and more edge cases
+golden_prompts:
+  # ... previous 5 ...
+  - "Primary use case 1 variation"
+  - "Primary use case 2 variation"
+  - "Secondary use case 2"
+  - "Edge case 2"
+  - "Edge case 3"
+```
+
+**Phase 3: Refine (15+ prompts)**
+```yaml
+# Add based on test results and production data
+golden_prompts:
+  # ... previous 10 ...
+  - "Real user query from logs"
+  - "Another production example"
+  - "Failure case that should work"
+```
+
+#### Common Mistakes to Avoid
+
+❌ **Too Generic**
+```yaml
+# Bad: Too vague
+golden_prompts:
+  - "Help me"
+  - "Do something"
+  - "Question"
+```
+
+✅ **Specific and Actionable**
+```yaml
+# Good: Clear intent
+golden_prompts:
+  - "Book a flight from NYC to LA"
+  - "What's my account balance?"
+  - "Cancel my subscription"
+```
+
+❌ **Including System Prompts**
+```yaml
+# Bad: This is a system prompt, not a golden prompt
+golden_prompts:
+  - "You are a helpful assistant that..."
+```
+
+✅ **User Inputs Only**
+```yaml
+# Good: Actual user queries
+golden_prompts:
+  - "Book a flight"
+  - "What's the weather?"
+```
+
+❌ **Only Happy Path**
+```yaml
+# Bad: Only perfect inputs
+golden_prompts:
+  - "Book a flight from New York to Los Angeles on March 15th, 2024, economy class, window seat, no meals"
+```
+
+✅ **Include Variations**
+```yaml
+# Good: Various input styles
+golden_prompts:
+  - "Book a flight from NYC to LA"
+  - "I need to fly to Los Angeles"
+  - "flight booking please"
+  - "Can you help me book a flight?"
+```
+
+#### Testing Your Golden Prompts
+
+Before running FlakeStorm, manually test your golden prompts:
+
+```bash
+# Test each golden prompt manually
+curl -X POST http://localhost:8000/invoke \
+  -H "Content-Type: application/json" \
+  -d '{"input": "Your golden prompt here"}'
+```
+
+Verify:
+- ✅ Agent responds correctly
+- ✅ Response time is reasonable
+- ✅ No errors occur
+- ✅ Response format matches expectations
+
+If a golden prompt fails manually, fix your agent first, then use it in FlakeStorm.
+
 ---

 ## Troubleshooting