chore(release): v0.6.11 — Gemini provider + Anthropic model fix

feat(llm): add Gemini provider and fix stale Anthropic default model
Adds a Google Gemini provider (Generative Language API) to the chain, ordered Ollama -> OpenAI -> Gemini -> Anthropic so Google credits are preferred with Anthropic as last-resort fallback. System->systemInstruction, assistant->model, json_mode->responseMimeType; model name validated before URL interpolation; maxOutputTokens defaults high for 2.5 thinking models. Also fixes AnthropicProvider default (retired claude-sonnet-4-20250514 -> 404); now claude-sonnet-4-6, honors ANTHROPIC_MODEL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 23:45:13 +02:00 · 2026-06-16 16:12:11 +02:00 · 2026-06-16 15:52:37 +02:00 · 2026-06-15 18:15:19 +02:00 · 2026-06-15 15:06:00 +02:00 · 2026-06-15 11:27:04 +02:00
24 changed files with 923 additions and 109 deletions
--- a/.github/banner.png
+++ b/.github/banner.png
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@ -3,6 +3,15 @@ name: Release
 on:
  push:
    tags: ["v*"]
+  # Manual re-publish of the Docker image for an existing release, without
+  # rebuilding binaries or cutting a new version. Runs only the docker (+
+  # homebrew) jobs against the given tag's already-published release assets.
+  workflow_dispatch:
+    inputs:
+      tag:
+        description: "Existing release tag to (re)build + push the Docker image for, e.g. v0.6.9"
+        required: true
+        type: string

 permissions:
  contents: read
@ -12,6 +21,9 @@ env:

 jobs:
  build:
+    # Binaries are only built when a tag is pushed. A manual dispatch reuses
+    # the existing release's binaries, so it skips this job entirely.
+    if: github.event_name == 'push'
    permissions:
      contents: read
    name: Build ${{ matrix.target }}
@ -105,6 +117,7 @@ jobs:

  release:
    name: Release
+    if: github.event_name == 'push'
    needs: build
    runs-on: ubuntu-latest
    permissions:
@ -137,6 +150,10 @@ jobs:
  docker:
    name: Docker
    needs: release
+    # Runs after a successful release on tag push, or standalone via
+    # workflow_dispatch to (re)publish an existing tag's image. `always()` lets
+    # it run even though `release` is skipped on a manual dispatch.
+    if: ${{ always() && (github.event_name == 'workflow_dispatch' || needs.release.result == 'success') }}
    runs-on: ubuntu-latest
    permissions:
      contents: read
@ -156,49 +173,48 @@ jobs:
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

-      # Download pre-built binaries for both architectures
+      # The pushed tag, or the workflow_dispatch input for a manual re-publish.
+      - name: Resolve tag
+        id: tag
+        run: echo "tag=${{ github.event.inputs.tag || github.ref_name }}" >> "$GITHUB_OUTPUT"
+
+      # Download pre-built binaries into TARGETARCH-named dirs (amd64/arm64) so
+      # a single multi-platform build picks the matching binary per platform.
      - name: Download release binaries
        run: |
-          tag="${GITHUB_REF#refs/tags/}"
+          tag="${{ steps.tag.outputs.tag }}"
+          declare -A arch=( [x86_64-unknown-linux-gnu]=amd64 [aarch64-unknown-linux-gnu]=arm64 )
          for target in x86_64-unknown-linux-gnu aarch64-unknown-linux-gnu; do
            dir="webclaw-${tag}-${target}"
            curl -sSL "https://github.com/0xMassi/webclaw/releases/download/${tag}/${dir}.tar.gz" -o "${target}.tar.gz"
            tar xzf "${target}.tar.gz"
-            mkdir -p "binaries-${target}"
-            cp "${dir}/webclaw" "binaries-${target}/webclaw"
-            cp "${dir}/webclaw-mcp" "binaries-${target}/webclaw-mcp"
-            cp "${dir}/webclaw-server" "binaries-${target}/webclaw-server"
-            chmod +x "binaries-${target}"/*
+            a="${arch[$target]}"
+            mkdir -p "binaries-${a}"
+            cp "${dir}/webclaw" "${dir}/webclaw-mcp" "${dir}/webclaw-server" "binaries-${a}/"
+            chmod +x "binaries-${a}"/*
          done
          ls -laR binaries-*/

-      # Build per-arch images with plain docker build (no buildx manifest nesting)
+      # One atomic multi-platform build + push. buildx assembles a single
+      # manifest list and pushes it in one shot, so there is no separate
+      # `imagetools create` step to race GHCR's read-after-write (that is what
+      # failed before: "v0.6.9-arm64: not found"). Provenance/SBOM attestations
+      # are disabled so each platform entry stays a plain image manifest.
      - name: Build and push
        run: |
-          tag="${GITHUB_REF#refs/tags/}"
-
-          # amd64
-          docker build -f Dockerfile.ci --build-arg BINARY_DIR=binaries-x86_64-unknown-linux-gnu \
-            --platform linux/amd64 -t ghcr.io/0xmassi/webclaw:${tag}-amd64 --push .
-
-          # arm64
-          docker build -f Dockerfile.ci --build-arg BINARY_DIR=binaries-aarch64-unknown-linux-gnu \
-            --platform linux/arm64 -t ghcr.io/0xmassi/webclaw:${tag}-arm64 --push .
-
-          # Multi-arch manifest
-          docker manifest create ghcr.io/0xmassi/webclaw:${tag} \
-            ghcr.io/0xmassi/webclaw:${tag}-amd64 \
-            ghcr.io/0xmassi/webclaw:${tag}-arm64
-          docker manifest push ghcr.io/0xmassi/webclaw:${tag}
-
-          docker manifest create ghcr.io/0xmassi/webclaw:latest \
-            ghcr.io/0xmassi/webclaw:${tag}-amd64 \
-            ghcr.io/0xmassi/webclaw:${tag}-arm64
-          docker manifest push ghcr.io/0xmassi/webclaw:latest
+          tag="${{ steps.tag.outputs.tag }}"
+          docker buildx build -f Dockerfile.ci \
+            --platform linux/amd64,linux/arm64 \
+            --provenance=false --sbom=false \
+            -t "ghcr.io/0xmassi/webclaw:${tag}" \
+            -t ghcr.io/0xmassi/webclaw:latest \
+            --push .

  homebrew:
    name: Update Homebrew
    needs: [release, docker]
+    # Runs once Docker succeeds, on both tag push and manual re-publish.
+    if: ${{ always() && needs.docker.result == 'success' }}
    runs-on: ubuntu-latest
    permissions:
      contents: read
@ -207,7 +223,7 @@ jobs:
        env:
          COMMITTER_TOKEN: ${{ secrets.HOMEBREW_TAP_TOKEN }}
        run: |
-          tag="${GITHUB_REF#refs/tags/}"
+          tag="${{ github.event.inputs.tag || github.ref_name }}"
          base="https://github.com/0xMassi/webclaw/releases/download/${tag}"

          # Download all tarballs (Linux + macOS) and compute SHAs
--- a/.mcp.json
+++ b/.mcp.json
@ -0,0 +1,7 @@
+{
+  "mcpServers": {
+    "webclaw": {
+      "command": "~/.webclaw/webclaw-mcp"
+    }
+  }
+}
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -3,6 +3,38 @@
 All notable changes to webclaw are documented here.
 Format follows [Keep a Changelog](https://keepachangelog.com/).

+## [Unreleased]
+
+## [0.6.11] - 2026-06-16
+
+### Added
+- New **Google Gemini** provider in the LLM provider chain. Set `GEMINI_API_KEY` (and optionally `GEMINI_MODEL`, default `gemini-2.5-flash`) to enable it; the chain tries Ollama → OpenAI → Gemini → Anthropic and uses the first available provider.
+
+### Fixed
+- The Anthropic provider's default model pointed at a retired model id that now returns `404`, which could fail extraction/summarization when falling back to Anthropic. It now defaults to a current model and is overridable via `ANTHROPIC_MODEL`.
+
+## [0.6.10] - 2026-06-15
+
+### Fixed
+- MCP tools that take numeric arguments now accept those values whether the client sends them as numbers or as numeric strings. Some MCP clients (e.g. Claude Desktop) send `"5"` instead of `5`, which previously failed the call with a deserialization error. Affects `crawl` (depth, max_pages, concurrency), `batch` (concurrency), `search` (num_results), and `summarize` (max_sentences).
+
+## [0.6.9] - 2026-06-10
+
+### Fixed
+- The multi-arch Docker image (linux/amd64 + linux/arm64) now publishes reliably on each release. The build moved to Buildx so registry pushes no longer fail intermittently, and the Homebrew formula update that depends on it is no longer skipped.
+
+## [0.6.8] - 2026-06-10
+
+### Fixed
+- Pages with multibyte text (accented or CJK characters) no longer panic or get mangled during extraction. API-endpoint discovery now cuts oversized scripts on a character boundary instead of crashing mid-character, and structured-data parsing preserves non-ASCII string values instead of turning them into mojibake.
+- LLM error messages from a provider no longer panic when the error body contains multibyte characters near the truncation point.
+- LLM provider requests now have explicit connect and overall timeouts, so a stalled or unreachable provider fails fast instead of hanging.
+- Batch extraction in the MCP server no longer aborts the whole batch when a single URL fails to resolve; bad URLs are reported as individual per-URL errors and the rest still run.
+- CLI crawl and batch runs now wait for the completion webhook to actually send before exiting, replacing a fixed delay that could cut the request off or waste time.
+- Homepage warm-up requests now include the port for hosts on a non-default port, so those sites are warmed correctly.
+
+---
+
 ## [0.6.7] — 2026-06-09

 ### Changed
--- a/Cargo.lock
+++ b/Cargo.lock
@ -3221,7 +3221,7 @@ dependencies = [

 [[package]]
 name = "webclaw-cli"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "clap",
 "dotenvy",
@ -3242,7 +3242,7 @@ dependencies = [

 [[package]]
 name = "webclaw-core"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "ego-tree",
 "once_cell",
@ -3260,7 +3260,7 @@ dependencies = [

 [[package]]
 name = "webclaw-fetch"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "async-trait",
 "bytes",
@ -3287,7 +3287,7 @@ dependencies = [

 [[package]]
 name = "webclaw-llm"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "async-trait",
 "reqwest",
@ -3300,7 +3300,7 @@ dependencies = [

 [[package]]
 name = "webclaw-mcp"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "dirs",
 "dotenvy",
@ -3320,7 +3320,7 @@ dependencies = [

 [[package]]
 name = "webclaw-pdf"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "pdf-extract",
 "thiserror",
@ -3329,7 +3329,7 @@ dependencies = [

 [[package]]
 name = "webclaw-server"
-version = "0.6.7"
+version = "0.6.11"
 dependencies = [
 "anyhow",
 "axum",
--- a/Cargo.toml
+++ b/Cargo.toml
@ -3,7 +3,7 @@ resolver = "2"
 members = ["crates/*"]

 [workspace.package]
-version = "0.6.7"
+version = "0.6.11"
 edition = "2024"
 license = "AGPL-3.0"
 repository = "https://github.com/0xMassi/webclaw"
--- a/Dockerfile.ci
+++ b/Dockerfile.ci
@ -1,7 +1,6 @@
 # Slim runtime image — uses pre-built binaries from the release.
 # The full Dockerfile (multi-stage Rust build) is for local development.
 # CI uses this to avoid 60+ min QEMU cross-compilation.
-ARG BINARY_DIR=binaries

 FROM ubuntu:24.04

@ -10,10 +9,13 @@ FROM ubuntu:24.04
 # CI runners and breaks the multi-arch release build. No build-time network.
 COPY --from=gcr.io/distroless/static-debian12 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt

-ARG BINARY_DIR
-COPY ${BINARY_DIR}/webclaw /usr/local/bin/webclaw
-COPY ${BINARY_DIR}/webclaw-mcp /usr/local/bin/webclaw-mcp
-COPY ${BINARY_DIR}/webclaw-server /usr/local/bin/webclaw-server
+# TARGETARCH (amd64 / arm64) is provided automatically by buildx for each
+# target platform, so one multi-platform build copies the matching binaries.
+# The release workflow stages them in binaries-amd64 / binaries-arm64.
+ARG TARGETARCH
+COPY binaries-${TARGETARCH}/webclaw /usr/local/bin/webclaw
+COPY binaries-${TARGETARCH}/webclaw-mcp /usr/local/bin/webclaw-mcp
+COPY binaries-${TARGETARCH}/webclaw-server /usr/local/bin/webclaw-server

 # Default REST API port when running `webclaw-server` inside the container.
 EXPOSE 3000
@ -25,8 +27,9 @@ ENV WEBCLAW_HOST=0.0.0.0

 # Entrypoint shim: forwards webclaw args/URL to the binary, but exec's other
 # commands directly so this image can be used as a FROM base with custom CMD.
-COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
-RUN chmod +x /usr/local/bin/docker-entrypoint.sh
+# `--chmod` sets the bit at copy time so the build needs no in-container `RUN`
+# (and thus no QEMU emulation for the arm64 platform).
+COPY --chmod=755 docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh

 ENTRYPOINT ["docker-entrypoint.sh"]
 CMD ["webclaw", "--help"]
--- a/README.md
+++ b/README.md
@ -77,7 +77,7 @@ brew install webclaw

 ### Prebuilt binaries

-Download macOS and Linux binaries from [GitHub Releases](https://github.com/0xMassi/webclaw/releases).
+Download macOS, Linux, and Windows binaries from [GitHub Releases](https://github.com/0xMassi/webclaw/releases).

 ### Docker

@ -142,7 +142,7 @@ webclaw https://docs.rust-lang.org --crawl --depth 2 --max-pages 50
 - [HTML to Markdown for RAG](examples/html-to-markdown-rag/)
 - [Firecrawl-compatible API](examples/firecrawl-compatible-api/)
 - [MCP web scraping](examples/mcp-web-scraping/)
- [Proxy-backed crawling](examples/proxy-backed-crawling/)
+- [Proxy-backed crawling with ColdProxy](examples/proxy-backed-crawling/)
 - [Cloudflare diagnostics](examples/cloudflare-diagnostics/)

 ### Extract brand assets
@ -401,6 +401,8 @@ Please remove secrets, cookies, private tokens, and customer data from logs befo
      residential IPv6, and datacenter IPv6 proxy infrastructure across 195+ countries for public data
      collection, regional testing, monitoring, and web scraping workflows. Explore
      <a href="https://coldproxy.com/">ColdProxy</a>'s latest plans and available offers directly on the website.
+      See the <a href="examples/proxy-backed-crawling/#using-coldproxy">proxy-backed crawling guide</a>
+      for a hands-on walkthrough of wiring ColdProxy into webclaw.
    </td>
  </tr>
 </table>
@ -410,6 +412,21 @@ Please remove secrets, cookies, private tokens, and customer data from logs befo
 ## Studio Partners

 <table>
+  <tr>
+    <td width="340" align="center">
+      <a href="https://go.nodemaven.com/webclaw">
+        <img src="./assets/sponsors/nodemaven-banner.png" alt="NodeMaven" width="300" />
+      </a>
+    </td>
+    <td>
+      <strong>NodeMaven</strong> is the most reliable proxy provider with the highest-quality IPs on the market.
+      Best solution for automation, web scraping, SEO research, and social media management: 99.9% uptime,
+      sticky sessions up to 7 days, IP filtering (all proxies under a 97% fraud score), no KYC, and cashback up
+      to 10% on traffic. Use <code>WEBCLAW35</code> for 35% off Mobile and Residential proxies, or
+      <code>WEBCLAW40</code> for 40% off ISP (Static) proxies at
+      <a href="https://go.nodemaven.com/webclaw">NodeMaven</a>.
+    </td>
+  </tr>
  <tr>
    <td width="340" align="center">
      <a href="https://quantumproxies.net/?utm_source=webclaw&utm_medium=github&utm_campaign=sponsor">
@ -448,6 +465,18 @@ Please remove secrets, cookies, private tokens, and customer data from logs befo
      <a href="https://www.rapidproxy.io/?ref=webclaw">Try it free</a>.
    </td>
  </tr>
+  <tr>
+    <td width="340" align="center">
+      <a href="https://mangoproxy.com/?utm_source=github&utm_medium=partner&utm_campaign=0xmassi">
+        <img src="./assets/sponsors/mangoproxy-banner.png" alt="MangoProxy" width="300" />
+      </a>
+    </td>
+    <td>
+      <strong>MangoProxy</strong> provides residential, ISP, datacenter, and mobile proxies across 200+ locations, backed by a 90M+ IP pool with HTTP and SOCKS5 support and high stability for web scraping and data collection at scale.
+      Use code <code>0XMASSI</code> for 8% off ISP (Static) proxies at
+      <a href="https://mangoproxy.com/?utm_source=github&utm_medium=partner&utm_campaign=0xmassi">mangoproxy.com</a>.
+    </td>
+  </tr>
 </table>

 ---
--- a/assets/sponsors/mangoproxy-banner.png
+++ b/assets/sponsors/mangoproxy-banner.png
--- a/assets/sponsors/nodemaven-banner.png
+++ b/assets/sponsors/nodemaven-banner.png
--- a/crates/webclaw-cli/src/main.rs
+++ b/crates/webclaw-cli/src/main.rs
@ -1548,7 +1548,7 @@ async fn run_crawl(cli: &Cli) -> Result<(), String> {
    // Fire webhook on crawl complete
    if let Some(ref webhook_url) = cli.webhook {
        let urls: Vec<&str> = result.pages.iter().map(|p| p.url.as_str()).collect();
-        fire_webhook(
+        let handle = fire_webhook(
            webhook_url,
            &serde_json::json!({
                "event": "crawl_complete",
@ -1559,8 +1559,8 @@ async fn run_crawl(cli: &Cli) -> Result<(), String> {
                "urls": urls,
            }),
        );
-        // Brief pause so the async webhook has time to fire
-        tokio::time::sleep(std::time::Duration::from_millis(500)).await;
+        // Wait for the webhook to finish so the process doesn't exit mid-send.
+        let _ = handle.await;
    }

    if result.errors > 0 {
@ -1658,7 +1658,7 @@ async fn run_batch(cli: &Cli, entries: &[(String, Option<String>)]) -> Result<()
    // Fire webhook on batch complete
    if let Some(ref webhook_url) = cli.webhook {
        let urls: Vec<&str> = results.iter().map(|r| r.url.as_str()).collect();
-        fire_webhook(
+        let handle = fire_webhook(
            webhook_url,
            &serde_json::json!({
                "event": "batch_complete",
@ -1668,7 +1668,7 @@ async fn run_batch(cli: &Cli, entries: &[(String, Option<String>)]) -> Result<()
                "urls": urls,
            }),
        );
-        tokio::time::sleep(std::time::Duration::from_millis(500)).await;
+        let _ = handle.await;
    }

    if errors > 0 {
@ -1742,9 +1742,12 @@ async fn spawn_on_change(cmd: &str, stdin_payload: &[u8]) {
    }
 }

-/// Fire a webhook POST with a JSON payload. Non-blocking — errors logged to stderr.
-/// Auto-detects Discord and Slack webhook URLs and wraps the payload accordingly.
-fn fire_webhook(url: &str, payload: &serde_json::Value) {
+/// Fire a webhook POST with a JSON payload. Spawns the send on a background task
+/// and returns its `JoinHandle` so callers that need delivery (e.g. one-shot
+/// crawl/batch runs that exit immediately after) can `.await` it; long-running
+/// loops can drop the handle and let it run fire-and-forget. Errors are logged
+/// to stderr. Auto-detects Discord and Slack webhook URLs and wraps the payload.
+fn fire_webhook(url: &str, payload: &serde_json::Value) -> tokio::task::JoinHandle<()> {
    let url = url.to_string();
    let is_discord = url.contains("discord.com/api/webhooks");
    let is_slack = url.contains("hooks.slack.com");
@ -1806,7 +1809,7 @@ fn fire_webhook(url: &str, payload: &serde_json::Value) {
            },
            Err(e) => eprintln!("[webhook] client error: {e}"),
        }
-    });
+    })
 }

 async fn run_watch(cli: &Cli, urls: &[String]) -> Result<(), String> {
@ -2318,7 +2321,7 @@ async fn run_batch_llm(cli: &Cli, entries: &[(String, Option<String>)]) -> Resul
    eprintln!("Processed {total} URLs ({ok} ok, {errors} errors)");

    if let Some(ref webhook_url) = cli.webhook {
-        fire_webhook(
+        let handle = fire_webhook(
            webhook_url,
            &serde_json::json!({
                "event": "batch_llm_complete",
@ -2327,7 +2330,7 @@ async fn run_batch_llm(cli: &Cli, entries: &[(String, Option<String>)]) -> Resul
                "errors": errors,
            }),
        );
-        tokio::time::sleep(std::time::Duration::from_millis(500)).await;
+        let _ = handle.await;
    }

    if errors > 0 {
--- a/crates/webclaw-core/src/endpoints.rs
+++ b/crates/webclaw-core/src/endpoints.rs
@ -233,7 +233,13 @@ pub fn extract_endpoints(
        }
        let slice = if text.len() > *budget {
            *truncated = true;
-            &text[..*budget]
+            // Snap the cut to a UTF-8 char boundary so non-ASCII content
+            // (multibyte codepoints straddling the budget) can't panic.
+            let mut cut = (*budget).min(text.len());
+            while cut > 0 && !text.is_char_boundary(cut) {
+                cut -= 1;
+            }
+            &text[..cut]
        } else {
            text
        };
@ -512,4 +518,16 @@ mod tests {
        );
        assert!(r.hosts.iter().any(|h| h == "pubapi.ticketmaster.co.uk"));
    }
+
+    #[test]
+    fn scan_truncation_at_non_ascii_boundary_does_not_panic() {
+        // A bundle just over the scan budget, padded with a multibyte char
+        // ('é' is 2 bytes) so the cut lands mid-codepoint. The old
+        // `&text[..budget]` slice panicked here; the boundary snap must not.
+        let pad = "é".repeat(MAX_SCAN_BYTES); // ~2× budget in bytes
+        let bundle = format!("{pad} fetch(\"/api/x\")");
+        let bundles = vec![("big.js".to_string(), bundle)];
+        let r = extract_endpoints("<html></html>", "https://example.com/", &bundles);
+        assert!(r.truncated, "oversized bundle should mark truncated");
+    }
 }
--- a/crates/webclaw-core/src/structured_data.rs
+++ b/crates/webclaw-core/src/structured_data.rs
@ -178,7 +178,12 @@ pub fn extract_sveltekit(html: &str) -> Vec<Value> {
 /// Preserves already-quoted keys and string values.
 fn js_literal_to_json(input: &str) -> String {
    let bytes = input.as_bytes();
-    let mut out = String::with_capacity(input.len() + input.len() / 10);
+    // Accumulate raw bytes, not `byte as char`. The input is valid UTF-8 and we
+    // only ever copy its bytes verbatim or insert ASCII quotes, so the result is
+    // guaranteed valid UTF-8 — copying byte-by-byte preserves multibyte
+    // codepoints (e.g. accented/CJK string values) instead of mangling them
+    // into Latin-1 mojibake.
+    let mut out: Vec<u8> = Vec::with_capacity(input.len() + input.len() / 10);
    let mut i = 0;
    let len = bytes.len();

@ -187,14 +192,14 @@ fn js_literal_to_json(input: &str) -> String {

        // Skip through strings
        if b == b'"' {
-            out.push('"');
+            out.push(b'"');
            i += 1;
            while i < len {
                let c = bytes[i];
-                out.push(c as char);
+                out.push(c);
                i += 1;
                if c == b'\\' && i < len {
-                    out.push(bytes[i] as char);
+                    out.push(bytes[i]);
                    i += 1;
                } else if c == b'"' {
                    break;
@ -205,11 +210,11 @@ fn js_literal_to_json(input: &str) -> String {

        // After { or , — look for unquoted key followed by :
        if (b == b'{' || b == b',' || b == b'[') && i + 1 < len {
-            out.push(b as char);
+            out.push(b);
            i += 1;
            // Skip whitespace
            while i < len && bytes[i].is_ascii_whitespace() {
-                out.push(bytes[i] as char);
+                out.push(bytes[i]);
                i += 1;
            }
            // Check if next is an unquoted identifier (key)
@ -218,29 +223,30 @@ fn js_literal_to_json(input: &str) -> String {
                while i < len && (bytes[i].is_ascii_alphanumeric() || bytes[i] == b'_') {
                    i += 1;
                }
-                let key = &input[key_start..i];
+                let key = &bytes[key_start..i];
                // Skip whitespace after key
                while i < len && bytes[i].is_ascii_whitespace() {
                    i += 1;
                }
                // If followed by :, it's an unquoted key — quote it
                if i < len && bytes[i] == b':' {
-                    out.push('"');
-                    out.push_str(key);
-                    out.push('"');
+                    out.push(b'"');
+                    out.extend_from_slice(key);
+                    out.push(b'"');
                } else {
                    // Not a key — might be a bare value like true/false/null
-                    out.push_str(key);
+                    out.extend_from_slice(key);
                }
            }
            continue;
        }

-        out.push(b as char);
+        out.push(b);
        i += 1;
    }

-    out
+    // Safe: we only copied bytes from valid-UTF-8 `input` plus ASCII quotes.
+    String::from_utf8(out).unwrap_or_else(|e| String::from_utf8_lossy(e.as_bytes()).into_owned())
 }

 /// Replace raw newlines/tabs inside JSON string values with escape sequences.
@ -440,4 +446,17 @@ newline"}"#;
        assert_eq!(parsed["text"], "line1\nline2");
        assert_eq!(parsed["raw"], "has\nnewline");
    }
+
+    #[test]
+    fn js_literal_to_json_preserves_multibyte_utf8() {
+        // Unquoted ASCII keys with accented and CJK string values (the shape
+        // SvelteKit emits). The old `byte as char` path turned the multibyte
+        // values into Latin-1 mojibake; they must now survive intact.
+        let input = r#"{name:"déjà vu", city:"東京", emoji:"🌱"}"#;
+        let json = js_literal_to_json(input);
+        let parsed: Value = serde_json::from_str(&json).unwrap();
+        assert_eq!(parsed["name"], "déjà vu");
+        assert_eq!(parsed["city"], "東京");
+        assert_eq!(parsed["emoji"], "🌱");
+    }
 }
--- a/crates/webclaw-fetch/src/client.rs
+++ b/crates/webclaw-fetch/src/client.rs
@ -801,11 +801,17 @@ fn is_challenge_html(html: &str) -> bool {
    false
 }

-/// Extract the homepage URL (scheme + host) from a full URL.
+/// Extract the homepage URL (scheme + host[:port]) from a full URL.
 fn extract_homepage(url: &str) -> Option<String> {
-    url::Url::parse(url)
-        .ok()
-        .map(|u| format!("{}://{}/", u.scheme(), u.host_str().unwrap_or("")))
+    url::Url::parse(url).ok().map(|u| {
+        let host = u.host_str().unwrap_or("");
+        // `port()` is `Some` only for a non-default port; include it so a
+        // host like example.com:8443 is warmed on the right port.
+        match u.port() {
+            Some(port) => format!("{}://{}:{}/", u.scheme(), host, port),
+            None => format!("{}://{}/", u.scheme(), host),
+        }
+    })
 }

 /// Convert a webclaw-pdf PdfResult into a webclaw-core ExtractionResult.
--- a/crates/webclaw-llm/src/chain.rs
+++ b/crates/webclaw-llm/src/chain.rs
@ -1,5 +1,5 @@
 /// Provider chain — tries providers in order until one succeeds.
-/// Default order: Ollama (local, free) -> OpenAI -> Anthropic.
+/// Default order: Ollama (local, free) -> OpenAI -> Gemini -> Anthropic.
 /// Only includes providers that are actually configured/available.
 use async_trait::async_trait;
 use tracing::{debug, warn};
@ -7,7 +7,8 @@ use tracing::{debug, warn};
 use crate::error::LlmError;
 use crate::provider::{CompletionRequest, LlmProvider};
 use crate::providers::{
-    anthropic::AnthropicProvider, ollama::OllamaProvider, openai::OpenAiProvider,
+    anthropic::AnthropicProvider, gemini::GeminiProvider, ollama::OllamaProvider,
+    openai::OpenAiProvider,
 };

 pub struct ProviderChain {
@ -15,9 +16,11 @@ pub struct ProviderChain {
 }

 impl ProviderChain {
-    /// Build the default chain: Ollama -> OpenAI -> Anthropic.
+    /// Build the default chain: Ollama -> OpenAI -> Gemini -> Anthropic.
    /// Ollama is always added (availability checked at call time).
    /// Cloud providers are only added if their API keys are configured.
+    /// Gemini sits ahead of Anthropic so Google Cloud credits are preferred,
+    /// with Anthropic as the last-resort fallback.
    pub async fn default() -> Self {
        let mut providers: Vec<Box<dyn LlmProvider>> = Vec::new();

@ -34,6 +37,11 @@ impl ProviderChain {
            providers.push(Box::new(openai));
        }

+        if let Some(gemini) = GeminiProvider::new(None, None, None) {
+            debug!("gemini configured, adding to chain");
+            providers.push(Box::new(gemini));
+        }
+
        if let Some(anthropic) = AnthropicProvider::with_base_url(None, None, None) {
            debug!("anthropic configured, adding to chain");
            providers.push(Box::new(anthropic));
--- a/crates/webclaw-llm/src/lib.rs
+++ b/crates/webclaw-llm/src/lib.rs
@ -1,6 +1,6 @@
 /// webclaw-llm: LLM integration with local-first hybrid architecture.
 ///
-/// Provider chain tries Ollama (local) first, falls back to OpenAI, then Anthropic.
+/// Provider chain tries Ollama (local) first, falls back to OpenAI, then Gemini, then Anthropic.
 /// Provides schema-based extraction, prompt extraction, and summarization
 /// on top of webclaw-core's content pipeline.
 pub mod chain;
--- a/crates/webclaw-llm/src/providers/anthropic.rs
+++ b/crates/webclaw-llm/src/providers/anthropic.rs
@ -1,6 +1,8 @@
 /// Anthropic provider — Claude models via api.anthropic.com.
 /// Anthropic's API differs from OpenAI: system message is a top-level param,
 /// not part of the messages array.
+use std::time::Duration;
+
 use async_trait::async_trait;
 use serde_json::json;

@ -35,14 +37,20 @@ impl AnthropicProvider {
        let key = load_api_key(key_override, "ANTHROPIC_API_KEY")?;

        Some(Self {
-            client: reqwest::Client::new(),
+            client: reqwest::Client::builder()
+                .timeout(Duration::from_secs(120))
+                .connect_timeout(Duration::from_secs(10))
+                .build()
+                .unwrap_or_else(|_| reqwest::Client::new()),
            key,
            base_url: base_url
                .or_else(|| std::env::var("ANTHROPIC_BASE_URL").ok())
                .unwrap_or_else(|| DEFAULT_ANTHROPIC_BASE_URL.into())
                .trim_end_matches('/')
                .to_string(),
-            default_model: model.unwrap_or_else(|| "claude-sonnet-4-20250514".into()),
+            default_model: model
+                .or_else(|| std::env::var("ANTHROPIC_MODEL").ok())
+                .unwrap_or_else(|| "claude-sonnet-4-6".into()),
        })
    }

@ -108,11 +116,7 @@ impl LlmProvider for AnthropicProvider {
        if !resp.status().is_success() {
            let status = resp.status();
            let text = resp.text().await.unwrap_or_default();
-            let safe_text = if text.len() > 500 {
-                &text[..500]
-            } else {
-                &text
-            };
+            let safe_text = text.chars().take(500).collect::<String>();
            return Err(LlmError::ProviderError(format!(
                "anthropic returned {status}: {safe_text}"
            )));
@ -156,7 +160,7 @@ mod tests {
        let provider =
            AnthropicProvider::new(Some("sk-ant-test".into()), None).expect("should construct");
        assert_eq!(provider.name(), "anthropic");
-        assert_eq!(provider.default_model, "claude-sonnet-4-20250514");
+        assert_eq!(provider.default_model, "claude-sonnet-4-6");
        assert_eq!(provider.key, "sk-ant-test");
        assert_eq!(provider.base_url, "https://api.anthropic.com/v1");
        assert_eq!(
@ -176,7 +180,7 @@ mod tests {
    #[test]
    fn default_model_accessor() {
        let provider = AnthropicProvider::new(Some("sk-ant-test".into()), None).unwrap();
-        assert_eq!(provider.default_model(), "claude-sonnet-4-20250514");
+        assert_eq!(provider.default_model(), "claude-sonnet-4-6");
    }

    #[test]
--- a/crates/webclaw-llm/src/providers/gemini.rs
+++ b/crates/webclaw-llm/src/providers/gemini.rs
@ -0,0 +1,363 @@
+/// Google Gemini provider — Gemini models via the Generative Language API.
+/// Gemini's request shape differs from OpenAI/Anthropic: the system message is a
+/// top-level `systemInstruction`, conversation turns live in `contents` (with the
+/// assistant role renamed to `model`), and generation knobs sit under
+/// `generationConfig`. API-key auth is sent as an `x-goog-api-key` header.
+use std::time::Duration;
+
+use async_trait::async_trait;
+use serde_json::json;
+
+use crate::clean::strip_thinking_tags;
+use crate::error::LlmError;
+use crate::provider::{CompletionRequest, LlmProvider};
+
+use super::load_api_key;
+
+const DEFAULT_GEMINI_BASE_URL: &str = "https://generativelanguage.googleapis.com/v1beta";
+/// Default model. Gemini 2.5 Flash/Pro are "thinking" models: internal reasoning
+/// tokens count against `maxOutputTokens`, so the output budget must comfortably
+/// exceed the visible response (see `request_body`) or the model returns
+/// `finishReason=MAX_TOKENS` with no text. Set `GEMINI_MODEL` to a non-thinking
+/// model (e.g. `gemini-2.0-flash`) to avoid the reasoning overhead entirely.
+const DEFAULT_GEMINI_MODEL: &str = "gemini-2.5-flash";
+
+/// Gemini puts the model in the URL path, so only plain model identifiers are
+/// safe to interpolate. Real model names are ASCII alphanumerics plus `-`/`.`/`_`
+/// (e.g. `gemini-2.5-flash`, `gemini-2.0-flash-001`); anything else (`/`, `:`,
+/// `?`, `#`, whitespace) could redirect the request to a different path/method.
+fn is_safe_model_name(model: &str) -> bool {
+    !model.is_empty()
+        && model
+            .bytes()
+            .all(|b| b.is_ascii_alphanumeric() || matches!(b, b'-' | b'.' | b'_'))
+}
+
+pub struct GeminiProvider {
+    client: reqwest::Client,
+    key: String,
+    base_url: String,
+    default_model: String,
+}
+
+impl GeminiProvider {
+    /// Returns `None` if no API key is available (param or `GEMINI_API_KEY` env).
+    pub fn new(
+        key_override: Option<String>,
+        base_url: Option<String>,
+        model: Option<String>,
+    ) -> Option<Self> {
+        let key = load_api_key(key_override, "GEMINI_API_KEY")?;
+
+        Some(Self {
+            client: reqwest::Client::builder()
+                .timeout(Duration::from_secs(120))
+                .connect_timeout(Duration::from_secs(10))
+                .build()
+                .unwrap_or_else(|_| reqwest::Client::new()),
+            key,
+            base_url: base_url
+                .or_else(|| std::env::var("GEMINI_BASE_URL").ok())
+                .unwrap_or_else(|| DEFAULT_GEMINI_BASE_URL.into())
+                .trim_end_matches('/')
+                .to_string(),
+            default_model: model
+                .or_else(|| std::env::var("GEMINI_MODEL").ok())
+                .unwrap_or_else(|| DEFAULT_GEMINI_MODEL.into()),
+        })
+    }
+
+    pub fn default_model(&self) -> &str {
+        &self.default_model
+    }
+
+    /// Build the `generateContent` body from a generic completion request.
+    /// System messages become `systemInstruction`; user/assistant turns become
+    /// `contents` (assistant → `model`); `json_mode` constrains the model to
+    /// valid JSON via `responseMimeType`.
+    fn request_body(&self, request: &CompletionRequest) -> serde_json::Value {
+        let contents: Vec<serde_json::Value> = request
+            .messages
+            .iter()
+            .filter(|m| m.role != "system")
+            .map(|m| {
+                let role = if m.role == "assistant" {
+                    "model"
+                } else {
+                    "user"
+                };
+                json!({ "role": role, "parts": [{ "text": m.content }] })
+            })
+            .collect();
+
+        let system_parts: Vec<serde_json::Value> = request
+            .messages
+            .iter()
+            .filter(|m| m.role == "system")
+            .map(|m| json!({ "text": m.content }))
+            .collect();
+
+        // `maxOutputTokens` is a ceiling, not a reservation — you're billed per
+        // token actually produced — so default generously. Gemini 2.5 "thinking"
+        // models spend part of this budget on internal reasoning; too low a cap
+        // makes them return `finishReason=MAX_TOKENS` with no visible text.
+        let mut generation_config = json!({
+            "maxOutputTokens": request.max_tokens.unwrap_or(8192),
+        });
+        if let Some(temp) = request.temperature {
+            generation_config["temperature"] = json!(temp);
+        }
+        if request.json_mode {
+            generation_config["responseMimeType"] = json!("application/json");
+        }
+
+        let mut body = json!({
+            "contents": contents,
+            "generationConfig": generation_config,
+        });
+
+        // Gemini rejects an empty `systemInstruction`, so only attach it when a
+        // system message is actually present.
+        if !system_parts.is_empty() {
+            body["systemInstruction"] = json!({ "parts": system_parts });
+        }
+
+        body
+    }
+}
+
+#[async_trait]
+impl LlmProvider for GeminiProvider {
+    async fn complete(&self, request: &CompletionRequest) -> Result<String, LlmError> {
+        let model = if request.model.is_empty() {
+            &self.default_model
+        } else {
+            &request.model
+        };
+
+        // The model goes in the URL path (Gemini's API requires it there, unlike
+        // OpenAI/Anthropic which take it in the body), so reject anything that
+        // isn't a plain model identifier to prevent path/query injection from a
+        // caller-supplied `request.model`.
+        if !is_safe_model_name(model) {
+            return Err(LlmError::ProviderError(format!(
+                "invalid gemini model name: {model:?}"
+            )));
+        }
+
+        let body = self.request_body(request);
+
+        // API-key auth goes in the header, never the URL, so the key can't leak
+        // into request logs, proxies, or referrer headers.
+        let url = format!("{}/models/{model}:generateContent", self.base_url);
+        let resp = self
+            .client
+            .post(&url)
+            .header("x-goog-api-key", &self.key)
+            .header("content-type", "application/json")
+            .json(&body)
+            .send()
+            .await?;
+
+        if !resp.status().is_success() {
+            let status = resp.status();
+            let text = resp.text().await.unwrap_or_default();
+            let safe_text = text.chars().take(500).collect::<String>();
+            return Err(LlmError::ProviderError(format!(
+                "gemini returned {status}: {safe_text}"
+            )));
+        }
+
+        // Cap response body size to defend against adversarial payloads.
+        let json = super::response_json_capped(resp).await?;
+
+        // Gemini response: {"candidates":[{"content":{"parts":[{"text":"..."}]}}]}.
+        // A candidate may carry multiple text parts; concatenate them in order.
+        let text = json["candidates"][0]["content"]["parts"]
+            .as_array()
+            .map(|parts| {
+                parts
+                    .iter()
+                    .filter_map(|p| p["text"].as_str())
+                    .collect::<String>()
+            })
+            .unwrap_or_default();
+
+        if text.is_empty() {
+            // No usable text. Surface Gemini's finishReason (or a prompt-level
+            // block reason) so MAX_TOKENS — e.g. a "thinking" model that spent
+            // its whole maxOutputTokens budget on reasoning — and SAFETY blocks
+            // are visible in logs/telemetry instead of masquerading as a parse
+            // failure. The chain falls through to the next provider on any Err.
+            let reason = json["candidates"][0]["finishReason"]
+                .as_str()
+                .or_else(|| json["promptFeedback"]["blockReason"].as_str())
+                .unwrap_or("unknown");
+            return Err(LlmError::ProviderError(format!(
+                "gemini returned no text (finishReason={reason})"
+            )));
+        }
+
+        Ok(strip_thinking_tags(&text))
+    }
+
+    async fn is_available(&self) -> bool {
+        !self.key.is_empty()
+    }
+
+    fn name(&self) -> &str {
+        "gemini"
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::provider::Message;
+
+    fn provider() -> GeminiProvider {
+        GeminiProvider::new(Some("test-key".into()), None, None).expect("should construct")
+    }
+
+    fn msg(role: &str, content: &str) -> Message {
+        Message {
+            role: role.into(),
+            content: content.into(),
+        }
+    }
+
+    fn request(messages: Vec<Message>, json_mode: bool) -> CompletionRequest {
+        CompletionRequest {
+            model: String::new(),
+            messages,
+            temperature: None,
+            max_tokens: None,
+            json_mode,
+        }
+    }
+
+    #[test]
+    fn empty_key_returns_none() {
+        assert!(GeminiProvider::new(Some(String::new()), None, None).is_none());
+    }
+
+    #[test]
+    fn model_name_validation_blocks_path_injection() {
+        // Real model identifiers pass.
+        assert!(is_safe_model_name("gemini-2.5-flash"));
+        assert!(is_safe_model_name("gemini-2.0-flash-001"));
+        assert!(is_safe_model_name("gemini-1.5-pro-002"));
+        // Anything that could alter the request path/method is rejected.
+        assert!(!is_safe_model_name(""));
+        assert!(!is_safe_model_name(
+            "gemini-2.5-flash:streamGenerateContent"
+        ));
+        assert!(!is_safe_model_name("../../models/x"));
+        assert!(!is_safe_model_name("model?alt=sse"));
+        assert!(!is_safe_model_name("a b"));
+    }
+
+    #[test]
+    fn explicit_key_constructs_with_defaults() {
+        let p = provider();
+        assert_eq!(p.name(), "gemini");
+        assert_eq!(p.key, "test-key");
+        assert_eq!(p.default_model, DEFAULT_GEMINI_MODEL);
+        assert_eq!(p.default_model(), DEFAULT_GEMINI_MODEL);
+        assert_eq!(p.base_url, DEFAULT_GEMINI_BASE_URL);
+    }
+
+    #[test]
+    fn custom_base_url_trims_trailing_slash_and_model() {
+        let p = GeminiProvider::new(
+            Some("test-key".into()),
+            Some("https://example.test/v1beta/".into()),
+            Some("gemini-2.5-pro".into()),
+        )
+        .unwrap();
+        assert_eq!(p.base_url, "https://example.test/v1beta");
+        assert_eq!(p.default_model, "gemini-2.5-pro");
+    }
+
+    #[test]
+    fn maps_user_and_assistant_roles_into_contents() {
+        let p = provider();
+        let body = p.request_body(&request(
+            vec![msg("user", "hello"), msg("assistant", "hi there")],
+            false,
+        ));
+        let contents = body["contents"].as_array().unwrap();
+        assert_eq!(contents.len(), 2);
+        assert_eq!(contents[0]["role"], "user");
+        assert_eq!(contents[0]["parts"][0]["text"], "hello");
+        // assistant must be renamed to Gemini's "model" role.
+        assert_eq!(contents[1]["role"], "model");
+        assert_eq!(contents[1]["parts"][0]["text"], "hi there");
+        // No system message -> no systemInstruction key at all.
+        assert!(body.get("systemInstruction").is_none());
+    }
+
+    #[test]
+    fn system_message_becomes_system_instruction_not_contents() {
+        let p = provider();
+        let body = p.request_body(&request(
+            vec![msg("system", "be terse"), msg("user", "hello")],
+            false,
+        ));
+        let contents = body["contents"].as_array().unwrap();
+        assert_eq!(contents.len(), 1, "system message lifted out of contents");
+        assert_eq!(contents[0]["role"], "user");
+        assert_eq!(body["systemInstruction"]["parts"][0]["text"], "be terse");
+    }
+
+    #[test]
+    fn json_mode_toggles_response_mime_type() {
+        let p = provider();
+        let on = p.request_body(&request(vec![msg("user", "x")], true));
+        assert_eq!(
+            on["generationConfig"]["responseMimeType"],
+            "application/json"
+        );
+        let off = p.request_body(&request(vec![msg("user", "x")], false));
+        assert!(off["generationConfig"].get("responseMimeType").is_none());
+    }
+
+    #[test]
+    fn max_output_tokens_default_and_temperature_override() {
+        let p = provider();
+        let default_body = p.request_body(&request(vec![msg("user", "x")], false));
+        assert_eq!(default_body["generationConfig"]["maxOutputTokens"], 8192);
+        // No temperature set -> key omitted.
+        assert!(
+            default_body["generationConfig"]
+                .get("temperature")
+                .is_none()
+        );
+
+        let mut req = request(vec![msg("user", "x")], false);
+        req.max_tokens = Some(256);
+        req.temperature = Some(0.5); // 0.5 is exact in both f32 and f64
+        let body = p.request_body(&req);
+        assert_eq!(body["generationConfig"]["maxOutputTokens"], 256);
+        assert_eq!(body["generationConfig"]["temperature"], 0.5);
+    }
+
+    // Env var fallback tests mutate process-global state and race with parallel
+    // tests. Run in isolation if needed:
+    //   cargo test -p webclaw-llm env_var -- --ignored --test-threads=1
+    #[test]
+    #[ignore = "mutates process env; run with --test-threads=1"]
+    fn env_var_key_fallback() {
+        unsafe { std::env::set_var("GEMINI_API_KEY", "gemini-env-key") };
+        let p = GeminiProvider::new(None, None, None).expect("should construct from env");
+        assert_eq!(p.key, "gemini-env-key");
+        unsafe { std::env::remove_var("GEMINI_API_KEY") };
+    }
+
+    #[test]
+    #[ignore = "mutates process env; run with --test-threads=1"]
+    fn no_key_returns_none() {
+        unsafe { std::env::remove_var("GEMINI_API_KEY") };
+        assert!(GeminiProvider::new(None, None, None).is_none());
+    }
+}
--- a/crates/webclaw-llm/src/providers/mod.rs
+++ b/crates/webclaw-llm/src/providers/mod.rs
@ -1,4 +1,5 @@
 pub mod anthropic;
+pub mod gemini;
 pub mod ollama;
 pub mod openai;

--- a/crates/webclaw-llm/src/providers/ollama.rs
+++ b/crates/webclaw-llm/src/providers/ollama.rs
@ -1,5 +1,7 @@
 /// Ollama provider — talks to a local Ollama instance (default localhost:11434).
 /// First choice in the provider chain: free, private, fast on Apple Silicon.
+use std::time::Duration;
+
 use async_trait::async_trait;
 use serde_json::json;

@ -24,7 +26,11 @@ impl OllamaProvider {
            .unwrap_or_else(|| "qwen3:8b".into());

        Self {
-            client: reqwest::Client::new(),
+            client: reqwest::Client::builder()
+                .timeout(Duration::from_secs(120))
+                .connect_timeout(Duration::from_secs(10))
+                .build()
+                .unwrap_or_else(|_| reqwest::Client::new()),
            base_url,
            default_model,
        }
@ -70,11 +76,7 @@ impl LlmProvider for OllamaProvider {
        if !resp.status().is_success() {
            let status = resp.status();
            let text = resp.text().await.unwrap_or_default();
-            let safe_text = if text.len() > 500 {
-                &text[..500]
-            } else {
-                &text
-            };
+            let safe_text = text.chars().take(500).collect::<String>();
            return Err(LlmError::ProviderError(format!(
                "ollama returned {status}: {safe_text}"
            )));
@ -98,7 +100,8 @@ impl LlmProvider for OllamaProvider {

    async fn is_available(&self) -> bool {
        let url = format!("{}/api/tags", self.base_url);
-        matches!(self.client.get(&url).send().await, Ok(r) if r.status().is_success())
+        let req = self.client.get(&url).timeout(Duration::from_secs(10));
+        matches!(req.send().await, Ok(r) if r.status().is_success())
    }

    fn name(&self) -> &str {
--- a/crates/webclaw-llm/src/providers/openai.rs
+++ b/crates/webclaw-llm/src/providers/openai.rs
@ -1,4 +1,6 @@
 /// OpenAI provider — works with api.openai.com and any OpenAI-compatible endpoint.
+use std::time::Duration;
+
 use async_trait::async_trait;
 use serde_json::json;

@ -69,7 +71,11 @@ impl OpenAiProvider {
        let key = load_api_key(key_override, "OPENAI_API_KEY")?;

        Some(Self {
-            client: reqwest::Client::new(),
+            client: reqwest::Client::builder()
+                .timeout(Duration::from_secs(120))
+                .connect_timeout(Duration::from_secs(10))
+                .build()
+                .unwrap_or_else(|_| reqwest::Client::new()),
            key,
            base_url: base_url
                .or_else(|| std::env::var("OPENAI_BASE_URL").ok())
@ -132,11 +138,7 @@ impl LlmProvider for OpenAiProvider {
        if !resp.status().is_success() {
            let status = resp.status();
            let text = resp.text().await.unwrap_or_default();
-            let safe_text = if text.len() > 500 {
-                &text[..500]
-            } else {
-                &text
-            };
+            let safe_text = text.chars().take(500).collect::<String>();
            return Err(LlmError::ProviderError(format!(
                "openai returned {status}: {safe_text}"
            )));
--- a/crates/webclaw-mcp/src/server.rs
+++ b/crates/webclaw-mcp/src/server.rs
@ -323,9 +323,10 @@ impl WebclawMcp {
        if params.urls.len() > 100 {
            return Err("batch is limited to 100 URLs per request".into());
        }
-        for u in &params.urls {
-            validate_url(u).await?;
-        }
+        // No up-front DNS pre-validation: it aborted the whole batch on a
+        // single unresolvable URL. The fetch layer applies the same SSRF
+        // guard (validate_public_http_url) per URL, so bad entries surface
+        // as individual per-URL errors below instead of failing the batch.

        let format = params.format.as_deref().unwrap_or("markdown");
        let concurrency = params.concurrency.unwrap_or(5);
--- a/crates/webclaw-mcp/src/tools.rs
+++ b/crates/webclaw-mcp/src/tools.rs
@ -4,6 +4,61 @@
 use schemars::JsonSchema;
 use serde::Deserialize;

+// ── Coercion helpers ────────────────────────────────────────────────────────
+//
+// MCP clients (Claude Desktop, VS Code extension, etc.) sometimes pass numeric
+// parameters as JSON strings (e.g. `"depth": "3"` instead of `"depth": 3`).
+// serde's default u32/usize deserialisers reject strings with:
+//
+//   "invalid type: string \"3\", expected u32"
+//
+// These two helpers accept both forms transparently so callers never see that
+// error regardless of which representation their client sends.
+
+fn deser_opt_u32_or_str<'de, D>(d: D) -> Result<Option<u32>, D::Error>
+where
+    D: serde::Deserializer<'de>,
+{
+    #[derive(serde::Deserialize)]
+    #[serde(untagged)]
+    enum NumOrStr {
+        Num(u32),
+        Str(String),
+    }
+    match Option::<NumOrStr>::deserialize(d)? {
+        None => Ok(None),
+        Some(NumOrStr::Num(n)) => Ok(Some(n)),
+        Some(NumOrStr::Str(s)) => {
+            s.trim().parse::<u32>().map(Some).map_err(|_| {
+                serde::de::Error::custom(format!("expected a u32, got string \"{s}\""))
+            })
+        }
+    }
+}
+
+fn deser_opt_usize_or_str<'de, D>(d: D) -> Result<Option<usize>, D::Error>
+where
+    D: serde::Deserializer<'de>,
+{
+    #[derive(serde::Deserialize)]
+    #[serde(untagged)]
+    enum NumOrStr {
+        Num(usize),
+        Str(String),
+    }
+    match Option::<NumOrStr>::deserialize(d)? {
+        None => Ok(None),
+        Some(NumOrStr::Num(n)) => Ok(Some(n)),
+        Some(NumOrStr::Str(s)) => {
+            s.trim().parse::<usize>().map(Some).map_err(|_| {
+                serde::de::Error::custom(format!("expected a usize, got string \"{s}\""))
+            })
+        }
+    }
+}
+
+// ── Parameter structs ───────────────────────────────────────────────────────
+
 #[derive(Debug, Deserialize, JsonSchema)]
 pub struct ScrapeParams {
    /// URL to scrape
@ -27,10 +82,13 @@ pub struct CrawlParams {
    /// Seed URL to start crawling from
    pub url: String,
    /// Maximum link depth to follow (default: 2)
+    #[serde(default, deserialize_with = "deser_opt_u32_or_str")]
    pub depth: Option<u32>,
    /// Maximum number of pages to crawl (default: 50)
+    #[serde(default, deserialize_with = "deser_opt_usize_or_str")]
    pub max_pages: Option<usize>,
    /// Number of concurrent requests (default: 5)
+    #[serde(default, deserialize_with = "deser_opt_usize_or_str")]
    pub concurrency: Option<usize>,
    /// Seed the frontier from sitemap discovery before crawling
    pub use_sitemap: Option<bool>,
@ -51,6 +109,7 @@ pub struct BatchParams {
    /// Output format: "markdown" (default), "llm", "text"
    pub format: Option<String>,
    /// Number of concurrent requests (default: 5)
+    #[serde(default, deserialize_with = "deser_opt_usize_or_str")]
    pub concurrency: Option<usize>,
 }

@ -69,6 +128,7 @@ pub struct SummarizeParams {
    /// URL to fetch and summarize
    pub url: String,
    /// Number of sentences in the summary (default: 3)
+    #[serde(default, deserialize_with = "deser_opt_usize_or_str")]
    pub max_sentences: Option<usize>,
 }

@ -101,6 +161,7 @@ pub struct SearchParams {
    /// Search query
    pub query: String,
    /// Number of results to return (default: 10)
+    #[serde(default, deserialize_with = "deser_opt_u32_or_str")]
    pub num_results: Option<u32>,
 }

@ -120,3 +181,179 @@ pub struct VerticalParams {
 /// so rmcp can generate a schema and parse the (empty) JSON-RPC params.
 #[derive(Debug, Deserialize, JsonSchema)]
 pub struct ListExtractorsParams {}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    // ── CrawlParams.depth (u32) ──────────────────────────────────────────────
+
+    #[test]
+    fn crawl_depth_from_numeric_string() {
+        let v: CrawlParams =
+            serde_json::from_str(r#"{"url":"https://x.com","depth":"3"}"#).unwrap();
+        assert_eq!(v.depth, Some(3));
+    }
+
+    #[test]
+    fn crawl_depth_from_number() {
+        let v: CrawlParams = serde_json::from_str(r#"{"url":"https://x.com","depth":3}"#).unwrap();
+        assert_eq!(v.depth, Some(3));
+    }
+
+    #[test]
+    fn crawl_depth_absent_is_none() {
+        let v: CrawlParams = serde_json::from_str(r#"{"url":"https://x.com"}"#).unwrap();
+        assert_eq!(v.depth, None);
+    }
+
+    #[test]
+    fn crawl_depth_non_numeric_string_errors() {
+        let e = serde_json::from_str::<CrawlParams>(r#"{"url":"https://x.com","depth":"abc"}"#);
+        assert!(e.is_err(), "expected Err, got {e:?}");
+    }
+
+    // ── CrawlParams.max_pages (usize) ────────────────────────────────────────
+
+    #[test]
+    fn crawl_max_pages_from_numeric_string() {
+        let v: CrawlParams =
+            serde_json::from_str(r#"{"url":"https://x.com","max_pages":"50"}"#).unwrap();
+        assert_eq!(v.max_pages, Some(50));
+    }
+
+    #[test]
+    fn crawl_max_pages_from_number() {
+        let v: CrawlParams =
+            serde_json::from_str(r#"{"url":"https://x.com","max_pages":50}"#).unwrap();
+        assert_eq!(v.max_pages, Some(50));
+    }
+
+    #[test]
+    fn crawl_max_pages_absent_is_none() {
+        let v: CrawlParams = serde_json::from_str(r#"{"url":"https://x.com"}"#).unwrap();
+        assert_eq!(v.max_pages, None);
+    }
+
+    #[test]
+    fn crawl_max_pages_non_numeric_string_errors() {
+        let e = serde_json::from_str::<CrawlParams>(r#"{"url":"https://x.com","max_pages":"abc"}"#);
+        assert!(e.is_err(), "expected Err, got {e:?}");
+    }
+
+    // ── CrawlParams.concurrency (usize) ──────────────────────────────────────
+
+    #[test]
+    fn crawl_concurrency_from_numeric_string() {
+        let v: CrawlParams =
+            serde_json::from_str(r#"{"url":"https://x.com","concurrency":"5"}"#).unwrap();
+        assert_eq!(v.concurrency, Some(5));
+    }
+
+    #[test]
+    fn crawl_concurrency_from_number() {
+        let v: CrawlParams =
+            serde_json::from_str(r#"{"url":"https://x.com","concurrency":5}"#).unwrap();
+        assert_eq!(v.concurrency, Some(5));
+    }
+
+    #[test]
+    fn crawl_concurrency_absent_is_none() {
+        let v: CrawlParams = serde_json::from_str(r#"{"url":"https://x.com"}"#).unwrap();
+        assert_eq!(v.concurrency, None);
+    }
+
+    #[test]
+    fn crawl_concurrency_non_numeric_string_errors() {
+        let e =
+            serde_json::from_str::<CrawlParams>(r#"{"url":"https://x.com","concurrency":"abc"}"#);
+        assert!(e.is_err(), "expected Err, got {e:?}");
+    }
+
+    // ── BatchParams.concurrency (usize) ──────────────────────────────────────
+
+    #[test]
+    fn batch_concurrency_from_numeric_string() {
+        let v: BatchParams =
+            serde_json::from_str(r#"{"urls":["https://x.com"],"concurrency":"5"}"#).unwrap();
+        assert_eq!(v.concurrency, Some(5));
+    }
+
+    #[test]
+    fn batch_concurrency_from_number() {
+        let v: BatchParams =
+            serde_json::from_str(r#"{"urls":["https://x.com"],"concurrency":5}"#).unwrap();
+        assert_eq!(v.concurrency, Some(5));
+    }
+
+    #[test]
+    fn batch_concurrency_absent_is_none() {
+        let v: BatchParams = serde_json::from_str(r#"{"urls":["https://x.com"]}"#).unwrap();
+        assert_eq!(v.concurrency, None);
+    }
+
+    #[test]
+    fn batch_concurrency_non_numeric_string_errors() {
+        let e = serde_json::from_str::<BatchParams>(
+            r#"{"urls":["https://x.com"],"concurrency":"abc"}"#,
+        );
+        assert!(e.is_err(), "expected Err, got {e:?}");
+    }
+
+    // ── SearchParams.num_results (u32) ───────────────────────────────────────
+
+    #[test]
+    fn search_num_results_from_numeric_string() {
+        let v: SearchParams =
+            serde_json::from_str(r#"{"query":"rust","num_results":"10"}"#).unwrap();
+        assert_eq!(v.num_results, Some(10));
+    }
+
+    #[test]
+    fn search_num_results_from_number() {
+        let v: SearchParams = serde_json::from_str(r#"{"query":"rust","num_results":10}"#).unwrap();
+        assert_eq!(v.num_results, Some(10));
+    }
+
+    #[test]
+    fn search_num_results_absent_is_none() {
+        let v: SearchParams = serde_json::from_str(r#"{"query":"rust"}"#).unwrap();
+        assert_eq!(v.num_results, None);
+    }
+
+    #[test]
+    fn search_num_results_non_numeric_string_errors() {
+        let e = serde_json::from_str::<SearchParams>(r#"{"query":"rust","num_results":"abc"}"#);
+        assert!(e.is_err(), "expected Err, got {e:?}");
+    }
+
+    // ── SummarizeParams.max_sentences (usize) ────────────────────────────────
+
+    #[test]
+    fn summarize_max_sentences_from_numeric_string() {
+        let v: SummarizeParams =
+            serde_json::from_str(r#"{"url":"https://x.com","max_sentences":"3"}"#).unwrap();
+        assert_eq!(v.max_sentences, Some(3));
+    }
+
+    #[test]
+    fn summarize_max_sentences_from_number() {
+        let v: SummarizeParams =
+            serde_json::from_str(r#"{"url":"https://x.com","max_sentences":3}"#).unwrap();
+        assert_eq!(v.max_sentences, Some(3));
+    }
+
+    #[test]
+    fn summarize_max_sentences_absent_is_none() {
+        let v: SummarizeParams = serde_json::from_str(r#"{"url":"https://x.com"}"#).unwrap();
+        assert_eq!(v.max_sentences, None);
+    }
+
+    #[test]
+    fn summarize_max_sentences_non_numeric_string_errors() {
+        let e = serde_json::from_str::<SummarizeParams>(
+            r#"{"url":"https://x.com","max_sentences":"abc"}"#,
+        );
+        assert!(e.is_err(), "expected Err, got {e:?}");
+    }
+}
--- a/examples/proxy-backed-crawling/README.md
+++ b/examples/proxy-backed-crawling/README.md
@ -1,6 +1,68 @@
 # Proxy-Backed Crawling

-Use proxy rotation when you need to distribute a crawl across a proxy pool. webclaw supports a single proxy or a proxy file.
+Use proxy rotation when you need to distribute a crawl across a proxy pool. webclaw supports a single proxy or a proxy file, and accepts any standard HTTP/HTTPS or SOCKS5 proxy URL.
+
+## Using ColdProxy
+
+[ColdProxy](https://coldproxy.com/) is webclaw's infrastructure partner, providing residential IPv4, residential IPv6, and datacenter IPv6 proxies across 195+ countries. Use a ColdProxy endpoint as a full URL with `--proxy` / `WEBCLAW_PROXY`, or list several in a `--proxy-file` pool.
+
+### 1. Get your endpoint
+
+Sign in to your [ColdProxy dashboard](https://coldproxy.com/) and copy your proxy host, port, and credentials. Assemble them into a standard proxy URL:
+
+```text
+http://USERNAME:PASSWORD@HOST:PORT
+```
+
+### 2. One ColdProxy endpoint
+
+```bash
+export WEBCLAW_PROXY="http://USERNAME:PASSWORD@HOST:PORT"
+webclaw https://example.com --format markdown
+```
+
+Or pass it inline:
+
+```bash
+webclaw https://example.com \
+  --proxy "http://USERNAME:PASSWORD@HOST:PORT" \
+  --format markdown
+```
+
+### 3. Rotate a ColdProxy pool
+
+List one ColdProxy endpoint per line in `coldproxy.txt`. Pool files use `host:port:user:pass` (one entry per line; lines starting with `#` are ignored). Mix product types and regions to match your workload:
+
+```text
+# residential IPv4
+HOST:PORT:USERNAME:PASSWORD
+# residential IPv6
+HOST:PORT:USERNAME:PASSWORD
+# datacenter IPv6
+HOST:PORT:USERNAME:PASSWORD
+```
+
+webclaw rotates across the pool per request:
+
+```bash
+webclaw https://docs.example.com \
+  --crawl \
+  --depth 2 \
+  --max-pages 200 \
+  --concurrency 10 \
+  --delay 200 \
+  --proxy-file coldproxy.txt \
+  --format markdown
+```
+
+### 4. Target a country
+
+ColdProxy offers access across 195+ countries. Use the country-specific endpoint from your ColdProxy dashboard for each region you want to collect from (for example, a France residential endpoint for fr-localized pages). Add one endpoint per country to your pool file to spread a single crawl across regions.
+
+### Choosing a product
+
+- **Residential IPv4 / IPv6** — suitable for region-specific testing, localized content validation, public data collection, market monitoring, and regional QA.
+- **Datacenter IPv6** — fastest and most cost-effective; best for high-volume crawling of tolerant endpoints.

 ## Single Proxy

@ -20,12 +82,12 @@ webclaw https://example.com \

 ## Proxy Pool

-Create `proxies.txt` with one proxy per line:
+Create `proxies.txt` with one proxy per line in `host:port:user:pass` format (lines starting with `#` are ignored):

 ```text
-http://user:pass@proxy-1.example.com:8080
-http://user:pass@proxy-2.example.com:8080
-http://user:pass@proxy-3.example.com:8080
+proxy-1.example.com:8080:user:pass
+proxy-2.example.com:8080:user:pass
+proxy-3.example.com:8080:user:pass
 ```

 Run a crawl with controlled concurrency:
Author	SHA1	Message	Date
Valerio	0c6f323f51	chore(release): v0.6.11 — Gemini provider + Anthropic model fix	2026-06-16 16:12:11 +02:00
Valerio	d9e3d0b2bb	feat(llm): add Gemini provider and fix stale Anthropic default model Adds a Google Gemini provider (Generative Language API) to the chain, ordered Ollama -> OpenAI -> Gemini -> Anthropic so Google credits are preferred with Anthropic as last-resort fallback. System->systemInstruction, assistant->model, json_mode->responseMimeType; model name validated before URL interpolation; maxOutputTokens defaults high for 2.5 thinking models. Also fixes AnthropicProvider default (retired claude-sonnet-4-20250514 -> 404); now claude-sonnet-4-6, honors ANTHROPIC_MODEL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 15:52:37 +02:00
Valerio	8a0768526f	chore(mcp): add .mcp.json so Cursor / Open Plugins directories detect the MCP server Declares the webclaw MCP server at the repo root (matches the README manual config). Cursor's plugin scanner looks for .mcp.json/mcp.json at root. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 18:15:19 +02:00
Valerio	e7ec76bce9	docs(sponsors): add MangoProxy studio partner (#60 )	2026-06-15 15:06:00 +02:00
Valerio	da6c6af724	chore(release): bump version to 0.6.10 Release the MCP numeric-param string-coercion fix (#58, PR #59): crawl/batch/search/summarize numeric args now accept JSON numbers or numeric strings, fixing clients (e.g. Claude Desktop) that send "5" instead of 5. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 11:27:04 +02:00
Valerio	243e7032d0	Merge pull request #59 from crossi-dev/fix/numeric-params-string-coercion fix: accept numeric MCP params sent as strings (#58)	2026-06-15 11:26:05 +02:00
Valerio	24ae3a7af2	style(mcp): apply rustfmt to numeric param coercion Reformat the string-or-number deserialize helpers and tests to satisfy `cargo fmt --check` (style_edition 2024), which the lint CI job enforces. Formatting only — no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 11:25:55 +02:00
Charles Rossi	b5ee838d5f	fix(tools): accept numeric params as JSON strings MCP clients (Claude Desktop, VS Code Copilot, etc.) serialize numeric tool arguments as JSON strings ("3" instead of 3). serde's built-in u32/usize deserialisers reject these with: invalid type: string "N", expected u32 Add two private coercion helpers — `deser_opt_u32_or_str` and `deser_opt_usize_or_str` — that accept both JSON number and JSON string representations, falling back to `str::parse` for the string form and returning a clear custom error for non-numeric strings. Annotate the six affected optional fields: CrawlParams: depth (u32), max_pages (usize), concurrency (usize) BatchParams: concurrency (usize) SearchParams: num_results (u32) SummarizeParams: max_sentences (usize) Add 24 unit tests (4 per field: numeric string → value, native number → value, absent → None, non-numeric string → Err) verified green via an isolated serde-only crate. Fixes #58	2026-06-15 01:04:35 -03:00
Valerio	28cd53efcb	Merge pull request #57 from raffaelemancuso/patch-1 Add Windows binaries to README	2026-06-12 17:59:55 +02:00
Raffaele Mancuso	c133478994	Add Windows binaries to README	2026-06-12 17:56:47 +02:00
Valerio	3c726060bf	docs(proxy-example): reword residential product line; refresh NodeMaven banner	2026-06-11 15:16:56 +02:00
Valerio	cb78363466	chore(sponsors): update NodeMaven banner to new branding	2026-06-11 11:50:23 +02:00
Valerio	df7336d55b	Merge pull request #56 from 0xMassi/docs/nodemaven-partner docs: add NodeMaven studio partner to README	2026-06-10 17:46:55 +02:00
Valerio	acd3021f38	docs(readme): add NodeMaven studio partner	2026-06-10 17:46:49 +02:00
Valerio	bcc58dbadd	Merge pull request #55 from 0xMassi/fix/docker-multiarch-single-build ci(release): single multi-platform Docker build + dispatch re-publish	2026-06-10 15:56:36 +02:00
Valerio	8015de7db5	ci(release): build the Docker image in one multi-platform pass The per-arch build + 'imagetools create' combine failed at the manifest step with 'v0.6.9-arm64: not found' — buildx's default provenance/SBOM attestations turn each per-arch tag into an index, and assembling them races GHCR's read-after-write. Replace it with a single 'docker buildx build --platform linux/amd64,linux/arm64 --push' (attestations off) so one manifest list is pushed atomically. Dockerfile.ci now selects binaries by TARGETARCH. Adds a workflow_dispatch path to re-publish an existing tag's image without rebuilding binaries or bumping the version.	2026-06-10 15:54:28 +02:00
Valerio	be64409d62	Merge pull request #54 from 0xMassi/fix/docker-multiarch-release chore: release v0.6.9 (fix multi-arch Docker publish)	2026-06-10 15:30:46 +02:00
Valerio	2773474984	chore: release v0.6.9 Publish the multi-arch Docker image with Buildx instead of the legacy docker driver, whose GHCR push intermittently failed with 'unknown blob'. The manifest list is now assembled registry-side with `imagetools create`. This also unblocks the Homebrew formula update, which depends on the Docker job. No library or CLI behavior changes.	2026-06-10 15:30:39 +02:00
Valerio	7dfa180e86	chore: release v0.6.8	2026-06-10 14:42:05 +02:00
Valerio	598f319bf3	Merge pull request #52 from 0xMassi/audit-fixes-2026-06-09 fix: harden LLM providers, UTF-8 handling, and webhook/batch reliability	2026-06-10 14:40:29 +02:00
Valerio	fae2766db1	Merge pull request #53 from 0xMassi/docs-coldproxy docs: add ColdProxy proxy-backed crawling walkthrough	2026-06-10 14:40:01 +02:00
Valerio	d0909a25e3	docs: add ColdProxy proxy-backed crawling walkthrough	2026-06-10 10:42:47 +02:00
Valerio	499345046c	fix: harden LLM providers, UTF-8 handling, and webhook/batch reliability - webclaw-llm: add explicit request + connect timeouts to the reqwest client in every provider (anthropic, openai, ollama) with a shorter timeout on the ollama health check, so a stalled provider fails fast. - webclaw-llm: fix a panic when truncating a provider error body that contains multibyte characters near the 500-char cut (char-safe take). - webclaw-core: snap the endpoint-scan budget cut to a UTF-8 char boundary so oversized scripts with non-ASCII content no longer panic. - webclaw-core: rewrite js_literal_to_json to copy raw bytes instead of `byte as char`, preserving multibyte UTF-8 in SvelteKit string values rather than producing Latin-1 mojibake. - webclaw-cli: have fire_webhook return its JoinHandle and await it at the crawl/batch/batch-llm call sites, removing the fixed 500ms sleeps. - webclaw-mcp: drop the up-front DNS pre-validation loop in batch that aborted the whole request on one bad URL; the fetch layer already applies the same SSRF guard per URL and reports per-URL errors. - webclaw-fetch: include the port in the warmup homepage URL so hosts on a non-default port are warmed correctly. Adds regression tests for the UTF-8 endpoint-scan and SvelteKit cases.	2026-06-09 21:10:15 +02:00
Valerio	d0d7b835f2	docs(readme): update banner to new webclaw branding	2026-06-09 18:53:14 +02:00