Commit graph

223 commits

Author SHA1 Message Date
Valerio
a3d3744104
Merge pull request #77 from 0xMassi/chore/release-0.6.14
chore(release): bump version to 0.6.14
2026-06-27 15:48:26 +02:00
Valerio
25df9ef7b7 chore(release): bump version to 0.6.14
Distribution release — the extraction engine is unchanged from 0.6.13. Ships
the broader-Linux-compatibility work: gnu binaries are now built on glibc 2.35
(so they run on Debian 12 / Ubuntu 22.04), plus new static musl binaries that
run on any Linux (Alpine, Amazon Linux 2023, RHEL 9). Also carries the
create-webclaw Windows/all-platform install fix and the WEBCLAW_API_KEY
setup-script fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 15:44:57 +02:00
Valerio
472f059e4c
Merge pull request #76 from 0xMassi/ci/guard-prereleases
ci(release): guard prerelease tags from clobbering :latest / Homebrew
2026-06-27 14:49:19 +02:00
Valerio
d5d58ab612 ci(release): don't let prerelease tags clobber :latest / Homebrew / stable
A v*-rc* tag triggered the same release pipeline as a stable tag, which
would publish a normal release, overwrite ghcr.io/0xmassi/webclaw:latest,
and repoint the Homebrew formula at the rc — shipping a prerelease to every
stable Docker/brew user. Guard all three on the SemVer prerelease hyphen:

- release: add --prerelease for hyphenated tags
- docker: still push :${tag} (testable rc image) but only move :latest for stable
- homebrew: skip entirely for prereleases

Lets us cut rc tags (e.g. to validate the new musl build job) without
touching stable users.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 14:45:49 +02:00
Valerio
1d49b4404e
Merge pull request #75 from 0xMassi/feat/musl-static-builds
ci(release): add musl static Linux builds (glibc-independent)
2026-06-27 14:42:29 +02:00
Valerio
77fe3b52e2 ci(release): add musl static Linux builds (glibc-independent)
The gnu Linux binaries are glibc-floored (2.35 after #74), so they still
won't run on Amazon Linux 2023 / RHEL 9 (glibc 2.34), Alpine, or anything
older. Add fully static musl builds that run on ANY Linux regardless of
glibc.

Adds x86_64-unknown-linux-musl and aarch64-unknown-linux-musl to the build
matrix, built with cargo-zigbuild (zig as the C/C++ cross-compiler for
BoringSSL). Build scripts (bindgen) run as the glibc host so libclang loads,
and the linked output is fully static. A native Alpine build can't do this —
its static build scripts can't dlopen libclang.

musl assets ship ALONGSIDE the gnu ones (gnu stays default; musl is the
runs-anywhere fallback). The release job globs *.tar.gz, so the new assets
are checksummed + uploaded automatically; the docker/homebrew jobs enumerate
gnu targets explicitly and are unaffected.

Validated in Docker: cargo-zigbuild produced a fully static aarch64-musl
webclaw-mcp (ldd: not a dynamic executable) that answered an MCP handshake on
Alpine, Debian 11 (glibc 2.31), Debian 12, Amazon Linux 2023 (2.34), and
Ubuntu 24.04 — everywhere, including where the gnu builds fail.

Closes #73

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 13:49:40 +02:00
Valerio
07c105b5cb
Merge pull request #74 from 0xMassi/fix/linux-glibc-ubuntu-2204
ci(release): build Linux binaries on ubuntu-22.04 for older-glibc support
2026-06-27 13:01:24 +02:00
Valerio
a9567aa661 ci(release): build Linux binaries on ubuntu-22.04 for older-glibc support
Linux release binaries were built on ubuntu-latest (now Ubuntu 24.04,
glibc 2.39), so they required GLIBC_2.38 and failed to start on older LTS
distros (Debian 12 = 2.36, Ubuntu 22.04 = 2.35) with:

  libc.so.6: version `GLIBC_2.38' not found (required by webclaw-mcp)

glibc is forward- but not backward-compatible, so the build host's glibc
sets the floor. Pin both Linux targets to ubuntu-22.04 (glibc 2.35); the
aarch64 cross toolchain tracks the runner distro, so this lowers both.

Note: 2.35 still won't cover Amazon Linux 2023 / RHEL 9 (2.34) — full
coverage needs a musl static build. Tracked in #73.

Refs #73

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 12:33:05 +02:00
Valerio
f528629f2a
Merge pull request #72 from 0xMassi/fix/create-webclaw-windows
fix(create-webclaw): repair binary install on Windows (and all platforms)
2026-06-27 12:12:44 +02:00
Valerio
9af55c2a2d fix(create-webclaw): repair binary install on Windows (and all platforms)
`npx create-webclaw` never used the prebuilt binary on any platform and
silently fell back to `cargo install`, which fails with "'cargo' is not
recognized" / "cargo: not found" unless Rust is installed. Four bugs:

1. Asset name mismatch: getAssetName() hardcoded `webclaw-mcp-<target>`,
   but release assets are `webclaw-<tag>-<target>` (versioned, no `mcp-`
   infix). The `find()` always returned undefined, so the prebuilt path
   was never taken — on every OS, not just Windows. Now the asset name is
   built from the release tag_name + a platform→target map.

2. `unzip` is absent on Windows. The `.zip` branch now uses PowerShell
   `Expand-Archive` (ships with Windows 10/11) and keeps `unzip` only for
   the non-Windows case.

3. The prebuilt failure was swallowed by a bare `catch {}`, hiding the
   real cause (a 403 is almost always a GitHub API rate limit). The error
   is now surfaced, with a rate-limit hint + GITHUB_TOKEN support on the
   api.github.com request (token dropped on CDN redirects).

4. (missed by the report's own suggested fix) Archives extract into a
   `webclaw-<tag>-<target>/` subdirectory holding three binaries, so the
   old `chmod(BINARY_PATH)` hit a nonexistent path. webclaw-mcp is now
   lifted out of that subdir to BINARY_PATH and the rest is cleaned up.
   BINARY_NAME/BINARY_PATH also gain the `.exe` suffix on Windows so the
   written MCP config points at a real file.

Tested in Docker (no Windows machine available):
- Linux amd64 + arm64 on Debian trixie: full flow installs the binary and
  it answers a real MCP initialize handshake (serverInfo webclaw-mcp
  0.6.13, 12 tools).
- Windows .zip path validated against the real release zip: Expand-Archive
  equivalent extraction, nested `.exe` resolved + lifted, PE header `MZ`.
  Executing the .exe needs Windows (the reporter confirmed that on Win11).
- Bug 3: with the GitHub API blocked, the new build prints the real reason
  instead of "No pre-built binary found".

Closes #71

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:58:14 +02:00
Valerio
1137419a09 chore(sponsors): remove Quantum Proxies (no longer sponsoring)
Quantum Proxies is no longer a sponsor. Remove their Studio Partners entry
from the README and delete the banner/logo assets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 17:19:06 +02:00
Valerio
b7ace81374
Merge pull request #68 from 0xMassi/fix/setup-env-api-key
fix(deploy): write WEBCLAW_API_KEY in generated .env
2026-06-20 14:49:05 +02:00
Valerio
e9abc8f459 docs(claude-md): correct LLM chain, fetch modules, extractor count
- LLM provider chain is Ollama -> OpenAI -> Gemini -> Anthropic; Gemini
  was added ahead of Anthropic (Google Cloud credits preferred) but the
  docs still listed Ollama -> OpenAI -> Anthropic.
- Document the top-level webclaw-fetch verticals reddit.rs / linkedin.rs
  (distinct from extractors/ and webclaw-core parsers) and progress.rs.
- Bump extractor count ~28 -> ~30 and call out the shared helpers
  (og.rs, github_common.rs, jsonld_product.rs, ecommerce_product.rs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 14:44:03 +02:00
Valerio
a8eb6b3cfa fix(deploy): write WEBCLAW_API_KEY in generated .env, not WEBCLAW_AUTH_KEY
setup.sh and deploy/hetzner.sh emitted WEBCLAW_AUTH_KEY into the server's
.env, but webclaw-server reads WEBCLAW_API_KEY (env = "WEBCLAW_API_KEY").
The generated key was silently ignored — and since hetzner.sh binds
0.0.0.0, the server refused to start at all (it rejects a public bind
without WEBCLAW_API_KEY). Fix both .env writers, plus the hetzner help
line that told users to grep the wrong name and the env.example sample.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 14:43:55 +02:00
Valerio
3caca67cd1
Merge pull request #67 from 0xMassi/docs/document-search-map-perf
docs(claude-md): document search, map, and perf
2026-06-17 17:14:53 +02:00
Valerio
480d3187db docs(claude-md): document search, map, and perf; refresh stale details
Bring core/CLAUDE.md current with the slices rescued this cycle, and fold
in earlier /init corrections that were never committed.

New capabilities documented:
- search: webclaw-fetch `search.rs` (Serper BYO-key) + the CLI `search`
  subcommand + the OSS `POST /v1/search` route (gated on SERPER_API_KEY)
  + the now-local-first MCP `search` tool.
- map: webclaw-fetch `map.rs` (`discover_urls`/`MapOptions`, sitemap +
  bounded crawl fallback), gzip sitemap support, and the new
  `--map-pages`/`--no-map-crawl`/`--map-limit` CLI flags.
- perf: shared `extractors/og.rs` parser and the QuickJS runtime gate /
  parsed-document reuse noted on `js_eval.rs`.

Corrections folded in: real browser fingerprint versions live in tls.rs
(not browser.rs), accurate module/route lists, Repo Layout section, and
removal of the now-false "search lives only in production" notes.
Bumped the stated workspace version to 0.6.13.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 17:10:36 +02:00
Valerio
ecfb72a1a3 chore(release): bump version to 0.6.13
Ship the hot-path extraction speedups (#66): selector hoisting, shared
Open Graph parsing, QuickJS gating + parsed-document reuse, and HTTP
connection-pool tuning. Byte-identical extraction output (verified).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 16:52:38 +02:00
Valerio
febe56d177
Merge pull request #66 from 0xMassi/perf/hot-path-speedups
perf: hot-path extraction speedups (selector hoist, shared og, QuickJS gating)
2026-06-17 16:47:22 +02:00
Valerio
3c54bea300 perf: hot-path extraction speedups (selector hoist, shared og, QuickJS gating)
Rescued from the stale perf/audit-fixes branch — the *perf-only* subset of
that branch's big mixed commit, ported cleanly onto current main with
byte-identical extraction output.

- markdown: hoist the `img[alt]` / `a[href]` selectors out of the per-node
  noise path into `Lazy` statics (stop recompiling them per element).
- extractors: single shared `og()` / `parse_og()` module replaces the
  per-field Open Graph re-scan duplicated across 7 vertical extractors
  (amazon, ebay, ecommerce, etsy, substack, trustpilot, youtube). Each
  vertical now does one pass. Raw-vs-unescaped behaviour preserved exactly.
- core: gate the QuickJS VM on a cheap marker check (skip it entirely when
  the page has no JS-assigned data) and reuse the already-parsed document
  instead of re-parsing the HTML.
- fetch: connection-pool tuning on the wreq client (connect_timeout, idle
  pool, max-idle-per-host, tcp keepalive) for connection reuse.

Output-equivalence is covered by existing tests (amazon quot-entity,
trustpilot title parse, ecommerce/youtube/etsy/substack og fallbacks) — all
green. No new dependencies; no public API change.

Deliberately EXCLUDED from this slice (separate concerns bundled in the
original commit): the `#[non_exhaustive]` API-breaking changes, the LLM/PDF/
server reliability hardening (much already shipped in 0.6.8), the tooling
(cargo-deny, release profile, MSRV), and the retry-loop dedup refactor (a
code-cleanup with no runtime benefit — not worth churning client.rs for).

Original work by the prior author on perf/audit-fixes; this re-applies only
the performance subset onto main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 16:41:45 +02:00
Valerio
51d0c538f1 chore(release): bump version to 0.6.12
Bundle three changes landed since 0.6.11:
- feat(search): standalone web search via Serper.dev (#63)
- feat(map): layered URL discovery with bounded crawl fallback (#64)
- fix(mcp): accept boolean params sent as JSON strings (#62 / #65)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 16:10:45 +02:00
Valerio
c5dfce8ed5
Merge pull request #65 from 0xMassi/fix/mcp-bool-param-coercion
fix(mcp): accept boolean params sent as JSON strings (#62)
2026-06-17 15:41:42 +02:00
Valerio
b5d0f78bb8
Merge pull request #64 from 0xMassi/feat/map-crawl-fallback
feat(map): layered URL discovery with bounded crawl fallback
2026-06-17 15:38:43 +02:00
Valerio
884f06a5d3 fix(mcp): accept boolean params sent as JSON strings (#62)
Follow-up to #58/#59, which fixed numeric params but left the booleans.
MCP clients (e.g. Claude Desktop) send `true` as the JSON string `"true"`,
which serde's default bool deserializer rejects with
`invalid type: string "true", expected a boolean`, failing the call.

Adds a `deser_opt_bool_or_str` helper (same untagged pattern as the #59
numeric helpers) that accepts a JSON boolean OR "true"/"false"
(case-insensitive, trimmed) and rejects anything else with a clear error.
Numeric-looking strings like "1" are intentionally NOT coerced to bool.

Applied to every Option<bool> tool param:
- scrape   -> only_main_content
- crawl    -> use_sitemap
- research -> deep
- search   -> scrape   (added by the standalone-search slice, #63)

16 unit tests (bool / "true"-string / absent->None / garbage->error per
field). No new dependencies.

Fixes #62.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 15:37:36 +02:00
Valerio
179efbcf87 feat(map): layered URL discovery with bounded crawl fallback
Rescued from the stale perf/audit-fixes branch and ported cleanly onto
current main (fetch + CLI only — the original commit never touched the
server/MCP map surfaces).

`--map` used to return only what a site advertises in sitemap.xml, which
is nothing for sites with no sitemap (e.g. Hacker News) or a thin one.
Now discovery is layered:

- webclaw-fetch::discover_urls() / MapOptions — sitemaps first
  (authoritative, carries lastmod/priority/changefreq); when the sitemap
  is thin (< min_sitemap_urls) and the fallback is enabled, run a bounded
  same-origin crawl and harvest links from every fetched page plus the
  unfetched frontier, deduped against the sitemap set.
- sitemap.rs: gzip (.xml.gz) support via a new decode_sitemap_body() +
  FetchClient::fetch_raw() (raw bytes, no lossy UTF-8); deeper index
  recursion (3->5); 4 more fallback paths.
- CLI: --map-pages / --no-map-crawl / --map-limit; crawler logs now go to
  stderr so `--map -f json` stays machine-parseable.

One new dependency: flate2 (already resolved in the lockfile transitively).
Includes the commit's unit tests (map dedup/origin, gzip decode). Original
work by the prior author on perf/audit-fixes; this re-applies only the map
slice onto main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 15:33:49 +02:00
Valerio
c3e5ef5143
Merge pull request #63 from 0xMassi/feat/standalone-search
feat(search): standalone web search via Serper.dev (bring-your-own-key)
2026-06-17 15:21:02 +02:00
Valerio
06f151c560 feat(search): standalone web search via Serper.dev (bring-your-own-key)
Rescued from the stale perf/audit-fixes branch and ported cleanly onto
current main. OSS surfaces can now search without the hosted webclaw API
when the caller supplies their own Serper.dev key (free at serper.dev).

- webclaw-fetch::search() — calls Serper.dev directly (plain wreq client;
  a JSON API needs no fingerprinting) and, with scrape=true, fetches +
  extracts the top result pages concurrently (bounded) via the caller's
  FetchClient. parse_serper_organic() is pure and unit-tested.
- MCP `search` tool: local-first — uses SERPER_API_KEY when set, else
  falls back to the hosted webclaw API. Adds country/lang/scrape params.
- OSS REST server: POST /v1/search, gated on SERPER_API_KEY (501 when
  unset, with a setup hint). Adds ApiError::NotImplemented.
- CLI: `webclaw search <query> [--serper-key|SERPER_API_KEY] [--num]
  [--country] [--lang] [--scrape] [--format]`.

No new dependencies (reuses futures-util already in the tree). Original
work by the prior author on perf/audit-fixes; this re-applies only the
search slice onto main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 15:10:58 +02:00
Valerio
0c6f323f51 chore(release): v0.6.11 — Gemini provider + Anthropic model fix 2026-06-16 16:12:11 +02:00
Valerio
d9e3d0b2bb feat(llm): add Gemini provider and fix stale Anthropic default model
Adds a Google Gemini provider (Generative Language API) to the chain, ordered Ollama -> OpenAI -> Gemini -> Anthropic so Google credits are preferred with Anthropic as last-resort fallback. System->systemInstruction, assistant->model, json_mode->responseMimeType; model name validated before URL interpolation; maxOutputTokens defaults high for 2.5 thinking models. Also fixes AnthropicProvider default (retired claude-sonnet-4-20250514 -> 404); now claude-sonnet-4-6, honors ANTHROPIC_MODEL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 15:52:37 +02:00
Valerio
8a0768526f chore(mcp): add .mcp.json so Cursor / Open Plugins directories detect the MCP server
Declares the webclaw MCP server at the repo root (matches the README manual
config). Cursor's plugin scanner looks for .mcp.json/mcp.json at root.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 18:15:19 +02:00
Valerio
e7ec76bce9
docs(sponsors): add MangoProxy studio partner (#60) 2026-06-15 15:06:00 +02:00
Valerio
da6c6af724 chore(release): bump version to 0.6.10
Release the MCP numeric-param string-coercion fix (#58, PR #59):
crawl/batch/search/summarize numeric args now accept JSON numbers or
numeric strings, fixing clients (e.g. Claude Desktop) that send "5"
instead of 5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 11:27:04 +02:00
Valerio
243e7032d0
Merge pull request #59 from crossi-dev/fix/numeric-params-string-coercion
fix: accept numeric MCP params sent as strings (#58)
2026-06-15 11:26:05 +02:00
Valerio
24ae3a7af2 style(mcp): apply rustfmt to numeric param coercion
Reformat the string-or-number deserialize helpers and tests to satisfy
`cargo fmt --check` (style_edition 2024), which the lint CI job enforces.
Formatting only — no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 11:25:55 +02:00
Charles Rossi
b5ee838d5f fix(tools): accept numeric params as JSON strings
MCP clients (Claude Desktop, VS Code Copilot, etc.) serialize numeric
tool arguments as JSON strings ("3" instead of 3). serde's built-in
u32/usize deserialisers reject these with:

  invalid type: string "N", expected u32

Add two private coercion helpers — `deser_opt_u32_or_str` and
`deser_opt_usize_or_str` — that accept both JSON number and JSON string
representations, falling back to `str::parse` for the string form and
returning a clear custom error for non-numeric strings.

Annotate the six affected optional fields:
  CrawlParams: depth (u32), max_pages (usize), concurrency (usize)
  BatchParams: concurrency (usize)
  SearchParams: num_results (u32)
  SummarizeParams: max_sentences (usize)

Add 24 unit tests (4 per field: numeric string → value, native number
→ value, absent → None, non-numeric string → Err) verified green via
an isolated serde-only crate.

Fixes #58
2026-06-15 01:04:35 -03:00
Valerio
28cd53efcb
Merge pull request #57 from raffaelemancuso/patch-1
Add Windows binaries to README
2026-06-12 17:59:55 +02:00
Raffaele Mancuso
c133478994
Add Windows binaries to README 2026-06-12 17:56:47 +02:00
Valerio
3c726060bf docs(proxy-example): reword residential product line; refresh NodeMaven banner 2026-06-11 15:16:56 +02:00
Valerio
cb78363466 chore(sponsors): update NodeMaven banner to new branding 2026-06-11 11:50:23 +02:00
Valerio
df7336d55b
Merge pull request #56 from 0xMassi/docs/nodemaven-partner
docs: add NodeMaven studio partner to README
2026-06-10 17:46:55 +02:00
Valerio
acd3021f38 docs(readme): add NodeMaven studio partner 2026-06-10 17:46:49 +02:00
Valerio
bcc58dbadd
Merge pull request #55 from 0xMassi/fix/docker-multiarch-single-build
ci(release): single multi-platform Docker build + dispatch re-publish
2026-06-10 15:56:36 +02:00
Valerio
8015de7db5 ci(release): build the Docker image in one multi-platform pass
The per-arch build + 'imagetools create' combine failed at the manifest
step with 'v0.6.9-arm64: not found' — buildx's default provenance/SBOM
attestations turn each per-arch tag into an index, and assembling them
races GHCR's read-after-write. Replace it with a single
'docker buildx build --platform linux/amd64,linux/arm64 --push'
(attestations off) so one manifest list is pushed atomically. Dockerfile.ci
now selects binaries by TARGETARCH. Adds a workflow_dispatch path to
re-publish an existing tag's image without rebuilding binaries or bumping
the version.
2026-06-10 15:54:28 +02:00
Valerio
be64409d62
Merge pull request #54 from 0xMassi/fix/docker-multiarch-release
chore: release v0.6.9 (fix multi-arch Docker publish)
2026-06-10 15:30:46 +02:00
Valerio
2773474984 chore: release v0.6.9
Publish the multi-arch Docker image with Buildx instead of the legacy
docker driver, whose GHCR push intermittently failed with 'unknown
blob'. The manifest list is now assembled registry-side with
`imagetools create`. This also unblocks the Homebrew formula update,
which depends on the Docker job. No library or CLI behavior changes.
2026-06-10 15:30:39 +02:00
Valerio
7dfa180e86 chore: release v0.6.8 2026-06-10 14:42:05 +02:00
Valerio
598f319bf3
Merge pull request #52 from 0xMassi/audit-fixes-2026-06-09
fix: harden LLM providers, UTF-8 handling, and webhook/batch reliability
2026-06-10 14:40:29 +02:00
Valerio
fae2766db1
Merge pull request #53 from 0xMassi/docs-coldproxy
docs: add ColdProxy proxy-backed crawling walkthrough
2026-06-10 14:40:01 +02:00
Valerio
d0909a25e3 docs: add ColdProxy proxy-backed crawling walkthrough 2026-06-10 10:42:47 +02:00
Valerio
499345046c fix: harden LLM providers, UTF-8 handling, and webhook/batch reliability
- webclaw-llm: add explicit request + connect timeouts to the reqwest
  client in every provider (anthropic, openai, ollama) with a shorter
  timeout on the ollama health check, so a stalled provider fails fast.
- webclaw-llm: fix a panic when truncating a provider error body that
  contains multibyte characters near the 500-char cut (char-safe take).
- webclaw-core: snap the endpoint-scan budget cut to a UTF-8 char
  boundary so oversized scripts with non-ASCII content no longer panic.
- webclaw-core: rewrite js_literal_to_json to copy raw bytes instead of
  `byte as char`, preserving multibyte UTF-8 in SvelteKit string values
  rather than producing Latin-1 mojibake.
- webclaw-cli: have fire_webhook return its JoinHandle and await it at
  the crawl/batch/batch-llm call sites, removing the fixed 500ms sleeps.
- webclaw-mcp: drop the up-front DNS pre-validation loop in batch that
  aborted the whole request on one bad URL; the fetch layer already
  applies the same SSRF guard per URL and reports per-URL errors.
- webclaw-fetch: include the port in the warmup homepage URL so hosts
  on a non-default port are warmed correctly.

Adds regression tests for the UTF-8 endpoint-scan and SvelteKit cases.
2026-06-09 21:10:15 +02:00
Valerio
d0d7b835f2 docs(readme): update banner to new webclaw branding 2026-06-09 18:53:14 +02:00