webclaw/crates
Valerio 4bf11d902f
Some checks are pending
CI / Test (push) Waiting to run
CI / Lint (push) Waiting to run
CI / Docs (push) Waiting to run
fix(mcp): vertical_scrape uses Firefox profile, not default Chrome
Reddit's .json API rejects the wreq-Chrome TLS fingerprint with a
403 even from residential IPs. Their block list includes known
browser-emulation library fingerprints. wreq-Firefox passes. The
CLI `vertical` subcommand already forced Firefox; MCP
`vertical_scrape` was still falling back to the long-lived
`self.fetch_client` which defaults to Chrome, so reddit failed
on MCP and nobody noticed because the earlier test runs all had
an API key set that masked the issue.

Switched vertical_scrape to reuse `self.firefox_or_build()` which
gives us the cached Firefox client (same pattern the scrape tool
uses when the caller requests `browser: firefox`). Firefox is
strictly-safer-than-Chrome for every vertical in the catalog, so
making it the hard default for `vertical_scrape` is the right call.

Verified end-to-end from a clean shell with no WEBCLAW_API_KEY:
- MCP reddit: 679ms, post/author/6 comments correct
- MCP instagram_profile: 1157ms, 18471 followers

No change to the `scrape` tool -- it keeps the user-selectable
browser param.

Bumps version to 0.5.3.
2026-04-22 23:18:11 +02:00
..
webclaw-cli feat(cli+mcp): vertical extractor support (28 extractors discoverable + callable) 2026-04-22 21:41:15 +02:00
webclaw-core style: cargo fmt 2026-04-17 12:03:22 +02:00
webclaw-fetch feat(fetch): Fetcher trait so vertical extractors work under any HTTP backend 2026-04-22 21:17:50 +02:00
webclaw-llm feat(fetch,llm): DoS hardening + glob validation + cleanup (P2) (#22) 2026-04-16 19:44:08 +02:00
webclaw-mcp fix(mcp): vertical_scrape uses Firefox profile, not default Chrome 2026-04-22 23:18:11 +02:00
webclaw-pdf Initial release: webclaw v0.1.0 — web content extraction for LLMs 2026-03-23 18:31:11 +01:00
webclaw-server feat(extractors): wave 5 \u2014 Amazon, eBay, Trustpilot via cloud fallback 2026-04-22 16:16:11 +02:00