webclaw/packages/create-webclaw
Valerio 9af55c2a2d fix(create-webclaw): repair binary install on Windows (and all platforms)
`npx create-webclaw` never used the prebuilt binary on any platform and
silently fell back to `cargo install`, which fails with "'cargo' is not
recognized" / "cargo: not found" unless Rust is installed. Four bugs:

1. Asset name mismatch: getAssetName() hardcoded `webclaw-mcp-<target>`,
   but release assets are `webclaw-<tag>-<target>` (versioned, no `mcp-`
   infix). The `find()` always returned undefined, so the prebuilt path
   was never taken — on every OS, not just Windows. Now the asset name is
   built from the release tag_name + a platform→target map.

2. `unzip` is absent on Windows. The `.zip` branch now uses PowerShell
   `Expand-Archive` (ships with Windows 10/11) and keeps `unzip` only for
   the non-Windows case.

3. The prebuilt failure was swallowed by a bare `catch {}`, hiding the
   real cause (a 403 is almost always a GitHub API rate limit). The error
   is now surfaced, with a rate-limit hint + GITHUB_TOKEN support on the
   api.github.com request (token dropped on CDN redirects).

4. (missed by the report's own suggested fix) Archives extract into a
   `webclaw-<tag>-<target>/` subdirectory holding three binaries, so the
   old `chmod(BINARY_PATH)` hit a nonexistent path. webclaw-mcp is now
   lifted out of that subdir to BINARY_PATH and the rest is cleaned up.
   BINARY_NAME/BINARY_PATH also gain the `.exe` suffix on Windows so the
   written MCP config points at a real file.

Tested in Docker (no Windows machine available):
- Linux amd64 + arm64 on Debian trixie: full flow installs the binary and
  it answers a real MCP initialize handshake (serverInfo webclaw-mcp
  0.6.13, 12 tools).
- Windows .zip path validated against the real release zip: Expand-Archive
  equivalent extraction, nested `.exe` resolved + lifted, PE header `MZ`.
  Executing the .exe needs Windows (the reporter confirmed that on Win11).
- Bug 3: with the GitHub API blocked, the new build prints the real reason
  instead of "No pre-built binary found".

Closes #71

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:58:14 +02:00
..
index.mjs fix(create-webclaw): repair binary install on Windows (and all platforms) 2026-06-27 11:58:14 +02:00
package.json fix(create-webclaw): repair binary install on Windows (and all platforms) 2026-06-27 11:58:14 +02:00
README.md docs: update npm package license to AGPL-3.0 2026-04-02 11:33:43 +02:00
server.json feat(fetch,llm): DoS hardening + glob validation + cleanup (P2) (#22) 2026-04-16 19:44:08 +02:00

webclaw

One command to give your AI agent reliable web access.
No headless browser. No Puppeteer. No 403s.

npm installs Stars License


Quick Start

npx create-webclaw

That's it. Auto-detects your AI tools, downloads the MCP server, configures everything.

Works with Claude Desktop, Claude Code, Cursor, Windsurf, VS Code, OpenCode, Codex CLI, and Antigravity.


The Problem

Your AI agent calls fetch() and gets a 403. Cloudflare, Akamai, and every major CDN fingerprint the TLS handshake and block non-browser clients before the request hits the server.

When it does work, you get 100KB+ of raw HTML — navigation, ads, cookie banners, scripts. Your agent burns 4,000+ tokens parsing noise.

The Fix

webclaw impersonates Chrome 146 at the TLS protocol level. Perfect JA4 fingerprint. Perfect HTTP/2 Akamai hash. 99% bypass rate on 102 tested sites.

Then it extracts just the content — clean markdown, 67% fewer tokens.

                     Raw HTML                          webclaw
┌──────────────────────────────────┐    ┌──────────────────────────────────┐
│ <div class="ad-wrapper">         │    │ # Breaking: AI Breakthrough      │
│ <nav class="global-nav">         │    │                                  │
│ <script>window.__NEXT_DATA__     │    │ Researchers achieved 94%         │
│ ={...8KB of JSON...}</script>    │    │ accuracy on cross-domain         │
│ <div class="social-share">       │    │ reasoning benchmarks.            │
│ <!-- 142,847 characters -->      │    │                                  │
│                                  │    │ ## Key Findings                  │
│         4,820 tokens             │    │         1,590 tokens             │
└──────────────────────────────────┘    └──────────────────────────────────┘

What It Does

npx create-webclaw
  1. Detects installed AI tools (Claude, Cursor, Windsurf, VS Code, OpenCode, Codex, Antigravity)
  2. Downloads the webclaw-mcp binary for your platform (macOS arm64/x86, Linux x86/arm64)
  3. Asks for your API key (optional — works locally without one)
  4. Writes the MCP config for each detected tool

10 MCP Tools

After setup, your AI agent has access to:

Tool What it does API key needed?
scrape Extract content from any URL No
crawl Recursively crawl a website No
search Web search + parallel scrape Yes (Serper)
map Discover URLs from sitemaps No
batch Extract multiple URLs in parallel No
extract LLM-powered structured extraction Yes
summarize Content summarization Yes
diff Track content changes No
brand Extract brand identity No
research Deep multi-page research Yes

8 of 10 tools work fully offline. No API key, no cloud, no tracking.

Supported Tools

Tool Config location
Claude Desktop ~/Library/Application Support/Claude/claude_desktop_config.json
Claude Code ~/.claude.json
Cursor .cursor/mcp.json
Windsurf ~/.codeium/windsurf/mcp_config.json
VS Code (Continue) ~/.continue/config.json
OpenCode ~/.opencode/config.json
Codex CLI ~/.codex/config.json
Antigravity ~/.antigravity/mcp.json

Sites That Work

webclaw gets through where default fetch() gets blocked:

Nike, Cloudflare, Bloomberg, Zillow, Indeed, Viagogo, Fansale, Wikipedia, Stripe, and 93 more. Tested on 102 sites with 99% success rate.

Alternative Install Methods

Homebrew

brew tap 0xMassi/webclaw && brew install webclaw

Docker

docker run --rm ghcr.io/0xmassi/webclaw https://example.com

Cargo

cargo install --git https://github.com/0xMassi/webclaw.git webclaw-cli

Prebuilt Binaries

Download from GitHub Releases for macOS (arm64, x86_64) and Linux (x86_64, aarch64).


License

AGPL-3.0