Self-hosters hitting docs/self-hosting were promised three binaries
but the OSS Docker image only shipped two. webclaw-server lived in
the closed-source hosted-platform repo, which couldn't be opened. This
adds a minimal axum REST API in the OSS repo so self-hosting actually
works without pretending to ship the cloud platform.
Crate at crates/webclaw-server/. Stateless, no database, no job queue,
single binary. Endpoints: GET /health, POST /v1/{scrape, crawl, map,
batch, extract, summarize, diff, brand}. JSON shapes mirror
api.webclaw.io for the endpoints OSS can support, so swapping between
self-hosted and hosted is a base-URL change.
Auth: optional bearer token via WEBCLAW_API_KEY / --api-key. Comparison
is constant-time (subtle::ConstantTimeEq). Open mode (no key) is
allowed and binds 127.0.0.1 by default; the Docker image flips
WEBCLAW_HOST=0.0.0.0 so the container is reachable out of the box.
Hard caps to keep naive callers from OOMing the process: crawl capped
at 500 pages synchronously, batch capped at 100 URLs / 20 concurrent.
For unbounded crawls or anti-bot bypass the docs point users at the
hosted API.
Dockerfile + Dockerfile.ci updated to copy webclaw-server into
/usr/local/bin and EXPOSE 3000. Workspace version bumped to 0.4.0
(new public binary).
v0.3.13 switched ENTRYPOINT to ["webclaw"] to make `docker run IMAGE
https://example.com` work. That broke a different use case: downstream
Dockerfiles that `FROM ghcr.io/0xmassi/webclaw` and set their own
CMD ["./setup.sh"] — the child's ./setup.sh becomes arg to webclaw,
which tries to fetch it as a URL and fails:
fetch error: request failed: error sending request for uri
(https://./setup.sh): client error (Connect)
Both Dockerfile and Dockerfile.ci now use docker-entrypoint.sh which:
- forwards flags (-*) and URLs (http://, https://) to `webclaw`
- exec's anything else directly
Test matrix (all pass locally):
docker run IMAGE https://example.com → webclaw scrape ok
docker run IMAGE --help → webclaw --help ok
docker run IMAGE → default CMD, --help
docker run IMAGE bash → bash runs
FROM IMAGE + CMD ["./setup.sh"] → setup.sh runs, webclaw available
Default CMD is ["webclaw", "--help"] so bare `docker run IMAGE` still
prints help.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Docker CMD gets overridden by any args, while ENTRYPOINT receives them.
This fixes `docker run webclaw <url>` silently ignoring the URL argument.
Bump to 0.3.13.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QEMU arm64 Rust builds took 60+ min and timed out in CI. Now the Docker
job downloads the pre-built release binaries and packages them directly.
- Dockerfile.ci: slim image for CI (downloads pre-built binaries)
- Dockerfile: full source build for local dev (unchanged build stage)
- Both use ubuntu:24.04 (GLIBC 2.39 matches CI build environment)
- Multi-arch manifest combines amd64 + arm64 images
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>