fix(docker): entrypoint shim so child images with custom CMD work

v0.3.13 switched ENTRYPOINT to ["webclaw"] to make `docker run IMAGE
https://example.com` work. That broke a different use case: downstream
Dockerfiles that `FROM ghcr.io/0xmassi/webclaw` and set their own
CMD ["./setup.sh"] — the child's ./setup.sh becomes arg to webclaw,
which tries to fetch it as a URL and fails:

  fetch error: request failed: error sending request for uri
  (https://./setup.sh): client error (Connect)

Both Dockerfile and Dockerfile.ci now use docker-entrypoint.sh which:
- forwards flags (-*) and URLs (http://, https://) to `webclaw`
- exec's anything else directly

Test matrix (all pass locally):
  docker run IMAGE https://example.com     → webclaw scrape ok
  docker run IMAGE --help                   → webclaw --help ok
  docker run IMAGE                          → default CMD, --help
  docker run IMAGE bash                     → bash runs
  FROM IMAGE + CMD ["./setup.sh"]           → setup.sh runs, webclaw available

Default CMD is ["webclaw", "--help"] so bare `docker run IMAGE` still
prints help.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Valerio 2026-04-17 15:53:34 +02:00
parent e27ee1f86f
commit 3396dc8ce7
6 changed files with 61 additions and 10 deletions

View file

@ -3,6 +3,13 @@
All notable changes to webclaw are documented here.
Format follows [Keep a Changelog](https://keepachangelog.com/).
## [0.3.19] — 2026-04-17
### Fixed
- **Docker image can be used as a FROM base again.** v0.3.13 switched the Docker `CMD` to `ENTRYPOINT ["webclaw"]` so that `docker run IMAGE https://example.com` would pass the URL through as expected. That change trapped a different use case: downstream Dockerfiles that `FROM ghcr.io/0xmassi/webclaw` and set their own `CMD ["./setup.sh"]` — the child's `./setup.sh` became the first arg to `webclaw`, which tried to fetch it as a URL and failed with `error sending request for uri (https://./setup.sh)`. Both `Dockerfile` and `Dockerfile.ci` now use a small `docker-entrypoint.sh` shim that forwards flags (`-*`) and URLs (`http://`, `https://`) to `webclaw`, but `exec`s anything else directly. All four use cases now work: `docker run IMAGE https://example.com`, `docker run IMAGE --help`, child-image `CMD ["./setup.sh"]`, and `docker run IMAGE bash` for debugging. Default `CMD` is `["webclaw", "--help"]`.
---
## [0.3.18] — 2026-04-16
### Fixed

12
Cargo.lock generated
View file

@ -3102,7 +3102,7 @@ dependencies = [
[[package]]
name = "webclaw-cli"
version = "0.3.18"
version = "0.3.19"
dependencies = [
"clap",
"dotenvy",
@ -3123,7 +3123,7 @@ dependencies = [
[[package]]
name = "webclaw-core"
version = "0.3.18"
version = "0.3.19"
dependencies = [
"ego-tree",
"once_cell",
@ -3141,7 +3141,7 @@ dependencies = [
[[package]]
name = "webclaw-fetch"
version = "0.3.18"
version = "0.3.19"
dependencies = [
"bytes",
"calamine",
@ -3163,7 +3163,7 @@ dependencies = [
[[package]]
name = "webclaw-llm"
version = "0.3.18"
version = "0.3.19"
dependencies = [
"async-trait",
"reqwest",
@ -3176,7 +3176,7 @@ dependencies = [
[[package]]
name = "webclaw-mcp"
version = "0.3.18"
version = "0.3.19"
dependencies = [
"dirs",
"dotenvy",
@ -3197,7 +3197,7 @@ dependencies = [
[[package]]
name = "webclaw-pdf"
version = "0.3.18"
version = "0.3.19"
dependencies = [
"pdf-extract",
"thiserror",

View file

@ -3,7 +3,7 @@ resolver = "2"
members = ["crates/*"]
[workspace.package]
version = "0.3.18"
version = "0.3.19"
edition = "2024"
license = "AGPL-3.0"
repository = "https://github.com/0xMassi/webclaw"

View file

@ -58,5 +58,10 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
COPY --from=builder /build/target/release/webclaw /usr/local/bin/webclaw
COPY --from=builder /build/target/release/webclaw-mcp /usr/local/bin/webclaw-mcp
# Default: run the CLI (ENTRYPOINT so args pass through)
ENTRYPOINT ["webclaw"]
# Entrypoint shim: forwards webclaw args/URL to the binary, but exec's other
# commands directly so this image can be used as a FROM base with custom CMD.
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["webclaw", "--help"]

View file

@ -13,4 +13,10 @@ ARG BINARY_DIR
COPY ${BINARY_DIR}/webclaw /usr/local/bin/webclaw
COPY ${BINARY_DIR}/webclaw-mcp /usr/local/bin/webclaw-mcp
ENTRYPOINT ["webclaw"]
# Entrypoint shim: forwards webclaw args/URL to the binary, but exec's other
# commands directly so this image can be used as a FROM base with custom CMD.
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["webclaw", "--help"]

33
docker-entrypoint.sh Executable file
View file

@ -0,0 +1,33 @@
#!/bin/sh
# webclaw docker entrypoint.
#
# Behaves like the real binary when the first arg looks like a webclaw arg
# (URL or flag), so `docker run ghcr.io/0xmassi/webclaw https://example.com`
# still works. But gets out of the way when the first arg looks like a
# different command (e.g. `./setup.sh`, `bash`, `sh -c ...`), so this image
# can be used as a FROM base in downstream Dockerfiles with a custom CMD.
#
# Test matrix:
# docker run IMAGE https://example.com → webclaw https://example.com
# docker run IMAGE --help → webclaw --help
# docker run IMAGE --file page.html → webclaw --file page.html
# docker run IMAGE --stdin < page.html → webclaw --stdin
# docker run IMAGE bash → bash
# docker run IMAGE ./setup.sh → ./setup.sh
# docker run IMAGE → webclaw --help (default CMD)
#
# Root cause fixed: v0.3.13 switched CMD→ENTRYPOINT to make the first use
# case work, which trapped the last four. This shim restores all of them.
set -e
# If the first arg starts with `-`, `http://`, or `https://`, treat the
# whole arg list as webclaw flags/URL.
if [ "$#" -gt 0 ] && {
[ "${1#-}" != "$1" ] || \
[ "${1#http://}" != "$1" ] || \
[ "${1#https://}" != "$1" ]; }; then
set -- webclaw "$@"
fi
exec "$@"