webclaw/docker-entrypoint.sh
Valerio b4bfff120e
Some checks failed
CI / Test (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Docs (push) Has been cancelled
fix(docker): entrypoint shim so child images with custom CMD work (#28)
v0.3.13 switched ENTRYPOINT to ["webclaw"] to make `docker run IMAGE
https://example.com` work. That broke a different use case: downstream
Dockerfiles that `FROM ghcr.io/0xmassi/webclaw` and set their own
CMD ["./setup.sh"] — the child's ./setup.sh becomes arg to webclaw,
which tries to fetch it as a URL and fails:

  fetch error: request failed: error sending request for uri
  (https://./setup.sh): client error (Connect)

Both Dockerfile and Dockerfile.ci now use docker-entrypoint.sh which:
- forwards flags (-*) and URLs (http://, https://) to `webclaw`
- exec's anything else directly

Test matrix (all pass locally):
  docker run IMAGE https://example.com     → webclaw scrape ok
  docker run IMAGE --help                   → webclaw --help ok
  docker run IMAGE                          → default CMD, --help
  docker run IMAGE bash                     → bash runs
  FROM IMAGE + CMD ["./setup.sh"]           → setup.sh runs, webclaw available

Default CMD is ["webclaw", "--help"] so bare `docker run IMAGE` still
prints help.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:57:47 +02:00

33 lines
1.3 KiB
Bash
Executable file

#!/bin/sh
# webclaw docker entrypoint.
#
# Behaves like the real binary when the first arg looks like a webclaw arg
# (URL or flag), so `docker run ghcr.io/0xmassi/webclaw https://example.com`
# still works. But gets out of the way when the first arg looks like a
# different command (e.g. `./setup.sh`, `bash`, `sh -c ...`), so this image
# can be used as a FROM base in downstream Dockerfiles with a custom CMD.
#
# Test matrix:
# docker run IMAGE https://example.com → webclaw https://example.com
# docker run IMAGE --help → webclaw --help
# docker run IMAGE --file page.html → webclaw --file page.html
# docker run IMAGE --stdin < page.html → webclaw --stdin
# docker run IMAGE bash → bash
# docker run IMAGE ./setup.sh → ./setup.sh
# docker run IMAGE → webclaw --help (default CMD)
#
# Root cause fixed: v0.3.13 switched CMD→ENTRYPOINT to make the first use
# case work, which trapped the last four. This shim restores all of them.
set -e
# If the first arg starts with `-`, `http://`, or `https://`, treat the
# whole arg list as webclaw flags/URL.
if [ "$#" -gt 0 ] && {
[ "${1#-}" != "$1" ] || \
[ "${1#http://}" != "$1" ] || \
[ "${1#https://}" != "$1" ]; }; then
set -- webclaw "$@"
fi
exec "$@"