mirror of
https://github.com/0xMassi/webclaw.git
synced 2026-04-25 00:06:21 +02:00
Replaces the previous benchmarks/README.md, which claimed specific numbers (94.2% accuracy, 0.8ms extraction, 97% Cloudflare bypass, etc.) with no reproducing code committed to the repo. The `webclaw-bench` crate and `benchmarks/fixtures`, `benchmarks/ground-truth` directories it referenced never existed. This is what #18 was calling out. New benchmarks/ is fully reproducible. Every number ships with the script that produced it. `./benchmarks/run.sh` regenerates everything. Results (18 sites, 90 hand-curated facts, median of 3 runs, webclaw 0.3.18, cl100k_base tokenizer): tool reduction_mean fidelity latency_mean webclaw 92.5% 76/90 (84.4%) 0.41s firecrawl 92.4% 70/90 (77.8%) 0.99s trafilatura 97.8% 45/90 (50.0%) 0.21s webclaw matches or beats both competitors on fidelity on all 18 sites while running 2.4x faster than Firecrawl's hosted API. Includes: - README.md — headline table + per-site breakdown - methodology.md — tokenizer, fact selection, run rationale - sites.txt — 18 canonical URLs - facts.json — 90 curated facts (PRs welcome to add sites) - scripts/bench.py — the runner - results/2026-04-17.json — today's raw data, median of 3 runs - run.sh — one-command reproduction Closes #18 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
27 lines
793 B
Bash
Executable file
27 lines
793 B
Bash
Executable file
#!/usr/bin/env bash
|
|
# Reproduce the webclaw benchmark.
|
|
# Requires: python3, tiktoken, trafilatura. Optional: firecrawl-py + FIRECRAWL_API_KEY.
|
|
|
|
set -euo pipefail
|
|
cd "$(dirname "$0")"
|
|
|
|
# Build webclaw if not present
|
|
if [ ! -x "../target/release/webclaw" ]; then
|
|
echo "→ building webclaw..."
|
|
(cd .. && cargo build --release)
|
|
fi
|
|
|
|
# Install python deps if missing
|
|
missing=""
|
|
python3 -c "import tiktoken" 2>/dev/null || missing+=" tiktoken"
|
|
python3 -c "import trafilatura" 2>/dev/null || missing+=" trafilatura"
|
|
if [ -n "${FIRECRAWL_API_KEY:-}" ]; then
|
|
python3 -c "import firecrawl" 2>/dev/null || missing+=" firecrawl-py"
|
|
fi
|
|
if [ -n "$missing" ]; then
|
|
echo "→ installing python deps:$missing"
|
|
python3 -m pip install --quiet $missing
|
|
fi
|
|
|
|
# Run
|
|
python3 scripts/bench.py
|