webclaw/crates
webclaw b7bd1155c6 feat(map): layered URL discovery with crawl fallback
map falls back to a bounded same-origin crawl when a site has no sitemap
or a thin one, harvesting links from each fetched page (the rich source).
Adds gzip (.xml.gz) sitemap support, deeper sitemap-index recursion + more
fallback paths, uncapped-by-default results with an optional --map-limit /
--map-pages, and routes crawler logs to stderr so --map -f json stays
machine-parseable.
2026-06-06 12:08:26 +02:00
..
webclaw-cli feat(map): layered URL discovery with crawl fallback 2026-06-06 12:08:26 +02:00
webclaw-core perf(core): hot-path extraction speedups + senior-grade hardening 2026-06-04 20:22:00 +02:00
webclaw-fetch feat(map): layered URL discovery with crawl fallback 2026-06-06 12:08:26 +02:00
webclaw-llm perf(core): hot-path extraction speedups + senior-grade hardening 2026-06-04 20:22:00 +02:00
webclaw-mcp perf(core): hot-path extraction speedups + senior-grade hardening 2026-06-04 20:22:00 +02:00
webclaw-pdf perf(core): hot-path extraction speedups + senior-grade hardening 2026-06-04 20:22:00 +02:00
webclaw-server perf(core): hot-path extraction speedups + senior-grade hardening 2026-06-04 20:22:00 +02:00