# Changelog All notable changes to webclaw are documented here. Format follows [Keep a Changelog](https://keepachangelog.com/). ## [0.3.4] — 2026-04-01 ### Added - **SvelteKit data island extraction**: extracts structured JSON from `kit.start()` data arrays. Handles unquoted JS object keys by converting to valid JSON before parsing. Data appears in the `structured_data` field. ### Changed - **License changed from MIT to AGPL-3.0**. --- ## [0.3.3] — 2026-04-01 ### Changed - **Replaced custom TLS stack with wreq**: migrated from webclaw-tls (patched rustls/h2/hyper/reqwest) to [wreq](https://github.com/0x676e67/wreq) by [@0x676e67](https://github.com/0x676e67). wreq uses BoringSSL for TLS and the [http2](https://github.com/0x676e67/http2) crate for HTTP/2 fingerprinting — both battle-tested with 60+ browser profiles. - **Removed all `[patch.crates-io]` entries**: consumers no longer need to patch rustls, h2, hyper, hyper-util, or reqwest. Just depend on webclaw normally. - **Browser profiles rebuilt on wreq's Emulation API**: Chrome 145, Firefox 135, Safari 18, Edge 145 with correct TLS options (cipher suites, curves, GREASE, ECH, PSK session resumption), HTTP/2 SETTINGS ordering, pseudo-header order, and header wire order. - **Better TLS compatibility**: BoringSSL handles more server configurations than patched rustls (e.g. servers that previously returned IllegalParameter alerts). ### Removed - webclaw-tls dependency and all 5 forked crates (webclaw-rustls, webclaw-h2, webclaw-hyper, webclaw-hyper-util, webclaw-reqwest). ### Acknowledgments - TLS and HTTP/2 fingerprinting powered by [wreq](https://github.com/0x676e67/wreq) and [http2](https://github.com/0x676e67/http2) by [@0x676e67](https://github.com/0x676e67), who pioneered browser-grade HTTP/2 fingerprinting in Rust. --- ## [0.3.2] — 2026-03-31 ### Added - **`--cookie-file` flag**: load cookies from JSON files exported by browser extensions (EditThisCookie, Cookie-Editor). Format: `[{name, value, domain, ...}]`. - **MCP `cookies` parameter**: the `scrape` tool now accepts a `cookies` array for authenticated scraping. - **Combined cookies**: `--cookie` and `--cookie-file` can be used together and merge automatically. --- ## [0.3.1] — 2026-03-30 ### Added - **Cookie warmup fallback**: when a fetch returns an Akamai challenge page, automatically visits the homepage first to collect `_abck`/`bm_sz` cookies, then retries the original URL. Enables extraction of Akamai-protected subpages (e.g. fansale ticket pages) without JS rendering. ### Changed - Fixed HTTP header wire order (accept/user-agent were in wrong positions) and added H2 PRIORITY flag in HEADERS frames. - `FetchResult.headers` now uses `http::HeaderMap` instead of `HashMap` — avoids per-response allocation, preserves multi-value headers. ## [0.3.0] — 2026-03-29 ### Changed - **Replaced primp with webclaw-tls**: switched to custom TLS fingerprinting stack. - **Browser profiles**: Chrome 146 (Win/Mac), Firefox 135+, Safari 18, Edge 146 — captured from real browsers. - **HTTP/2 fingerprinting**: SETTINGS frame ordering and pseudo-header ordering based on concepts pioneered by [@0x676e67](https://github.com/0x676e67). ### Fixed - **HTTPS completely broken (#5)**: primp's forked rustls rejected valid certificates (UnknownIssuer on cross-signed chains like example.com). Fixed by using native OS root CAs alongside Mozilla bundle. - **Unknown certificate extensions**: servers returning SCT in certificate entries no longer cause TLS errors. ### Added - **Native root CA support**: uses OS trust store (macOS Keychain, Windows cert store) in addition to webpki-roots. - **HTTP/2 fingerprinting**: SETTINGS frame ordering and pseudo-header ordering match real browsers. - **Per-browser header ordering**: HTTP headers sent in browser-specific wire order. - **Bandwidth tracking**: atomic byte counters shared across cloned clients. --- ## [0.2.2] — 2026-03-27 ### Fixed - **`cargo install` broken with primp 1.2.0**: added missing `reqwest` patch to `[patch.crates-io]`. primp moved to reqwest 0.13 which requires a patched fork. - **Weekly dependency check**: CI now runs every Monday to catch primp patch drift before users hit it. --- ## [0.2.1] — 2026-03-27 ### Added - **Docker image on GHCR**: `docker run ghcr.io/0xmassi/webclaw` — auto-built on every release - **QuickJS data island extraction**: inline `