From 9a63c1a3ca05d52899ac9104ad5e9517c65f9bb3 Mon Sep 17 00:00:00 2001 From: Valerio Date: Thu, 4 Jun 2026 17:56:24 +0200 Subject: [PATCH] docs(contributing): describe in-process wreq TLS, drop stale patched-deps The TLS layer moved to wreq (BoringSSL) in-process; there is no longer a [patch.crates-io] section or a separate TLS fork. Update the architecture tree and crate-boundary notes to match. --- CONTRIBUTING.md | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 3358e48..b046212 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -91,18 +91,16 @@ Body is optional but encouraged for non-trivial changes. ``` webclaw (this repo) -├── crates/ -│ ├── webclaw-core/ # Pure extraction engine (HTML → markdown/json/text) -│ ├── webclaw-fetch/ # HTTP client + crawler + sitemap + batch -│ ├── webclaw-llm/ # LLM provider chain (Ollama → OpenAI → Anthropic) -│ ├── webclaw-pdf/ # PDF text extraction -│ ├── webclaw-cli/ # CLI binary -│ └── webclaw-mcp/ # MCP server binary -│ -└── [patch.crates-io] # Points to webclaw-tls for TLS fingerprinting +└── crates/ + ├── webclaw-core/ # Pure extraction engine (HTML → markdown/json/text) + ├── webclaw-fetch/ # HTTP client (wreq/BoringSSL) + crawler + sitemap + batch + ├── webclaw-llm/ # LLM provider chain (Ollama → OpenAI → Anthropic) + ├── webclaw-pdf/ # PDF text extraction + ├── webclaw-cli/ # CLI binary + └── webclaw-mcp/ # MCP server binary ``` -TLS fingerprinting lives in a separate repo: [webclaw-tls](https://github.com/0xMassi/webclaw-tls). The `[patch.crates-io]` section in `Cargo.toml` overrides rustls, h2, hyper, hyper-util, and reqwest with our patched forks for browser-grade JA4 + HTTP/2 Akamai fingerprinting. +TLS fingerprinting is handled in-process by [wreq](https://crates.io/crates/wreq) (BoringSSL), so `webclaw-fetch` impersonates real browser TLS directly. There are no `[patch.crates-io]` forks or external TLS dependencies. ## Crate Boundaries @@ -111,7 +109,7 @@ Changes that cross crate boundaries need extra care: | Crate | Network? | Key constraint | |-------|----------|----------------| | webclaw-core | No | Zero network deps, WASM-safe | -| webclaw-fetch | Yes (webclaw-http) | Uses [webclaw-tls](https://github.com/0xMassi/webclaw-tls) for TLS fingerprinting | +| webclaw-fetch | Yes (wreq) | Browser TLS impersonation via wreq (BoringSSL); no patched deps | | webclaw-llm | Yes (reqwest) | Plain reqwest — LLM APIs don't need TLS fingerprinting | | webclaw-pdf | No | Minimal, wraps pdf-extract | | webclaw-cli | Yes | Depends on all above |