mirror of
https://github.com/0xMassi/webclaw.git
synced 2026-04-25 00:06:21 +02:00
fix(server): switch default browser profile to Firefox
Reddit blocks wreq's Chrome 145 BoringSSL fingerprint at the JA3/JA4
TLS layer even though our HTTP headers correctly impersonate Chrome.
Curl from the same machine with the same Chrome User-Agent string
returns 200 from Reddit's .json endpoint; webclaw with the Chrome
profile returns 403. The detector clearly fingerprints below the
header layer.
Tested all six vertical extractors with the Firefox profile:
reddit, hackernews, github_repo, pypi, npm, huggingface_model all
return correct typed JSON. Firefox is a strict improvement on the
Chrome default for sites with active TLS-level bot detection, with
no regressions on the API-flavored sites that were already working.
Real fix is per-extractor preferred profile, but the structural
change to allow per-call profile selection in FetchClient is a
larger refactor. Flipping the global default is a one-line change
that ships the unblock now and lets users hit the new
/v1/scrape/{vertical} routes against Reddit immediately.
This commit is contained in:
parent
8ba7538c37
commit
86182ef28a
1 changed files with 1 additions and 1 deletions
|
|
@ -26,7 +26,7 @@ impl AppState {
|
||||||
/// state don't churn per request.
|
/// state don't churn per request.
|
||||||
pub fn new(api_key: Option<String>) -> anyhow::Result<Self> {
|
pub fn new(api_key: Option<String>) -> anyhow::Result<Self> {
|
||||||
let config = FetchConfig {
|
let config = FetchConfig {
|
||||||
browser: BrowserProfile::Chrome,
|
browser: BrowserProfile::Firefox,
|
||||||
..FetchConfig::default()
|
..FetchConfig::default()
|
||||||
};
|
};
|
||||||
let fetch = FetchClient::new(config)
|
let fetch = FetchClient::new(config)
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue