mirror of
https://github.com/0xMassi/webclaw.git
synced 2026-06-13 23:15:13 +02:00
Webclaw's default -t timeout is 30s; slow sites previously sat
silently with no feedback. Now during a fetch, every 10s of elapsed
time webclaw writes one line to stderr:
# webclaw: still fetching <URL> (Ns)
Fetches completing in under 10s emit nothing (the timer never fires).
Stdout output is untouched - pure feedback signal on stderr.
No timeout change. No new flags. Default behavior is augmented at
stderr only.
Implemented via tokio::select! between the fetch future and a
tokio::time::interval. Latency cost: a single tokio task spawn
and a 10s tick - microseconds on the fast path.
10 new tests in webclaw-fetch::progress::tests (none ignored; the
slow-future test uses a 50ms test interval to keep cargo test fast).
Workspace total 710 -> 720.
(cherry picked from commit 06f065cb08)
31 lines
1,010 B
Rust
31 lines
1,010 B
Rust
//! webclaw-fetch: HTTP client layer with browser TLS fingerprint impersonation.
|
|
//! Uses wreq (BoringSSL) for browser-grade TLS + HTTP/2 fingerprinting.
|
|
//! Automatically detects PDF responses and delegates to webclaw-pdf.
|
|
pub mod browser;
|
|
pub mod client;
|
|
pub mod cloud;
|
|
pub mod crawler;
|
|
pub mod document;
|
|
pub mod error;
|
|
pub mod extractors;
|
|
pub mod fetcher;
|
|
pub mod linkedin;
|
|
pub mod locale;
|
|
pub mod progress;
|
|
pub mod proxy;
|
|
pub mod reddit;
|
|
pub mod sitemap;
|
|
pub mod tls;
|
|
pub mod url_security;
|
|
|
|
pub use browser::BrowserProfile;
|
|
pub use client::{BatchExtractResult, BatchResult, FetchClient, FetchConfig, FetchResult};
|
|
pub use crawler::{CrawlConfig, CrawlResult, CrawlState, Crawler, PageResult};
|
|
pub use error::FetchError;
|
|
pub use fetcher::Fetcher;
|
|
pub use http::HeaderMap;
|
|
pub use locale::{accept_language_for_tld, accept_language_for_url};
|
|
pub use progress::{with_progress, PROGRESS_INTERVAL};
|
|
pub use proxy::{parse_proxy_file, parse_proxy_line};
|
|
pub use sitemap::SitemapEntry;
|
|
pub use webclaw_pdf::PdfMode;
|