feat(fetch): periodic progress stderr line on slow fetches

Webclaw's default -t timeout is 30s; slow sites previously sat
silently with no feedback. Now during a fetch, every 10s of elapsed
time webclaw writes one line to stderr:

  # webclaw: still fetching <URL> (Ns)

Fetches completing in under 10s emit nothing (the timer never fires).
Stdout output is untouched - pure feedback signal on stderr.

No timeout change. No new flags. Default behavior is augmented at
stderr only.

Implemented via tokio::select! between the fetch future and a
tokio::time::interval. Latency cost: a single tokio task spawn
and a 10s tick - microseconds on the fast path.

10 new tests in webclaw-fetch::progress::tests (none ignored; the
slow-future test uses a 50ms test interval to keep cargo test fast).
Workspace total 710 -> 720.

(cherry picked from commit 06f065cb08)
This commit is contained in:
devnen 2026-05-24 00:24:51 +02:00 committed by Valerio
parent a1abf625a0
commit 985a90b083
3 changed files with 299 additions and 2 deletions

View file

@ -859,8 +859,11 @@ async fn fetch_and_extract(cli: &Cli) -> Result<FetchOutput, String> {
let client =
FetchClient::new(build_fetch_config(cli)).map_err(|e| format!("client error: {e}"))?;
let options = build_extraction_options(cli);
let result = client
.fetch_and_extract_with_options(url, &options)
// M13: wrap with periodic stderr progress emitter. Fast fetches see
// zero emissions (timer never fires in <10s); slow fetches get a
// line every 10s of elapsed time so the CLI doesn't appear hung.
let fetch_fut = client.fetch_and_extract_with_options(url, &options);
let result = webclaw_fetch::with_progress(url, fetch_fut)
.await
.map_err(|e| format!("fetch error: {e}"))?;