mirror of
https://github.com/0xMassi/webclaw.git
synced 2026-06-06 22:05:13 +02:00
Port the valid PR #43 LLM cleanup fixes onto current main without stale branch regressions.\n\nIncludes comment-count link cleanup, bare numeric paragraph cleanup, pagination leftover cleanup, JSON-LD article body scrubbing, clearer CLI consent-wall warnings, and quieter parser logs by default.\n\nThanks to @devnen for the report and patch work. |
||
|---|---|---|
| .. | ||
| llm | ||
| brand.rs | ||
| data_island.rs | ||
| diff.rs | ||
| domain.rs | ||
| error.rs | ||
| extractor.rs | ||
| js_eval.rs | ||
| lib.rs | ||
| markdown.rs | ||
| metadata.rs | ||
| noise.rs | ||
| structured_data.rs | ||
| types.rs | ||
| youtube.rs | ||