mirror of
https://github.com/0xMassi/webclaw.git
synced 2026-06-09 22:35:12 +02:00
Extraction ~22% faster on the corpus benchmark with byte-identical output: - hoist recompiled CSS selectors in the markdown noise path - single-pass shared og() meta parsing across vertical extractors - output-safe QuickJS gating (skip the JS VM when no candidate data) + reuse the already-parsed document instead of re-parsing - wreq connect_timeout + connection-pool tuning; dedup the retry loop Reliability + correctness: - char-boundary-safe truncation of LLM error bodies (shared helper) - HTTP connect/read timeouts on all LLM provider clients - isolate pdf-extract behind catch_unwind + spawn_blocking - OSS server: crawl inherits the shared fetch profile; ProviderChain built once in AppState; request TimeoutLayer API / safety / docs: - #[non_exhaustive] on public enums + result structs (+ builders) - #![forbid(unsafe_code)] on pure crates, deny on llm - //! crate docs + doctests; scrub bypass/vendor/target specifics from public crate docs and comments Tooling: [profile.release] lto/codegen-units/strip, MSRV pin, deny.toml + cargo-deny CI, macOS test matrix. CLI main.rs split into focused modules.
69 lines
1.7 KiB
YAML
69 lines
1.7 KiB
YAML
name: CI
|
|
|
|
on:
|
|
push:
|
|
branches: [main]
|
|
pull_request:
|
|
|
|
env:
|
|
CARGO_TERM_COLOR: always
|
|
RUSTFLAGS: "--cfg reqwest_unstable"
|
|
|
|
jobs:
|
|
test:
|
|
name: Test
|
|
runs-on: ${{ matrix.os }}
|
|
strategy:
|
|
fail-fast: false
|
|
matrix:
|
|
os: [ubuntu-latest, macos-latest]
|
|
steps:
|
|
- uses: actions/checkout@v5
|
|
- uses: dtolnay/rust-toolchain@stable
|
|
- uses: Swatinem/rust-cache@v2
|
|
- run: cargo test --workspace
|
|
|
|
lint:
|
|
name: Lint
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v5
|
|
- uses: dtolnay/rust-toolchain@stable
|
|
with:
|
|
components: clippy, rustfmt
|
|
- uses: Swatinem/rust-cache@v2
|
|
- run: cargo fmt --check --all
|
|
- run: cargo clippy --all --all-targets -- -D warnings
|
|
|
|
deny:
|
|
name: Supply chain
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v5
|
|
- uses: EmbarkStudios/cargo-deny-action@v2
|
|
with:
|
|
command: check advisories bans licenses sources
|
|
|
|
wasm:
|
|
name: WASM
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v5
|
|
- uses: dtolnay/rust-toolchain@stable
|
|
with:
|
|
targets: wasm32-unknown-unknown
|
|
- uses: Swatinem/rust-cache@v2
|
|
# webclaw-core must stay WASM-safe (zero network deps, no threads).
|
|
# Check both with and without default features so the quickjs gate
|
|
# can't regress.
|
|
- run: cargo check --target wasm32-unknown-unknown -p webclaw-core
|
|
- run: cargo check --target wasm32-unknown-unknown -p webclaw-core --no-default-features
|
|
|
|
docs:
|
|
name: Docs
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v5
|
|
- uses: dtolnay/rust-toolchain@stable
|
|
- uses: Swatinem/rust-cache@v2
|
|
- run: cargo doc --no-deps --workspace
|