Compare commits

...

No commits in common. "firefox-1" and "main" have entirely different histories.

111 changed files with 24222 additions and 614 deletions

31
.githooks/pre-push Normal file
View file

@ -0,0 +1,31 @@
#!/bin/sh
# Pre-push hook: blocks push if the test suite isn't fully green.
#
# Enable once with:
# git config core.hooksPath .githooks
#
# Bypass for a known-broken WIP push (NOT for releases):
# git push --no-verify
# The --no-verify flag is the only escape hatch. Use it sparingly and never
# for branches that feed into a release.
set -e
echo "[pre-push] running unit + integration tests before push..."
# Run from this script's directory so it works regardless of where the user
# invoked git push from.
cd "$(dirname "$0")/.."
# Default pyproject addopts skip slow/e2e. That's the gate we want for every
# push — fast feedback. e2e is reserved for explicit release runs.
if ! python -m pytest -q --tb=short; then
echo ""
echo "[pre-push] TESTS FAILED — push aborted."
echo "[pre-push] Either fix the failure or use 'git push --no-verify' if"
echo "[pre-push] you really know what you're doing (NOT for release branches)."
exit 1
fi
echo "[pre-push] all tests green — push proceeding."
exit 0

View file

@ -0,0 +1,98 @@
name: Launch failure
description: Browser or wrapper fails to start (install errors, missing deps, profile load fails, never reaches new_page)
title: "[launch] "
labels: ["bug", "launch-failure"]
body:
- type: markdown
attributes:
value: |
Use this when the browser never reaches a usable state.
If it starts and the bug appears on a site or clicking something, use the site/action template instead.
- type: input
id: version
attributes:
label: Version
description: Output of `python -m invisible_playwright version`.
placeholder: 0.1.7 (binary firefox-7)
validations:
required: true
- type: dropdown
id: os
attributes:
label: OS
options:
- Windows 10/11 x86_64
- Linux x86_64
- macOS (unsupported)
- Other
validations:
required: true
- type: input
id: python
attributes:
label: Python
placeholder: 3.11.7
validations:
required: true
- type: input
id: install_cmd
attributes:
label: How you installed
placeholder: pip install invisible_playwright
validations:
required: true
- type: textarea
id: snippet
attributes:
label: What you ran
description: Stop at the line that errors out. Redact creds.
render: python
value: |
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42) as browser:
ctx = browser.new_context()
validations:
required: true
- type: textarea
id: traceback
attributes:
label: Full traceback
description: The whole stack trace verbatim. Don't summarize.
render: text
validations:
required: true
- type: textarea
id: logs
attributes:
label: Extra logs
description: Output of `DEBUG=pw:browser* python yourscript.py 2>&1`. Optional but speeds things up.
render: text
validations:
required: false
- type: textarea
id: tried
attributes:
label: What you already tried
description: Reinstall, clear cache, different Python version, different proxy, etc.
validations:
required: false
- type: checkboxes
id: confirm
attributes:
label: Before submitting
options:
- label: Searched existing issues.
required: true
- label: On the latest released version.
required: true
- label: Removed credentials and personal paths from the snippet and logs.
required: true

View file

@ -0,0 +1,167 @@
name: Site or action bug
description: Browser starts fine but a navigation, click, evaluate, or other operation fails or behaves wrong
title: "[bug] "
labels: ["bug"]
body:
- type: markdown
attributes:
value: |
For bugs that happen after the browser is up.
If the browser never launches, use the launch failure template.
If a fingerprint detector flags the browser, use the stealth detection template.
- type: input
id: version
attributes:
label: Version
description: Output of `python -m invisible_playwright version`.
placeholder: 0.1.7 (binary firefox-7)
validations:
required: true
- type: dropdown
id: os
attributes:
label: OS
options:
- Windows 10/11 x86_64
- Linux x86_64
- macOS (unsupported)
- Other
validations:
required: true
- type: input
id: python
attributes:
label: Python
placeholder: 3.11.7
validations:
required: true
- type: dropdown
id: headless
attributes:
label: headless=
description: Some bugs only repro on Windows headless=True (hidden alt-desktop path).
options:
- "True"
- "False"
validations:
required: true
- type: dropdown
id: proxy
attributes:
label: Proxy
description: Sites often vary by IP geo (e.g. GDPR consent shows only on UK/EU).
options:
- No proxy (host network)
- Residential, UK/GB
- Residential, US
- Residential, other country (specify in notes)
- Datacenter (specify provider in notes)
validations:
required: true
- type: dropdown
id: profile
attributes:
label: Profile dir
options:
- Fresh each run (no profile_dir)
- Persistent profile_dir, reusing across runs
- Persistent profile_dir, first run creating it
validations:
required: true
- type: input
id: url
attributes:
label: URL
description: The exact URL passed to `page.goto`. Not "the homepage" — the literal string.
placeholder: https://id.sky.com/
validations:
required: true
- type: textarea
id: snippet
attributes:
label: Runnable reproduction
description: A complete snippet we can copy, paste, run. Stub creds with placeholders, keep everything else literal.
render: python
value: |
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, headless=True) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto("https://example.com/")
# the exact operation that fails:
page.click("button:has-text('Accept all')")
validations:
required: true
- type: input
id: selector
attributes:
label: Selector or locator
description: The exact string passed to locator/click/frame_locator. Write N/A if not a selector bug.
placeholder: page.frame_locator("iframe[id^='sp_message_iframe_']").get_by_text("Accept all")
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected
description: What should happen when the snippet runs?
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual
description: What happens instead? Full traceback, error string verbatim, any page.on('crash') firing.
validations:
required: true
- type: textarea
id: screenshot
attributes:
label: Screenshot
description: Drag-drop a screenshot if the bug is visual. Optional but useful.
validations:
required: false
- type: textarea
id: logs
attributes:
label: Browser logs
description: Output of `DEBUG=pw:browser* python yourscript.py 2>&1 | tail -200`. Redact creds and real IPs.
render: text
validations:
required: false
- type: textarea
id: notes
attributes:
label: Notes
description: Anything else, hypotheses, related issues, things you've already tried.
validations:
required: false
- type: checkboxes
id: confirm
attributes:
label: Before submitting
options:
- label: Searched existing issues.
required: true
- label: On the latest released version.
required: true
- label: The snippet above runs end-to-end on a clean Python install.
required: true
- label: Removed credentials, proxy passwords, real IPs, personal file paths.
required: true

View file

@ -0,0 +1,141 @@
name: Stealth detection
description: A fingerprint detector flagged the browser as a bot, VM, VPN, anti-detect, tampered, or otherwise non-human
title: "[detect] "
labels: ["bug", "stealth"]
body:
- type: markdown
attributes:
value: |
Use this when something detects the browser (Fingerprint Pro, CreepJS, BotD, reCAPTCHA, Cloudflare, sannysoft, etc).
Bugs in operations (clicks, navigation) go to the site/action template.
Browser failing to start goes to the launch failure template.
- type: input
id: version
attributes:
label: Version
placeholder: 0.1.7 (binary firefox-7)
validations:
required: true
- type: dropdown
id: os
attributes:
label: OS
options:
- Windows 10/11 x86_64
- Linux x86_64
- macOS (unsupported)
- Other
validations:
required: true
- type: dropdown
id: headless
attributes:
label: headless=
options:
- "True"
- "False"
validations:
required: true
- type: dropdown
id: proxy
attributes:
label: Proxy
description: Datacenter or wrong-country proxies trip most detectors regardless of the browser. Be honest about what you used.
options:
- No proxy (host network)
- Residential, matching target geo
- Residential, different geo than target
- Datacenter (specify provider in notes)
- Mobile / 4G
validations:
required: true
- type: input
id: detector
attributes:
label: Detector name and URL
description: Exact site / service / product that flagged us.
placeholder: Fingerprint Pro — https://demo.fingerprint.com/playground
validations:
required: true
- type: textarea
id: scores
attributes:
label: Detector verdict
description: Paste the relevant flags / scores verbatim. For Fingerprint Pro paste `bot`, `vpn`, `virtual_machine`, `tampering*`, `vm_ml_score`, `suspect_score`. For CreepJS the headless / lies / trust scores. For reCAPTCHA v3 the score number.
render: text
placeholder: |
bot: bad
vpn: true
virtual_machine: true
vm_ml_score: 0.74
suspect_score: 22
validations:
required: true
- type: textarea
id: screenshot
attributes:
label: Screenshot of the detector result
description: Drag-drop a screenshot of the detector page so we see what you see.
validations:
required: true
- type: textarea
id: snippet
attributes:
label: How you launched
description: The InvisiblePlaywright launch + navigation that produced the result above. Redact creds.
render: python
value: |
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, headless=True) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto("https://demo.fingerprint.com/playground")
validations:
required: true
- type: textarea
id: expected
attributes:
label: What you expected
description: Most detectors will never give a perfect score for any browser. Tell us what threshold you'd accept (e.g. bot=not_detected, vm_ml_score < 0.3).
validations:
required: true
- type: textarea
id: full_report
attributes:
label: Full detector response
description: For Fingerprint Pro paste the JSON from /api/event/v4/ if you have it. For CreepJS paste the full Smart Signals block. Optional but speeds things up a lot.
render: json
validations:
required: false
- type: textarea
id: notes
attributes:
label: Notes
validations:
required: false
- type: checkboxes
id: confirm
attributes:
label: Before submitting
options:
- label: Searched existing issues.
required: true
- label: On the latest released version.
required: true
- label: The detector verdict above is from a real run, not a hypothesis.
required: true
- label: Removed credentials, real IPs, FpJS visitor_id values, personal file paths from the snippet and full report.
required: true

11
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View file

@ -0,0 +1,11 @@
blank_issues_enabled: false
contact_links:
- name: Security vulnerability
url: https://github.com/feder-cr/invisible_playwright/security/advisories/new
about: Report a security issue privately. Do NOT open a public issue.
- name: Bug in the patched Firefox source (C++, IDL, Juggler JS)
url: https://github.com/feder-cr/invisible_firefox/issues
about: Source-level patches in the Firefox fork go in the invisible_firefox repo. Detection results (FpJS, CreepJS, etc.) use the stealth detection template here.
- name: Question or general discussion
url: https://github.com/feder-cr/invisible_playwright/discussions
about: Usage questions, ideas, chat. Bugs and features still go in issues.

View file

@ -0,0 +1,47 @@
name: Feature request
description: Suggest a new feature or improvement
title: "[feature] "
labels: ["enhancement"]
body:
- type: markdown
attributes:
value: |
Thanks for the suggestion! Please check that:
- Your idea is **in scope** for this repo (the Python wrapper, sampler, CLI, packaging).
- Changes to the patched Firefox C++ source belong at [feder-cr/firefox-stealth](https://github.com/feder-cr/firefox-stealth) instead.
- You have searched [existing issues](https://github.com/feder-cr/invisible_playwright/issues?q=is%3Aissue) for similar requests.
- type: textarea
id: problem
attributes:
label: Problem
description: What problem does this solve? What can't you currently do, or what is awkward today?
validations:
required: true
- type: textarea
id: proposal
attributes:
label: Proposed solution
description: How would the feature work? API sketches, CLI examples, or pseudocode welcome.
validations:
required: true
- type: textarea
id: alternatives
attributes:
label: Alternatives considered
description: Other approaches you thought about and why they fall short.
validations:
required: false
- type: textarea
id: context
attributes:
label: Additional context
description: Links to related issues, prior art in other libraries, screenshots, etc.
validations:
required: false
- type: checkboxes
id: contribute
attributes:
label: Are you willing to contribute?
options:
- label: I'd be willing to open a PR for this if accepted.
required: false

40
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View file

@ -0,0 +1,40 @@
<!--
Thanks for your contribution! Please fill in the sections below.
PRs that don't follow this template may be asked for revision before review.
-->
## Summary
<!-- One or two sentences: what does this PR change and why? -->
## Type of change
<!-- Tick all that apply -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that changes existing behavior)
- [ ] Documentation only
- [ ] Tests / CI / tooling
## Related issues
<!-- Link any related issues, e.g. "Closes #123", "Refs #456" -->
## How was this tested?
<!--
Describe what you ran:
- `pytest` (default, unit + integration)
- `pytest -m e2e` (against the patched binary)
- Manual repro steps, screenshots, etc.
-->
## Checklist
- [ ] I have read [CONTRIBUTING.md](../CONTRIBUTING.md).
- [ ] My commits follow [Conventional Commits](https://www.conventionalcommits.org/).
- [ ] I added or updated tests covering the change.
- [ ] `pytest` passes locally.
- [ ] I updated `README.md` / `docs/` if user-visible behavior changed.
- [ ] My change is in scope for this repo (Python wrapper / sampler / CLI / packaging — not the patched Firefox C++ source).

52
.github/workflows/e2e.yml vendored Normal file
View file

@ -0,0 +1,52 @@
# ─────────────────────────────────────────────────────────────────────────────
# e2e.yml — run the FULL browser-driving e2e suite (the 127 @pytest.mark.e2e)
# on GitHub, on every push/PR to main.
#
# Why this can run on CI when the drive-gate had to stay light: the drive-gate
# launched Firefox in true HEADLESS mode, which is content-process unstable on
# the hosted runners (eval-CSP / context-destroyed). The stealth wrapper instead
# launches Firefox HEADED on a real display; under `xvfb-run` (a virtual X
# server) that's exactly what we get on a headless CI box — stable, and the same
# thing webrtc-e2e.yml already relies on.
#
# Secret-free, so it's safe in public CI: the binary is the PUBLIC firefox-9
# release (no token), and the webrtc e2e fake a local TCP-only SOCKS. The proxy
# realness gate (fppro / smartproxy) is NOT here — it needs secrets and stays a
# local pre-release gate.
# ─────────────────────────────────────────────────────────────────────────────
name: e2e
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
permissions:
contents: read
jobs:
e2e:
name: e2e (linux, xvfb)
runs-on: ubuntu-24.04
timeout-minutes: 40
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with: { fetch-depth: 1 }
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with: { python-version: '3.11' }
- name: Install wrapper + test deps (+ pinned Playwright)
run: |
python -m pip install --upgrade pip
python -m pip install ".[dev]"
python -m pip install "playwright==$(cat scripts/playwright_pin.txt)"
- name: System deps (xvfb + Firefox runtime libs)
run: |
sudo apt-get update
sudo apt-get install -y xvfb
sudo "$(which python)" -m playwright install-deps firefox
- name: Fetch the published firefox binary
run: echo "FF=$(python -m invisible_playwright fetch | tail -1)" >> "$GITHUB_ENV"
- name: Run the full e2e suite under a virtual display
run: xvfb-run -a python scripts/run_e2e.py "$FF"

View file

@ -0,0 +1,106 @@
name: firefox-launch-matrix
# Cross-Windows-edition smoke for the shipped firefox-N binary.
# Triggered by issue #22 (firefox-7 SxS mismatch on Win11 build 26200,
# reporter `jannusdorfer-create`).
#
# Runs the exact reporter snippet on every Windows runner GitHub offers,
# from a fresh checkout. If any matrix cell fails the same way, the bug
# is reproducible on at least one clean-ish environment and we ship a
# sidecar mozglue.manifest fix. If all cells pass, the bug is confined
# to the reporter's specific environment (Pro/Enterprise GPO, EDR, etc.).
on:
workflow_dispatch:
push:
branches: [main]
paths:
- '.github/workflows/firefox-launch-matrix.yml'
jobs:
smoke:
name: launch (${{ matrix.os }}, py${{ matrix.python }})
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [windows-2022, windows-2025, windows-latest]
python: ["3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python }}
cache: pip
- name: Windows edition + build info
shell: pwsh
run: |
$os = Get-CimInstance Win32_OperatingSystem
Write-Host "Caption : $($os.Caption)"
Write-Host "BuildNumber: $($os.BuildNumber)"
Write-Host "OSArch : $($os.OSArchitecture)"
Write-Host "Edition : $((Get-CimInstance Win32_OperatingSystem).OperatingSystemSKU)"
Write-Host "---"
Write-Host "VC++ Redistributables installed:"
Get-ItemProperty 'HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\*' `
-ErrorAction SilentlyContinue |
Where-Object { $_.DisplayName -like '*Visual C++*Redist*' } |
Select-Object DisplayName, DisplayVersion |
Format-Table -AutoSize
- name: Install package from this commit
run: |
python -m pip install --upgrade pip
pip install .
- name: Fetch firefox-7 binary
run: python -m invisible_playwright fetch
- name: Verify firefox.exe can launch standalone (the snippet that fails for issue #22)
shell: pwsh
run: |
# The platformdirs path has the duplicated `invisible-playwright` segment
# on Windows (user_cache_dir convention).
$ffPath = "$env:LOCALAPPDATA\invisible-playwright\invisible-playwright\Cache\firefox-7\firefox.exe"
if (-not (Test-Path $ffPath)) {
Write-Error "firefox.exe NOT FOUND at $ffPath"
exit 1
}
Write-Host "Launching: $ffPath --version"
# NOTE: firefox.exe --version on Windows prints the version but may
# return non-zero exit code (sub-process fork quirk). Check stdout.
$output = & $ffPath --version 2>&1 | Out-String
Write-Host "Output: $output"
if ($output -notmatch 'Mozilla Firefox \d') {
Write-Error "firefox.exe --version did not print a Mozilla Firefox version. Output was: $output"
exit 1
}
Write-Host "OK: firefox.exe runs and prints version."
- name: Run reporter's exact InvisiblePlaywright snippet
run: |
python -c "
import asyncio
from invisible_playwright.async_api import InvisiblePlaywright
async def main():
async with InvisiblePlaywright(seed=9128) as browser:
page = await browser.new_page()
await page.goto('about:blank')
print('OK: page loaded, url =', page.url)
asyncio.run(main())
"
- name: Upload diagnostics on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: launch-failure-${{ matrix.os }}-py${{ matrix.python }}
path: |
${{ env.LOCALAPPDATA }}/invisible-playwright/invisible-playwright/Cache/firefox-7/firefox.exe
${{ env.LOCALAPPDATA }}/invisible-playwright/invisible-playwright/Cache/firefox-7/mozglue.dll
if-no-files-found: warn
retention-days: 7

479
.github/workflows/release.yml vendored Normal file
View file

@ -0,0 +1,479 @@
# ─────────────────────────────────────────────────────────────────────────────
# release.yml — build all 5 patched-Firefox targets at $0 and publish them as
# DRAFT GitHub Release assets, named per the wrapper contract (constants.ARCHIVE_NAME).
# DRAFT on purpose: a human runs the realness gate and only THEN un-drafts + bumps
# BINARY_VERSION. Nothing auto-ships (issue #14 lesson).
#
# PACKAGING (issue #14: dangling symlinks broke 265 downloads — never again):
# Linux → cp -aL (dereference ALL symlinks into real files) + rm dev tools +
# strip + sanitize + tar at ROOT, then validate_release.py as a HARD
# in-pipeline gate (the exact battle-tested script from the source repo).
# Win → mach package; zip the CONTENTS of dist/firefox (clean tree, NOT
# dist/bin) so firefox.exe sits at the zip ROOT.
# macOS → mach package; ad-hoc codesign the .app; PRESERVE its internal relative
# symlinks (a .app legitimately has them — cp -aL would break it); verify
# every symlink is relative+internal; tar the bundle. --version self-gate.
#
# DRIVE GATE (the firefox-8 catcher): after build, every binary is DRIVEN by
# Playwright on its native runner (launch via juggler + real page + JS roundtrip,
# headless, no screenshot → GPU-free, zero proxy). A juggler-less binary renders
# a screenshot fine but is undrivable — only an actual drive catches that. The
# proxy realness gate (fppro/webrtc) stays LOCAL — it needs secrets.
#
# Trigger: push a tag `firefox-N`, or run manually. Hybrid runners, all free.
# ─────────────────────────────────────────────────────────────────────────────
name: release
on:
push:
tags: ['firefox-*']
workflow_dispatch:
inputs:
source_ref:
description: 'invisible_firefox ref to build'
default: 'stealth/150'
release_tag:
description: 'release tag to publish the draft under (e.g. firefox-9)'
required: true
env:
SOURCE_REPO: feder-cr/invisible_firefox
SOURCE_REF: ${{ github.event.inputs.source_ref || 'stealth/150' }}
jobs:
build:
name: build-${{ matrix.leg }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 350
strategy:
fail-fast: false
matrix:
include:
- leg: linux-x86_64
runner: ubuntu-24.04
family: linux
target: ''
rust_target: x86_64-unknown-linux-gnu
win_disables: 'no'
extra_pkgs: ''
asset: firefox-150.0.1-stealth-linux-x86_64.tar.gz
- leg: linux-arm64
runner: ubuntu-24.04-arm
family: linux
target: ''
rust_target: aarch64-unknown-linux-gnu
win_disables: 'no'
extra_pkgs: ''
asset: firefox-150.0.1-stealth-linux-arm64.tar.gz
- leg: win-x86_64
runner: ubuntu-24.04
family: win
target: x86_64-pc-windows-msvc
rust_target: x86_64-pc-windows-msvc
win_disables: 'yes'
extra_pkgs: 'msitools p7zip-full zip'
asset: firefox-150.0.1-stealth-win-x86_64.zip
- leg: macos-arm64
runner: macos-15
family: mac
target: aarch64-apple-darwin
rust_target: aarch64-apple-darwin
win_disables: 'no'
extra_pkgs: ''
asset: firefox-150.0.1-stealth-macos-arm64.tar.gz
- leg: macos-x86_64
runner: macos-15-intel
family: mac
target: x86_64-apple-darwin
rust_target: x86_64-apple-darwin
win_disables: 'no'
extra_pkgs: ''
asset: firefox-150.0.1-stealth-macos-x86_64.tar.gz
steps:
- name: Free disk + 16G swap (Linux runners)
if: matrix.family != 'mac'
run: |
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android \
/usr/local/share/boost "${AGENT_TOOLSDIRECTORY:-/opt/hostedtoolcache}" 2>/dev/null || true
sudo fallocate -l 16G /swapfile && sudo chmod 600 /swapfile && sudo mkswap /swapfile && sudo swapon /swapfile || true
- name: Checkout patched Firefox source
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
repository: ${{ env.SOURCE_REPO }}
ref: ${{ env.SOURCE_REF }}
fetch-depth: 1
# Record which invisible_firefox commit this build came from. The publish
# job turns the range previous-release..this commit into the release notes
# (scripts/gen_release_notes.py), and re-publishes it as a source-commit.txt
# asset so the NEXT release knows where to start the changelog. One leg is
# enough — all legs check out the same SOURCE_REF.
- name: Record source commit (for auto release notes)
if: matrix.leg == 'linux-x86_64'
shell: bash
run: git rev-parse HEAD > source-commit.txt && cat source-commit.txt
- name: Upload source-commit artifact
if: matrix.leg == 'linux-x86_64'
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: source-commit
path: source-commit.txt
if-no-files-found: error
retention-days: 7
- name: Set up Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with: { python-version: '3.11' }
- name: Install Linux build tools
if: matrix.family != 'mac'
run: |
sudo apt-get update
sudo apt-get install -y util-linux binutils ${{ matrix.extra_pkgs }}
- name: Select Xcode 26.2 + export SDK path (macOS)
if: matrix.family == 'mac'
run: |
sudo xcode-select -s /Applications/Xcode_26.2.app
SDKP="$(xcrun --show-sdk-path)"
echo "SDK_PATH=$SDKP" >> "$GITHUB_ENV"
echo "macOS SDK $(xcrun --sdk macosx --show-sdk-version) at $SDKP"
- name: Add Rust target
run: rustup target add ${{ matrix.rust_target }} || true
- name: Extend the repo .mozconfig (NO mold; +target/SDK as needed)
run: |
test -f .mozconfig || { echo "ERROR: no .mozconfig in source"; exit 1; }
rm -f mozconfig
{
echo ""
echo "# --- release CI levers for ${{ matrix.leg }} (mold intentionally OFF — it segfaults libxul) ---"
echo "ac_add_options --disable-debug-symbols"
} >> .mozconfig
if [ -n "${{ matrix.target }}" ]; then echo "ac_add_options --target=${{ matrix.target }}" >> .mozconfig; fi
if [ "${{ matrix.family }}" = "mac" ]; then echo "ac_add_options --with-macos-sdk=$SDK_PATH" >> .mozconfig; fi
if [ "${{ matrix.win_disables }}" = "yes" ]; then
{ echo "ac_add_options --disable-default-browser-agent";
echo "ac_add_options --disable-maintenance-service";
echo "ac_add_options --disable-update-agent"; } >> .mozconfig
fi
if [ "${{ matrix.family }}" = "mac" ]; then NCPU=$(sysctl -n hw.ncpu); else NCPU=4; fi
{ echo "mk_add_options MOZ_PARALLEL_BUILD=$NCPU";
echo "mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-rel"; } >> .mozconfig
echo "----- final .mozconfig -----"; cat .mozconfig
- name: Build
run: ./mach build
# ── LINUX: dereference symlinks (issue #14) + strip + sanitize + tar@root + GATE
- name: Package + validate (Linux)
if: matrix.family == 'linux'
run: |
set -e
DIST=obj-rel/dist/bin
STAGING=staging
rm -rf "$STAGING"; mkdir -p "$STAGING" out
cp -aL "$DIST/." "$STAGING/" # -L: dereference ALL symlinks into real files
N=$(find "$STAGING" -type l | wc -l)
[ "$N" -eq 0 ] || { echo "ERROR: $N symlinks remain after cp -aL"; exit 1; }
for t in xpcshell certutil pk12util rapl; do rm -f "$STAGING/$t"; done
# JUGGLER GATE: the binary is undrivable by Playwright without it (see 70-known-bugs)
{ [ -e "$STAGING/chrome/juggler.manifest" ] && [ -d "$STAGING/chrome/juggler" ]; } \
|| { echo "ERROR: juggler missing from package (chrome/juggler) — Playwright can't drive it"; exit 1; }
echo "juggler GATE OK (loose chrome/juggler present)"
find "$STAGING" -type f \
\( -name '*.so' -o -name firefox -o -name firefox-bin -o -name plugin-container \
-o -name pingsender -o -name glxtest -o -name vaapitest -o -name updater \) \
-exec strip --strip-debug {} + 2>/dev/null || true
STAGING="$STAGING" python3 scripts/linux_sanitize.py || true # no-op in CI (no /home/feder), defensive
tar --owner=0 --group=0 --numeric-owner --mtime="2026-01-01 00:00:00 UTC" \
-czf "out/${{ matrix.asset }}" -C "$STAGING" . # firefox at ROOT
echo "=== HARD GATE: scripts/validate_release.py (the issue-#14 protector) ==="
python3 scripts/validate_release.py --linux "out/${{ matrix.asset }}" --linux-only
ls -la out/
# ── WINDOWS (cross): zip the CLEAN dist/firefox tree, firefox.exe at root
- name: Package (Windows cross)
if: matrix.family == 'win'
run: |
set -e
# Do NOT swallow a mach failure: `./mach package || echo` lets set -e pass
# and would fall through to a stale tree. A release MUST come from the clean
# dist/firefox; dist/bin is the dev tree (cruft + loose juggler that masked
# the firefox-7/8 packaging bugs), never acceptable for a release.
./mach package
[ -f obj-rel/dist/firefox/firefox.exe ] \
|| { echo "ERROR: mach package did not produce a clean dist/firefox tree"; exit 1; }
WIN_APP=obj-rel/dist/firefox
echo "packaging from: $WIN_APP"
# JUGGLER GATE: omni.ja must carry juggler (else Playwright can't drive it)
[ -f "$WIN_APP/omni.ja" ] || { echo "ERROR: no omni.ja in $WIN_APP"; exit 1; }
python3 -c "import zipfile,sys; sys.exit(0 if any('juggler' in n.lower() for n in zipfile.ZipFile('$WIN_APP/omni.ja').namelist()) else 1)" \
|| { echo "ERROR: juggler missing from $WIN_APP/omni.ja — Playwright can't drive it"; exit 1; }
echo "juggler GATE OK (win)"
mkdir -p out
( cd "$WIN_APP" && zip -qr "$GITHUB_WORKSPACE/out/${{ matrix.asset }}" . ) # firefox.exe at zip ROOT
ls -la out/
# ── macOS: package .app, ad-hoc sign, verify relative-internal symlinks, --version gate, tar
- name: Package + validate (macOS)
if: matrix.family == 'mac'
run: |
set -e
./mach package
APP="$(find obj-rel/dist -maxdepth 2 -name '*.app' -type d | head -1)"
[ -n "$APP" ] || { echo "ERROR: no .app produced"; exit 1; }
echo "built app: $APP"
# JUGGLER GATE: the .app's omni.ja must carry juggler (else Playwright can't drive it)
python3 -c "import zipfile,sys,glob; jas=glob.glob('$APP/Contents/Resources/omni.ja')+glob.glob('$APP/Contents/Resources/browser/omni.ja'); sys.exit(0 if jas and any(any('juggler' in n.lower() for n in zipfile.ZipFile(j).namelist()) for j in jas) else 1)" \
|| { echo "ERROR: juggler missing from .app omni.ja — Playwright can't drive it"; exit 1; }
echo "juggler GATE OK (mac)"
codesign --force --deep --sign - --timestamp=none "$APP"
codesign --verify --deep --strict --verbose=2 "$APP"
echo "=== --version GATE ==="
"$APP/Contents/MacOS/firefox" --version
echo "=== critical files present ==="
for need in "Contents/MacOS/firefox" "Contents/Info.plist"; do
[ -e "$APP/$need" ] || { echo "ERROR: missing $need"; exit 1; }
done
echo "=== Info.plist well-formed + required keys (a malformed plist → Finder 'damaged') ==="
plutil -lint "$APP/Contents/Info.plist"
for key in CFBundleExecutable CFBundleIdentifier CFBundleShortVersionString; do
plutil -extract "$key" raw -o - "$APP/Contents/Info.plist" >/dev/null \
|| { echo "ERROR: Info.plist missing $key"; exit 1; }
done
EXEC="$(plutil -extract CFBundleExecutable raw -o - "$APP/Contents/Info.plist")"
[ -e "$APP/Contents/MacOS/$EXEC" ] \
|| { echo "ERROR: CFBundleExecutable '$EXEC' has no matching binary in Contents/MacOS"; exit 1; }
echo "=== verify NO absolute symlinks in the .app (relative-internal ones are fine) ==="
BAD="$(find "$APP" -type l -print0 | xargs -0 -I{} sh -c 't=$(readlink "{}"); case "$t" in /*) echo "{} -> $t";; esac')"
[ -z "$BAD" ] || { echo "ERROR: absolute symlinks in .app (break on user machines):"; echo "$BAD" | head -5; exit 1; }
echo "mac .app OK: critical files present, no absolute symlinks"
STABLE="$(dirname "$APP")/Firefox.app"
[ "$APP" = "$STABLE" ] || mv "$APP" "$STABLE"
mkdir -p out
tar -czf "out/${{ matrix.asset }}" -C "$(dirname "$STABLE")" Firefox.app # preserves internal symlinks
ls -la out/
- name: Upload build artifact
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: asset-${{ matrix.leg }}
path: out/${{ matrix.asset }}
if-no-files-found: error
retention-days: 7
# DRIVE GATE — the firefox-8 catcher. A raw `firefox --screenshot` proves
# nothing about automation: a juggler-less binary renders fine and ships
# broken (firefox-8 did exactly that). So we DRIVE every binary the way users
# will: Playwright launches it over the juggler pipe, loads a real page, and
# round-trips JS. A binary missing/broken juggler throws TargetClosedError
# here and the release never publishes. Headless, NO screenshot → GPU-free,
# so it can't false-fail on the GPU-less hosted runners. Zero proxy / zero
# secrets → safe in public CI (the proxy realness gate stays local, by design).
# Each leg runs on its NATIVE runner so we test the real artifact, not a cross
# surrogate. Playwright is pinned to a version validated against this build's
# juggler; bump it in lockstep when the juggler is re-synced from upstream.
gate:
name: gate-${{ matrix.leg }}
needs: build
runs-on: ${{ matrix.runner }}
timeout-minutes: 25
strategy:
fail-fast: false
matrix:
include:
# `extra: --full` adds the mouse/keyboard/canvas/navsurface interaction
# checks. Only on linux-x86_64 (historically the most reliable hosted
# runner): the interaction code is platform-identical JS (omni.ja), so
# one reliable full run catches a firefox-2-class regression for all
# platforms. The other legs run SMOKE (launch+http+UA+webdriver) — the
# firefox-8/juggler catcher — which is robust even on the flaky
# windows-latest runner. See scripts/ci_drive_gate.py.
- leg: linux-x86_64
runner: ubuntu-24.04
kind: linux
asset: firefox-150.0.1-stealth-linux-x86_64.tar.gz
extra: '--full'
- leg: linux-arm64
runner: ubuntu-24.04-arm
kind: linux
asset: firefox-150.0.1-stealth-linux-arm64.tar.gz
extra: ''
- leg: win-x86_64
runner: windows-latest
kind: win
asset: firefox-150.0.1-stealth-win-x86_64.zip
extra: ''
- leg: macos-arm64
runner: macos-15
kind: mac
asset: firefox-150.0.1-stealth-macos-arm64.tar.gz
extra: ''
- leg: macos-x86_64
runner: macos-15-intel
kind: mac
asset: firefox-150.0.1-stealth-macos-x86_64.tar.gz
extra: ''
steps:
- name: Checkout wrapper (for scripts/ci_drive_gate.py)
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with: { fetch-depth: 1 }
- name: Download asset
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: asset-${{ matrix.leg }}
path: art
- name: Set up Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with: { python-version: '3.11' }
- name: Install Playwright driver (no bundled browser — we override executable_path)
# Pin from a SINGLE source (scripts/playwright_pin.txt) so release.yml and
# verify-assets.yml can't drift to different versions. The drive gate then
# ENFORCES playwright↔juggler compatibility: an incompatible pin fails the
# launch/drive (TargetClosedError / protocol error) and nothing publishes.
# Bump the pin file in lockstep when the juggler is re-synced from upstream.
shell: bash
run: python -m pip install --quiet "playwright==$(cat scripts/playwright_pin.txt)"
- name: Linux system deps for headless firefox
if: matrix.kind == 'linux'
run: sudo "$(which python)" -m playwright install-deps firefox
- name: Extract + locate firefox binary
shell: bash
run: |
set -e
mkdir -p ff
A="art/${{ matrix.asset }}"
case "${{ matrix.kind }}" in
win) python -c "import zipfile; zipfile.ZipFile('$A').extractall('ff')"; EXE="ff/firefox.exe";;
linux) tar xzf "$A" -C ff; EXE="ff/firefox";;
mac) tar xzf "$A" -C ff; EXE="ff/Firefox.app/Contents/MacOS/firefox";;
esac
[ -e "$EXE" ] || { echo "ERROR: firefox binary not found at $EXE"; exit 1; }
chmod +x "$EXE" 2>/dev/null || true
echo "FF_EXE=$EXE" >> "$GITHUB_ENV"
echo "located: $EXE"
- name: DRIVE GATE — Playwright launch via juggler + real page (+ interaction on --full)
shell: bash
run: python scripts/ci_drive_gate.py "$FF_EXE" ${{ matrix.extra }}
# CLOAK + WEBGL-MASKING GUARDS — run the wrapper's e2e cloak/gamma checks
# against THIS leg's freshly-built artifact, on its native runner. The
# wrapper's headless=True is headed+hidden (cloak on Win/macOS, its own
# Xvfb on Linux). Linux (Xvfb + llvmpipe) and Windows (WARP) give a
# software WebGL context on the GPU-less hosts, so the WebGL-dependent
# assertions run there. macOS GitHub runners expose NO WebGL in the CI
# session at all (even vanilla Firefox; macOS has no software-GL fallback),
# so on the mac legs the WebGL checks self-skip and the cloak is validated
# via its non-blank screenshot + CGWindowAlpha == 0. test_cloak asserts the
# window is hidden (Windows DWMWA_CLOAKED / macOS CGWindowAlpha) AND still
# renders — the macOS leg is the only place the cocoa cloak patch gets RUN.
# The webgl guard catches a regression of the gamma readPixels noise back to
# the pixelscan-maskable ±1 spike form (covered on Linux + Windows).
- name: Install pyobjc Quartz (macOS — to read the cloak window alpha)
if: matrix.kind == 'mac'
run: python -m pip install --quiet pyobjc-framework-Quartz
- name: Cloak + WebGL-masking guards (headed)
shell: bash
run: |
python -m pip install --quiet ".[dev]"
INVPW_BINARY_PATH="$FF_EXE" python -m pytest \
tests/test_cloak.py \
"tests/test_fingerprint_surface.py::test_webgl_readpixels_no_masking_signature" \
-m e2e -o addopts='' -q
publish:
name: publish-draft-release
needs: [build, gate]
runs-on: ubuntu-24.04
permissions:
contents: write
steps:
- name: Checkout wrapper (for scripts/gen_release_notes.py)
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with: { fetch-depth: 1 }
- name: Set up Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with: { python-version: '3.11' }
- name: Download all build assets
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with: { pattern: asset-*, path: dl, merge-multiple: true }
- name: Download source-commit metadata
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with: { name: source-commit, path: src-meta }
- name: Assert all 5 target archives present (no silent partial release)
run: |
cd dl
EXPECTED="
firefox-150.0.1-stealth-linux-x86_64.tar.gz
firefox-150.0.1-stealth-linux-arm64.tar.gz
firefox-150.0.1-stealth-win-x86_64.zip
firefox-150.0.1-stealth-macos-arm64.tar.gz
firefox-150.0.1-stealth-macos-x86_64.tar.gz
"
for a in $EXPECTED; do
[ -s "$a" ] || { echo "ERROR: missing/empty release asset: $a (a build leg silently dropped out?)"; exit 1; }
done
echo "all 5 target archives present"
- name: Generate checksums.txt
run: |
cd dl; ls -la
# explicit glob — never include checksums.txt itself (the `*`-includes-itself trap)
sha256sum firefox-150.0.1-stealth-* > checksums.txt
echo "----- checksums.txt -----"; cat checksums.txt
- name: Resolve release tag
id: tag
run: |
TAG="${{ github.event.inputs.release_tag }}"
[ -z "$TAG" ] && TAG="${GITHUB_REF_NAME}"
echo "tag=$TAG" >> "$GITHUB_OUTPUT"
# bare revision number for the release title: firefox-10 -> 10
N="${TAG#firefox-}"
echo "num=$N" >> "$GITHUB_OUTPUT"
# previous release tag, for the changelog range (firefox-10 -> firefox-9)
case "$N" in (*[!0-9]*|'') echo "prevtag=" >> "$GITHUB_OUTPUT";;
(*) echo "prevtag=firefox-$((N-1))" >> "$GITHUB_OUTPUT";; esac
echo "publishing DRAFT release for tag: $TAG"
- name: Build release notes from the source commits
id: notes
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -e
CUR="$(cat src-meta/source-commit.txt 2>/dev/null | tr -d '[:space:]')"
echo "this build's source commit: ${CUR:-<none>}"
# previous release's recorded source commit — gives the changelog range.
# Missing (first automated notes / firefox-0) -> notes omit the changelog.
PREV=""
PREVTAG="${{ steps.tag.outputs.prevtag }}"
if [ -n "$PREVTAG" ] && gh release download "$PREVTAG" -R "${{ github.repository }}" \
--pattern source-commit.txt --dir prev 2>/dev/null; then
PREV="$(cat prev/source-commit.txt | tr -d '[:space:]')"
echo "previous ($PREVTAG) source commit: $PREV"
else
echo "no previous source-commit.txt — changelog section omitted this time"
fi
python scripts/gen_release_notes.py --tag "${{ steps.tag.outputs.tag }}" \
--current "$CUR" --prev-sha "$PREV" --source-repo "${{ env.SOURCE_REPO }}" > body.md
echo "----- generated body.md -----"; cat body.md
# publish THIS build's source commit so the next release can diff from it
cp src-meta/source-commit.txt dl/source-commit.txt
- name: Create DRAFT release with all assets
uses: softprops/action-gh-release@3bb12739c298aeb8a4eeaf626c5b8d85266b0e65 # v2
with:
tag_name: ${{ steps.tag.outputs.tag }}
name: invisible_firefox (150.0.1) rev ${{ steps.tag.outputs.num }}
draft: true
prerelease: false
fail_on_unmatched_files: true
files: |
dl/*.tar.gz
dl/*.zip
dl/checksums.txt
dl/source-commit.txt
body_path: body.md
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

35
.github/workflows/tests.yml vendored Normal file
View file

@ -0,0 +1,35 @@
name: tests
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
jobs:
unit:
name: pytest (${{ matrix.os }}, py${{ matrix.python }})
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest]
python: ["3.11", "3.12"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python }}
cache: pip
- name: Install package + dev extras
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
- name: Run pytest
run: pytest tests/ -v --tb=short

111
.github/workflows/verify-assets.yml vendored Normal file
View file

@ -0,0 +1,111 @@
# ─────────────────────────────────────────────────────────────────────────────
# verify-assets.yml — re-runnable DRIVE GATE for an EXISTING release's assets.
#
# release.yml drive-gates every binary it builds. This does the same drive test
# WITHOUT rebuilding: it downloads a release's already-published assets (works on
# DRAFT releases too via GITHUB_TOKEN) and drives each one on its native runner.
#
# Use it to:
# • drive-test a release that was built before the in-pipeline gate existed
# (e.g. firefox-9, built on the old release.yml), or
# • re-verify any shipped release on demand (regression check).
#
# Same single-source-of-truth drive logic as release.yml: scripts/ci_drive_gate.py.
# Headless, no screenshot → GPU-free. Zero proxy / zero secrets.
# ─────────────────────────────────────────────────────────────────────────────
name: verify-assets
on:
workflow_dispatch:
inputs:
release_tag:
description: 'release tag whose assets to drive-test (e.g. firefox-9)'
required: true
permissions:
# write (not read) is required: GitHub only exposes DRAFT releases to tokens
# with push access. With contents:read, `gh release download` on a draft tag
# 404s ("release not found"). This workflow only READS assets — the elevated
# scope is solely to make draft releases visible to GITHUB_TOKEN.
contents: write
jobs:
drive:
name: drive-${{ matrix.leg }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 25
strategy:
fail-fast: false
matrix:
include:
# --full (interaction) only on the reliable linux-x86_64 leg; others run
# the robust SMOKE drive. Same rationale as release.yml's gate.
- leg: linux-x86_64
runner: ubuntu-24.04
kind: linux
asset: firefox-150.0.1-stealth-linux-x86_64.tar.gz
extra: '--full'
- leg: linux-arm64
runner: ubuntu-24.04-arm
kind: linux
asset: firefox-150.0.1-stealth-linux-arm64.tar.gz
extra: ''
- leg: win-x86_64
runner: windows-latest
kind: win
asset: firefox-150.0.1-stealth-win-x86_64.zip
extra: ''
- leg: macos-arm64
runner: macos-15
kind: mac
asset: firefox-150.0.1-stealth-macos-arm64.tar.gz
extra: ''
- leg: macos-x86_64
runner: macos-15-intel
kind: mac
asset: firefox-150.0.1-stealth-macos-x86_64.tar.gz
extra: ''
steps:
- name: Checkout wrapper (for scripts/ci_drive_gate.py)
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with: { fetch-depth: 1 }
- name: Download the release asset (draft releases included)
shell: bash
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -e
mkdir -p art
gh release download "${{ github.event.inputs.release_tag }}" \
--repo "${{ github.repository }}" \
--pattern "${{ matrix.asset }}" \
--dir art
ls -la art/
- name: Set up Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with: { python-version: '3.11' }
- name: Install Playwright driver (no bundled browser — we override executable_path)
# Single-source pin (see release.yml); the drive gate enforces juggler compat.
shell: bash
run: python -m pip install --quiet "playwright==$(cat scripts/playwright_pin.txt)"
- name: Linux system deps for headless firefox
if: matrix.kind == 'linux'
run: sudo "$(which python)" -m playwright install-deps firefox
- name: Extract + locate firefox binary
shell: bash
run: |
set -e
mkdir -p ff
A="art/${{ matrix.asset }}"
case "${{ matrix.kind }}" in
win) python -c "import zipfile; zipfile.ZipFile('$A').extractall('ff')"; EXE="ff/firefox.exe";;
linux) tar xzf "$A" -C ff; EXE="ff/firefox";;
mac) tar xzf "$A" -C ff; EXE="ff/Firefox.app/Contents/MacOS/firefox";;
esac
[ -e "$EXE" ] || { echo "ERROR: firefox binary not found at $EXE"; exit 1; }
chmod +x "$EXE" 2>/dev/null || true
echo "FF_EXE=$EXE" >> "$GITHUB_ENV"
echo "located: $EXE"
- name: DRIVE GATE — Playwright launch via juggler + real page (+ interaction on --full)
shell: bash
run: python scripts/ci_drive_gate.py "$FF_EXE" ${{ matrix.extra }}

103
.github/workflows/verify-cloak.yml vendored Normal file
View file

@ -0,0 +1,103 @@
# ─────────────────────────────────────────────────────────────────────────────
# verify-cloak.yml — re-runnable CLOAK + WEBGL-MASKING GUARDS for an EXISTING
# build run's artifacts, WITHOUT rebuilding Firefox (~3h on the mac legs).
#
# release.yml runs these same guards in its `gate` job against each freshly-built
# artifact. This re-runs them against the artifacts of a PRIOR build run (input
# `run_id`) using the CURRENT wrapper code on the default branch — so a test-only
# fix (e.g. making the macOS leg tolerant of the runner's missing WebGL) can be
# validated against the real binaries in ~10 min instead of paying a full rebuild.
#
# Same guard command as release.yml's gate. Headed-but-cloaked; zero proxy / zero
# secrets. The macOS legs are the only place the cocoa cloak patch actually RUNS.
# ─────────────────────────────────────────────────────────────────────────────
name: verify-cloak
on:
workflow_dispatch:
inputs:
run_id:
description: 'build run id whose asset-* artifacts to re-gate (e.g. 27346856197)'
required: true
permissions:
contents: read
actions: read # download-artifact needs this to read another run's artifacts
jobs:
guard:
name: guard-${{ matrix.leg }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 25
strategy:
fail-fast: false
matrix:
# Same legs/runners/assets as release.yml's gate matrix.
include:
- leg: linux-x86_64
runner: ubuntu-24.04
kind: linux
asset: firefox-150.0.1-stealth-linux-x86_64.tar.gz
- leg: linux-arm64
runner: ubuntu-24.04-arm
kind: linux
asset: firefox-150.0.1-stealth-linux-arm64.tar.gz
- leg: win-x86_64
runner: windows-latest
kind: win
asset: firefox-150.0.1-stealth-win-x86_64.zip
- leg: macos-arm64
runner: macos-15
kind: mac
asset: firefox-150.0.1-stealth-macos-arm64.tar.gz
- leg: macos-x86_64
runner: macos-15-intel
kind: mac
asset: firefox-150.0.1-stealth-macos-x86_64.tar.gz
steps:
- name: Checkout wrapper (current default branch — the FIXED tests)
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with: { fetch-depth: 1 }
- name: Download build asset from the prior run (no rebuild)
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: asset-${{ matrix.leg }}
path: art
run-id: ${{ github.event.inputs.run_id }}
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with: { python-version: '3.11' }
- name: Install Playwright driver (no bundled browser — we override executable_path)
# Single-source pin (see release.yml); the wrapper enforces juggler compat.
shell: bash
run: python -m pip install --quiet "playwright==$(cat scripts/playwright_pin.txt)"
- name: Linux system deps for headless firefox
if: matrix.kind == 'linux'
run: sudo "$(which python)" -m playwright install-deps firefox
- name: Extract + locate firefox binary
shell: bash
run: |
set -e
mkdir -p ff
A="art/${{ matrix.asset }}"
case "${{ matrix.kind }}" in
win) python -c "import zipfile; zipfile.ZipFile('$A').extractall('ff')"; EXE="ff/firefox.exe";;
linux) tar xzf "$A" -C ff; EXE="ff/firefox";;
mac) tar xzf "$A" -C ff; EXE="ff/Firefox.app/Contents/MacOS/firefox";;
esac
[ -e "$EXE" ] || { echo "ERROR: firefox binary not found at $EXE"; exit 1; }
chmod +x "$EXE" 2>/dev/null || true
echo "FF_EXE=$EXE" >> "$GITHUB_ENV"
echo "located: $EXE"
- name: Install pyobjc Quartz (macOS — to read the cloak window alpha)
if: matrix.kind == 'mac'
run: python -m pip install --quiet pyobjc-framework-Quartz
- name: Cloak + WebGL-masking guards (headed)
shell: bash
run: |
python -m pip install --quiet ".[dev]"
INVPW_BINARY_PATH="$FF_EXE" python -m pytest \
tests/test_cloak.py \
"tests/test_fingerprint_surface.py::test_webgl_readpixels_no_masking_signature" \
-m e2e -o addopts='' -q

47
.github/workflows/webrtc-e2e.yml vendored Normal file
View file

@ -0,0 +1,47 @@
name: webrtc-e2e
# Live WebRTC realness check against the shipped patched binary.
#
# Manual (workflow_dispatch) on purpose: it needs a firefox-N binary that
# carries the WebRTC fixes (synthetic srflx in genuine nICEr form + the
# default-route fallback behind a proxy). Run it after publishing such a
# binary — it is the release gate for "WebRTC looks real behind a proxy".
# Until that binary ships, test_not_blocked_behind_tcp_only_socks is EXPECTED
# to fail (the old binary is fully blocked behind a SOCKS proxy), which is the
# whole point of the gate.
#
# No smartproxy / credentials: the "behind a proxy" condition is faked by an
# in-process TCP-only SOCKS5 server (refuses UDP ASSOCIATE) and the egress IP
# is injected as an RFC 5737 TEST-NET address. Fully self-contained.
on:
workflow_dispatch:
jobs:
webrtc-e2e:
name: webrtc realness (ubuntu, py3.12)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: pip
- name: Install package + dev extras
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
- name: Fetch the patched Firefox binary
run: python -m invisible_playwright fetch
- name: Resolve binary path
run: echo "STEALTHFOX_E2E_BINARY=$(python -m invisible_playwright path)" >> "$GITHUB_ENV"
- name: Run WebRTC realness e2e (xvfb for the headless Firefox)
run: |
sudo apt-get update && sudo apt-get install -y xvfb
xvfb-run -a pytest tests/test_webrtc_realness.py -m e2e -o addopts="" -v -rs

132
CHANGELOG.md Normal file
View file

@ -0,0 +1,132 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Added
- `timezone="auto"`: the browser timezone is auto-derived from the egress IP. By default (no explicit timezone) it ALWAYS resolves — from the proxy egress when a proxy is set, otherwise from the host's own public IP — so the zone can never disagree with the IP (the classic `timezone_mismatch` signal). An explicit `"Area/City"` is the only way to force a specific zone. On failure: with a proxy the launch raises (no silent host-TZ fallback behind a foreign proxy); without a proxy it falls back to the host TZ so a transient lookup can't break the launch.
- The egress IP is mapped to its IANA zone with an offline mmdb (`daijro/geoip-all-in-one`). It auto-updates against the upstream weekly rebuild: cached locally, re-checked after `GEOIP_REFRESH_DAYS` (7), older copies pruned, and a stale cache is reused when offline. `STEALTHFOX_GEOIP_MMDB` points at your own `.mmdb` to skip the download.
- `resolve_session_timezone(timezone, proxy)` and `ensure_geoip_mmdb()` re-exported at the package root (plus `GeoTimezoneError`) so integrations that own their launch can reproduce the resolution.
- `tests/test_geo.py` (37) + `tests/test_geoip_update.py` (freshness / auto-update / offline fallback) unit tests.
### Changed
- New runtime dependencies: `requests[socks]` (SOCKS egress lookup), `maxminddb` (mmdb reader), `tzdata` (IANA database for `zoneinfo`, which Windows lacks).
## [0.2.0] - 2026-05-28
### Added
- Public config helpers in `invisible_playwright.config`: `get_default_stealth_prefs(seed, *, pin, locale, timezone, extra_prefs, humanize, virtual_display)` returns a complete `firefox_user_prefs` dict; `get_default_args()` returns the baseline CLI args list (currently empty). Both also re-exported at the package root.
- `invisible_playwright.ensure_binary` re-exported at the package root for parity with the `cloakbrowser.download.ensure_binary` integration pattern that downstream projects (Skyvern, Crawlee, agno) already expect.
- These helpers let third-party fetchers (changedetection.io plugins, Crawlee `BrowserPool` subclasses, agno toolkits) drive `playwright.firefox.launch(executable_path=..., firefox_user_prefs=...)` themselves without depending on the `InvisiblePlaywright` context manager owning the lifecycle.
- `tests/unit/test_config_public.py`: 14 unit tests covering deterministic seed, locale / timezone / pin / extra_prefs / humanize variations, and round-trip via the public namespace.
### Unchanged
- `InvisiblePlaywright` context manager surface is identical (backwards compatible).
- `BINARY_VERSION` stays at `firefox-7`. Python-only release; no new Firefox build.
## [0.1.8] - 2026-05-23
### Fixed
- [#20](https://github.com/feder-cr/invisible_playwright/issues/20): cross-origin iframes were unreachable from Playwright. `element_handle.content_frame()` returned `None`, `frame.evaluate()` threw cross-origin SOP errors, and `frame_locator(...).click()` timed out even with `force=True`. Root cause: FF150 defaults `fission.webContentIsolationStrategy=1` (`IsolateEverything`), which site-isolates every cross-origin iframe into a separate `webIsolated` content process even when `fission.autostart=False`. The parent's Juggler FrameTree then has a Frame placeholder with no docShell and no URL — every protocol op that needs to enter the iframe fails. Fix: pin `fission.webContentIsolationStrategy=0` (`IsolateNothing`) in the baseline prefs. The setting can be flipped back per session via `extra_prefs={"fission.webContentIsolationStrategy": 1}`.
### Added
- `tests/test_cross_origin_iframe.py`: 4 unit + 5 e2e regression sentinels for cross-origin iframe interaction. The e2e layer runs entirely offline against two local HTTP servers on `127.0.0.1` (two ports = two SOP origins) and covers `page.frames` URL tracking, `content_frame()`, `frame.evaluate()`, `frame_locator(...).locator(...)`, and end-to-end `dispatch_event("click")` for plain, sandboxed and titled iframes. A future FF upgrade or fingerprint A/B that flips the pref back to `1` will fail the suite before shipping.
### Unchanged
- `BINARY_VERSION` stays at `firefox-7`. Python-only release; no new Firefox build was needed.
## [0.1.7] - 2026-05-21
### Fixed
- [#18](https://github.com/feder-cr/invisible_playwright/issues/18): Tab crash when running with `headless=True` on Windows on pages that trigger cross-process navigation. Two separate bugs that only manifested together: (1) the Chromium content sandbox at default level 6 puts content processes on `kAlternateWinstation`, but the wrapper hides the browser window on its own alt-desktop (`CreateDesktop` for headless on Windows). Mismatched desktops → cross-process navigations couldn't reparent windows → content process exits cleanly and Playwright fires `page.on('crash')`. (2) The canvas2d `getImageData` stealth spoof wrote to a read-only mapped `DataSourceSurface`. On GPU-backed canvases that memory is write-protected → segfault during the final `getImageData` at page unload. Wrapper now sets `security.sandbox.content.level=4` in the alt-desktop workaround set, and `firefox-7` ships the source fix that moves the noise to the JS array's writable backing buffer.
### Changed
- `BINARY_VERSION` bumped from `firefox-5` to `firefox-7`. `firefox-6` was rolled back when its partial fix turned out to be wrong (the iframe-burst hypothesis was a dead end; bisection in the evening found the real two-bug cause documented above).
## [0.1.6] - 2026-05-21
### Added
- `profile_dir=` kwarg on `InvisiblePlaywright` (sync + async). When set, the session uses `firefox.launch_persistent_context()` so cookies, localStorage, sessionStorage, extensions, cache and prefs are kept on disk between runs. `__enter__` returns a `BrowserContext` directly: `with InvisiblePlaywright(profile_dir=p) as ctx: ctx.new_page()`. Pair with a stable `seed=` to also pin the fingerprint identity across runs. First run creates the dir; subsequent runs reuse it.
### Fixed
- `launch_persistent_context(timezone_id="…")` no longer times out at 180s. Root cause: `juggler/content/main.js` calls `docShell.overrideTimezone(...)` on every navigation; the patched Firefox up to firefox-4 didn't expose that IDL method on `nsIDocShell`, so the call threw `TypeError: docShell.overrideTimezone is not a function`. On the non-persistent path the error fired *after* launch and was harmless; on the persistent path it blocked the launch handshake. `firefox-5` ships the C++ method (see `patch.md` section 19); this release removes the firefox-4 era Python workaround that was filtering `locale`/`timezone_id` out of the persistent context kwargs.
### Changed
- `BINARY_VERSION` bumped from `firefox-4` to `firefox-5`. The Python source delta is JS/Python only; the new Firefox build adds 50 lines of C++ in `docshell/base/nsIDocShell.idl` + `nsDocShell.cpp`.
## [0.1.5] - 2026-05-20
### Fixed
- [#15](https://github.com/feder-cr/invisible_playwright/pull/15): `python -m invisible_playwright fetch` raised `RuntimeError: no SHA256 for firefox-150.0.1-stealth-linux-x86_64.tar.gz in checksums.txt` for every user because the parser kept the `*` binary-mode prefix that `sha256sum` writes in front of filenames. Now `.lstrip("*")` is applied to the key. Reporter + patch: [@LostBoxArt](https://github.com/LostBoxArt). Unrelated to the `firefox-N` binary; existing caches still work, only first-time fetches were broken.
## [0.1.4] - 2026-05-20
### Fixed
- [#13](https://github.com/feder-cr/invisible_playwright/issues/13): every page that threw an uncaught JS error (e.g. bunny.net) crashed the Playwright client with `TypeError: Cannot read properties of undefined (reading 'url')`. Root cause: upstream Playwright Juggler added a required `location` field to the `Page.uncaughtError` event in the 2026-05-07 roll ([microsoft/playwright@c8604ec](https://github.com/microsoft/playwright/commit/c8604ecd97)); our fork was carrying the pre-roll schema in every `firefox-N` build. Fix matches upstream — Runtime.js builds the `errorLocation`, PageAgent.js forwards it on both worker and runtime error paths, Protocol.js declares the schema field. Reporter: [@dionorgua](https://github.com/dionorgua).
### Changed
- `BINARY_VERSION` bumped from `firefox-3` to `firefox-4`. JS-only change inside `chrome/juggler/`; `xul.dll` and `firefox.exe` are byte-identical to `firefox-3`.
## [0.1.3] - 2026-05-19
### Changed
- `BINARY_VERSION` bumped from `firefox-2` to `firefox-3`. The new archives on both Windows and Linux are built from a clean clone of [feder-cr/invisible_firefox#stealth/150](https://github.com/feder-cr/invisible_firefox/tree/stealth/150) — the consolidated source-of-truth fork (renamed from `feder-cr/firefox`; the companion `feder-cr/firefox-stealth` patches repo was deleted, all patches now live as commits on top of `mozilla-firefox/firefox`).
- The patched Firefox archive now ships the **proper C++ implementation** of `windowUtils.jugglerSendMouseEvent`, replacing the JS shim from 0.1.2.
### C++ fixes landed in this release
- **C1+C2**: `setDownloadInterceptor` IDL + cpp (re-landed for FF150).
- **C4**: 5 `nsIDocShell` stealth attributes (`fileInputInterceptionEnabled`, `overrideHasFocus`, `bypassCSPEnabled`, `forceActiveState`, `disallowBFCache`).
- **C5**: `LauncherProcessWin.cpp` + `nsWindowsWMain.cpp` juggler-pipe handle inheritance — without this, the Playwright pipe disconnects immediately on launch.
- **C6**: `juggler-navigation-started-renderer` / `-browser` observer notifications in `nsDocShell.cpp` and `CanonicalBrowsingContext.cpp` — without these, `Page.ready` never fires and `ctx.new_page()` hangs.
- **C7 (partial)**: storage stub for `nsIDocShell.languageOverride`. Workaround `InvisiblePlaywright(locale="")` recommended until full BC FIELD port lands.
### Verified
- Both archives built from same source: feder-cr/invisible_firefox commit `68906f1f9c55`.
- Windows + Linux smoke suite green: launch, `ctx.new_page()`, `page.mouse.{move,down,up,click,wheel}`, `navigator.webdriver=false`, sannysoft 32/33 PASS.
- SHA256 published in `checksums.txt` on the `firefox-3` release.
### Notes
- This is the first release with a native Linux build of the patched binary (previous `firefox-3` draft mentioned shipping the Linux firefox-2 archive byte-for-byte; that no longer applies — Linux now has the full C++ patch series).
## [0.1.2] - 2026-05-18
### Changed
- `BINARY_VERSION` bumped from `firefox-1` to `firefox-2`. The patched Firefox archive on GitHub Releases now contains the JS fix from 0.1.1 (every `page.mouse.*` / `page.click()` / `locator.click()` / `mouse.wheel()` failure on the FF150 binary). Users on 0.1.1 must run `python -m invisible_playwright clear-cache && python -m invisible_playwright fetch` to pick up the new archive.
### Verified
- Archive integrity tests on both platforms: Windows zip extracted + booted via Playwright (`mouse.move + click + page.click(selector)` all succeed end-to-end), Linux tarball file-level checks (firefox/libxul.so sizes, byte-identity of patched JS files against Windows source). 21/21 assertions pass.
- SHA256 published in `checksums.txt` on the `firefox-2` release.
## [0.1.1] - 2026-05-18
### Fixed
- **Critical**: every `page.mouse.*`, `page.click(selector)`, `locator.click()`, `page.hover()`, `mouse.wheel()` failed on the patched Firefox 150 binary with `win.windowUtils.jugglerSendMouseEvent is not a function`. The Juggler JS was porting calls to a Playwright-specific C++ method that was never landed in the FF146→FF150 port; replaced with the Mozilla chrome-scope `win.synthesizeMouseEvent` helper which is present in FF150. Six call sites patched across `juggler/protocol/PageHandler.js` and `juggler/content/PageAgent.js`. Reporter: [@trob9](https://github.com/trob9) — [#9](https://github.com/feder-cr/invisible_playwright/issues/9).
- `_linkedBrowser.scrollRectIntoViewIfNeeded()` is now guarded at both call sites in `PageHandler.js` (`dispatchMouseEvent` and `dispatchWheelEvent`) — the method is not present on the shipped FF150 `<browser>` element, so the unguarded call threw before the mouse event was dispatched.
### Added
- `tests/test_mouse.py`: 12-case regression suite covering every patched code path (mouse.move/click/dblclick/right-click, modifiers, locator.click/hover, wheel, manual mousedown+up, off-viewport move, humanize intermediate moves, scroll-and-click on offscreen element). Test cases inspired by `microsoft/playwright-python/tests/async/test_click.py`.
- Community standards: `CODE_OF_CONDUCT.md`, `CONTRIBUTING.md`, `SECURITY.md`, `.github/ISSUE_TEMPLATE/*`, `.github/PULL_REQUEST_TEMPLATE.md`.
### Notes
- The Stealthfox humanize Bezier expansion continues to fire intermediate `mousemove` events; the swap to `synthesizeMouseEvent` does not change the human-trajectory behavior (verified by test).
- The reCAPTCHA v3 score (0.90) and FingerprintPro / CreepJS results documented in the README are unaffected — `synthesizeMouseEvent` is a legitimate Mozilla helper that does not increase the anti-detect surface.
- A binary refresh of the patched Firefox archive on GitHub Releases is required for users to receive this fix (the Juggler JS is shipped inside the archive). The `BINARY_VERSION` will be bumped to `firefox-2` in that release.
## [0.1.0] - 2026-05-13
### Added
- Initial public release.
- `InvisiblePlaywright` sync and async context managers — drop-in replacement for `playwright.sync_api.Browser` / `async_api.Browser`.
- StealthFox humanize hook: Bezier-curve mouse trajectories enabled by default.
- `_fpforge` Bayesian fingerprint sampler with ~400 fields per session.
- CLI: `invisible-playwright fetch | path | version | clear-cache`.
- Pinnable fingerprint fields via `pin={...}` (see `docs/pinning.md`).
- SOCKS5 / SOCKS4 / HTTP / HTTPS proxy support with auth.
- Linux x86_64 and Windows x86_64 binary support.
[Unreleased]: https://github.com/feder-cr/invisible_playwright/compare/v0.1.1...HEAD
[0.1.1]: https://github.com/feder-cr/invisible_playwright/compare/v0.1.0...v0.1.1
[0.1.0]: https://github.com/feder-cr/invisible_playwright/releases/tag/v0.1.0

33
CODE_OF_CONDUCT.md Normal file
View file

@ -0,0 +1,33 @@
# Code of Conduct
This project follows the [Contributor Covenant, v2.1](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).
## Our Pledge
We pledge to make participation in our community a harassment-free experience for everyone.
## Standards
Examples of behavior that contributes to a positive environment:
- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
Examples of unacceptable behavior:
- The use of sexualized language or imagery
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information without explicit permission
## Enforcement
Instances of unacceptable behavior may be reported by contacting the maintainer at **federico.elia.majo@gmail.com**. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances.
The maintainer is obligated to maintain confidentiality with regard to the reporter of an incident.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.

79
CONTRIBUTING.md Normal file
View file

@ -0,0 +1,79 @@
# Contributing to invisible_playwright
Thanks for your interest in improving this project. Contributions are welcome via issues and pull requests.
## Quick links
- **Bug?** Open a [bug report](https://github.com/feder-cr/invisible_playwright/issues/new?template=bug_report.yml).
- **Idea?** Open a [feature request](https://github.com/feder-cr/invisible_playwright/issues/new?template=feature_request.yml).
- **Security issue?** Do **not** open a public issue — see [SECURITY.md](SECURITY.md).
- **The C++ patches** live in the companion repo [feder-cr/invisible_firefox](https://github.com/feder-cr/invisible_firefox) (branch `stealth/150`). Bugs in fingerprint spoofing usually belong there.
## Scope
This repository ships the **Python wrapper** (`invisible_playwright`) around a pre-built patched Firefox. In scope:
- The `InvisiblePlaywright` sync/async API and launcher
- The fingerprint sampler (`_fpforge`)
- Binary download/caching, CLI, proxy plumbing
- Tests, docs, examples, packaging
Out of scope (belongs in `invisible_firefox`):
- Changes to the Firefox C++ source
- New preferences exposed by the patched binary
- Canvas / WebGL / WebRTC / font spoofing logic
## Development setup
```bash
git clone https://github.com/feder-cr/invisible_playwright.git
cd invisible_playwright
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
python -m invisible_playwright fetch # download the patched Firefox binary
```
Requires Python 3.11+ and one of: Windows x86_64, Linux x86_64.
## Running tests
```bash
pytest # unit + integration (default — fast)
pytest -m e2e # end-to-end, requires the patched binary
pytest -m slow # wheel-build regression tests
```
Markers are defined in `pyproject.toml`. The default run excludes `slow` and `e2e`.
## Pull requests
1. Fork and create a topic branch (`fix/...`, `feat/...`, `docs/...`).
2. Keep PRs focused — one logical change per PR.
3. Add or update tests for any behavior change.
4. Make sure the default `pytest` run is green.
5. Follow [Conventional Commits](https://www.conventionalcommits.org/) for commit messages (e.g. `fix(launcher): handle missing profile dir`).
6. Update `README.md` or `docs/` when changing user-visible behavior.
7. Open the PR against `main`, fill in the PR template, and link any related issue.
CI must be green before merge.
## Reporting bugs
Before opening, please:
- Search [existing issues](https://github.com/feder-cr/invisible_playwright/issues) — the bug may already be tracked.
- Reproduce on the **latest release** if possible.
- Confirm the issue is in the Python wrapper, not the patched Firefox itself. If a fingerprint is leaking or a detector flags the browser, open the issue at `feder-cr/invisible_firefox` instead.
Include:
- OS and version, Python version, `invisible_playwright` version (`invisible_playwright version`)
- A minimal reproduction
- Expected vs actual behavior
- Relevant logs / stack traces
## License
By contributing, you agree that your contributions will be licensed under the MIT License (see [LICENSE](LICENSE)).

166
README.md
View file

@ -1,81 +1,31 @@
# stealthfox # invisible_playwright
[![tests](https://github.com/feder-cr/invisible_playwright/actions/workflows/tests.yml/badge.svg)](https://github.com/feder-cr/invisible_playwright/actions/workflows/tests.yml)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Firefox 150.0.1](https://img.shields.io/badge/firefox-150.0.1-orange.svg)](https://www.mozilla.org/firefox/) [![Firefox 150.0.1](https://img.shields.io/badge/firefox-150.0.1-orange.svg)](https://www.mozilla.org/firefox/)
[![GitHub release](https://img.shields.io/github/v/release/feder-cr/stealthfox.svg)](https://github.com/feder-cr/stealthfox/releases) [![GitHub release](https://img.shields.io/github/v/release/feder-cr/invisible_playwright.svg)](https://github.com/feder-cr/invisible_playwright/releases)
[![GitHub stars](https://img.shields.io/github/stars/feder-cr/stealthfox.svg?style=social)](https://github.com/feder-cr/stealthfox/stargazers) [![GitHub stars](https://img.shields.io/github/stars/feder-cr/invisible_playwright.svg?style=social)](https://github.com/feder-cr/invisible_playwright/stargazers)
[![browser launches](https://img.shields.io/github/downloads/feder-cr/invisible_firefox/usage-counter/total?label=browser%20launches&color=blue)](https://github.com/feder-cr/invisible_firefox/releases/tag/usage-counter)
A patched Firefox **100% Playwright-compatible** that passes the hardest browser-fingerprint detectors in the wild. [![LinkedIn](https://img.shields.io/badge/LinkedIn-Federico%20Elia-0A66C2?logo=linkedin&logoColor=white)](https://it.linkedin.com/in/federico-elia-5199951b6)
**Stealth Firefox that passes every bot detection test. Drop-in Playwright replacement, fingerprint patched at the C++ level, not a JavaScript shim.**
## Results ![invisible_playwright - 5/5 detection suites passed](docs/screenshots/hero.gif)
These are the "best" outcomes observed across independent runs on residential proxies.
### Google reCAPTCHA v3 - **0.90 / 1.0**
Top-tier score. Google classifies the session as "very likely a human". Most anti-detect stacks plateau around 0.3-0.7.
![reCAPTCHA score 0.90](docs/screenshots/recaptcha_score.png)
### Fingerprint Pro - **bot: not detected, VPN: false, tampering: false, dev tools: not detected**
FingerprintJS Pro's full Smart Signals battery flips every flag to "Not detected". Browser correctly identified as Firefox 150 on Windows 10. Confidence score 0.9.
![FingerprintPro not detected](docs/screenshots/fingerprintpro.png)
### CreepJS - **0 lies**, fingerprint is internally coherent
No contradictions between headless hints, spoofed values, and real rendering output. That "0 lies" is what kills most anti-detect browsers: one inconsistency (e.g. Chrome UA + Firefox WebGL) and the trust score collapses.
![CreepJS 0 lies](docs/screenshots/creepjs.png)
### BrowserLeaks WebRTC - **no public IP leak**
WebRTC srflx address is the proxy egress IP; host candidates are private LAN. The real public IP never leaks via STUN, even on pages that configure their own ICE servers. Stock Firefox leaks the real local IP via WebRTC mDNS - stealthfox doesn't.
![WebRTC no leaks](docs/screenshots/webrtc.png)
### bot.sannysoft.com - **all checks pass**
Every row green: WebDriver not present, Chrome-only properties absent, plugin/mime/languages arrays coherent, permissions API correct, iframe/source window checks pass.
![Sannysoft all green](docs/screenshots/sannysoft.png)
---
## Why it's powerful ## Why it's powerful
**Most anti-detect browsers patch Chromium at the JavaScript level** - they override `navigator`, `WebGLRenderingContext.getParameter`, canvas APIs, and so on via injected scripts. This has two fatal problems:
**Most other anti-detect browsers patch Chromium at the JavaScript level** - they override `navigator`, `WebGLRenderingContext.getParameter`, canvas APIs, and so on via injected scripts. This has two fatal problems:
1. **JS patches are detectable.** Anti-bots enumerate native function `.toString()`, check descriptor configurability, compare property enumeration order, watch for prototype mutations. Every patch leaves a fingerprint of its own. CreepJS has an entire battery of "lies detectors" built around this. 1. **JS patches are detectable.** Anti-bots enumerate native function `.toString()`, check descriptor configurability, compare property enumeration order, watch for prototype mutations. Every patch leaves a fingerprint of its own. CreepJS has an entire battery of "lies detectors" built around this.
2. **Chromium itself is now suspect.** Residential-proxy bot traffic is overwhelmingly Chromium-based, so detectors weight anything Chromium-shaped as risky by default. And the parts that matter (TLS stack, renderer process) are not fully open-source in Chrome proper - forks either inherit all Chromium tells or drift in visible ways. 2. **Chromium itself is now suspect.** Residential-proxy bot traffic is overwhelmingly Chromium-based, so detectors weight anything Chromium-shaped as risky by default. Chromium-based forks inherit Chrome's open-source layers (BoringSSL, Blink, V8, ANGLE) cleanly, but they still cannot fully match Chrome in practice: Chrome ships closed-source components on top (Widevine, proprietary codecs, Google Update / Safe Browsing endpoints) that flip detectable JS feature flags and network signals, and forks lag Chrome's release cadence by days to weeks, leaving telltale version-specific behaviours that detectors lock onto.
**stealthfox patches Firefox at the C++ level.** The spoofed values come back out through the normal Gecko paths - there is no JS shim, no override, no `Object.defineProperty`. **From the page's point of view, the browser is just telling the truth.** Anti-bot lie-detectors have nothing to latch onto. **invisible_playwright patches Firefox at the C++ level.** The spoofed values come back out through the normal Gecko paths - there is no JS shim, no override, no `Object.defineProperty`. **From the page's point of view, the browser is just telling the truth.** Anti-bot lie-detectors have nothing to latch onto.
stealthfox spoofs **all the layers that matter, together, coherently**: invisible_playwright spoofs **all the layers that matter, together, coherently**: Navigator, screen, GPU/WebGL, Canvas, fonts, audio, WebRTC, timezone, DevTools detection, SOCKS5 auth, and the rest. See [feder-cr/invisible_firefox](https://github.com/feder-cr/invisible_firefox) for the full per-layer breakdown of which C++ files are patched and why.
| Layer | What we do | Why it matters |
|-------|-----------|-----------------|
| Navigator / hardware | C++ overrides: UA, oscpu, languages, hardwareConcurrency, deviceMemory, storage quota | Self-description coherent across every API |
| Screen / window / pointer | C++ patch: screen WxH, outerSize bound, media-query device-size, pointer/hover/touch capabilities | `screen.*`, `window.outer*`, CSS `@media (pointer: fine)` all coherent |
| CSS system colors | 40 `ui.*` Win32 palette overrides | `getComputedStyle()` on system colors matches real Windows |
| GPU / WebGL | C++ patch: vendor, renderer, extensions whitelist, integer/float params, shader precisions, readPixels noise | Matches real Windows ANGLE down to enum values |
| Canvas 2D | C++ patch: per-pixel substitution + geometry skip-mask noise + TextMetrics variance | Defeats canvas hashing and text-metrics fingerprinting |
| Fonts / DirectWrite | C++ patch: family whitelist + fabricated authoritative list + per-family width scale + DWrite settings | Font enumeration matches real Win10; canvas text hash stable |
| Audio | C++ patch: sampleRate + output latency + max channels + AnalyserNode/DynamicsCompressor noise | AudioContext fingerprints bucket users very tightly |
| Speech synthesis | C++ patch: fabricated voices list | `navigator.speechSynthesis.getVoices()` matches the spoofed OS |
| WebRTC | C++ patch (nICEr): srflx address swap + synthetic srflx fallback + private-LAN host candidates | Real public IP never leaks via STUN |
| Timezone | C++ patch: per-Realm TZ via BrowsingContext (no IPC pref races) | `Date.getTimezoneOffset()`, `Intl.DateTimeFormat` match the spoofed location |
| DevTools detection | C++ patch: `Debugger.stealthMode` + Juggler `Runtime.js` + thread actor | FP Pro `developer_tools` = Not detected even with debugger attached |
| SOCKS5 auth | C++ patch | Stock Playwright+Firefox cannot negotiate it at all |
| DNS | Routed through SOCKS proxy by default | No DNS leak when using a residential gateway |
| Mouse motion | Bezier curves inside Juggler `PageHandler.js`, ~10 ms per waypoint | Even `page.click(selector)` moves like a human |
| GPU on virtual desktop | Pref-driven workaround for FF150 alt-desktop sandbox regression | WebGL renderer populated even in headless / multi-worker mass tests |
| Fission navigation | C++ patch: `nsDocShell` + `CanonicalBrowsingContext` Juggler navigation fix | `page.goto()` reliable on FF150 across proxy edge cases |
| about:newtab race | Async wrapper sleep around `new_page()` | No "Navigation interrupted by about:newtab" on FF150 |
| Proxy reliability | Juggler `PageHandler.equalsExceptRef` split try/catch | No spurious "Invalid url" with proxies like Evomi |
Everything is driven by preferences - no hardcoded values in the binary. You change one pref, you change the spoofed value. Everything is driven by preferences - no hardcoded values in the binary. You change one pref, you change the spoofed value.
@ -83,32 +33,32 @@ Everything is driven by preferences - no hardcoded values in the binary. You cha
## How it compares ## How it compares
Commercial anti-detect browsers (Multilogin, GoLogin, AdsPower, Kameleo, Dolphin Anty, Browserbase) ship a patched Chromium and override fingerprints at the JavaScript layer. That's the ceiling - and it's a low one. **CloakBrowser** ships a similar pitch for Chromium, but its binary is **closed source** (the source-level patches are not published, you only get the compiled output), and it still hits the Chromium reCAPTCHA ceiling. The commercial anti-detect browsers (**Multilogin**, **GoLogin**, AdsPower, Dolphin, Kameleo) are paid SaaS that overlay JS-layer spoofing on a patched Chromium. Managed profiles are nice but raw detection bypass sits below both Camoufox and us.
| | stealthfox | Multilogin / GoLogin | AdsPower / Dolphin | Browserbase | | | invisible_playwright | Camoufox | CloakBrowser | Multilogin |
|---|---|---|---|---| |---|---|---|---|---|
| Engine | Firefox (open source) | Chromium fork | Chromium fork | Chromium | | Engine | Firefox 150 | Firefox (~1 year old base) | Chromium | Chromium fork |
| Patch depth | C++ source | JS overrides | JS overrides | JS overrides | | Patch depth | C++ source | C++ source | C++ source | JS overrides |
| `.toString()` clean | ✅ Native Gecko path | ❌ Detectable shims | ❌ Detectable shims | ❌ Detectable shims | | Maintenance | Active | Gap (~1 year) | Active | Active SaaS |
| Canvas / WebGL | ✅ C++ level | ⚠️ JS override | ⚠️ JS override | ⚠️ JS override | | Open source | ✅ MIT | ✅ MPL | ❌ Closed source | ❌ Closed source |
| SOCKS5 auth | ✅ Patched | ⚠️ Varies | ⚠️ Varies | ❌ | | `.toString()` clean | ✅ | ✅ | ✅ | ❌ Detectable shims |
| Self-hosted | ✅ | ❌ SaaS | ❌ SaaS | ❌ Cloud | | Canvas / WebGL / Audio | ✅ C++ | ⚠️ Drift vs current FF | ✅ C++ | ⚠️ JS override |
| reCAPTCHA v3 score | **0.90** | ~0.3-0.6 | ~0.3-0.5 | ~0.3-0.5 | | SOCKS5 auth | ✅ Patched | ❌ | ⚠️ Playwright proxy | ⚠️ Varies |
| **reCAPTCHA v3 score** | **0.90** | ~0.3-0.5 | ~0.3-0.5 | ~0.3-0.6 |
| FP Pro - bot detected | ✅ Not detected | ❌ Detected | ❌ Detected | ❌ Detected | | FP Pro - bot detected | ✅ Not detected | ❌ Detected | ❌ Detected | ❌ Detected |
| FP Pro - tampering | ✅ Not detected | ❌ Detected | ❌ Detected | ❌ Detected | | CreepJS lies | ✅ 0 | ❌ Multiple | ✅ 0 | ❌ Multiple |
| FP Pro - VPN flag | ✅ false | ❌ true | ❌ true | ❌ true | | Cost | Free | Free | Free | From $99/mo |
| CreepJS lies | ✅ 0 | ❌ multiple | ❌ multiple | ❌ multiple |
--- ---
## Install ## Install
```bash ```bash
pip install stealthfox pip install git+https://github.com/feder-cr/invisible_playwright.git
python -m stealthfox fetch # one-time ~100 MB download, SHA256-verified python -m invisible_playwright fetch # one-time ~100 MB download, SHA256-verified
``` ```
Supported platforms: **Windows x86_64**, **Linux x86_64**. Supported platforms: **Windows x86_64**, **Linux x86_64 / arm64**, **macOS arm64 / x86_64**. On macOS the app is ad-hoc signed (not notarized): if Gatekeeper complains, clear the quarantine flag once with `xattr -dr com.apple.quarantine` on the cached `Firefox.app`.
--- ---
@ -120,17 +70,17 @@ Supported platforms: **Windows x86_64**, **Linux x86_64**.
- from playwright.sync_api import sync_playwright - from playwright.sync_api import sync_playwright
- with sync_playwright() as p: - with sync_playwright() as p:
- browser = p.firefox.launch() - browser = p.firefox.launch()
+ from stealthfox import Stealthfox + from invisible_playwright import InvisiblePlaywright
+ with Stealthfox() as browser: + with InvisiblePlaywright() as browser:
``` ```
Every session gets a unique, coherent fingerprint drawn from real-world Firefox telemetry (GPU / audio / fonts / ~400 other fields) and Bezier-curve mouse motion baked into the browser itself. Every session gets a unique, coherent fingerprint drawn from real-world Firefox telemetry (GPU / audio / fonts / ~400 other fields) and Bezier-curve mouse motion baked into the browser itself.
**Sync** **Sync**
```python ```python
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
with Stealthfox(proxy={"server": "socks5://...", "username": "u", "password": "p"}) as browser: with InvisiblePlaywright(proxy={"server": "socks5://...", "username": "u", "password": "p"}) as browser:
page = browser.new_page() page = browser.new_page()
page.goto("https://example.com") page.goto("https://example.com")
page.click("#submit") # mouse arcs to the button on a Bezier curve page.click("#submit") # mouse arcs to the button on a Bezier curve
@ -138,9 +88,9 @@ with Stealthfox(proxy={"server": "socks5://...", "username": "u", "password": "p
**Async** **Async**
```python ```python
from stealthfox.async_api import Stealthfox from invisible_playwright.async_api import InvisiblePlaywright
async with Stealthfox(proxy={"server": "socks5://...", "username": "u", "password": "p"}) as browser: async with InvisiblePlaywright(proxy={"server": "socks5://...", "username": "u", "password": "p"}) as browser:
page = await browser.new_page() page = await browser.new_page()
await page.goto("https://example.com") await page.goto("https://example.com")
await page.click("#submit") await page.click("#submit")
@ -153,9 +103,9 @@ The `browser` object is a `playwright.sync_api.Browser` / `playwright.async_api.
### Random fingerprint per session ### Random fingerprint per session
```python ```python
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
with Stealthfox() as browser: with InvisiblePlaywright() as browser:
page = browser.new_page() page = browser.new_page()
page.goto("https://creepjs-api.web.app") page.goto("https://creepjs-api.web.app")
``` ```
@ -163,7 +113,7 @@ with Stealthfox() as browser:
Every call samples a new coherent profile. Log the seed to reproduce interesting runs: Every call samples a new coherent profile. Log the seed to reproduce interesting runs:
```python ```python
sf = Stealthfox() sf = InvisiblePlaywright()
with sf as browser: with sf as browser:
print("seed =", sf.seed) print("seed =", sf.seed)
# ... # ...
@ -172,7 +122,7 @@ with sf as browser:
### Reproducible fingerprint ### Reproducible fingerprint
```python ```python
with Stealthfox(seed=42) as browser: with InvisiblePlaywright(seed=42) as browser:
... # same GPU, same canvas hash, same audio context, every run ... # same GPU, same canvas hash, same audio context, every run
``` ```
@ -184,18 +134,33 @@ proxy = {
"username": "user", "username": "user",
"password": "pass", "password": "pass",
} }
with Stealthfox(proxy=proxy) as browser: with InvisiblePlaywright(proxy=proxy) as browser:
... ...
``` ```
Schemes supported: `socks5`, `socks4`, `http`, `https`. Auth works on all of them (SOCKS5 via patched `nsProtocolProxyService.cpp`, HTTP/HTTPS via Playwright). DNS is routed through the proxy by default, no local leak. Schemes supported: `socks5`, `socks4`, `http`, `https`. Auth works on all of them (SOCKS5 via patched `nsProtocolProxyService.cpp`, HTTP/HTTPS via Playwright). DNS is routed through the proxy by default, no local leak.
### Timezone
The browser timezone follows `timezone=`:
```python
# default: timezone is auto-derived from the egress IP (proxy egress if a
# proxy is set, otherwise the host's own public IP)
with InvisiblePlaywright(proxy=proxy) as browser:
...
# explicit IANA zone always wins — the only way to force a specific zone
with InvisiblePlaywright(proxy=proxy, timezone="America/New_York") as browser:
...
```
### Pinning specific fingerprint fields ### Pinning specific fingerprint fields
By default everything comes from `seed`. To force specific values while the rest stays seed-derived: By default everything comes from `seed`. To force specific values while the rest stays seed-derived:
```python ```python
with Stealthfox( with InvisiblePlaywright(
seed=42, seed=42,
pin={ pin={
"gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)", "gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)",
@ -215,12 +180,23 @@ Full list of pinnable keys, how pinning interacts with the Bayesian sampler, and
## CLI ## CLI
```bash ```bash
stealthfox fetch # download the binary if missing invisible_playwright fetch # download the binary if missing
stealthfox path # print the absolute path to the cached binary invisible_playwright fetch --force # re-download even if cached
stealthfox version # wrapper and binary versions invisible_playwright path # print the absolute path to the cached binary
stealthfox clear-cache # remove all cached binaries invisible_playwright version # wrapper and binary versions
invisible_playwright clear-cache # remove all cached binaries
``` ```
## Related projects
invisible_playwright takes a different angle than the major Firefox-hardening projects but stands on their shoulders:
- **[arkenfox/user.js](https://github.com/arkenfox/user.js)** - the canonical Firefox configuration for privacy/security hardening via prefs. Reading arkenfox is how you understand which `user.js` knobs matter; invisible_playwright goes further by patching the C++ source where prefs alone are insufficient (Canvas noise, WebGL parameter overrides, font whitelisting, WebRTC IP swap, DevTools detection bypass).
- **[LibreWolf](https://librewolf.net)** - a Firefox fork bundled with sensible privacy defaults. Same audience, different distribution model: LibreWolf ships a configured Firefox binary, invisible_playwright ships source patches + a wrapper for automation.
- **[Camoufox](https://github.com/daijro/camoufox)** - the most well-known open-source anti-detect Firefox project. We share design goals on the fingerprint-spoofing side; the implementation approach differs (Camoufox patches a wider surface and ships its own fingerprint database, while invisible_playwright sticks closer to vanilla and drives spoofing from a Bayesian sampler).
---
## License ## License
MIT - see [LICENSE](LICENSE). The patched Firefox binary is distributed under the MPL-2.0 (Firefox upstream license). The C++ patches against mozilla-central that produce that binary are at [feder-cr/firefox-stealth](https://github.com/feder-cr/firefox-stealth). MIT - see [LICENSE](LICENSE). The patched Firefox binary is distributed under the MPL-2.0 (Firefox upstream license). The C++ patches against mozilla-central that produce that binary are at [feder-cr/invisible_firefox](https://github.com/feder-cr/invisible_firefox).

54
SECURITY.md Normal file
View file

@ -0,0 +1,54 @@
# Security Policy
## Supported versions
Only the latest release on `main` receives security fixes.
| Version | Supported |
|---------|-----------|
| latest | ✅ |
| older | ❌ |
## Reporting a vulnerability
**Please do not report security issues via public GitHub issues, discussions, or pull requests.**
Use one of the following private channels:
1. **GitHub Private Vulnerability Reporting** (preferred): open an advisory at https://github.com/feder-cr/invisible_playwright/security/advisories/new
2. **Email**: `federico.elia.majo@gmail.com` with subject prefix `[security][invisible_playwright]`
Please include:
- A clear description of the issue and impact
- Steps to reproduce (minimal repro preferred)
- The version of `invisible_playwright` and OS where it was observed
- Whether you have a suggested fix
## What to expect
- Acknowledgement of your report within **7 days**
- An initial assessment and tracking issue (private) within **14 days**
- Coordinated disclosure: a fix and public advisory are released together; reporters are credited unless they prefer to remain anonymous
## Scope
In scope:
- The Python wrapper `invisible_playwright` (this repo)
- The binary download/verification flow (SHA256 pinning, fetch endpoints)
- The CLI
Out of scope here (report to the relevant project):
- Vulnerabilities in the patched Firefox C++ source — open a private report at [feder-cr/invisible_firefox](https://github.com/feder-cr/invisible_firefox/security/advisories/new)
- Vulnerabilities in upstream Firefox / mozilla-central — report to Mozilla per https://www.mozilla.org/security/
- Vulnerabilities in third-party dependencies (`playwright`, `requests`, etc.) — report to those projects directly
## Out of scope
- Reports that the browser is detected by a specific anti-bot service — open a regular GitHub issue, this is a product-quality concern, not a security one
- Social engineering of maintainers
- Denial of service requiring physical access or local privileged access
Thank you for helping keep the project and its users safe.

View file

@ -5,9 +5,9 @@ By default, every field of the fingerprint is sampled from a Bayesian network of
`pin` lets you **force specific fields** while letting the rest stay seed-derived. Useful when you need to replicate a known device (e.g. an NVIDIA 1080p laptop), test a specific GPU/screen combo, or pin just one noisy signal that a target site weighs heavily. `pin` lets you **force specific fields** while letting the rest stay seed-derived. Useful when you need to replicate a known device (e.g. an NVIDIA 1080p laptop), test a specific GPU/screen combo, or pin just one noisy signal that a target site weighs heavily.
```python ```python
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
with Stealthfox( with InvisiblePlaywright(
seed=42, seed=42,
pin={ pin={
"gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)", "gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)",
@ -27,7 +27,7 @@ The generator is a Bayesian network: every field has a probability distribution
When you pin a field: When you pin a field:
1. The pinned value is written directly, bypassing the sampler. 1. The pinned value is written directly, bypassing the sampler.
2. **Unpinned children are still sampled from their conditionals** using the parent's original posterior, not the pinned value. 2. **Unpinned children are still sampled from their conditionals** - using the parent's original posterior, not the pinned value.
That last point is the subtle one: pinning breaks the conditional chain. If you pin `gpu.renderer` to an RTX 4090 string but leave `screen` unpinned, the sampler will pick `screen` from the seed-derived tier (which might be `low_end`), producing a physically implausible "RTX 4090 + 1366x768" pairing. That last point is the subtle one: pinning breaks the conditional chain. If you pin `gpu.renderer` to an RTX 4090 string but leave `screen` unpinned, the sampler will pick `screen` from the seed-derived tier (which might be `low_end`), producing a physically implausible "RTX 4090 + 1366x768" pairing.
@ -35,20 +35,20 @@ That last point is the subtle one: pinning breaks the conditional chain. If you
## Full list of pinnable keys ## Full list of pinnable keys
Keys are dotted paths. All values are optional omitted keys fall back to the sampler. Keys are dotted paths. All values are optional - omitted keys fall back to the sampler.
### `gpu.*` ### `gpu.*`
| Key | Type | Example | Notes | | Key | Type | Example | Notes |
|-----|------|---------|-------| |-----|------|---------|-------|
| `gpu.class_tier` | str | `"high_end"` | The **root** of the Bayesian network. One of `"low_end"`, `"mid_range"`, `"high_end"`, `"integrated_old"`, `"integrated_modern"`. Pin this alone to steer the whole profile (screen, concurrency, MSAA, ) toward a coherent tier without having to name each sub-field. | | `gpu.class_tier` | str | `"high_end"` | The **root** of the Bayesian network. One of `"low_end"`, `"mid_range"`, `"high_end"`, `"integrated_old"`, `"integrated_modern"`. Pin this alone to steer the whole profile (screen, concurrency, MSAA, ...) toward a coherent tier without having to name each sub-field. |
| `gpu.vendor` | str | `"Google Inc. (NVIDIA)"` | Must exactly match the renderer vendor prefix, otherwise detectors catch the mismatch. | | `gpu.vendor` | str | `"Google Inc. (NVIDIA)"` | Must exactly match the renderer vendor prefix, otherwise detectors catch the mismatch. |
| `gpu.renderer` | str | `"ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)"` | Windows ANGLE string. Used by WebGL `UNMASKED_RENDERER_WEBGL`. | | `gpu.renderer` | str | `"ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)"` | Windows ANGLE string. Used by WebGL `UNMASKED_RENDERER_WEBGL`. |
**Why `class_tier` is pinnable separately from `renderer`.** They live at different levels of abstraction: **Why `class_tier` is pinnable separately from `renderer`.** They live at different levels of abstraction:
- `class_tier` is a **coarse handle** over the whole Bayesian graph. It gates the distribution of `screen`, `hardware.concurrency`, `webgl.msaa_samples`, and storage quota. Pin `{"gpu.class_tier": "low_end"}` and the sampler returns a *coherent* low-end machine — small screen, 2-4 cores, 4x MSAA — without you having to specify each field. - `class_tier` is a **coarse handle** over the whole Bayesian graph. It gates the distribution of `screen`, `hardware.concurrency`, `webgl.msaa_samples`, and storage quota. Pin `{"gpu.class_tier": "low_end"}` and the sampler returns a *coherent* low-end machine - small screen, 2-4 cores, 4x MSAA - without you having to specify each field.
- `renderer` is an **exact string** that lands verbatim in WebGL's `UNMASKED_RENDERER_WEBGL`. Useful when you want to imitate a specific GPU the target site has seen before. Does **not** condition other fields if you pin `renderer` to an RTX 4090 but leave `class_tier` unpinned, `class_tier` is re-sampled from scratch and might disagree with the renderer string (see [How sampling + pinning interact](#how-sampling--pinning-interact)). - `renderer` is an **exact string** that lands verbatim in WebGL's `UNMASKED_RENDERER_WEBGL`. Useful when you want to imitate a specific GPU the target site has seen before. Does **not** condition other fields - if you pin `renderer` to an RTX 4090 but leave `class_tier` unpinned, `class_tier` is re-sampled from scratch and might disagree with the renderer string (see [How sampling + pinning interact](#how-sampling--pinning-interact)).
In practice most users should pin `class_tier` alone, or pin `renderer`+`vendor`+`class_tier` together if they want full control. In practice most users should pin `class_tier` alone, or pin `renderer`+`vendor`+`class_tier` together if they want full control.
@ -82,7 +82,7 @@ In practice most users should pin `class_tier` alone, or pin `renderer`+`vendor`
| Key | Effect | | Key | Effect |
|-----|--------| |-----|--------|
| `codec.av1_enabled` | `true` `canPlayType('video/av01')` returns `"probably"`. | | `codec.av1_enabled` | `true` -> `canPlayType('video/av01')` returns `"probably"`. |
| `codec.webm_encoder_enabled` | `MediaRecorder` advertises WebM support. | | `codec.webm_encoder_enabled` | `MediaRecorder` advertises WebM support. |
| `codec.mediasource_webm` | `MediaSource.isTypeSupported('video/webm')`. | | `codec.mediasource_webm` | `MediaSource.isTypeSupported('video/webm')`. |
| `codec.mediasource_mp4` | `MediaSource.isTypeSupported('video/mp4')`. | | `codec.mediasource_mp4` | `MediaSource.isTypeSupported('video/mp4')`. |
@ -98,17 +98,17 @@ In practice most users should pin `class_tier` alone, or pin `renderer`+`vendor`
| Key | Type | Example | Notes | | Key | Type | Example | Notes |
|-----|------|---------|-------| |-----|------|---------|-------|
| `fonts` | list[str] | `["Arial", "Segoe UI", ...]` | Complete font allowlist. **Every other font is hidden**. The sampler usually picks 1424 system fonts. | | `fonts` | list[str] | `["Arial", "Segoe UI", ...]` | Complete font allowlist. **Every other font is hidden**. The sampler usually picks 14-24 system fonts. |
| `dark_theme` | bool | `False` | `prefers-color-scheme: dark`. Real traffic is ~85% light, 15% dark. | | `dark_theme` | bool | `False` | `prefers-color-scheme: dark`. Real traffic is ~85% light, 15% dark. |
## Reading the chosen values back ## Reading the chosen values back
Every sampled (or pinned) value lands in a `zoom.stealth.*` pref inside the browser. Open `about:config` in a launched stealthfox session and filter for `zoom.stealth` to see the exact values in effect. Every sampled (or pinned) value lands in a `zoom.stealth.*` pref inside the browser. Open `about:config` in a launched invisible_playwright session and filter for `zoom.stealth` to see the exact values in effect.
Alternatively, inspect the instance before the `with` block exits: Alternatively, inspect the instance before the `with` block exits:
```python ```python
sf = Stealthfox(seed=42) sf = InvisiblePlaywright(seed=42)
with sf as browser: with sf as browser:
# sf.seed is set; the full profile is in browser's prefs # sf.seed is set; the full profile is in browser's prefs
... ...
@ -118,7 +118,7 @@ with sf as browser:
### Mimic a specific real device ### Mimic a specific real device
Pin the whole visible tuple GPU, screen, concurrency, fonts, audio: Pin the whole visible tuple - GPU, screen, concurrency, fonts, audio:
```python ```python
pin = { pin = {

BIN
docs/screenshots/hero.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 479 KiB

View file

@ -1,9 +1,9 @@
"""Launch a patched Firefox with a random stealth profile and load example.com.""" """Launch a patched Firefox with a random stealth profile and load example.com."""
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
def main() -> None: def main() -> None:
with Stealthfox() as browser: with InvisiblePlaywright() as browser:
page = browser.new_page() page = browser.new_page()
page.goto("https://example.com") page.goto("https://example.com")
print("title:", page.title()) print("title:", page.title())

View file

@ -1,7 +1,7 @@
"""Same as basic.py but route through a SOCKS5 proxy.""" """Same as basic.py but route through a SOCKS5 proxy."""
import os import os
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
def main() -> None: def main() -> None:
@ -14,7 +14,7 @@ def main() -> None:
proxy["username"] = user proxy["username"] = user
proxy["password"] = password proxy["password"] = password
with Stealthfox(proxy=proxy) as browser: with InvisiblePlaywright(proxy=proxy) as browser:
page = browser.new_page() page = browser.new_page()
page.goto("https://httpbin.org/ip") page.goto("https://httpbin.org/ip")
print(page.content()[:500]) print(page.content()[:500])

View file

@ -3,8 +3,8 @@ requires = ["hatchling"]
build-backend = "hatchling.build" build-backend = "hatchling.build"
[project] [project]
name = "stealthfox" name = "invisible-playwright"
version = "0.1.0" version = "0.2.0"
description = "Playwright wrapper for a patched Firefox with deterministic stealth profile." description = "Playwright wrapper for a patched Firefox with deterministic stealth profile."
readme = "README.md" readme = "README.md"
requires-python = ">=3.11" requires-python = ">=3.11"
@ -22,24 +22,41 @@ classifiers = [
dependencies = [ dependencies = [
"playwright>=1.40", "playwright>=1.40",
"platformdirs>=4", "platformdirs>=4",
"requests>=2.31", "requests[socks]>=2.31",
"maxminddb>=2.2",
"tzdata>=2024.1",
"tqdm>=4.66", "tqdm>=4.66",
"pywin32>=306; sys_platform == 'win32'", "pywin32>=306; sys_platform == 'win32'",
] ]
[project.optional-dependencies] [project.optional-dependencies]
dev = ["pytest>=7", "pytest-mock>=3", "responses>=0.24"] dev = ["pytest>=7", "pytest-mock>=3", "responses>=0.24", "build>=1", "pytest-rerunfailures>=14", "playwright>=1.40"]
[tool.pytest.ini_options]
markers = [
"unit: pure-logic tests, no I/O or external deps",
"integration: multi-module tests, no browser",
"e2e: requires patched Firefox binary and display",
"slow: tests that build the wheel — opt-in only",
"linux_only: tests that require Linux platform",
]
addopts = "-m 'not slow and not e2e'"
# tests/playwright-upstream/ is a vendored Microsoft Playwright test suite
# used for compatibility verification on demand. It has its own deps
# (pixelmatch with API not matching our version) and a conftest that fails
# collection in our env. Run it explicitly with --override-ini for compat
# audits, not on every push.
norecursedirs = ["playwright-upstream"]
[project.scripts] [project.scripts]
stealthfox = "stealthfox.cli:main" invisible-playwright = "invisible_playwright.cli:main"
[project.urls] [project.urls]
Homepage = "https://github.com/feder-cr/stealthfox" Homepage = "https://github.com/feder-cr/invisible_playwright"
Issues = "https://github.com/feder-cr/stealthfox/issues" Issues = "https://github.com/feder-cr/invisible_playwright/issues"
[tool.hatch.build.targets.wheel] [tool.hatch.build.targets.wheel]
packages = ["src/stealthfox"] packages = ["src/invisible_playwright"]
[tool.hatch.build.targets.wheel.force-include] [tool.hatch.build.targets.sdist]
"src/stealthfox/data" = "stealthfox/data" include = ["src/invisible_playwright", "tests", "README.md", "LICENSE", "pyproject.toml"]
"src/stealthfox/_fpforge/data" = "stealthfox/_fpforge/data"

View file

@ -13,7 +13,7 @@ import sys
OUT = os.path.join( OUT = os.path.join(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))), os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
"src", "stealthfox", "_fpforge", "data", "src", "invisible-playwright", "_fpforge", "data",
) )

172
scripts/ci_drive_gate.py Normal file
View file

@ -0,0 +1,172 @@
#!/usr/bin/env python3
"""CI drive gate — the firefox-N catcher.
A raw `firefox --screenshot` proves nothing about automation: a juggler-less
binary renders a screenshot just fine and ships broken (firefox-8 did exactly
that). This DRIVES the binary the way users will Playwright launches it over
the juggler pipe and exercises real paths.
Two levels (see `--full`):
SMOKE (default run on ALL 5 legs, on every binary's native runner):
launch over juggler-pipe navigate a real http://127.0.0.1 page assert a
response, the Firefox UA, navigator.webdriver falsy, and a DOM read. This is
the firefox-8 catcher (a juggler-less binary throws TargetClosedError on
launch) plus a base stealth + drivability check. It is intentionally LIGHT:
the free hosted runners windows-latest especially are content-process
unstable under a heavy headless interaction sequence (clicks/moves cascade
into "context destroyed" / selector-timeout / eval-CSP), so the gate that
must be GREEN on every leg stays minimal and reliable.
FULL (`--full` run on the historically-reliable Linux leg):
SMOKE plus mouse + keyboard input (firefox-2 / issue #9:
jugglerSendMouseEvent/synthesizeMouseEvent), canvas determinism (stealth
seed must be per-session), and navigator-surface tells. The interaction code
is platform-identical JS (it lives in omni.ja), so exercising it on one
reliable leg catches a regression for ALL platforms; win interaction is
additionally covered by local pre-release testing.
NOT covered here: WebGL determinism (needs SWGL, false-fails headless) and the
faithful cross-origin iframe test (issue #20) — both live in the local realness
gate. All checks here are headless, no screenshot (GPU-free), loopback-only
(no external network / proxy / secrets) safe in public CI.
Robustness: a real loopback HTTP page (NOT data: / about:blank those get
re-normalized / carry an eval-blocking CSP), arrow-function evaluates (never
eval'd), and up to 2 retries on transient context-destroyed/detached/timeout.
A genuinely broken binary fails ALL attempts the gate fails.
Usage: python ci_drive_gate.py <firefox-binary> [--full]
Exit 0 + "DRIVE GATE OK ..." on success; non-zero with a reason on failure.
"""
from __future__ import annotations
import http.server
import socketserver
import sys
import threading
HTML = (
"<!doctype html><html><head><title>dt</title></head><body>"
"<h1 id=x>hello-drive</h1>"
"<button id=b>go</button>"
"<input id=inp>"
"<script>"
"window.__clicked=0;window.__moves=0;"
"document.getElementById('b').addEventListener('click',function(){window.__clicked=1;});"
"window.addEventListener('mousemove',function(){window.__moves++;});"
"</script>"
"</body></html>"
).encode()
CANVAS_DRAW = (
"() => {const c=document.createElement('canvas');c.width=c.height=16;"
"const g=c.getContext('2d');g.fillStyle='#08f';g.fillRect(0,0,16,16);"
"g.fillStyle='#f40';g.fillText('s',2,12);return c.toDataURL();}"
)
_TRANSIENT = ("context was destroyed", "frame was detached", "target closed",
"because of a navigation", "timeout", "blocked by csp")
class _Handler(http.server.BaseHTTPRequestHandler):
def do_GET(self): # noqa: N802
self.send_response(200)
self.send_header("Content-Type", "text/html; charset=utf-8")
self.send_header("Content-Length", str(len(HTML)))
self.end_headers()
self.wfile.write(HTML)
def log_message(self, *a): # silence per-request stderr noise
pass
def _start_server():
srv = socketserver.TCPServer(("127.0.0.1", 0), _Handler)
threading.Thread(target=srv.serve_forever, daemon=True).start()
return srv, srv.server_address[1]
def _drive(exe: str, url: str, full: bool) -> str:
"""One full drive attempt. Returns the UA on success; raises on failure."""
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.firefox.launch(executable_path=exe, headless=True)
try:
page = browser.new_page()
resp = page.goto(url, wait_until="load")
assert resp and resp.ok, f"navigation to {url} failed: {resp.status if resp else 'no response'}"
ua = page.evaluate("() => navigator.userAgent")
webdriver = page.evaluate("() => navigator.webdriver")
text = page.evaluate("() => document.getElementById('x').textContent")
inter = {}
if full:
# firefox-2 / issue-#9 catcher: real mouse + keyboard over juggler.
page.wait_for_selector("#b")
page.mouse.move(20, 20)
page.mouse.move(120, 90) # synthesizeMouseEvent path
page.click("#b") # mousedown/up/click → listener fires
page.click("#inp")
page.keyboard.type("ok")
inter["clicked"] = page.evaluate("() => window.__clicked")
inter["moves"] = page.evaluate("() => window.__moves")
inter["typed"] = page.evaluate("() => document.getElementById('inp').value")
inter["canvas_a"] = page.evaluate(CANVAS_DRAW)
inter["canvas_b"] = page.evaluate(CANVAS_DRAW)
inter["langs"] = page.evaluate("() => navigator.languages.length")
inter["plugins"] = page.evaluate("() => navigator.plugins instanceof PluginArray")
finally:
browser.close()
# SMOKE asserts (always).
assert "Firefox" in ua, f"unexpected UA (binary not driving correctly): {ua!r}"
assert text == "hello-drive", f"DOM/JS roundtrip failed: {text!r}"
assert not webdriver, f"navigator.webdriver leaked True (stealth regression): {webdriver!r}"
if full:
assert inter["clicked"] == 1, "page.click() did not fire the click listener — mouse-event synthesis broken (firefox-2 class)"
assert inter["moves"] >= 1, "page.mouse.move() produced no mousemove — jugglerSendMouseEvent regression"
assert inter["typed"] == "ok", f"page.keyboard.type() failed: {inter['typed']!r}"
assert inter["canvas_a"] == inter["canvas_b"], "canvas non-deterministic across identical draws (stealth seed broken → bot tell)"
assert inter["langs"] and inter["langs"] > 0, "navigator.languages empty (headless tell)"
assert inter["plugins"], "navigator.plugins is not a PluginArray (headless tell)"
return ua
def main(exe: str, full: bool) -> int:
srv, port = _start_server()
url = f"http://127.0.0.1:{port}/"
level = "full" if full else "smoke"
extras = "http+click+mousemove+keyboard+canvas-determinism+navsurface" if full else "http+ua+webdriver+dom"
last = None
try:
for attempt in (1, 2, 3):
try:
ua = _drive(exe, url, full)
if attempt > 1:
print(f"(note: drive succeeded on attempt {attempt} after a transient error)")
print(f"DRIVE GATE OK [{level}] | UA={ua} | {extras}=ok")
return 0
except Exception as e: # noqa: BLE001 — gate: any failure must surface
last = e
msg = str(e).lower()
if attempt < 3 and any(t in msg for t in _TRANSIENT):
print(f"(transient error on attempt {attempt}, retrying): {e}", file=sys.stderr)
continue
break
finally:
srv.shutdown()
print(f"DRIVE GATE FAILED [{level}]: {last}", file=sys.stderr)
return 1
if __name__ == "__main__":
args = sys.argv[1:]
full = "--full" in args
positional = [a for a in args if not a.startswith("--")]
if len(positional) != 1:
print("usage: ci_drive_gate.py <path-to-firefox-binary> [--full]", file=sys.stderr)
sys.exit(2)
sys.exit(main(positional[0], full))

View file

@ -0,0 +1,114 @@
#!/usr/bin/env python3
"""Generate the GitHub release body for a firefox-N build from the actual
invisible_firefox commits that went into it.
The release tag (firefox-N) lives on the wrapper, but the binary's changes live
on the SOURCE repo (feder-cr/invisible_firefox). We never deep-clone that history
(it's a full Firefox fork); instead we use GitHub's compare API to list the
commits between the PREVIOUS release's source commit and this one, and turn their
subject lines into a short human-readable "What changed" list.
- The previous release's source commit comes from its ``source-commit.txt``
asset (this script's own output uploads one for the next run to read).
- If there's no previous source commit (first automated release) or the compare
fails, we fall back to a body WITHOUT the changelog section publishing must
never break on note generation.
This is NOT an LLM and NOT a raw ``git log`` dump: it filters out the
non-user-facing commits (docs/chore/ci/test/style) and prints the remaining
subjects as plain bullets. Quality rides on writing good commit subjects.
Usage:
python scripts/gen_release_notes.py --tag firefox-10 --current <sha> \
[--prev-sha <sha>] [--source-repo feder-cr/invisible_firefox]
# reads GITHUB_TOKEN from the env for the compare API (optional for public).
"""
from __future__ import annotations
import argparse
import json
import os
import re
import sys
import urllib.request
import urllib.error
# Conventional-commit prefixes that never belong in user-facing release notes.
_SKIP = re.compile(r"^(docs|chore|ci|test|style|build)(\(|:)", re.I)
def _api(url: str, token: str | None) -> dict:
headers = {"Accept": "application/vnd.github+json",
"User-Agent": "invisible-playwright-release-notes"}
if token:
headers["Authorization"] = f"Bearer {token}"
req = urllib.request.Request(url, headers=headers)
with urllib.request.urlopen(req, timeout=30) as r:
return json.load(r)
def changelog_bullets(source_repo: str, prev_sha: str, current_sha: str,
token: str | None) -> list[str]:
"""Return the user-facing commit subjects in prev_sha..current_sha, or []."""
if not prev_sha or not current_sha or prev_sha == current_sha:
return []
url = f"https://api.github.com/repos/{source_repo}/compare/{prev_sha}...{current_sha}"
try:
data = _api(url, token)
except (urllib.error.URLError, urllib.error.HTTPError, ValueError) as e:
print(f"[gen_release_notes] compare API failed ({e}); no changelog section",
file=sys.stderr)
return []
bullets: list[str] = []
for c in data.get("commits", []):
subject = (c.get("commit", {}).get("message") or "").splitlines()[0].strip()
if not subject or _SKIP.match(subject):
continue
bullets.append(subject.rstrip("."))
return bullets
def build_body(tag: str, current_sha: str, bullets: list[str]) -> str:
m = re.search(r"(\d+)", tag)
n = int(m.group(1)) if m else None
prev_label = f"firefox-{n - 1}" if n else "the previous build"
short = (current_sha or "")[:8]
parts = ["Patched Firefox 150.0.1, the stealth build invisible_playwright drives.", ""]
if bullets:
parts.append(f"What changed since {prev_label}:")
parts += [f"- {b}" for b in bullets]
parts.append("")
parts += [
"Builds: Linux x86_64, Linux arm64, Windows x86_64, macOS arm64, macOS x86_64.",
"",
"Most people won't grab these by hand. The wrapper fetches the right one for "
"your platform on first run:",
"",
" pip install git+https://github.com/feder-cr/invisible_playwright",
"",
"If you do download manually, `checksums.txt` has the SHA256s. The macOS builds "
"are ad-hoc signed (not notarized), so clear the quarantine flag: "
"`xattr -dr com.apple.quarantine Firefox.app`",
]
if short:
parts += ["", f"Built from invisible_firefox @{short}."]
return "\n".join(parts)
def main() -> int:
ap = argparse.ArgumentParser()
ap.add_argument("--tag", required=True, help="release tag, e.g. firefox-10")
ap.add_argument("--current", required=True, help="invisible_firefox SHA this build was built from")
ap.add_argument("--prev-sha", default="", help="previous release's source SHA (omit for none)")
ap.add_argument("--source-repo", default="feder-cr/invisible_firefox")
args = ap.parse_args()
token = os.environ.get("GITHUB_TOKEN") or os.environ.get("GH_TOKEN")
bullets = changelog_bullets(args.source_repo, args.prev_sha, args.current, token)
sys.stdout.write(build_body(args.tag, args.current, bullets))
return 0
if __name__ == "__main__":
sys.exit(main())

View file

@ -0,0 +1 @@
1.55.0

67
scripts/run_e2e.py Normal file
View file

@ -0,0 +1,67 @@
#!/usr/bin/env python3
"""Run the FULL e2e suite (every test that opens the browser) against a binary.
The 127 ``@pytest.mark.e2e`` tests are excluded from the default `pytest` run
(`addopts = -m 'not slow and not e2e'`) because they need a real Firefox binary
and a display, and they skip themselves when no binary is available. That makes
them easy to forget and "we can't afford for something to not work". This is
the gate that runs them all, deliberately, against a chosen binary.
It is the MANDATORY pre-release e2e gate: run it green against the freshly-built
release binary BEFORE un-drafting a firefox-N (alongside the fppro + WebRTC
realness gates). It is NOT in the public CI drive-gate the hosted runners are
content-process unstable under a heavy headless interaction sequence (see
70-known-bugs / 60-ci-release-pipeline); this runs locally on reliable hardware.
Flake-resilience: under full-suite load a couple of interaction tests (dblclick,
hover/mouseenter) can flake even though they pass 3/3 in isolation, so failures
are reran up to twice on the known transient signatures. A genuinely broken
binary fails all attempts. The webrtc e2e fake a TCP-only SOCKS locally (no
proxy/secrets), so the whole suite is offline.
Usage:
python scripts/run_e2e.py <firefox-binary>
python scripts/run_e2e.py # uses $INVPW_BINARY_PATH
"""
from __future__ import annotations
import os
import subprocess
import sys
from pathlib import Path
_RERUN_SIGNATURES = "Timeout|context was destroyed|was detached|not visible|because of a navigation|TargetClosed"
def main() -> int:
binary = sys.argv[1] if len(sys.argv) > 1 else os.environ.get("INVPW_BINARY_PATH")
if not binary:
print("usage: run_e2e.py <firefox-binary> (or set INVPW_BINARY_PATH)", file=sys.stderr)
return 2
if not Path(binary).exists():
print(f"ERROR: binary not found: {binary}", file=sys.stderr)
return 2
env = dict(os.environ)
# One setting drives the whole suite: conftest's firefox_binary fixture and
# the webrtc e2e both resolve from these.
env["INVPW_BINARY_PATH"] = binary
env["STEALTHFOX_E2E_BINARY"] = binary
repo = Path(__file__).resolve().parent.parent
cmd = [
sys.executable, "-m", "pytest",
"-m", "e2e",
"-o", "addopts=", # override the default 'not e2e' deselection
"--reruns", "2", "--reruns-delay", "1",
"--only-rerun", _RERUN_SIGNATURES,
"-p", "no:cacheprovider",
"-q", "--tb=short",
] + sys.argv[2:]
print(f"[run_e2e] binary={binary}")
print(f"[run_e2e] {' '.join(cmd)}")
return subprocess.run(cmd, cwd=repo, env=env).returncode
if __name__ == "__main__":
sys.exit(main())

View file

@ -0,0 +1,44 @@
"""invisible_playwright — Playwright wrapper for a patched Firefox with stealth profile.
Quickstart:
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright() as browser: # random seed
page = browser.new_page()
page.goto("https://example.com")
with InvisiblePlaywright(seed=42) as browser: # deterministic
...
with InvisiblePlaywright(humanize=True) as browser: # human-like cursor motion
page = browser.new_page()
page.click("#submit") # expanded into a Bezier trajectory
"""
from .config import get_default_args, get_default_stealth_prefs
from .constants import BINARY_VERSION, FIREFOX_UPSTREAM_VERSION
from ._geo import GeoTimezoneError, resolve_session_timezone
from .download import ensure_binary, ensure_geoip_mmdb
from .launcher import InvisiblePlaywright
from importlib.metadata import PackageNotFoundError, version as _pkg_version
try:
__version__ = _pkg_version("invisible-playwright")
except PackageNotFoundError:
# Editable / source checkout without an install record: fall back to a
# marker rather than risk shipping a stale hardcoded string.
__version__ = "0.0.0+unknown"
__all__ = [
"InvisiblePlaywright",
"ensure_binary",
"ensure_geoip_mmdb",
"get_default_stealth_prefs",
"get_default_args",
"resolve_session_timezone",
"GeoTimezoneError",
"BINARY_VERSION",
"FIREFOX_UPSTREAM_VERSION",
"__version__",
]

View file

@ -1,7 +1,7 @@
"""Internal Bayesian fingerprint generator used by stealthfox. """Internal Bayesian fingerprint generator used by invisible_playwright.
Private module do not import from user code. Use Private module do not import from user code. Use
stealthfox.Stealthfox(seed=..., pin=...) instead. invisible_playwright.InvisiblePlaywright(seed=..., pin=...) instead.
""" """
from .profile import ( from .profile import (
AudioProfile, AudioProfile,

View file

@ -84,6 +84,12 @@ _FONT_POOL = _load("font_pool.json")
_FONT_CORE: list = _FONT_POOL["core"] _FONT_CORE: list = _FONT_POOL["core"]
_FONT_OPTIONAL: list = _FONT_POOL["optional"] _FONT_OPTIONAL: list = _FONT_POOL["optional"]
_CPT_FONTS_OPT = _load("cpt_fonts_optional_given_class.json")["table"] _CPT_FONTS_OPT = _load("cpt_fonts_optional_given_class.json")["table"]
# Browsing-history pool + CPT (per-class probabilities for visited sites).
# Drives _recaptcha_seed's cookie pre-seed: each persona ends up with a
# coherent list of ~15-30 visited sites whose categories correlate with
# gpu_class (workstation → dev-heavy, integrated_old → shop+news-heavy).
_BROWSING_POOL: list = _load("browsing_pool.json")["entries"]
_CPT_BROWSING = _load("cpt_browsing_given_class.json")["table"]
# ═══════════════════════════════════════════════════════════════════════ # ═══════════════════════════════════════════════════════════════════════
@ -282,6 +288,33 @@ def derive_font_whitelist(gpu_class: str, rng) -> str:
return derive_font_prefs(gpu_class, rng)["whitelist"] return derive_font_prefs(gpu_class, rng)["whitelist"]
# ═══════════════════════════════════════════════════════════════════════
# BROWSING HISTORY (Bayesian: per-site P(visited|gpu_class))
# ═══════════════════════════════════════════════════════════════════════
def derive_browsing_history(gpu_class: str, rng) -> list:
"""Sample which sites this persona has visited recently.
Each site in the pool has a per-class probability (CPT). We sample
independently per-site, producing a list of dicts:
[{"name": "github.com", "category": "dev", "cookie_profile": "ga_cf"}, ...]
Sum of CPT probabilities per class is tuned to land ~15-30 visited sites
on average an established-user signature. Sorted by name for stable
output across runs of the same seed.
"""
cpt = _CPT_BROWSING.get(gpu_class)
if cpt is None:
cpt = _CPT_BROWSING["mid_range"]
visited: list = []
for entry in _BROWSING_POOL:
name = entry["name"]
p = cpt.get(name, 0.3) # default 0.3 for missing CPT row
if rng.random() < p:
visited.append(dict(entry)) # copy to avoid mutating pool
visited.sort(key=lambda e: e["name"])
return visited
# ═══════════════════════════════════════════════════════════════════════ # ═══════════════════════════════════════════════════════════════════════
# PUBLIC API: Forge # PUBLIC API: Forge
# ═══════════════════════════════════════════════════════════════════════ # ═══════════════════════════════════════════════════════════════════════
@ -350,6 +383,12 @@ class Forge:
bundle["gpu_class"], self._rng bundle["gpu_class"], self._rng
).items() ).items()
}, },
# Bayesian browsing history (per-class P(visited|gpu_class)).
# Consumed by _recaptcha_seed.py to seed coherent cookie history
# when invisible_playwright is launched with prep_recaptcha=True.
"browsing_history": derive_browsing_history(
bundle["gpu_class"], self._rng
),
} }

View file

@ -0,0 +1,64 @@
{
"_comment": [
"Pool of everyday websites used by the browsing_history node.",
"Each entry: { name, category, cookie_profile }.",
"- name: bare domain (no scheme, no leading dot).",
"- category: dev / shop / news / reference / media / community / misc.",
"- cookie_profile: short tag pointing to a cookie-template recipe used by",
" _recaptcha_seed.py to generate concrete cookies (so heavy-analytics sites",
" get _ga+_gid+OneTrust, simple sites get just _ga, dev tools get GH-style).",
"Add new entries here + add per-class probabilities in cpt_browsing_given_class.json."
],
"entries": [
{"name": "youtube.com", "category": "media", "cookie_profile": "ga_only"},
{"name": "wikipedia.org", "category": "reference", "cookie_profile": "minimal"},
{"name": "mozilla.org", "category": "reference", "cookie_profile": "ga_consent"},
{"name": "w3schools.com", "category": "dev", "cookie_profile": "ga_consent_clarity"},
{"name": "mdn.io", "category": "dev", "cookie_profile": "minimal"},
{"name": "duckduckgo.com", "category": "reference", "cookie_profile": "minimal"},
{"name": "github.com", "category": "dev", "cookie_profile": "ga_cf"},
{"name": "stackoverflow.com", "category": "dev", "cookie_profile": "ga_consent_clarity"},
{"name": "npmjs.com", "category": "dev", "cookie_profile": "ga_consent"},
{"name": "gitlab.com", "category": "dev", "cookie_profile": "ga_cf"},
{"name": "pypi.org", "category": "dev", "cookie_profile": "minimal"},
{"name": "docs.python.org", "category": "dev", "cookie_profile": "minimal"},
{"name": "rust-lang.org", "category": "dev", "cookie_profile": "ga_consent"},
{"name": "go.dev", "category": "dev", "cookie_profile": "ga_consent"},
{"name": "amazon.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "ebay.com", "category": "shop", "cookie_profile": "ga_consent"},
{"name": "etsy.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "bestbuy.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "target.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "nytimes.com", "category": "news", "cookie_profile": "ga_consent_clarity"},
{"name": "cnn.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "bbc.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "theguardian.com", "category": "news", "cookie_profile": "ga_consent_clarity"},
{"name": "reuters.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "apnews.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "washingtonpost.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "techcrunch.com", "category": "news", "cookie_profile": "ga_consent_clarity"},
{"name": "theverge.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "arstechnica.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "wired.com", "category": "news", "cookie_profile": "ga_consent_clarity"},
{"name": "engadget.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "9to5mac.com", "category": "news", "cookie_profile": "ga_consent"},
{"name": "medium.com", "category": "community", "cookie_profile": "ga_consent"},
{"name": "dev.to", "category": "community", "cookie_profile": "ga_consent"},
{"name": "reddit.com", "category": "community", "cookie_profile": "ga_cf"},
{"name": "news.ycombinator.com", "category": "community", "cookie_profile": "minimal"},
{"name": "quora.com", "category": "community", "cookie_profile": "ga_consent_clarity"},
{"name": "stackexchange.com", "category": "community", "cookie_profile": "ga_consent_clarity"},
{"name": "imdb.com", "category": "media", "cookie_profile": "ga_consent_clarity"},
{"name": "rottentomatoes.com", "category": "media", "cookie_profile": "ga_consent"},
{"name": "metacritic.com", "category": "media", "cookie_profile": "ga_consent"},
{"name": "allrecipes.com", "category": "misc", "cookie_profile": "ga_consent_clarity"},
{"name": "epicurious.com", "category": "misc", "cookie_profile": "ga_consent"},
{"name": "tripadvisor.com", "category": "misc", "cookie_profile": "ga_consent_clarity"},
{"name": "weather.com", "category": "reference", "cookie_profile": "ga_consent"},
{"name": "timeanddate.com", "category": "reference", "cookie_profile": "ga_consent"},
{"name": "thesaurus.com", "category": "reference", "cookie_profile": "ga_consent_clarity"},
{"name": "kayak.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "booking.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "airbnb.com", "category": "shop", "cookie_profile": "ga_consent"}
]
}

View file

@ -0,0 +1,138 @@
{
"_comment": [
"Per-class probability that a persona of a given gpu_class has visited each",
"site in the pool. Used by the browsing_history node to derive a coherent",
"visited-domain list per persona.",
"",
"Probabilities are tuned so each class samples ~15-30 sites on average",
"(sum across all 50 entries falls in that range), giving an established-user",
"look. Categories are biased by class:",
" - workstation/high_end: higher P(dev) + high P(news/media)",
" - mid_range: balanced",
" - low_end/integrated_*: lower P(dev), higher P(shop/news/reference)",
"",
"Missing class falls back to mid_range via Node CPT pool fallback."
],
"table": {
"workstation": {
"youtube.com": 0.80, "wikipedia.org": 0.85, "mozilla.org": 0.70,
"w3schools.com": 0.40, "mdn.io": 0.55, "duckduckgo.com": 0.45,
"github.com": 0.95, "stackoverflow.com": 0.90, "npmjs.com": 0.65,
"gitlab.com": 0.50, "pypi.org": 0.55, "docs.python.org": 0.60,
"rust-lang.org": 0.35, "go.dev": 0.30,
"amazon.com": 0.70, "ebay.com": 0.25, "etsy.com": 0.15,
"bestbuy.com": 0.45, "target.com": 0.30,
"nytimes.com": 0.55, "cnn.com": 0.40, "bbc.com": 0.55,
"theguardian.com": 0.45, "reuters.com": 0.40, "apnews.com": 0.30,
"washingtonpost.com": 0.40,
"techcrunch.com": 0.65, "theverge.com": 0.60, "arstechnica.com": 0.65,
"wired.com": 0.50, "engadget.com": 0.35, "9to5mac.com": 0.30,
"medium.com": 0.55, "dev.to": 0.40, "reddit.com": 0.70,
"news.ycombinator.com": 0.65, "quora.com": 0.20, "stackexchange.com": 0.60,
"imdb.com": 0.45, "rottentomatoes.com": 0.25, "metacritic.com": 0.20,
"allrecipes.com": 0.20, "epicurious.com": 0.15, "tripadvisor.com": 0.30,
"weather.com": 0.55, "timeanddate.com": 0.30, "thesaurus.com": 0.25,
"kayak.com": 0.30, "booking.com": 0.35, "airbnb.com": 0.30
},
"high_end": {
"youtube.com": 0.85, "wikipedia.org": 0.80, "mozilla.org": 0.60,
"w3schools.com": 0.45, "mdn.io": 0.45, "duckduckgo.com": 0.40,
"github.com": 0.85, "stackoverflow.com": 0.80, "npmjs.com": 0.50,
"gitlab.com": 0.40, "pypi.org": 0.45, "docs.python.org": 0.50,
"rust-lang.org": 0.30, "go.dev": 0.25,
"amazon.com": 0.75, "ebay.com": 0.30, "etsy.com": 0.20,
"bestbuy.com": 0.50, "target.com": 0.35,
"nytimes.com": 0.50, "cnn.com": 0.50, "bbc.com": 0.50,
"theguardian.com": 0.40, "reuters.com": 0.35, "apnews.com": 0.30,
"washingtonpost.com": 0.35,
"techcrunch.com": 0.60, "theverge.com": 0.65, "arstechnica.com": 0.60,
"wired.com": 0.50, "engadget.com": 0.40, "9to5mac.com": 0.35,
"medium.com": 0.50, "dev.to": 0.35, "reddit.com": 0.75,
"news.ycombinator.com": 0.55, "quora.com": 0.25, "stackexchange.com": 0.55,
"imdb.com": 0.55, "rottentomatoes.com": 0.35, "metacritic.com": 0.30,
"allrecipes.com": 0.25, "epicurious.com": 0.20, "tripadvisor.com": 0.30,
"weather.com": 0.55, "timeanddate.com": 0.30, "thesaurus.com": 0.25,
"kayak.com": 0.30, "booking.com": 0.40, "airbnb.com": 0.30
},
"mid_range": {
"youtube.com": 0.85, "wikipedia.org": 0.75, "mozilla.org": 0.45,
"w3schools.com": 0.40, "mdn.io": 0.30, "duckduckgo.com": 0.35,
"github.com": 0.55, "stackoverflow.com": 0.55, "npmjs.com": 0.30,
"gitlab.com": 0.25, "pypi.org": 0.25, "docs.python.org": 0.30,
"rust-lang.org": 0.15, "go.dev": 0.15,
"amazon.com": 0.80, "ebay.com": 0.40, "etsy.com": 0.30,
"bestbuy.com": 0.55, "target.com": 0.40,
"nytimes.com": 0.45, "cnn.com": 0.55, "bbc.com": 0.45,
"theguardian.com": 0.35, "reuters.com": 0.30, "apnews.com": 0.30,
"washingtonpost.com": 0.30,
"techcrunch.com": 0.45, "theverge.com": 0.50, "arstechnica.com": 0.40,
"wired.com": 0.45, "engadget.com": 0.35, "9to5mac.com": 0.30,
"medium.com": 0.45, "dev.to": 0.25, "reddit.com": 0.70,
"news.ycombinator.com": 0.30, "quora.com": 0.35, "stackexchange.com": 0.40,
"imdb.com": 0.60, "rottentomatoes.com": 0.40, "metacritic.com": 0.35,
"allrecipes.com": 0.35, "epicurious.com": 0.25, "tripadvisor.com": 0.40,
"weather.com": 0.60, "timeanddate.com": 0.25, "thesaurus.com": 0.30,
"kayak.com": 0.35, "booking.com": 0.45, "airbnb.com": 0.40
},
"low_end": {
"youtube.com": 0.85, "wikipedia.org": 0.70, "mozilla.org": 0.35,
"w3schools.com": 0.30, "mdn.io": 0.20, "duckduckgo.com": 0.30,
"github.com": 0.30, "stackoverflow.com": 0.30, "npmjs.com": 0.15,
"gitlab.com": 0.10, "pypi.org": 0.10, "docs.python.org": 0.15,
"rust-lang.org": 0.05, "go.dev": 0.05,
"amazon.com": 0.85, "ebay.com": 0.50, "etsy.com": 0.40,
"bestbuy.com": 0.55, "target.com": 0.45,
"nytimes.com": 0.40, "cnn.com": 0.60, "bbc.com": 0.40,
"theguardian.com": 0.30, "reuters.com": 0.25, "apnews.com": 0.30,
"washingtonpost.com": 0.25,
"techcrunch.com": 0.30, "theverge.com": 0.35, "arstechnica.com": 0.25,
"wired.com": 0.40, "engadget.com": 0.30, "9to5mac.com": 0.25,
"medium.com": 0.35, "dev.to": 0.15, "reddit.com": 0.65,
"news.ycombinator.com": 0.15, "quora.com": 0.45, "stackexchange.com": 0.25,
"imdb.com": 0.65, "rottentomatoes.com": 0.45, "metacritic.com": 0.35,
"allrecipes.com": 0.45, "epicurious.com": 0.30, "tripadvisor.com": 0.45,
"weather.com": 0.65, "timeanddate.com": 0.25, "thesaurus.com": 0.35,
"kayak.com": 0.35, "booking.com": 0.50, "airbnb.com": 0.40
},
"integrated_modern": {
"youtube.com": 0.85, "wikipedia.org": 0.70, "mozilla.org": 0.40,
"w3schools.com": 0.35, "mdn.io": 0.25, "duckduckgo.com": 0.35,
"github.com": 0.40, "stackoverflow.com": 0.40, "npmjs.com": 0.20,
"gitlab.com": 0.15, "pypi.org": 0.20, "docs.python.org": 0.20,
"rust-lang.org": 0.10, "go.dev": 0.10,
"amazon.com": 0.80, "ebay.com": 0.40, "etsy.com": 0.30,
"bestbuy.com": 0.50, "target.com": 0.40,
"nytimes.com": 0.40, "cnn.com": 0.55, "bbc.com": 0.45,
"theguardian.com": 0.35, "reuters.com": 0.30, "apnews.com": 0.30,
"washingtonpost.com": 0.30,
"techcrunch.com": 0.40, "theverge.com": 0.45, "arstechnica.com": 0.30,
"wired.com": 0.40, "engadget.com": 0.30, "9to5mac.com": 0.25,
"medium.com": 0.40, "dev.to": 0.20, "reddit.com": 0.65,
"news.ycombinator.com": 0.25, "quora.com": 0.40, "stackexchange.com": 0.35,
"imdb.com": 0.60, "rottentomatoes.com": 0.40, "metacritic.com": 0.30,
"allrecipes.com": 0.40, "epicurious.com": 0.25, "tripadvisor.com": 0.40,
"weather.com": 0.60, "timeanddate.com": 0.25, "thesaurus.com": 0.30,
"kayak.com": 0.35, "booking.com": 0.45, "airbnb.com": 0.40
},
"integrated_old": {
"youtube.com": 0.75, "wikipedia.org": 0.65, "mozilla.org": 0.30,
"w3schools.com": 0.20, "mdn.io": 0.10, "duckduckgo.com": 0.25,
"github.com": 0.15, "stackoverflow.com": 0.20, "npmjs.com": 0.05,
"gitlab.com": 0.05, "pypi.org": 0.05, "docs.python.org": 0.10,
"rust-lang.org": 0.02, "go.dev": 0.02,
"amazon.com": 0.85, "ebay.com": 0.55, "etsy.com": 0.45,
"bestbuy.com": 0.55, "target.com": 0.50,
"nytimes.com": 0.45, "cnn.com": 0.65, "bbc.com": 0.40,
"theguardian.com": 0.30, "reuters.com": 0.25, "apnews.com": 0.35,
"washingtonpost.com": 0.30,
"techcrunch.com": 0.20, "theverge.com": 0.25, "arstechnica.com": 0.15,
"wired.com": 0.30, "engadget.com": 0.20, "9to5mac.com": 0.20,
"medium.com": 0.30, "dev.to": 0.05, "reddit.com": 0.55,
"news.ycombinator.com": 0.05, "quora.com": 0.55, "stackexchange.com": 0.15,
"imdb.com": 0.70, "rottentomatoes.com": 0.50, "metacritic.com": 0.35,
"allrecipes.com": 0.55, "epicurious.com": 0.35, "tripadvisor.com": 0.50,
"weather.com": 0.70, "timeanddate.com": 0.30, "thesaurus.com": 0.40,
"kayak.com": 0.40, "booking.com": 0.55, "airbnb.com": 0.40
}
}
}

View file

@ -120,6 +120,11 @@ class Profile:
webgl: WebGLProfile webgl: WebGLProfile
fonts: List[str] fonts: List[str]
dark_theme: bool dark_theme: bool
# Bayesian browsing-history: list of {name, category, cookie_profile}
# dicts sampled from data/browsing_pool.json with per-class CPT. Used
# by _recaptcha_seed.py to build a coherent cookie pre-seed when the
# caller opts in via Stealthfox(prep_recaptcha=True).
browsing_history: List[Dict[str, str]] = field(default_factory=list)
_raw: Dict[str, Any] = field(default_factory=dict, repr=False, compare=False) _raw: Dict[str, Any] = field(default_factory=dict, repr=False, compare=False)
def to_prefs_dict(self) -> Dict[str, Any]: def to_prefs_dict(self) -> Dict[str, Any]:
@ -129,7 +134,7 @@ class Profile:
# Mapping from flat pin key -> raw sampler dict key, so `to_prefs_dict()` # Mapping from flat pin key -> raw sampler dict key, so `to_prefs_dict()`
# and `stealthfox.prefs.translate_profile_to_prefs` observe the pinned value. # and `invisible_playwright.prefs.translate_profile_to_prefs` observe the pinned value.
_PIN_TO_RAW = { _PIN_TO_RAW = {
"gpu.vendor": "webgl_vendor", "gpu.vendor": "webgl_vendor",
"gpu.renderer": "webgl_renderer", "gpu.renderer": "webgl_renderer",
@ -182,11 +187,11 @@ def generate_profile(seed: int, pin: Optional[Dict[str, Any]] = None) -> Profile
same seed + same pin map always yields the same profile. same seed + same pin map always yields the same profile.
Example force a specific GPU and screen while letting everything Example force a specific GPU and screen while letting everything
else vary with the seed (via the public stealthfox API): else vary with the seed (via the public invisible_playwright API):
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
with Stealthfox( with InvisiblePlaywright(
seed=42, seed=42,
pin={ pin={
"gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)", "gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)",
@ -255,5 +260,6 @@ def generate_profile(seed: int, pin: Optional[Dict[str, Any]] = None) -> Profile
webgl=WebGLProfile(msaa_samples=int(raw["msaa_samples"])), webgl=WebGLProfile(msaa_samples=int(raw["msaa_samples"])),
fonts=fonts, fonts=fonts,
dark_theme=bool(raw["dark_theme"]), dark_theme=bool(raw["dark_theme"]),
browsing_history=list(raw.get("browsing_history") or []),
_raw=raw, _raw=raw,
) )

View file

@ -0,0 +1,218 @@
"""Resolve the session timezone from the egress IP (``timezone="auto"``).
Approach B: discover the egress IP with one HTTP request routed *through the
proxy* when one is set, otherwise a direct request that sees the host's own
public IP then map IP IANA timezone with an offline mmdb
(``daijro/geoip-all-in-one``, downloaded + cached by ``download.py``).
Precedence (see ``resolve_session_timezone``):
explicit IANA unchanged explicit always wins
"" / "auto" egress ALWAYS resolve. With a proxy, from the proxy
egress IP; without a proxy, from the host's
own public IP. This is the default.
On failure:
with a proxy raise a foreign proxy paired with the host TZ is
the precise ``timezone_mismatch`` signal, so
we fail loudly rather than fall back silently.
without a proxy "" (host) the host TZ is a safe default, so a transient
lookup failure must not break the launch.
"""
from __future__ import annotations
import ipaddress
from typing import Any, Dict, NamedTuple, Optional
from urllib.parse import quote
import requests
class GeoTimezoneError(RuntimeError):
"""Raised when ``timezone="auto"`` cannot resolve a valid IANA zone."""
# Plain-text IP echo endpoints (each returns just the caller's public IP).
_IP_ECHO_ENDPOINTS = (
"https://api.ipify.org",
"https://icanhazip.com",
"https://checkip.amazonaws.com",
)
_SOCKS_SCHEMES = ("socks5://", "socks4://", "socks://")
def _proxy_is_set(proxy: Optional[Dict[str, str]]) -> bool:
if not proxy:
return False
server = (proxy.get("server") or "").strip()
return bool(server) and server.lower() != "direct://"
def _proxies_for_requests(proxy: Dict[str, str]) -> Dict[str, str]:
"""Translate our proxy dict into a ``requests`` proxies mapping.
SOCKS5 uses the ``socks5h`` scheme so DNS is resolved proxy-side (matches
``network.proxy.socks_remote_dns=True`` in the Firefox path). HTTP/HTTPS
pass through unchanged. Credentials are URL-encoded.
"""
server = (proxy.get("server") or "").strip()
low = server.lower()
if low.startswith("socks5://") or low.startswith("socks://"):
scheme = "socks5h"
elif low.startswith("socks4://"):
scheme = "socks4"
elif low.startswith("https://"):
scheme = "https"
else:
scheme = "http"
host_port = server.split("://", 1)[1] if "://" in server else server
user = proxy.get("username") or ""
pwd = proxy.get("password") or ""
if user:
auth = f"{quote(user, safe='')}:{quote(pwd, safe='')}@"
else:
auth = ""
url = f"{scheme}://{auth}{host_port}"
return {"http": url, "https": url}
def discover_egress_ip(
proxy: Optional[Dict[str, str]] = None, *, timeout: float = 10.0
) -> str:
"""Return the public egress IP.
Routes the request through ``proxy`` when given (SOCKS support requires
``requests[socks]`` / PySocks); with ``proxy=None`` it makes a direct
request that sees the host's own public IP. Tries each echo endpoint in
turn; raises :class:`GeoTimezoneError` if none return a valid IP.
"""
proxies = _proxies_for_requests(proxy) if proxy else None
last_err: Optional[Exception] = None
for url in _IP_ECHO_ENDPOINTS:
try:
resp = requests.get(url, proxies=proxies, timeout=timeout)
resp.raise_for_status()
ip = resp.text.strip()
ipaddress.ip_address(ip) # validate (raises ValueError if not an IP)
return ip
except Exception as exc: # noqa: BLE001 - try the next endpoint
last_err = exc
continue
raise GeoTimezoneError(
f"could not discover the proxy egress IP via {len(_IP_ECHO_ENDPOINTS)} "
f"endpoints (last error: {last_err!r}). For SOCKS proxies make sure "
f"requests[socks] / PySocks is installed."
)
def ip_to_timezone(ip: str, mmdb_path: Any) -> str:
"""Map ``ip`` to its IANA timezone using the offline mmdb.
Reads the standard MaxMind ``location.time_zone`` field and validates it
against the system tz database. Raises :class:`GeoTimezoneError` if the IP
is absent from the DB or the zone is missing / not a valid IANA name.
"""
import maxminddb
with maxminddb.open_database(str(mmdb_path)) as reader:
record = reader.get(ip)
if not record:
raise GeoTimezoneError(f"egress IP {ip} not present in the geoip database")
tz = ((record.get("location") or {}) if isinstance(record, dict) else {}).get(
"time_zone"
)
if not tz:
raise GeoTimezoneError(f"no timezone for egress IP {ip} in the geoip database")
from zoneinfo import ZoneInfo, ZoneInfoNotFoundError
try:
ZoneInfo(tz)
except (ZoneInfoNotFoundError, ValueError) as exc:
raise GeoTimezoneError(
f"geoip returned an invalid IANA zone {tz!r} for {ip}: {exc}"
) from exc
return tz
class SessionGeo(NamedTuple):
"""Geo facts resolved once per session from a single egress round-trip.
``timezone`` follows the precedence in the module docstring.
``egress_ip`` is the proxy egress IP (the IP the *outside world* sees) when
a proxy is set, else ``None`` it feeds the WebRTC srflx override, which is
only meaningful behind a proxy (a direct connection's real STUN already
reports the truthful public IP, so we leave it alone).
"""
timezone: str
egress_ip: Optional[str]
def prepare_session_geo(
timezone: str, proxy: Optional[Dict[str, str]]
) -> SessionGeo:
"""Resolve the session timezone AND the proxy egress IP in ONE round-trip.
The egress IP is discovered once and reused for both the timezone mapping
(when ``timezone`` is ``""``/``"auto"``) and the WebRTC public-IP override.
Timezone precedence is identical to :func:`resolve_session_timezone`; the
egress IP is best-effort for the WebRTC side (a discovery failure that the
timezone path doesn't need won't break the launch but if the timezone
path *does* need it behind a proxy, that path still fails loudly).
"""
from .download import ensure_geoip_mmdb
tz = (timezone or "").strip()
proxy_set = _proxy_is_set(proxy)
# One discovery, reused below. Behind a proxy we always want the egress IP
# (for WebRTC) regardless of the timezone setting.
egress_ip: Optional[str] = None
egress_err: Optional[Exception] = None
if proxy_set:
try:
egress_ip = discover_egress_ip(proxy)
except Exception as exc: # noqa: BLE001
egress_err = exc
# Timezone resolution — same precedence as resolve_session_timezone.
if tz and tz.lower() != "auto":
return SessionGeo(tz, egress_ip) # explicit IANA wins
try:
ip = egress_ip if proxy_set else discover_egress_ip(None)
if ip is None: # proxy set but discovery failed above
raise egress_err or GeoTimezoneError("egress IP discovery failed")
return SessionGeo(ip_to_timezone(ip, ensure_geoip_mmdb()), egress_ip)
except Exception:
if proxy_set:
raise # fail-early behind a proxy (timezone_mismatch trap)
return SessionGeo("", None) # no proxy: host TZ is a safe fallback
def resolve_session_timezone(
timezone: str, proxy: Optional[Dict[str, str]]
) -> str:
"""Map the user's ``timezone`` setting to a concrete IANA zone (or ``""``).
Timezone-only path (no WebRTC side effects): an explicit IANA zone wins and
triggers NO network call; ``""``/``"auto"`` resolve from the egress IP. The
launch path uses :func:`prepare_session_geo` instead (which additionally
returns the egress IP for WebRTC); this standalone resolver is kept for
third-party integrations that only want the zone. See the module docstring
for the precedence table.
"""
tz = (timezone or "").strip()
if tz and tz.lower() != "auto":
return tz # explicit IANA wins — no egress lookup
from .download import ensure_geoip_mmdb
proxy_set = _proxy_is_set(proxy)
try:
ip = discover_egress_ip(proxy if proxy_set else None)
return ip_to_timezone(ip, ensure_geoip_mmdb())
except Exception:
if proxy_set:
raise # fail-early behind a proxy (timezone_mismatch trap)
return "" # no proxy: host TZ is a safe fallback

View file

@ -2,18 +2,23 @@
Playwright's ``headless=True`` flips Firefox onto a different code path — Playwright's ``headless=True`` flips Firefox onto a different code path —
no widget tree, software-only rendering, distinct timing and anti-bot no widget tree, software-only rendering, distinct timing and anti-bot
systems can spot the divergence. Running the browser *headed* on a systems can spot the divergence. Running the browser *headed* but hidden
virtual display gives us the real rendering pipeline while keeping the gives us the real rendering pipeline while keeping the windows off screen.
windows off the user's screen.
Linux: spawns its own ``Xvfb`` instance, points ``DISPLAY`` at it. Two mechanisms, by platform:
Windows: creates a hidden desktop via ``CreateDesktop`` and binds the
calling thread to it, so Playwright's child processes inherit it. - **Windows & macOS**: the patched binary cloaks its OWN chrome windows
when ``zoom.stealth.cloak_windows`` is set ``DWMWA_CLOAK`` (Windows)
/ ``NSWindow`` alpha-0 + pinned occlusion-ignore (macOS). The window
renders on the real GPU but never appears on screen, in the taskbar or
the Dock. The launcher injects the pref; nothing host-side is spawned.
- **Linux**: spawns its own ``Xvfb`` instance and points ``DISPLAY`` at
it (X11/Wayland have no per-window cloak that keeps the GPU rendering).
""" """
from __future__ import annotations from __future__ import annotations
import os import os
import secrets
import subprocess import subprocess
import sys import sys
import time import time
@ -33,7 +38,7 @@ _WAYLAND_LEAK_VARS = (
class _LinuxVirtualDisplay: class _LinuxVirtualDisplay:
"""Standalone Xvfb instance owned by this Stealthfox session.""" """Standalone Xvfb instance owned by this InvisiblePlaywright session."""
def __init__(self, width: int = 1920, height: int = 1080) -> None: def __init__(self, width: int = 1920, height: int = 1080) -> None:
self._geometry = f"{width}x{height}x24" self._geometry = f"{width}x{height}x24"
@ -44,7 +49,7 @@ class _LinuxVirtualDisplay:
def start(self) -> None: def start(self) -> None:
if not _binary_on_path("Xvfb"): if not _binary_on_path("Xvfb"):
raise RuntimeError( raise RuntimeError(
"stealthfox headless=True requires Xvfb. " "invisible_playwright headless=True requires Xvfb. "
"Install it: sudo apt install xvfb" "Install it: sudo apt install xvfb"
) )
# Retry: when many workers start in parallel they can pick the same # Retry: when many workers start in parallel they can pick the same
@ -131,95 +136,40 @@ class _LinuxVirtualDisplay:
self._display = None self._display = None
class _WindowsVirtualDesktop: # Windows & macOS: the patched Firefox cloaks its own chrome windows when this
"""A hidden Windows desktop the calling thread is bound to. # pref is set (DWMWA_CLOAK / NSWindow alpha-0 + pinned occlusion-ignore), so the
# window renders on the real GPU but never shows on screen / in the taskbar or
# Dock. window_occlusion_tracking is disabled so a hidden window keeps painting.
CLOAK_PREFS = {
"zoom.stealth.cloak_windows": True,
"widget.windows.window_occlusion_tracking.enabled": False,
}
Playwright's child processes (node driver → firefox.exe) inherit the
desktop because their ``STARTUPINFO.lpDesktop`` is NULL Windows uses
the calling thread's desktop in that case.
pywin32 ships ``CreateDesktop`` in ``win32service`` but doesn't expose def cloak_prefs() -> dict:
``SetThreadDesktop`` / ``GetThreadDesktop`` as module functions. We """Prefs that make the patched binary self-cloak its chrome windows.
call them directly via ctypes against ``user32.dll``.
Used on Windows & macOS, where hiding is done inside the binary rather than
with a host-side virtual display.
""" """
return dict(CLOAK_PREFS)
def __init__(self) -> None:
self._desktop = None # PyHDESK from win32service.CreateDesktop
self._original_handle = 0 # raw HDESK int of the previous desktop
def start(self) -> None:
try:
import win32con # type: ignore
import win32service # type: ignore
except ImportError as e:
raise RuntimeError(
"stealthfox headless=True on Windows requires pywin32. "
"Install it: pip install pywin32"
) from e
import ctypes
from ctypes import wintypes
user32 = ctypes.windll.user32
kernel32 = ctypes.windll.kernel32
# Save the current desktop handle so we can restore it on stop().
get_thread_desktop = user32.GetThreadDesktop
get_thread_desktop.argtypes = [wintypes.DWORD]
get_thread_desktop.restype = wintypes.HANDLE
self._original_handle = get_thread_desktop(kernel32.GetCurrentThreadId())
name = f"sf_{secrets.token_hex(4)}"
self._desktop = win32service.CreateDesktop(
name, 0, win32con.GENERIC_ALL, None
)
# Bind the calling thread to the new desktop. Children spawned
# afterwards (Playwright driver → firefox.exe) inherit it because
# their STARTUPINFO.lpDesktop is NULL.
set_thread_desktop = user32.SetThreadDesktop
set_thread_desktop.argtypes = [wintypes.HANDLE]
set_thread_desktop.restype = wintypes.BOOL
if not set_thread_desktop(int(self._desktop)):
err = ctypes.get_last_error()
raise RuntimeError(
f"SetThreadDesktop failed (GetLastError={err}). "
"The thread cannot have any windows or hooks; close them first."
)
def stop(self) -> None:
import ctypes
from ctypes import wintypes
user32 = ctypes.windll.user32
if self._original_handle:
try:
set_thread_desktop = user32.SetThreadDesktop
set_thread_desktop.argtypes = [wintypes.HANDLE]
set_thread_desktop.restype = wintypes.BOOL
set_thread_desktop(self._original_handle)
except Exception:
pass
self._original_handle = 0
if self._desktop is not None:
try:
self._desktop.CloseDesktop()
except Exception:
pass
self._desktop = None
def make_virtual_display(): def make_virtual_display():
"""Return a started/stoppable virtual-display object for this platform. """Return a start()/stop()-able virtual display, or ``None`` when the
platform hides windows via the in-binary cloak pref instead.
Stealthfox supports Windows x86_64 and Linux x86_64 only. - Linux: a fresh ``Xvfb`` (the launcher start()s/stop()s it).
- Windows / macOS: ``None`` the binary self-cloaks via ``cloak_prefs()``,
injected by the launcher; nothing host-side needs spawning.
""" """
if sys.platform == "win32":
return _WindowsVirtualDesktop()
if sys.platform.startswith("linux"): if sys.platform.startswith("linux"):
return _LinuxVirtualDisplay() return _LinuxVirtualDisplay()
if sys.platform in ("win32", "darwin"):
return None
raise RuntimeError( raise RuntimeError(
f"stealthfox supports Windows and Linux only (got {sys.platform!r})" f"invisible_playwright supports Windows, macOS and Linux "
f"(got {sys.platform!r})"
) )

View file

@ -0,0 +1,340 @@
"""Deterministic reCAPTCHA cookie pre-seed.
Consumes the Bayesian-sampled `browsing_history` from the persona Profile
(see `_fpforge/_sampler.py:derive_browsing_history`). For each visited
site, builds 1-5 realistic cookies whose composition is chosen by the
site's `cookie_profile` tag (analytics-only / consent / cloudflare-bot-
management / etc.). All values seeded deterministically from the persona
seed, so a given persona always presents the SAME cookies across sessions.
In addition, always seeds 5 cookies on .google.com (NID, CONSENT, SOCS,
_GRECAPTCHA, ENID). Excludes 1P_JAR which was deprecated by Google in 2022
including it now is an anachronism flag.
Public API:
await seed_recaptcha_cookies_async(context, profile, timezone=None)
seed_recaptcha_cookies_sync(context, profile, timezone=None)
`profile` is an `_fpforge.Profile`; `timezone` is the IANA tz (e.g.
"Europe/Rome") used to derive the CONSENT cookie's language token, so a
European-tz persona gets CONSENT in their language not en+FX.
"""
from __future__ import annotations
import datetime
import random
import time
from typing import Any, List, Optional
# URL-safe base64 alphabet (no padding chars).
_B64_ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"
_HEX_ALPHABET = "0123456789abcdef"
def _sub_seed(seed: int, tag: str) -> int:
"""FNV-1a mix → independent PRNG streams per logical bucket from one seed."""
h = 0xcbf29ce484222325 ^ (seed & 0xFFFFFFFF)
for c in tag.encode("ascii"):
h ^= c
h = (h * 0x100000001b3) & 0xFFFFFFFFFFFFFFFF
return h or 0xdeadbeef
def _b64_rand(rng: random.Random, length: int) -> str:
return "".join(rng.choice(_B64_ALPHABET) for _ in range(length))
def _hex_rand(rng: random.Random, length: int) -> str:
return "".join(rng.choice(_HEX_ALPHABET) for _ in range(length))
def _yyyymmdd_utc(ts: int) -> str:
return datetime.datetime.utcfromtimestamp(ts).strftime("%Y%m%d")
# IANA timezone -> (country_code, lang) for CONSENT cookie coherence.
# Real EU users get CONSENT with `<lang>+<COUNTRY>+NNN`; non-EU gets `en+FX+NNN`.
# Default fallback `en+FX+NNN` for any tz not in this map.
_TZ_TO_REGION = {
"Europe/Rome": ("IT", "it"),
"Europe/Berlin": ("DE", "de"),
"Europe/Paris": ("FR", "fr"),
"Europe/Madrid": ("ES", "es"),
"Europe/London": ("GB", "en"),
"Europe/Amsterdam": ("NL", "nl"),
"Europe/Brussels": ("BE", "fr"),
"Europe/Vienna": ("AT", "de"),
"Europe/Zurich": ("CH", "de"),
"Europe/Dublin": ("IE", "en"),
"Europe/Lisbon": ("PT", "pt"),
"Europe/Stockholm": ("SE", "sv"),
"Europe/Oslo": ("NO", "no"),
"Europe/Copenhagen": ("DK", "da"),
"Europe/Helsinki": ("FI", "fi"),
"Europe/Warsaw": ("PL", "pl"),
"Europe/Prague": ("CZ", "cs"),
"Europe/Athens": ("GR", "el"),
"Asia/Tokyo": ("FX", "ja"),
"Asia/Shanghai": ("FX", "zh"),
"Asia/Hong_Kong": ("FX", "zh"),
"Asia/Seoul": ("FX", "ko"),
}
def _consent_region_lang(timezone: Optional[str]) -> tuple:
"""Map IANA tz → (region_token, lang_2char) for CONSENT cookie.
Default `("FX", "en")` for US/unknown."""
if timezone and timezone in _TZ_TO_REGION:
return _TZ_TO_REGION[timezone]
return ("FX", "en")
# ---------------------------------------------------------------------------
# .google.com cookie batch (always present, regardless of browsing history)
# ---------------------------------------------------------------------------
def _google_cookies(rng: random.Random, now: int,
timezone: Optional[str] = None) -> List[dict]:
consent_age = rng.randint(60, 720) * 86400
region, lang = _consent_region_lang(timezone)
# NID 3-digit prefix range broadened to 100-540 to cover historical NID
# versions (137, 105, 511, 525 etc. observed in real captures).
return [
{"name": "NID",
"value": f"{rng.randint(100, 540)}={_b64_rand(rng, 178)}",
"domain": ".google.com", "path": "/",
"expires": now + 180 * 86400,
"httpOnly": True, "secure": True, "sameSite": "None"},
{"name": "CONSENT",
"value": f"YES+cb.{_yyyymmdd_utc(now - consent_age)}-"
f"{rng.randint(10, 19):02d}-p{rng.randint(0, 9)}."
f"{lang}+{region}+{rng.randint(100, 999)}",
"domain": ".google.com", "path": "/",
"expires": now + 395 * 86400,
"secure": True, "sameSite": "Lax"},
# 1P_JAR removed: Google deprecated it in 2022. Including it now is
# an anachronism flag for fingerprinters that look at cookie freshness.
{"name": "SOCS",
"value": f"CAES{_b64_rand(rng, 56)}",
"domain": ".google.com", "path": "/",
"expires": now + 395 * 86400,
"secure": True, "sameSite": "Lax"},
{"name": "_GRECAPTCHA",
"value": _b64_rand(rng, 124),
"domain": ".google.com", "path": "/",
"expires": now + 180 * 86400,
"secure": True, "sameSite": "None"},
{"name": "ENID",
"value": _b64_rand(rng, 252),
"domain": ".google.com", "path": "/",
"expires": now + 395 * 86400,
"httpOnly": True, "secure": True, "sameSite": "Lax"},
]
# ---------------------------------------------------------------------------
# Per-site cookie generators (recipes keyed by site["cookie_profile"])
# ---------------------------------------------------------------------------
def _norm_domain(domain: str) -> str:
return domain if domain.startswith(".") else "." + domain
def _ga_cookie(rng: random.Random, now: int, domain: str) -> dict:
first_age = rng.randint(7, 395) * 86400
return {"name": "_ga",
"value": f"GA1.2.{rng.randint(100000000, 999999999)}.{now - first_age}",
"domain": domain, "path": "/",
"expires": now + 395 * 86400,
"secure": True, "sameSite": "Lax"}
def _gid_cookie(rng: random.Random, now: int, domain: str) -> dict:
return {"name": "_gid",
"value": f"GA1.2.{rng.randint(100000000, 999999999)}.{now - rng.randint(60, 86400)}",
"domain": domain, "path": "/",
"expires": now + 86400,
"secure": True, "sameSite": "Lax"}
def _cf_bm_cookie(rng: random.Random, now: int, domain: str) -> dict:
return {"name": "__cf_bm",
"value": f"{_b64_rand(rng, 43)}.{rng.randint(1700000000, now)}-1-1-1-1",
"domain": domain, "path": "/",
"expires": now + 1800,
"secure": True, "sameSite": "None"}
def _onetrust_cookie(rng: random.Random, now: int, domain: str) -> dict:
age_d = rng.randint(7, 365)
iso = datetime.datetime.utcfromtimestamp(now - age_d * 86400).strftime(
"%Y-%m-%dT%H:%M:%S.000Z"
)
return {"name": "OptanonAlertBoxClosed",
"value": iso,
"domain": domain, "path": "/",
"expires": now + 395 * 86400,
"secure": True, "sameSite": "Lax"}
def _cookieyes_cookie(rng: random.Random, now: int, domain: str) -> dict:
return {"name": "cookieyes-consent",
"value": "consentid:" + _b64_rand(rng, 28) +
",consent:yes,action:yes,necessary:yes,functional:yes,analytics:yes",
"domain": domain, "path": "/",
"expires": now + 395 * 86400,
"secure": True, "sameSite": "Lax"}
def _clarity_cookie(rng: random.Random, now: int, domain: str) -> dict:
return {"name": "_clck",
"value": f"{_hex_rand(rng, 8)}|2|f{rng.randint(10, 99)}|0|"
f"{now - rng.randint(60, 180) * 86400}",
"domain": domain, "path": "/",
"expires": now + 365 * 86400,
"secure": True, "sameSite": "Lax"}
def _fbp_cookie(rng: random.Random, now: int, domain: str) -> dict:
"""Facebook Pixel _fbp = fb.<subdomain_index>.<unix_ms>.<random_int>"""
return {"name": "_fbp",
"value": f"fb.1.{(now - rng.randint(60, 30*86400)) * 1000}."
f"{rng.randint(100000000, 9999999999)}",
"domain": domain, "path": "/",
"expires": now + 90 * 86400,
"secure": True, "sameSite": "Lax"}
def _gtm_cookie(rng: random.Random, now: int, domain: str) -> dict:
"""_dc_gtm_<container_id>=1 — Google Tag Manager throttle flag."""
container = f"UA-{rng.randint(10000000, 99999999)}-{rng.randint(1, 9)}"
return {"name": f"_dc_gtm_{container}",
"value": "1",
"domain": domain, "path": "/",
"expires": now + 60,
"secure": True, "sameSite": "Lax"}
def _hssrc_cookie(rng: random.Random, now: int, domain: str) -> dict:
"""HubSpot referrer flag — small int."""
return {"name": "__hssrc",
"value": str(rng.randint(1, 5)),
"domain": domain, "path": "/",
"expires": now + 1800,
"secure": True, "sameSite": "Lax"}
def _cookies_for_profile(profile: str, rng: random.Random,
now: int, domain: str) -> List[dict]:
"""Map cookie_profile tag (from browsing_pool.json) → concrete cookies.
Each recipe is a realistic combination observed on real production sites
in that category. Cookie age and sub-recipe variance (e.g., OneTrust vs
CookieYes for consent banner) are deterministic from rng.
"""
domain = _norm_domain(domain)
if profile == "minimal":
return [_ga_cookie(rng, now, domain)]
if profile == "ga_only":
out = [_ga_cookie(rng, now, domain), _gid_cookie(rng, now, domain)]
# 30% chance of GTM helper paired with GA
if rng.random() < 0.3:
out.append(_gtm_cookie(rng, now, domain))
return out
if profile == "ga_cf":
return [_ga_cookie(rng, now, domain), _cf_bm_cookie(rng, now, domain)]
if profile == "ga_consent":
out = [_ga_cookie(rng, now, domain), _gid_cookie(rng, now, domain)]
out.append(_onetrust_cookie(rng, now, domain) if rng.random() < 0.5
else _cookieyes_cookie(rng, now, domain))
if rng.random() < 0.4:
out.append(_gtm_cookie(rng, now, domain))
return out
if profile == "ga_consent_clarity":
# Heavy-tracking site profile: GA + Clarity + consent + often FB pixel
out = [_ga_cookie(rng, now, domain), _gid_cookie(rng, now, domain),
_clarity_cookie(rng, now, domain)]
out.append(_onetrust_cookie(rng, now, domain) if rng.random() < 0.5
else _cookieyes_cookie(rng, now, domain))
if rng.random() < 0.5:
out.append(_fbp_cookie(rng, now, domain))
if rng.random() < 0.4:
out.append(_gtm_cookie(rng, now, domain))
if rng.random() < 0.25:
out.append(_hssrc_cookie(rng, now, domain))
return out
# Unknown profile → safe fallback
return [_ga_cookie(rng, now, domain)]
# ---------------------------------------------------------------------------
# Public builder
# ---------------------------------------------------------------------------
def build_cookies(seed: int,
browsing_history: Optional[List[dict]] = None,
now: Optional[int] = None,
timezone: Optional[str] = None) -> List[dict]:
"""Build the full cookie list for a persona.
Args:
seed: persona integer seed (from `Profile.seed`)
browsing_history: list of {name, category, cookie_profile} dicts as
sampled by `_fpforge.derive_browsing_history`. None empty list
(only the 5 google cookies are returned).
now: unix-seconds timestamp; defaults to current time. Pin for tests.
timezone: IANA tz used to derive CONSENT cookie's `lang+region` token
(e.g. "Europe/Rome" "it+IT", "America/New_York" "en+FX").
"""
ts = now if now is not None else int(time.time())
cookies: List[dict] = []
# 5 .google.com cookies (always) — CONSENT lang derived from tz
rng_g = random.Random(_sub_seed(int(seed), "google"))
cookies.extend(_google_cookies(rng_g, ts, timezone=timezone))
# Per-site cookies (deterministic from seed × domain)
for site in (browsing_history or []):
rng_d = random.Random(_sub_seed(int(seed), f"dom:{site['name']}"))
cookies.extend(_cookies_for_profile(
site.get("cookie_profile", "minimal"), rng_d, ts, site["name"]
))
return cookies
def _extract_seed_and_history(profile: Any) -> tuple:
"""Accept a Profile object OR a (seed, history) tuple OR just an int seed."""
if isinstance(profile, int):
return int(profile), []
seed = int(getattr(profile, "seed"))
history = list(getattr(profile, "browsing_history", []) or [])
return seed, history
async def seed_recaptcha_cookies_async(context: Any, profile: Any,
timezone: Optional[str] = None) -> None:
"""Async: inject deterministic persona cookies into the context."""
seed, history = _extract_seed_and_history(profile)
cookies = build_cookies(seed, history, timezone=timezone)
try:
await context.add_cookies(cookies)
except Exception:
pass
def seed_recaptcha_cookies_sync(context: Any, profile: Any,
timezone: Optional[str] = None) -> None:
"""Sync: inject deterministic persona cookies into the context."""
seed, history = _extract_seed_and_history(profile)
cookies = build_cookies(seed, history, timezone=timezone)
try:
context.add_cookies(cookies)
except Exception:
pass
__all__ = [
"build_cookies",
"seed_recaptcha_cookies_async",
"seed_recaptcha_cookies_sync",
]

View file

@ -3,12 +3,14 @@ from __future__ import annotations
import asyncio import asyncio
import secrets import secrets
from pathlib import Path
from typing import Any, Dict, Optional, Union from typing import Any, Dict, Optional, Union
from playwright.async_api import Browser, Playwright, async_playwright from playwright.async_api import Browser, BrowserContext, Playwright, async_playwright
from ._fpforge import Profile, generate_profile from ._fpforge import Profile, generate_profile
from ._headless import make_virtual_display from ._geo import prepare_session_geo
from ._headless import cloak_prefs, make_virtual_display
from ._proxy import configure_proxy as _configure_proxy_shared from ._proxy import configure_proxy as _configure_proxy_shared
from .download import ensure_binary from .download import ensure_binary
from .launcher import _CHROME_H, _CHROME_W, _TASKBAR_H, _tz_env from .launcher import _CHROME_H, _CHROME_W, _TASKBAR_H, _tz_env
@ -33,8 +35,8 @@ def _patch_new_page_sleep(ctx: Any) -> None:
ctx.new_page = patched_new_page # type: ignore[assignment] ctx.new_page = patched_new_page # type: ignore[assignment]
class Stealthfox: class InvisiblePlaywright:
"""Async context manager — see stealthfox.Stealthfox for the sync variant.""" """Async context manager — see invisible_playwright.InvisiblePlaywright for the sync variant."""
def __init__( def __init__(
self, self,
@ -49,6 +51,8 @@ class Stealthfox:
timezone: str = "", timezone: str = "",
extra_prefs: Optional[Dict[str, Any]] = None, extra_prefs: Optional[Dict[str, Any]] = None,
binary_path: Optional[str] = None, binary_path: Optional[str] = None,
profile_dir: Optional[Union[str, Path]] = None,
prep_recaptcha: bool = False,
) -> None: ) -> None:
# See sync launcher: `zoom.stealth.fpp.hw_seed` is int32_t — clamp. # See sync launcher: `zoom.stealth.fpp.hw_seed` is int32_t — clamp.
self.seed: int = int(seed) if seed is not None else secrets.randbits(31) self.seed: int = int(seed) if seed is not None else secrets.randbits(31)
@ -61,13 +65,28 @@ class Stealthfox:
self._timezone = timezone self._timezone = timezone
self._extra_prefs = extra_prefs self._extra_prefs = extra_prefs
self._binary_path = binary_path self._binary_path = binary_path
self._profile_dir: Optional[Path] = Path(profile_dir) if profile_dir else None
# reCAPTCHA pre-seed gated server-side; respect persistent profile.
self._prep_recaptcha = bool(prep_recaptcha) and self._profile_dir is None
self._profile: Profile = generate_profile(self.seed, pin=self._pin) self._profile: Profile = generate_profile(self.seed, pin=self._pin)
self._pw: Optional[Playwright] = None self._pw: Optional[Playwright] = None
self._browser: Optional[Browser] = None self._browser: Optional[Browser] = None
self._persistent_context: Optional[BrowserContext] = None
self._virtual_display: Any = None self._virtual_display: Any = None
# Proxy egress IP (WebRTC srflx override); discovered in __aenter__.
self._webrtc_egress_ip: Optional[str] = None
async def __aenter__(self) -> Browser: async def __aenter__(self) -> Union[Browser, BrowserContext]:
import sys as _sys import sys as _sys
# Resolve timezone="auto" AND discover the proxy egress IP in one
# round-trip, off the event loop, before anything reads self._timezone
# or builds prefs/env. Fail-early if a proxy is set but the egress
# can't be resolved.
_geo = await asyncio.to_thread(
prepare_session_geo, self._timezone, self._proxy
)
self._timezone = _geo.timezone
self._webrtc_egress_ip = _geo.egress_ip
executable = self._binary_path or ensure_binary() executable = self._binary_path or ensure_binary()
prefs = translate_profile_to_prefs( prefs = translate_profile_to_prefs(
self._profile, self._profile,
@ -76,6 +95,15 @@ class Stealthfox:
extra_prefs=self._extra_prefs, extra_prefs=self._extra_prefs,
virtual_display=bool(self._headless and _sys.platform == "win32"), virtual_display=bool(self._headless and _sys.platform == "win32"),
) )
# Windows & macOS hide the headless window via the binary's own cloak
# (DWMWA_CLOAK / NSWindow alpha) — inject the pref so the patched build
# cloaks its chrome windows. setdefault: an explicit user override wins.
# (Mirrors launcher._build_prefs; the sync path always did this, async
# didn't — so async headless=True never cloaked AND crashed below.)
if self._headless and _sys.platform in ("win32", "darwin"):
for _k, _v in cloak_prefs().items():
prefs.setdefault(_k, _v)
# stealthfox.* is the namespace the binary's Juggler reads (see launcher.py note).
prefs["stealthfox.humanize"] = bool(self._humanize) prefs["stealthfox.humanize"] = bool(self._humanize)
if self._humanize: if self._humanize:
cap = 1.5 if self._humanize is True else float(self._humanize) cap = 1.5 if self._humanize is True else float(self._humanize)
@ -85,6 +113,24 @@ class Stealthfox:
env = self._build_env() env = self._build_env()
try: try:
self._pw = await async_playwright().start() self._pw = await async_playwright().start()
if self._profile_dir is not None:
# See sync launcher for the persistent-context rationale.
self._profile_dir.mkdir(parents=True, exist_ok=True)
# firefox-5 ships the C++ overrideTimezone IDL method (C7
# chiusura), so locale + timezone_id now propagate cleanly
# to the persistent context without hanging the launch.
self._persistent_context = await self._pw.firefox.launch_persistent_context(
user_data_dir=str(self._profile_dir),
executable_path=str(executable),
headless=pw_headless,
firefox_user_prefs=prefs,
proxy=playwright_proxy,
args=self._extra_args,
env=env,
**self._default_context_kwargs(),
)
_patch_new_page_sleep(self._persistent_context)
return self._persistent_context
self._browser = await self._pw.firefox.launch( self._browser = await self._pw.firefox.launch(
executable_path=str(executable), executable_path=str(executable),
headless=pw_headless, headless=pw_headless,
@ -102,12 +148,18 @@ class Stealthfox:
def _patch_new_context_defaults(self, browser: Browser) -> None: def _patch_new_context_defaults(self, browser: Browser) -> None:
original = browser.new_context original = browser.new_context
defaults = self._default_context_kwargs() defaults = self._default_context_kwargs()
prep = self._prep_recaptcha
profile = self._profile # pass the whole Profile (seed + browsing_history)
tz = self._timezone # used by _recaptcha_seed for CONSENT lang+region
async def patched(**kw): async def patched(**kw):
merged = dict(defaults) merged = dict(defaults)
merged.update(kw) merged.update(kw)
ctx = await original(**merged) ctx = await original(**merged)
_patch_new_page_sleep(ctx) _patch_new_page_sleep(ctx)
if prep:
from ._recaptcha_seed import seed_recaptcha_cookies_async
await seed_recaptcha_cookies_async(ctx, profile, timezone=tz)
return ctx return ctx
browser.new_context = patched # type: ignore[assignment] browser.new_context = patched # type: ignore[assignment]
@ -134,6 +186,12 @@ class Stealthfox:
await self._teardown() await self._teardown()
async def _teardown(self) -> None: async def _teardown(self) -> None:
if self._persistent_context is not None:
try:
await self._persistent_context.close()
except Exception:
pass
self._persistent_context = None
if self._browser is not None: if self._browser is not None:
try: try:
await self._browser.close() await self._browser.close()
@ -158,21 +216,30 @@ class Stealthfox:
env = _os.environ.copy() env = _os.environ.copy()
if self._timezone: if self._timezone:
env["TZ"] = _tz_env(self._timezone) env["TZ"] = _tz_env(self._timezone)
# Propagate STEALTHFOX_WEBRTC_PUBLIC_IP if the process set it — read # WebRTC srflx override: feed nICEr's nr_stealth_bridge the proxy egress
# by nICEr's nr_stealth_bridge to inject a synthetic srflx candidate # IP (caller's explicit env var wins, else the IP auto-discovered in
# matching the proxy egress IP. This avoids the StaticPref IPC # __aenter__) and drop IPv6 from gathering behind a proxy.
# propagation timing issue between parent and socket processes. webrtc_ip = (
if _os.environ.get("STEALTHFOX_WEBRTC_PUBLIC_IP"): _os.environ.get("STEALTHFOX_WEBRTC_PUBLIC_IP")
env["STEALTHFOX_WEBRTC_PUBLIC_IP"] = _os.environ["STEALTHFOX_WEBRTC_PUBLIC_IP"] or self._webrtc_egress_ip
)
if webrtc_ip:
env["STEALTHFOX_WEBRTC_PUBLIC_IP"] = webrtc_ip
env["STEALTHFOX_WEBRTC_DISABLE_IPV6"] = "1"
return env return env
def _resolve_headless(self) -> bool: def _resolve_headless(self) -> bool:
if not self._headless: if not self._headless:
return False return False
vd = make_virtual_display() vd = make_virtual_display()
# Linux: Xvfb to start. Windows/macOS: make_virtual_display() returns
# None (the binary self-cloaks via cloak_prefs injected in __aenter__),
# so there is nothing to start — guarding the None was the missing piece
# that made async headless=True crash with AttributeError on Windows.
if vd is not None:
vd.start() vd.start()
self._virtual_display = vd self._virtual_display = vd
return False return False
__all__ = ["Stealthfox"] __all__ = ["InvisiblePlaywright"]

View file

@ -0,0 +1,92 @@
"""Command-line interface for invisible_playwright."""
from __future__ import annotations
import argparse
import shutil
import sys
from . import __version__
from .constants import BINARY_VERSION, FIREFOX_UPSTREAM_VERSION
from .download import cache_root, ensure_binary
def _cmd_fetch(args: argparse.Namespace) -> int:
# --force: re-download even if already cached (drop the cached version dir,
# then let ensure_binary fetch it fresh). Useful to recover a corrupted cache
# or re-pull after a re-published release.
if getattr(args, "force", False):
from .download import cache_dir_for_version
d = cache_dir_for_version()
if d.exists():
shutil.rmtree(d, ignore_errors=True)
path = ensure_binary()
print(path)
return 0
def _cmd_path(_args: argparse.Namespace) -> int:
try:
path = ensure_binary()
except Exception as e:
print(f"error: {e}", file=sys.stderr)
return 1
print(path)
return 0
def _cmd_version(_args: argparse.Namespace) -> int:
print(f"invisible_playwright {__version__}")
print(f"BINARY_VERSION={BINARY_VERSION} (Firefox {FIREFOX_UPSTREAM_VERSION})")
return 0
def _cmd_clear_cache(_args: argparse.Namespace) -> int:
root = cache_root()
if root.exists():
shutil.rmtree(root)
print(f"removed: {root}")
else:
print(f"nothing to remove: {root}")
return 0
def build_parser() -> argparse.ArgumentParser:
p = argparse.ArgumentParser(prog="invisible-playwright", description="invisible_playwright CLI")
# Top-level `--version` / `-V` flag so `python -m invisible_playwright --version`
# works (Python convention), in addition to the existing `version` subcommand.
p.add_argument(
"-V", "--version", action="version",
version=f"invisible_playwright {__version__} (BINARY_VERSION={BINARY_VERSION}, Firefox {FIREFOX_UPSTREAM_VERSION})",
)
sub = p.add_subparsers(dest="cmd")
fetch_p = sub.add_parser("fetch", help="download the patched Firefox binary")
fetch_p.add_argument("--force", action="store_true",
help="re-download even if already cached")
sub.add_parser("path", help="print the absolute path to the cached binary")
sub.add_parser("version", help="print wrapper and binary versions")
sub.add_parser("clear-cache", help="remove all cached binaries")
return p
def main(argv: list[str] | None = None) -> int:
parser = build_parser()
args = parser.parse_args(argv)
if args.cmd is None:
# argparse-conventional: print usage + error message to stderr, exit 2.
# We can't keep `required=True` on the subparsers because that breaks
# the top-level `--version` flag (argparse demands a subcommand even
# when --version is the only token). parser.error() preserves the
# original "no subcommand" exit semantics tests expect.
parser.error("a subcommand is required (try --help, --version, or one of: fetch, path, version, clear-cache)")
dispatch = {
"fetch": _cmd_fetch,
"path": _cmd_path,
"version": _cmd_version,
"clear-cache": _cmd_clear_cache,
}
return dispatch[args.cmd](args)
if __name__ == "__main__":
sys.exit(main())

View file

@ -0,0 +1,111 @@
"""Public helpers for building Firefox launch config without using ``InvisiblePlaywright``.
Use these when you need to call ``playwright.firefox.launch()`` (or
``firefox.launch_persistent_context()``) directly with our patched binary
and stealth prefs, instead of using the ``InvisiblePlaywright`` context
manager.
Typical caller is an external integration that owns its own browser
lifecycle (a Crawlee/Skyvern/changedetection-style fetcher, a Playwright
Server wrapper, a multi-language harness) and just wants the building
blocks::
from playwright.async_api import async_playwright
from invisible_playwright import ensure_binary, get_default_stealth_prefs
async with async_playwright() as p:
browser = await p.firefox.launch(
executable_path=str(ensure_binary()),
firefox_user_prefs=get_default_stealth_prefs(seed=42),
)
For everyday Python usage the ``InvisiblePlaywright`` context manager is
still the recommended entry point; these helpers expose the same internals
without the lifecycle ownership.
.. note::
When calling ``firefox.launch()`` yourself, pass ``headless=False`` and
manage the display hiding (Xvfb on Linux, hidden desktop on Windows)
externally. Passing ``headless=True`` directly to Playwright puts
Firefox in true headless mode, which skips the real rendering pipeline
and breaks canvas / audio / WebGL fingerprint coherence. The
``InvisiblePlaywright`` context manager does this translation
automatically; the public helpers leave it to the caller.
"""
from __future__ import annotations
import secrets
from typing import Any, Dict, List, Optional, Union
from ._fpforge import generate_profile
from .prefs import translate_profile_to_prefs
def get_default_stealth_prefs(
seed: Optional[int] = None,
*,
pin: Optional[Dict[str, Any]] = None,
locale: str = "en-US",
timezone: str = "",
extra_prefs: Optional[Dict[str, Any]] = None,
humanize: Union[bool, float] = True,
virtual_display: bool = False,
) -> Dict[str, Any]:
"""Build a complete ``firefox_user_prefs`` dict for ``firefox.launch()``.
Same prefs that ``InvisiblePlaywright(seed=..., locale=..., timezone=...,
extra_prefs=..., humanize=...)`` would inject. Use this when you need to
drive ``playwright.firefox.launch()`` yourself.
Args:
seed: Integer seed for the Bayesian fingerprint sampler. Same seed
produces the same fingerprint. ``None`` generates a fresh
random int31 (matches ``InvisiblePlaywright`` default).
pin: Optional dict forcing specific fingerprint fields while the
rest stays seed-derived. See ``docs/pinning.md``.
locale: BCP-47 tag (e.g. ``"en-US"``). Drives ``Accept-Language``
and ``navigator.language``.
timezone: IANA timezone (e.g. ``"America/New_York"``). Empty means
use the host TZ. This pure pref builder does NOT resolve
``"auto"`` (that needs the proxy + a network lookup at launch
time) pass a concrete zone here, or use ``InvisiblePlaywright``
/ ``resolve_session_timezone(timezone, proxy)`` for ``"auto"``.
extra_prefs: Optional dict overlaid LAST onto the generated prefs.
humanize: When True (default), every mouse move is expanded into
a Bezier trajectory by the patched Juggler. A float caps the
motion in seconds. False disables the behavior.
virtual_display: When True on Windows, apply GPU-disabling prefs
to prevent GPU process crashes on virtual desktops without
D3D11 backend.
Returns:
Dict ready to pass as ``firefox_user_prefs=`` to
``playwright.firefox.launch()`` or ``launch_persistent_context()``.
"""
resolved_seed = int(seed) if seed is not None else secrets.randbits(31)
profile = generate_profile(resolved_seed, pin=pin)
prefs = translate_profile_to_prefs(
profile,
locale=locale,
timezone=timezone,
extra_prefs=extra_prefs,
virtual_display=virtual_display,
)
# stealthfox.* is the namespace the binary's Juggler reads (see launcher.py note).
prefs["stealthfox.humanize"] = bool(humanize)
if humanize:
max_seconds = float(humanize) if not isinstance(humanize, bool) else 1.5
prefs["stealthfox.humanize.maxTime"] = str(max_seconds)
return prefs
def get_default_args() -> List[str]:
"""Return the default Firefox CLI args to pass via ``args=``.
Currently empty list, since all our stealth configuration is delivered
via ``firefox_user_prefs`` rather than CLI flags. Exposed for parity
with the ``cloakbrowser.config.get_default_stealth_args`` pattern and
to future-proof integrations that already wire ``args=[*existing,
*get_default_args()]``.
"""
return []

View file

@ -0,0 +1,80 @@
"""Compile-time constants that pin the wrapper to a specific Firefox build.
BINARY_VERSION is bumped every time new Firefox patches are released. It is
deliberately decoupled from the Python package version so that pure-Python
bugfixes don't force a multi-hour Firefox rebuild.
"""
from __future__ import annotations
# Bump this when a new patched Firefox build is released on GitHub.
BINARY_VERSION: str = "firefox-10"
# Releases known to be broken — ensure_binary() refuses them with a clear error
# instead of handing the user an unusable binary. firefox-8 was packaged without
# the juggler automation layer, so Playwright cannot drive it (TargetClosedError);
# fixed in firefox-9 (package-manifest.in now ships chrome/juggler). A cached
# firefox-8 from before the bump would otherwise keep being used silently.
BROKEN_VERSIONS: frozenset[str] = frozenset({"firefox-8"})
# Underlying Firefox version (for display only; does not drive downloads).
FIREFOX_UPSTREAM_VERSION: str = "150.0.1"
# The base filename prefix used inside archives.
BINARY_BASENAME: str = f"firefox-{FIREFOX_UPSTREAM_VERSION}-stealth"
def ARCHIVE_NAME(platform_key: str, machine: str) -> str:
"""Return the platform-specific archive filename.
platform_key: sys.platform ("win32", "linux", "darwin")
machine: platform.machine() ("AMD64", "x86_64", "arm64", "aarch64", ...)
"""
pk = platform_key.lower()
m = machine.lower()
if m in {"amd64", "x86_64"}:
arch = "x86_64"
elif m in {"arm64", "aarch64"}:
arch = "arm64"
else:
raise NotImplementedError(f"unsupported arch: {machine}")
if pk == "win32":
return f"{BINARY_BASENAME}-win-{arch}.zip"
if pk == "linux":
return f"{BINARY_BASENAME}-linux-{arch}.tar.gz"
if pk == "darwin":
return f"{BINARY_BASENAME}-macos-{arch}.tar.gz"
raise NotImplementedError(f"unsupported platform: {platform_key}")
# Binary entry point relative path inside the extracted archive root.
# macOS ships the .app bundle (renamed to a stable "Firefox.app" by release.yml);
# the wrapper execs the inner binary directly, which sidesteps Gatekeeper.
BINARY_ENTRY_REL = {
"win32": "firefox.exe",
"linux": "firefox",
"darwin": "Firefox.app/Contents/MacOS/firefox",
}
# GitHub release URL template. The "TODO" owner is resolved at publication time.
RELEASE_URL_TEMPLATE = (
"https://github.com/feder-cr/invisible_playwright/releases/download/{tag}/{asset}"
)
# ─────────────────────────────────────────────────────────────────────────
# GeoIP database (timezone="auto" → resolve IANA zone from proxy egress IP)
# ─────────────────────────────────────────────────────────────────────────
# daijro/geoip-all-in-one merges IP2Location LITE + GeoLite2 + DB-IP into a
# single mmdb (country ISO + coordinates + IANA timezone via tzfpy), rebuilt
# weekly. GPL-3.0, so we DOWNLOAD it at runtime into the user cache (like the
# Firefox binary) rather than bundling it into this MIT package. The `-all`
# variant covers IPv4+IPv6. download.py tracks the LATEST release and refreshes
# weekly; GEOIP_MMDB_VERSION is only the cold-cache fallback when the GitHub
# API is unreachable on a machine that has never downloaded the DB.
GEOIP_REPO: str = "daijro/geoip-all-in-one"
GEOIP_MMDB_VERSION: str = "2026.06.03"
GEOIP_ASSET: str = "geoip-aio-all.mmdb.zip"
GEOIP_MMDB_NAME: str = "geoip-aio-all.mmdb"
GEOIP_RELEASE_URL_TEMPLATE: str = (
"https://github.com/daijro/geoip-all-in-one/releases/download/{tag}/{asset}"
)

View file

@ -0,0 +1,328 @@
"""Download and cache the patched Firefox binary from GitHub Releases."""
from __future__ import annotations
import hashlib
import os
import platform
import re
import shutil
import subprocess
import sys
import tarfile
import tempfile
import time
import zipfile
from pathlib import Path
import platformdirs
import requests
from .constants import (
ARCHIVE_NAME,
BINARY_ENTRY_REL,
BINARY_VERSION,
BROKEN_VERSIONS,
GEOIP_ASSET,
GEOIP_MMDB_NAME,
GEOIP_MMDB_VERSION,
GEOIP_RELEASE_URL_TEMPLATE,
RELEASE_URL_TEMPLATE,
)
def _github_token() -> str | None:
return os.environ.get("STEALTHFOX_GITHUB_TOKEN") or os.environ.get("GITHUB_TOKEN")
def _parse_owner_repo(template: str) -> tuple[str, str]:
"""Extract (owner, repo) from RELEASE_URL_TEMPLATE."""
m = re.match(r"https://github\.com/([^/]+)/([^/]+)/releases/", template)
if not m:
raise RuntimeError(f"cannot parse owner/repo from {template!r}")
return m.group(1), m.group(2)
def cache_root() -> Path:
"""Directory where all cached binaries live."""
return Path(platformdirs.user_cache_dir("invisible-playwright"))
def cache_dir_for_version(version: str = BINARY_VERSION) -> Path:
return cache_root() / version
def _resolve_asset_url(tag: str, asset_name: str) -> str:
"""Return a downloadable URL for the asset.
For private repos the direct `releases/download/<tag>/<asset>` URL returns
404 even with a token, so we resolve via the API: list assets for the
release tag, find the one matching `asset_name`, and use its API URL with
`Accept: application/octet-stream` (which 302-redirects to a signed URL).
For public repos the direct URL still works without a token.
"""
token = _github_token()
if not token:
return RELEASE_URL_TEMPLATE.format(tag=tag, asset=asset_name)
owner, repo = _parse_owner_repo(RELEASE_URL_TEMPLATE)
api = f"https://api.github.com/repos/{owner}/{repo}/releases/tags/{tag}"
r = requests.get(api, headers={"Authorization": f"token {token}"}, timeout=30)
r.raise_for_status()
for a in r.json().get("assets", []):
if a.get("name") == asset_name:
return a["url"]
raise RuntimeError(f"asset {asset_name!r} not found in release {tag!r}")
def _download_file(url: str, dst: Path, chunk_size: int = 1 << 16) -> None:
dst.parent.mkdir(parents=True, exist_ok=True)
headers: dict[str, str] = {}
token = _github_token()
if token and url.startswith("https://api.github.com/"):
headers["Authorization"] = f"token {token}"
headers["Accept"] = "application/octet-stream"
with requests.get(url, stream=True, timeout=60, headers=headers) as r:
r.raise_for_status()
with open(dst, "wb") as f:
for chunk in r.iter_content(chunk_size):
if chunk:
f.write(chunk)
def _sha256_file(path: Path) -> str:
h = hashlib.sha256()
with open(path, "rb") as f:
for chunk in iter(lambda: f.read(1 << 16), b""):
h.update(chunk)
return h.hexdigest()
def _parse_checksums(text: str) -> dict[str, str]:
out: dict[str, str] = {}
for line in text.splitlines():
line = line.strip()
if not line or line.startswith("#"):
continue
parts = line.split()
if len(parts) >= 2:
# sha256sum uses ' *' or ' ' prefix for binary vs text mode
key = parts[-1].lstrip("*")
out[key] = parts[0]
return out
def _extract(archive: Path, dst: Path) -> None:
dst.mkdir(parents=True, exist_ok=True)
if archive.suffix == ".zip":
with zipfile.ZipFile(archive) as zf:
zf.extractall(dst)
elif archive.name.endswith(".tar.gz") or archive.suffix in {".tgz", ".gz"}:
with tarfile.open(archive, "r:gz") as tf:
tf.extractall(dst)
else:
raise RuntimeError(f"unknown archive format: {archive}")
def _post_extract_darwin(app_root: Path, entry: Path) -> None:
"""Make an ad-hoc-signed .app launchable on macOS.
The .app is downloaded via requests (no Finder quarantine attached), but we
strip com.apple.quarantine defensively and ensure the inner binary is
executable. We exec the inner binary directly (not via LaunchServices), so
Gatekeeper's first-launch prompt does not apply; the ad-hoc signature
(applied in release.yml) is what lets the arm64 Mach-O run at all.
"""
app = app_root
# walk up to the .app bundle dir if entry points inside it
for parent in entry.parents:
if parent.name.endswith(".app"):
app = parent
break
try:
subprocess.run(["xattr", "-dr", "com.apple.quarantine", str(app)], check=False)
except FileNotFoundError:
pass
try:
entry.chmod(0o755)
except OSError:
pass
def ensure_binary(version: str = BINARY_VERSION) -> Path:
"""Return a path to a runnable Firefox executable. Download if needed."""
if version in BROKEN_VERSIONS:
raise RuntimeError(
f"{version} is a known-broken release (the juggler automation layer is "
f"missing, so Playwright cannot drive it). Upgrade invisible_playwright "
f"(current BINARY_VERSION={BINARY_VERSION}) or pass a newer version."
)
plat = sys.platform
mach = platform.machine()
asset = ARCHIVE_NAME(plat, mach)
entry_rel = BINARY_ENTRY_REL.get(plat)
if entry_rel is None:
raise NotImplementedError(f"no binary entry for platform {plat}")
version_dir = cache_dir_for_version(version)
entry = version_dir / entry_rel
if entry.exists():
return entry
url_archive = _resolve_asset_url(version, asset)
url_sums = _resolve_asset_url(version, "checksums.txt")
with tempfile.TemporaryDirectory() as td:
tmp = Path(td)
archive_path = tmp / asset
_download_file(url_archive, archive_path)
sums_path = tmp / "checksums.txt"
_download_file(url_sums, sums_path)
sums = _parse_checksums(sums_path.read_text())
expected = sums.get(asset)
if expected is None:
raise RuntimeError(f"no SHA256 for {asset} in checksums.txt")
actual = _sha256_file(archive_path)
if actual.lower() != expected.lower():
raise RuntimeError(
f"SHA256 mismatch for {asset}: got {actual}, expected {expected}"
)
_extract(archive_path, version_dir)
if plat == "darwin":
_post_extract_darwin(version_dir, entry)
if not entry.exists():
raise RuntimeError(f"binary not found after extraction: {entry}")
return entry
# ─────────────────────────────────────────────────────────────────────────
# GeoIP mmdb (timezone="auto" → map egress IP → IANA zone)
#
# daijro/geoip-all-in-one is rebuilt WEEKLY, so we don't pin a tag. We cache
# the latest mmdb and, once it's older than GEOIP_REFRESH_DAYS, re-check the
# latest release and pull a newer build if one exists. Net effect: no download
# (not even an API call) on a launch within the window; auto-refresh after it;
# a stale cache is reused when offline rather than breaking the launch.
# ─────────────────────────────────────────────────────────────────────────
GEOIP_REFRESH_DAYS = 7 # matches daijro's weekly rebuild cadence
def _geoip_root() -> Path:
return cache_root() / "geoip"
def _geoip_check_marker() -> Path:
return _geoip_root() / ".last_check"
def _cached_geoip_mmdb() -> Path | None:
"""Newest cached mmdb across tag dirs, or None. Tag dirs are date strings
(e.g. ``2026.06.03``) so a lexical sort is chronological."""
root = _geoip_root()
if not root.exists():
return None
cands = sorted(root.glob("*/*.mmdb"))
return cands[-1] if cands else None
def _geoip_cache_fresh(max_age_days: int) -> bool:
marker = _geoip_check_marker()
if not marker.exists():
return False
return (time.time() - marker.stat().st_mtime) < max_age_days * 86400
def _touch_geoip_marker() -> None:
m = _geoip_check_marker()
m.parent.mkdir(parents=True, exist_ok=True)
m.touch()
def _latest_geoip_tag() -> str:
"""Latest ``daijro/geoip-all-in-one`` release tag via the GitHub API."""
headers = {"Accept": "application/vnd.github+json"}
token = _github_token()
if token:
headers["Authorization"] = f"token {token}"
r = requests.get(
f"https://api.github.com/repos/{GEOIP_REPO}/releases/latest",
headers=headers, timeout=15,
)
r.raise_for_status()
tag = r.json().get("tag_name")
if not tag:
raise RuntimeError("no tag_name in geoip-all-in-one latest release")
return tag
def _download_geoip_tag(tag: str) -> Path:
"""Download + extract a specific tag's mmdb if not already cached."""
dst_dir = _geoip_root() / tag
target = dst_dir / GEOIP_MMDB_NAME
if not target.exists():
url = GEOIP_RELEASE_URL_TEMPLATE.format(tag=tag, asset=GEOIP_ASSET)
dst_dir.mkdir(parents=True, exist_ok=True)
with tempfile.TemporaryDirectory() as td:
archive = Path(td) / GEOIP_ASSET
_download_file(url, archive)
_extract(archive, dst_dir)
if target.exists():
return target
# asset name inside the zip may differ from GEOIP_MMDB_NAME
found = sorted(dst_dir.glob("*.mmdb"))
if found:
return found[0]
raise RuntimeError(f"geoip mmdb not found after extraction in {dst_dir}")
def _prune_old_geoip_tags(keep: str) -> None:
"""Drop every cached tag dir except ``keep`` to bound disk usage."""
root = _geoip_root()
if not root.exists():
return
for d in root.iterdir():
if d.is_dir() and d.name != keep:
shutil.rmtree(d, ignore_errors=True)
def geoip_mmdb_path() -> Path | None:
"""Path to the currently-cached mmdb (newest tag), or None if none cached."""
return _cached_geoip_mmdb()
def ensure_geoip_mmdb(max_age_days: int = GEOIP_REFRESH_DAYS) -> Path:
"""Return a geoip mmdb, kept fresh against daijro's weekly rebuild.
Resolution order:
1. ``STEALTHFOX_GEOIP_MMDB`` env use that file (user-supplied / test).
2. A cached mmdb younger than ``max_age_days`` use it (no network).
3. Else ask GitHub for the latest tag, download it if not already cached,
prune older tags, and reset the freshness timer.
4. If the API/download is unreachable but a cached mmdb exists use it
(and reset the timer so we don't hammer the API while offline).
5. Cold cache + no network fall back to the pinned ``GEOIP_MMDB_VERSION``;
if that download also fails, raise.
"""
override = os.environ.get("STEALTHFOX_GEOIP_MMDB")
if override:
p = Path(override)
if not p.exists():
raise RuntimeError(f"STEALTHFOX_GEOIP_MMDB points to a missing file: {p}")
return p
cached = _cached_geoip_mmdb()
if cached and _geoip_cache_fresh(max_age_days):
return cached
try:
tag = _latest_geoip_tag()
except Exception:
if cached:
_touch_geoip_marker() # recheck after the window; don't hammer
return cached
tag = GEOIP_MMDB_VERSION # cold cache + API down → pinned fallback
mmdb = _download_geoip_tag(tag)
_prune_old_geoip_tags(mmdb.parent.name)
_touch_geoip_marker()
return mmdb

View file

@ -1,13 +1,15 @@
"""Sync Playwright launcher for stealthfox.""" """Sync Playwright launcher for invisible_playwright."""
from __future__ import annotations from __future__ import annotations
import secrets import secrets
from pathlib import Path
from typing import Any, Dict, Optional, Union from typing import Any, Dict, Optional, Union
from playwright.sync_api import Browser, Playwright, sync_playwright from playwright.sync_api import Browser, BrowserContext, Playwright, sync_playwright
from ._fpforge import Profile, generate_profile from ._fpforge import Profile, generate_profile
from ._headless import make_virtual_display from ._geo import prepare_session_geo
from ._headless import cloak_prefs, make_virtual_display
from ._proxy import configure_proxy as _configure_proxy_shared from ._proxy import configure_proxy as _configure_proxy_shared
from .download import ensure_binary from .download import ensure_binary
from .prefs import translate_profile_to_prefs from .prefs import translate_profile_to_prefs
@ -68,30 +70,30 @@ def _tz_env(timezone: str) -> str:
return _IANA_TO_POSIX_TZ.get(timezone, timezone) return _IANA_TO_POSIX_TZ.get(timezone, timezone)
class Stealthfox: class InvisiblePlaywright:
"""Context manager launching a patched Firefox with a deterministic profile. """Context manager launching a patched Firefox with a deterministic profile.
Usage: Usage:
from stealthfox import Stealthfox from invisible_playwright import InvisiblePlaywright
# random seed (different fingerprint each call) # random seed (different fingerprint each call)
with Stealthfox() as browser: with InvisiblePlaywright() as browser:
page = browser.new_page() page = browser.new_page()
page.goto("https://example.com") page.goto("https://example.com")
# explicit seed → same profile every time # explicit seed → same profile every time
with Stealthfox(seed=42) as browser: with InvisiblePlaywright(seed=42) as browser:
... ...
# human-like cursor motion (Bezier trajectory on every mousemove) # human-like cursor motion (Bezier trajectory on every mousemove)
with Stealthfox(humanize=True) as browser: with InvisiblePlaywright(humanize=True) as browser:
... ...
Optional ``pin`` forces specific fingerprint fields while the rest still Optional ``pin`` forces specific fingerprint fields while the rest still
varies with ``seed``:: varies with ``seed``::
with Stealthfox(seed=42, pin={"screen.width": 2560}) as browser: with InvisiblePlaywright(seed=42, pin={"screen.width": 2560}) as browser:
... ...
After construction, the chosen seed is available as ``self.seed`` useful After construction, the chosen seed is available as ``self.seed`` useful
@ -111,6 +113,8 @@ class Stealthfox:
timezone: str = "", timezone: str = "",
extra_prefs: Optional[Dict[str, Any]] = None, extra_prefs: Optional[Dict[str, Any]] = None,
binary_path: Optional[str] = None, binary_path: Optional[str] = None,
profile_dir: Optional[Union[str, Path]] = None,
prep_recaptcha: bool = False,
) -> None: ) -> None:
""" """
Args: Args:
@ -132,11 +136,26 @@ class Stealthfox:
a float caps the motion in seconds. a float caps the motion in seconds.
locale: BCP-47 tag (e.g. ``"en-US"``). Drives the locale: BCP-47 tag (e.g. ``"en-US"``). Drives the
``Accept-Language`` header and ``navigator.language``. ``Accept-Language`` header and ``navigator.language``.
timezone: IANA timezone (e.g. ``"America/New_York"``). Empty timezone: IANA zone (e.g. ``"America/New_York"``) used as-is
means use the host TZ. when set, the only way to force a specific zone. ``""``
(default) or ``"auto"`` ALWAYS resolves from the egress IP:
through the proxy when one is set, otherwise from the host's
own public IP (one lookup + an offline mmdb). On failure: with
a proxy it raises (a foreign proxy on the host TZ is the
``timezone_mismatch`` signal); without a proxy it falls back to
the host TZ so a transient lookup failure can't break launch.
extra_prefs: Optional dict of Firefox prefs overlayed on top extra_prefs: Optional dict of Firefox prefs overlayed on top
of the generated profile useful for niche tweaks of the generated profile useful for niche tweaks
without monkey-patching the package. without monkey-patching the package.
profile_dir: Path to a persistent Firefox profile directory.
When set, the session uses ``launch_persistent_context()``
so cookies, localStorage, sessionStorage, extensions, cache
and prefs are kept on disk between runs. ``__enter__``
returns a ``BrowserContext`` (not a ``Browser``) use it
directly: ``with InvisiblePlaywright(profile_dir=p) as ctx:
page = ctx.new_page()``. First run creates the dir;
subsequent runs reuse it. Pair with a stable ``seed=`` to
also pin the fingerprint identity across runs.
""" """
# Constrain to int31 — Firefox's `zoom.stealth.fpp.hw_seed` and # Constrain to int31 — Firefox's `zoom.stealth.fpp.hw_seed` and
# related stealth prefs are declared as ``int32_t`` in # related stealth prefs are declared as ``int32_t`` in
@ -154,12 +173,29 @@ class Stealthfox:
self._timezone = timezone self._timezone = timezone
self._extra_prefs = extra_prefs self._extra_prefs = extra_prefs
self._binary_path = binary_path self._binary_path = binary_path
self._profile_dir: Optional[Path] = Path(profile_dir) if profile_dir else None
# reCAPTCHA cookie pre-seed — opt-in. Gated server-side: if a
# persistent profile_dir is in use, respect its existing cookies
# and DON'T enable pre-seed (the profile owns its own state).
self._prep_recaptcha = bool(prep_recaptcha) and self._profile_dir is None
self._profile: Profile = generate_profile(self.seed, pin=self._pin) self._profile: Profile = generate_profile(self.seed, pin=self._pin)
self._pw: Optional[Playwright] = None self._pw: Optional[Playwright] = None
self._browser: Optional[Browser] = None self._browser: Optional[Browser] = None
self._persistent_context: Optional[BrowserContext] = None
self._virtual_display: Any = None self._virtual_display: Any = None
# Proxy egress IP, discovered at launch (see __enter__). Feeds the
# WebRTC srflx override so the candidate matches the proxy IP, not the
# real host IP. None when no proxy is set.
self._webrtc_egress_ip: Optional[str] = None
def __enter__(self) -> Browser: def __enter__(self) -> Union[Browser, BrowserContext]:
# Resolve timezone="auto" (and the proxy-set-but-unset default) to a
# concrete IANA zone AND discover the proxy egress IP — one round-trip,
# before anything reads self._timezone or builds prefs/env. Fail-early
# if a proxy is set but the egress can't be resolved.
_geo = prepare_session_geo(self._timezone, self._proxy)
self._timezone = _geo.timezone
self._webrtc_egress_ip = _geo.egress_ip
executable = self._binary_path or ensure_binary() executable = self._binary_path or ensure_binary()
prefs = self._build_prefs() prefs = self._build_prefs()
playwright_proxy = _configure_proxy_shared(self._proxy, prefs) playwright_proxy = _configure_proxy_shared(self._proxy, prefs)
@ -168,6 +204,25 @@ class Stealthfox:
try: try:
self._pw = sync_playwright().start() self._pw = sync_playwright().start()
if self._profile_dir is not None:
# Persistent context — cookies / localStorage / extensions /
# prefs all live on disk between runs. Stealth prefs are
# re-injected via firefox_user_prefs on every launch (Playwright
# writes them to user.js, which overrides anything in
# prefs.js inside the persistent dir).
self._profile_dir.mkdir(parents=True, exist_ok=True)
self._persistent_context = self._pw.firefox.launch_persistent_context(
user_data_dir=str(self._profile_dir),
executable_path=str(executable),
headless=pw_headless,
firefox_user_prefs=prefs,
proxy=playwright_proxy,
args=self._extra_args,
env=env,
**self._persistent_context_kwargs(),
)
_patch_sync_new_page_sleep(self._persistent_context)
return self._persistent_context
self._browser = self._pw.firefox.launch( self._browser = self._pw.firefox.launch(
executable_path=str(executable), executable_path=str(executable),
headless=pw_headless, headless=pw_headless,
@ -185,6 +240,22 @@ class Stealthfox:
self._patch_new_context_defaults(self._browser) self._patch_new_context_defaults(self._browser)
return self._browser return self._browser
def _persistent_context_kwargs(self) -> Dict[str, Any]:
"""Context-level kwargs accepted by launch_persistent_context.
Identical to ``_default_context_kwargs``: viewport / screen / DPR /
color-scheme / locale / timezone_id. Up to firefox-4 we had to drop
locale and timezone_id because Playwright's per-realm overrides
called IDL methods (``docShell.languageOverride``,
``docShell.overrideTimezone``) that weren't exposed by our patched
build, causing launch_persistent_context to hang for 180s. From
firefox-5 (C7 chiusura), the C++ ``overrideTimezone`` method is
present and ``languageOverride`` was already there, so the
per-realm overrides land and the persistent context starts in
~20s like the non-persistent path.
"""
return self._default_context_kwargs()
def _patch_new_context_defaults(self, browser: Browser) -> None: def _patch_new_context_defaults(self, browser: Browser) -> None:
"""Wrap ``browser.new_context`` so its defaults derive from the """Wrap ``browser.new_context`` so its defaults derive from the
profile (viewport, screen, DPR, color-scheme). Users get a profile (viewport, screen, DPR, color-scheme). Users get a
@ -192,12 +263,18 @@ class Stealthfox:
""" """
original = browser.new_context original = browser.new_context
defaults = self._default_context_kwargs() defaults = self._default_context_kwargs()
prep = self._prep_recaptcha
profile = self._profile # pass the whole Profile (seed + browsing_history)
tz = self._timezone # used by _recaptcha_seed for CONSENT lang+region
def patched(**kw): def patched(**kw):
merged = dict(defaults) merged = dict(defaults)
merged.update(kw) # user-supplied wins merged.update(kw) # user-supplied wins
ctx = original(**merged) ctx = original(**merged)
_patch_sync_new_page_sleep(ctx) _patch_sync_new_page_sleep(ctx)
if prep:
from ._recaptcha_seed import seed_recaptcha_cookies_sync
seed_recaptcha_cookies_sync(ctx, profile, timezone=tz)
return ctx return ctx
browser.new_context = patched # type: ignore[assignment] browser.new_context = patched # type: ignore[assignment]
@ -226,6 +303,12 @@ class Stealthfox:
self._teardown() self._teardown()
def _teardown(self) -> None: def _teardown(self) -> None:
if self._persistent_context is not None:
try:
self._persistent_context.close()
except Exception:
pass
self._persistent_context = None
if self._browser is not None: if self._browser is not None:
try: try:
self._browser.close() self._browser.close()
@ -257,6 +340,16 @@ class Stealthfox:
extra_prefs=self._extra_prefs, extra_prefs=self._extra_prefs,
virtual_display=bool(self._headless and _sys.platform == "win32"), virtual_display=bool(self._headless and _sys.platform == "win32"),
) )
# Windows & macOS hide the headless window via the binary's own cloak
# (DWMWA_CLOAK / NSWindow alpha) — inject the pref so the patched build
# cloaks its chrome windows. setdefault: an explicit user override wins.
if self._headless and _sys.platform in ("win32", "darwin"):
for _k, _v in cloak_prefs().items():
prefs.setdefault(_k, _v)
# Pref namespace MUST be stealthfox.* — that's what the binary's Juggler
# reads (PageHandler.js gates the Bezier mouse path on `stealthfox.humanize`).
# The old `invisible_playwright.*` name was a dead no-op (nothing read it), so
# humanize silently never fired and every click teleported the cursor.
prefs["stealthfox.humanize"] = bool(self._humanize) prefs["stealthfox.humanize"] = bool(self._humanize)
if self._humanize: if self._humanize:
prefs["stealthfox.humanize.maxTime"] = str(self._humanize_max_seconds()) prefs["stealthfox.humanize.maxTime"] = str(self._humanize_max_seconds())
@ -278,24 +371,34 @@ class Stealthfox:
env = _os.environ.copy() env = _os.environ.copy()
if self._timezone: if self._timezone:
env["TZ"] = _tz_env(self._timezone) env["TZ"] = _tz_env(self._timezone)
# Propagate STEALTHFOX_WEBRTC_PUBLIC_IP if the process set it — read # WebRTC srflx override: feed nICEr's nr_stealth_bridge the proxy egress
# by nICEr's nr_stealth_bridge to inject a synthetic srflx candidate # IP so the srflx candidate matches the proxy (not the real host the
# matching the proxy egress IP. This avoids the StaticPref IPC # UDP STUN would otherwise leak). An explicit env var set by the caller
# propagation timing issue between parent and socket processes. # wins; otherwise we use the egress IP auto-discovered in __enter__.
if _os.environ.get("STEALTHFOX_WEBRTC_PUBLIC_IP"): # Behind a proxy we also drop IPv6 from gathering (the disableIPv6 pref
env["STEALTHFOX_WEBRTC_PUBLIC_IP"] = _os.environ["STEALTHFOX_WEBRTC_PUBLIC_IP"] # is dead on FF150 — the bridge filter is the real switch).
webrtc_ip = (
_os.environ.get("STEALTHFOX_WEBRTC_PUBLIC_IP")
or self._webrtc_egress_ip
)
if webrtc_ip:
env["STEALTHFOX_WEBRTC_PUBLIC_IP"] = webrtc_ip
env["STEALTHFOX_WEBRTC_DISABLE_IPV6"] = "1"
return env return env
def _resolve_headless(self) -> bool: def _resolve_headless(self) -> bool:
"""Translate the user's ``headless`` flag. """Translate the user's ``headless`` flag.
When ``True``, we keep Firefox in headed mode (real rendering When ``True``, Firefox stays in headed mode (real rendering pipeline
pipeline coherent fingerprint) and hide the windows on a fresh coherent fingerprint) and the window is hidden: on Linux via a fresh
Xvfb (Linux) or hidden Windows desktop. Xvfb spawned here; on Windows/macOS via the binary's own window cloak
(the ``zoom.stealth.cloak_windows`` pref added in ``_build_prefs``), so
``make_virtual_display()`` returns ``None`` and nothing is spawned.
""" """
if not self._headless: if not self._headless:
return False return False
vd = make_virtual_display() vd = make_virtual_display()
if vd is not None:
vd.start() vd.start()
self._virtual_display = vd self._virtual_display = vd
return False return False

View file

@ -208,15 +208,21 @@ _BASELINE: Dict[str, Any] = {
"privacy.fingerprintingProtection.pbmode": False, "privacy.fingerprintingProtection.pbmode": False,
"privacy.fingerprintingProtection.remoteOverrides.enabled": False, "privacy.fingerprintingProtection.remoteOverrides.enabled": False,
# WebRTC: enabled, no public IP leak. # WebRTC: enabled, looks like a real Firefox behind NAT, no real-IP leak.
# obfuscate_host_addresses=false: our C++ injection handles candidate # obfuscate_host_addresses=true → host candidate is `<uuid>.local` mDNS,
# selection; mDNS causes mDNS-IPC to hang in sandboxed content processes. # exactly like vanilla Firefox (BrowserLeaks "No Leak", Local IP "-").
# disableIPv6=true keeps IPv6 out of gathering (less entropy, no IPv6 leak). # The mDNS-IPC hang feared on older builds does NOT reproduce on FF150.
# The proxy-egress srflx is injected by our C++ (srflx swap §17 + fallback
# §17.B), fed the egress IP via STEALTHFOX_WEBRTC_PUBLIC_IP from
# launcher._build_env (auto-discovered from the proxy).
# IPv6: media.peerconnection.ice.disableIPv6 is DEAD on FF150 (read by no
# ICE-gathering code). The real switch is our zoom.stealth.webrtc.disable_ipv6
# (nICEr addrs.cpp filter) + the STEALTHFOX_WEBRTC_DISABLE_IPV6 env.
"media.peerconnection.enabled": True, "media.peerconnection.enabled": True,
"media.peerconnection.ice.no_host": False, "media.peerconnection.ice.no_host": False,
"media.peerconnection.ice.default_address_only": False, "media.peerconnection.ice.default_address_only": False,
"media.peerconnection.ice.obfuscate_host_addresses": False, "media.peerconnection.ice.obfuscate_host_addresses": True,
"media.peerconnection.ice.disableIPv6": True, "zoom.stealth.webrtc.disable_ipv6": True,
"media.peerconnection.ice.proxy_only": False, "media.peerconnection.ice.proxy_only": False,
"media.peerconnection.ice.relay_only": False, "media.peerconnection.ice.relay_only": False,
"media.peerconnection.use_document_iceservers": True, "media.peerconnection.use_document_iceservers": True,
@ -225,6 +231,17 @@ _BASELINE: Dict[str, Any] = {
"network.proxy.socks_remote_dns": True, "network.proxy.socks_remote_dns": True,
"network.proxy.failover_direct": False, "network.proxy.failover_direct": False,
# TLS ClientHello fingerprint — match stock Firefox byte-for-byte.
# The Playwright/Juggler Firefox build this binary derives from re-enables
# cipher 0xC009 (TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA), which retail Firefox
# 150 does NOT offer. That extra (17th) cipher shifts our JA3/JA4 away from
# any real Firefox (ja4 t13d1717h2 vs stock t13d1617h2). A ClientHello that
# matches no real browser is itself a consistency tell. Disabling it makes
# JA3/JA4/peetprint byte-identical to retail FF150 (verified on tls.peet.ws).
# Stock Firefox ships without 0xC009 and works on the whole web, so this only
# improves fingerprint consistency — it cannot break connectivity.
"security.ssl3.ecdhe_ecdsa_aes_128_sha": False,
# Safebrowsing — chatty and fingerprintable. # Safebrowsing — chatty and fingerprintable.
"browser.safebrowsing.malware.enabled": False, "browser.safebrowsing.malware.enabled": False,
"browser.safebrowsing.phishing.enabled": False, "browser.safebrowsing.phishing.enabled": False,
@ -289,13 +306,29 @@ _BASELINE: Dict[str, Any] = {
"network.dns.echconfig.enabled": False, "network.dns.echconfig.enabled": False,
"network.dns.use_https_rr_as_altsvc": False, "network.dns.use_https_rr_as_altsvc": False,
# === A/B VARIANT B: Fission disabled === # === Fission / site-isolation disabled (FF146 Playwright parity) ===
# Force single content-process model (e10s only, no BC outer/inner split). # Force a single content-process model. Three knobs are required in FF150:
# Diagnostic for the FF150 BC-swap theory: if peet_ws/fppro/sannysoft # upstream Playwright Firefox (FF146-based) only needed fission.autostart=False
# work with this off, the Juggler FF146 baseline breaks specifically on # because FF146's default isolation strategy was looser. FF150 ships with
# cross-process navigation tracking. # fission.webContentIsolationStrategy=1 (IsolateEverything) which still
# site-isolates cross-origin iframes into separate `webIsolated` content
# processes EVEN WHEN fission.autostart is False. From the parent process's
# point of view, those iframes get a Juggler Frame placeholder with no
# docShell, no URL, and an execution context that wraps the wrong global,
# so frame.evaluate() fails with cross-origin SOP errors and
# element_handle.content_frame() returns None.
#
# Pinning the strategy to 0 keeps every cross-origin web iframe in the
# parent's content process, where the Juggler code paths from the FF146
# era expect them. processCount.webIsolated=1 is kept as belt-and-suspenders
# in case some path still classifies an origin as webIsolated despite the
# strategy change. It costs nothing to leave.
#
# See issue #20 + tests/test_cross_origin_iframe.py for the regression
# sentinel that catches a future A/B flipping these back.
"fission.autostart": False, "fission.autostart": False,
"fission.autostart.session": False, "fission.autostart.session": False,
"fission.webContentIsolationStrategy": 0, # IsolateNothing
"dom.ipc.processCount.webIsolated": 1, "dom.ipc.processCount.webIsolated": 1,
@ -384,6 +417,21 @@ _WIN_VIRT_DESKTOP_WORKAROUNDS: Dict[str, Any] = {
# Bugzilla refs: 1798091, 1524591, 1229829. Lowering the GPU sandbox to 0 # Bugzilla refs: 1798091, 1524591, 1229829. Lowering the GPU sandbox to 0
# restores hardware compositor + functional WebGL on alt desktops. # restores hardware compositor + functional WebGL on alt desktops.
"security.sandbox.gpu.level": 0, "security.sandbox.gpu.level": 0,
# Same root cause as above, content process side. Wrapper repo issue #18
# (tab crash on cross-process navigation under headless=True). Sandbox
# content level > 4 puts content processes on the sandbox's own
# kAlternateWinstation (see security/sandbox/win/src/sandboxbroker/
# sandboxBroker.cpp line 1113-1114:
# `if (aSandboxLevel > 4) config->SetDesktop(kAlternateWinstation)`).
# Combined with our CreateDesktop alt-desktop, that puts browser process
# and content processes on DIFFERENT desktops. Cross-process navigation
# then fails window parenting between parent and child, the content
# process exits cleanly (exitCode=0, signal=null) and Playwright fires
# page.on('crash') ~10s after page load. Lowering content sandbox to 4
# keeps content processes on the same desktop as the browser process,
# which is what we want here (still tight enough — level 4 blocks
# file/registry write, network calls, hardware access).
"security.sandbox.content.level": 4,
} }
@ -520,12 +568,17 @@ def translate_profile_to_prefs(
prefs["privacy.spoof_english"] = 0 prefs["privacy.spoof_english"] = 0
if timezone: if timezone:
prefs["zoom.stealth.timezone"] = timezone # juggler.timezone.override is the SOLE source of truth read by the C++
# timezone chain (BrowsingContext::Attach/DidSet, ContentChild). The old
# zoom.stealth.timezone pref was declared in the yaml but read by NO
# code — dropped here on 2026-06-10 (see 20-our-patches.md §8).
prefs["juggler.timezone.override"] = timezone prefs["juggler.timezone.override"] = timezone
# Cross-process seed (canvas noise + DWrite gamma share this). # Cross-process seed (canvas noise + DWrite gamma share this). Only
# zoom.stealth.fpp.hw_seed is read by the C++; the old zoom.stealth.seed
# alias was never declared in the yaml and read by nothing — dropped
# 2026-06-10.
prefs["zoom.stealth.fpp.hw_seed"] = profile.seed prefs["zoom.stealth.fpp.hw_seed"] = profile.seed
prefs["zoom.stealth.seed"] = profile.seed
# Synthetic host ICE candidate — injected by C++ when addr_ct==0 (SOCKS5 # Synthetic host ICE candidate — injected by C++ when addr_ct==0 (SOCKS5
# proxy suppresses all local addresses so Firefox can't gather host cands). # proxy suppresses all local addresses so Firefox can't gather host cands).

View file

@ -0,0 +1,4 @@
"""Synchronous API — re-exports InvisiblePlaywright for parity with async_api."""
from .launcher import InvisiblePlaywright
__all__ = ["InvisiblePlaywright"]

View file

@ -1,22 +0,0 @@
"""stealthfox — Playwright wrapper for a patched Firefox with stealth profile.
Quickstart:
from stealthfox import Stealthfox
with Stealthfox() as browser: # random seed
page = browser.new_page()
page.goto("https://example.com")
with Stealthfox(seed=42) as browser: # deterministic
...
with Stealthfox(humanize=True) as browser: # human-like cursor motion
page = browser.new_page()
page.click("#submit") # expanded into a Bezier trajectory
"""
from .launcher import Stealthfox
from .constants import BINARY_VERSION, FIREFOX_UPSTREAM_VERSION
__version__ = "0.1.0"
__all__ = ["Stealthfox", "BINARY_VERSION", "FIREFOX_UPSTREAM_VERSION", "__version__"]

View file

@ -1,68 +0,0 @@
"""Command-line interface for stealthfox."""
from __future__ import annotations
import argparse
import shutil
import sys
from . import __version__
from .constants import BINARY_VERSION, FIREFOX_UPSTREAM_VERSION
from .download import cache_root, ensure_binary
def _cmd_fetch(_args: argparse.Namespace) -> int:
path = ensure_binary()
print(path)
return 0
def _cmd_path(_args: argparse.Namespace) -> int:
try:
path = ensure_binary()
except Exception as e:
print(f"error: {e}", file=sys.stderr)
return 1
print(path)
return 0
def _cmd_version(_args: argparse.Namespace) -> int:
print(f"stealthfox {__version__}")
print(f"BINARY_VERSION={BINARY_VERSION} (Firefox {FIREFOX_UPSTREAM_VERSION})")
return 0
def _cmd_clear_cache(_args: argparse.Namespace) -> int:
root = cache_root()
if root.exists():
shutil.rmtree(root)
print(f"removed: {root}")
else:
print(f"nothing to remove: {root}")
return 0
def build_parser() -> argparse.ArgumentParser:
p = argparse.ArgumentParser(prog="stealthfox", description="stealthfox CLI")
sub = p.add_subparsers(dest="cmd", required=True)
sub.add_parser("fetch", help="download the patched Firefox binary")
sub.add_parser("path", help="print the absolute path to the cached binary")
sub.add_parser("version", help="print wrapper and binary versions")
sub.add_parser("clear-cache", help="remove all cached binaries")
return p
def main(argv: list[str] | None = None) -> int:
args = build_parser().parse_args(argv)
dispatch = {
"fetch": _cmd_fetch,
"path": _cmd_path,
"version": _cmd_version,
"clear-cache": _cmd_clear_cache,
}
return dispatch[args.cmd](args)
if __name__ == "__main__":
sys.exit(main())

View file

@ -1,48 +0,0 @@
"""Compile-time constants that pin the wrapper to a specific Firefox build.
BINARY_VERSION is bumped every time new Firefox patches are released. It is
deliberately decoupled from the Python package version so that pure-Python
bugfixes don't force a multi-hour Firefox rebuild.
"""
from __future__ import annotations
# Bump this when a new patched Firefox build is released on GitHub.
BINARY_VERSION: str = "firefox-1"
# Underlying Firefox version (for display only; does not drive downloads).
FIREFOX_UPSTREAM_VERSION: str = "150.0.1"
# The base filename prefix used inside archives.
BINARY_BASENAME: str = f"firefox-{FIREFOX_UPSTREAM_VERSION}-stealth"
def ARCHIVE_NAME(platform_key: str, machine: str) -> str:
"""Return the platform-specific archive filename.
platform_key: sys.platform ("win32", "linux")
machine: platform.machine() ("AMD64", "x86_64", ...)
"""
pk = platform_key.lower()
m = machine.lower()
if m in {"amd64", "x86_64"}:
arch = "x86_64"
else:
raise NotImplementedError(f"unsupported arch: {machine}")
if pk == "win32":
return f"{BINARY_BASENAME}-win-{arch}.zip"
if pk == "linux":
return f"{BINARY_BASENAME}-linux-{arch}.tar.gz"
raise NotImplementedError(f"unsupported platform: {platform_key}")
# Binary entry point relative path inside the extracted archive root.
BINARY_ENTRY_REL = {
"win32": "firefox.exe",
"linux": "firefox",
}
# GitHub release URL template. The "TODO" owner is resolved at publication time.
RELEASE_URL_TEMPLATE = (
"https://github.com/feder-cr/stealthfox/releases/download/{tag}/{asset}"
)

View file

@ -1,151 +0,0 @@
"""Download and cache the patched Firefox binary from GitHub Releases."""
from __future__ import annotations
import hashlib
import os
import platform
import re
import sys
import tarfile
import tempfile
import zipfile
from pathlib import Path
import platformdirs
import requests
from .constants import (
ARCHIVE_NAME,
BINARY_ENTRY_REL,
BINARY_VERSION,
RELEASE_URL_TEMPLATE,
)
def _github_token() -> str | None:
return os.environ.get("STEALTHFOX_GITHUB_TOKEN") or os.environ.get("GITHUB_TOKEN")
def _parse_owner_repo(template: str) -> tuple[str, str]:
"""Extract (owner, repo) from RELEASE_URL_TEMPLATE."""
m = re.match(r"https://github\.com/([^/]+)/([^/]+)/releases/", template)
if not m:
raise RuntimeError(f"cannot parse owner/repo from {template!r}")
return m.group(1), m.group(2)
def cache_root() -> Path:
"""Directory where all cached binaries live."""
return Path(platformdirs.user_cache_dir("stealthfox"))
def cache_dir_for_version(version: str = BINARY_VERSION) -> Path:
return cache_root() / version
def _resolve_asset_url(tag: str, asset_name: str) -> str:
"""Return a downloadable URL for the asset.
For private repos the direct `releases/download/<tag>/<asset>` URL returns
404 even with a token, so we resolve via the API: list assets for the
release tag, find the one matching `asset_name`, and use its API URL with
`Accept: application/octet-stream` (which 302-redirects to a signed URL).
For public repos the direct URL still works without a token.
"""
token = _github_token()
if not token:
return RELEASE_URL_TEMPLATE.format(tag=tag, asset=asset_name)
owner, repo = _parse_owner_repo(RELEASE_URL_TEMPLATE)
api = f"https://api.github.com/repos/{owner}/{repo}/releases/tags/{tag}"
r = requests.get(api, headers={"Authorization": f"token {token}"}, timeout=30)
r.raise_for_status()
for a in r.json().get("assets", []):
if a.get("name") == asset_name:
return a["url"]
raise RuntimeError(f"asset {asset_name!r} not found in release {tag!r}")
def _download_file(url: str, dst: Path, chunk_size: int = 1 << 16) -> None:
dst.parent.mkdir(parents=True, exist_ok=True)
headers: dict[str, str] = {}
token = _github_token()
if token and url.startswith("https://api.github.com/"):
headers["Authorization"] = f"token {token}"
headers["Accept"] = "application/octet-stream"
with requests.get(url, stream=True, timeout=60, headers=headers) as r:
r.raise_for_status()
with open(dst, "wb") as f:
for chunk in r.iter_content(chunk_size):
if chunk:
f.write(chunk)
def _sha256_file(path: Path) -> str:
h = hashlib.sha256()
with open(path, "rb") as f:
for chunk in iter(lambda: f.read(1 << 16), b""):
h.update(chunk)
return h.hexdigest()
def _parse_checksums(text: str) -> dict[str, str]:
out: dict[str, str] = {}
for line in text.splitlines():
line = line.strip()
if not line or line.startswith("#"):
continue
parts = line.split()
if len(parts) >= 2:
out[parts[-1]] = parts[0]
return out
def _extract(archive: Path, dst: Path) -> None:
dst.mkdir(parents=True, exist_ok=True)
if archive.suffix == ".zip":
with zipfile.ZipFile(archive) as zf:
zf.extractall(dst)
elif archive.name.endswith(".tar.gz") or archive.suffix in {".tgz", ".gz"}:
with tarfile.open(archive, "r:gz") as tf:
tf.extractall(dst)
else:
raise RuntimeError(f"unknown archive format: {archive}")
def ensure_binary(version: str = BINARY_VERSION) -> Path:
"""Return a path to a runnable Firefox executable. Download if needed."""
plat = sys.platform
mach = platform.machine()
asset = ARCHIVE_NAME(plat, mach)
entry_rel = BINARY_ENTRY_REL.get(plat)
if entry_rel is None:
raise NotImplementedError(f"no binary entry for platform {plat}")
version_dir = cache_dir_for_version(version)
entry = version_dir / entry_rel
if entry.exists():
return entry
url_archive = _resolve_asset_url(version, asset)
url_sums = _resolve_asset_url(version, "checksums.txt")
with tempfile.TemporaryDirectory() as td:
tmp = Path(td)
archive_path = tmp / asset
_download_file(url_archive, archive_path)
sums_path = tmp / "checksums.txt"
_download_file(url_sums, sums_path)
sums = _parse_checksums(sums_path.read_text())
expected = sums.get(asset)
if expected is None:
raise RuntimeError(f"no SHA256 for {asset} in checksums.txt")
actual = _sha256_file(archive_path)
if actual.lower() != expected.lower():
raise RuntimeError(
f"SHA256 mismatch for {asset}: got {actual}, expected {expected}"
)
_extract(archive_path, version_dir)
if not entry.exists():
raise RuntimeError(f"binary not found after extraction: {entry}")
return entry

View file

@ -1,4 +0,0 @@
"""Synchronous API — re-exports Stealthfox for parity with async_api."""
from .launcher import Stealthfox
__all__ = ["Stealthfox"]

54
tests/conftest.py Normal file
View file

@ -0,0 +1,54 @@
import os
import random
import sys
from pathlib import Path
import pytest
from invisible_playwright._fpforge import generate_profile
from invisible_playwright.constants import BINARY_ENTRY_REL
@pytest.fixture
def deterministic_rng():
"""Seeded RNG for reproducible tests."""
return random.Random(42)
@pytest.fixture
def sample_profile():
"""A Profile generated from seed=42 for reuse across tests."""
return generate_profile(seed=42)
@pytest.fixture(scope="session")
def firefox_binary():
"""Locate the patched Firefox binary for E2E tests, or skip cleanly.
Single source of truth for every E2E test (previously each test file had its
own copy and three of them silently ignored INVPW_BINARY_PATH, so they kept
testing whatever was in the cache even when you pointed the suite at a
specific build: a false-confidence trap). Lookup order:
1. ``INVPW_BINARY_PATH`` env var point the whole suite at a local build
or a freshly-extracted release (this is how the full-suite gate runs).
2. Cached binary under ``cache_dir_for_version()`` (post ``fetch``).
3. Skip we never trigger an implicit multi-hundred-MB network download
inside a test run.
"""
env_path = os.environ.get("INVPW_BINARY_PATH")
if env_path:
if Path(env_path).exists():
return env_path
pytest.skip(f"INVPW_BINARY_PATH={env_path!r} does not exist")
if sys.platform not in BINARY_ENTRY_REL:
pytest.skip(f"unsupported platform: {sys.platform}")
from invisible_playwright.download import cache_dir_for_version
entry = cache_dir_for_version() / BINARY_ENTRY_REL[sys.platform]
if not entry.exists():
pytest.skip(
"patched Firefox binary not cached and INVPW_BINARY_PATH unset; "
"set INVPW_BINARY_PATH=<firefox binary> or run `invisible-playwright fetch`"
)
return str(entry)

83
tests/test_async_api.py Normal file
View file

@ -0,0 +1,83 @@
"""Constructor-parity tests for the async ``InvisiblePlaywright``.
The async API mirrors the sync launcher (same prefs pipeline, same
profile generation, same proxy handling). The only async-specific
surface is ``__aenter__`` / ``__aexit__`` and an awaitable ``new_page``
patch both require a real Firefox binary to exercise meaningfully and
are covered by the sync E2E tests via parity arguments.
What we test here without launching a browser: the constructor builds
the same eager Profile, clamps the seed identically, and surfaces pin
validation errors at construction time. These guards keep the async
class from silently drifting away from the sync class as features land.
"""
from __future__ import annotations
import pytest
from invisible_playwright.async_api import InvisiblePlaywright as AsyncIP
from invisible_playwright.launcher import InvisiblePlaywright as SyncIP
@pytest.mark.unit
def test_async_explicit_seed_is_stored():
ip = AsyncIP(seed=42)
assert ip.seed == 42
@pytest.mark.unit
def test_async_random_seed_is_positive_int31():
"""Same int31 contract as sync: the C++ side rejects ``seed <= 0`` and
a 32-bit value risks the high bit looking negative."""
ip = AsyncIP()
assert isinstance(ip.seed, int)
assert 0 < ip.seed < 2**31
@pytest.mark.unit
def test_async_random_seed_varies_across_instances():
seeds = {AsyncIP().seed for _ in range(5)}
assert len(seeds) > 1
@pytest.mark.unit
def test_async_profile_built_eagerly_in_constructor():
"""Pin validation must fire before ``__aenter__`` — otherwise a user
only learns their pin is wrong when the browser launch starts."""
ip = AsyncIP(seed=42)
assert ip._profile is not None
assert ip._profile.seed == 42
@pytest.mark.unit
def test_async_invalid_pin_raises_in_constructor():
with pytest.raises(ValueError):
AsyncIP(seed=42, pin={"not_a_real_field": 1})
@pytest.mark.unit
def test_async_and_sync_share_seed_for_same_input():
"""Same seed → identical Profile across the two APIs. Both lean on
``generate_profile(seed)``; if they diverge it means one of them
started doing extra sampling."""
seed = 12345
a = AsyncIP(seed=seed)
s = SyncIP(seed=seed)
assert a._profile == s._profile
@pytest.mark.unit
def test_async_seed_coerced_from_float():
"""``int(seed)`` truncation — matches sync clamping behaviour."""
ip = AsyncIP(seed=42.9)
assert ip.seed == 42
@pytest.mark.unit
def test_async_default_context_kwargs_match_sync():
"""The two ``_default_context_kwargs`` implementations must produce
the same dict for the same inputs. Guards against the async copy
drifting away when sync adds new keys."""
a = AsyncIP(seed=42, timezone="America/New_York", locale="de-DE")
s = SyncIP(seed=42, timezone="America/New_York", locale="de-DE")
assert a._default_context_kwargs() == s._default_context_kwargs()

42
tests/test_build.py Normal file
View file

@ -0,0 +1,42 @@
"""Regression: the produced wheel must not contain duplicate zip entries.
The old pyproject.toml had a ``[tool.hatch.build.targets.wheel.force-include]``
section that re-included `data/` and `_fpforge/data/` already covered by
``packages = ["src/invisible_playwright"]``. Hatchling wrote every JSON twice
into the zip; PyPI rejects wheels with duplicate names.
"""
from __future__ import annotations
import subprocess
import sys
import zipfile
from collections import Counter
from pathlib import Path
import pytest
@pytest.mark.slow
def test_built_wheel_has_no_duplicate_entries(tmp_path):
"""Build the wheel in a clean dir and assert no duplicate zip names."""
root = Path(__file__).resolve().parent.parent
out = tmp_path / "dist"
r = subprocess.run(
[sys.executable, "-m", "build", "--wheel", "--outdir", str(out)],
cwd=root,
capture_output=True,
text=True,
)
assert r.returncode == 0, f"build failed:\n{r.stderr}"
wheels = list(out.glob("*.whl"))
assert len(wheels) == 1, f"expected exactly one wheel, got {wheels}"
with zipfile.ZipFile(wheels[0]) as zf:
names = zf.namelist()
dupes = {n: c for n, c in Counter(names).items() if c > 1}
assert not dupes, f"wheel has duplicate entries (PyPI will reject): {dupes}"
# Sanity: the Bayesian data files must still be packaged.
json_files = [n for n in names if n.endswith(".json")]
assert json_files, "no .json data files in wheel — packaging broken"

View file

@ -1,22 +1,122 @@
import subprocess import subprocess
import sys import sys
from pathlib import Path
import pytest
from invisible_playwright import cli
@pytest.mark.unit
def test_version_subcommand(): def test_version_subcommand():
r = subprocess.run( r = subprocess.run(
[sys.executable, "-m", "stealthfox", "version"], [sys.executable, "-m", "invisible_playwright", "version"],
capture_output=True, text=True, check=True, capture_output=True, text=True, check=True,
) )
assert "firefox-" in r.stdout assert "firefox-" in r.stdout
assert "stealthfox" in r.stdout.lower() assert "invisible_playwright" in r.stdout.lower()
@pytest.mark.unit
def test_help_subcommand(): def test_help_subcommand():
r = subprocess.run( r = subprocess.run(
[sys.executable, "-m", "stealthfox", "--help"], [sys.executable, "-m", "invisible_playwright", "--help"],
capture_output=True, text=True, capture_output=True, text=True,
) )
assert r.returncode == 0 assert r.returncode == 0
assert "fetch" in r.stdout assert "fetch" in r.stdout
assert "path" in r.stdout assert "path" in r.stdout
assert "clear-cache" in r.stdout assert "clear-cache" in r.stdout
# CL1: clear-cache with existing cache prints "removed:" + path
@pytest.mark.unit
def test_clear_cache_with_existing_cache(tmp_path, monkeypatch, capsys):
cache = tmp_path / "existing-cache"
cache.mkdir()
(cache / "marker").write_text("x")
monkeypatch.setattr("invisible_playwright.cli.cache_root", lambda: cache)
rc = cli.main(["clear-cache"])
captured = capsys.readouterr()
assert rc == 0
assert captured.out.startswith("removed:")
assert str(cache) in captured.out
assert not cache.exists()
# CL2: clear-cache with no cache prints "nothing to remove:"
@pytest.mark.unit
def test_clear_cache_with_no_cache(tmp_path, monkeypatch, capsys):
cache = tmp_path / "missing-cache"
assert not cache.exists()
monkeypatch.setattr("invisible_playwright.cli.cache_root", lambda: cache)
rc = cli.main(["clear-cache"])
captured = capsys.readouterr()
assert rc == 0
assert captured.out.startswith("nothing to remove:")
assert str(cache) in captured.out
# CL3: path when binary exists prints path, exit 0
@pytest.mark.unit
def test_path_subcommand_when_binary_exists(tmp_path, monkeypatch, capsys):
fake_binary = tmp_path / "firefox.exe"
fake_binary.write_text("x")
monkeypatch.setattr("invisible_playwright.cli.ensure_binary", lambda: fake_binary)
rc = cli.main(["path"])
captured = capsys.readouterr()
assert rc == 0
assert str(fake_binary) in captured.out
assert captured.err == ""
# CL4: path when binary missing prints to stderr, exit 1
@pytest.mark.unit
def test_path_subcommand_when_binary_missing(monkeypatch, capsys):
def boom():
raise RuntimeError("download failed")
monkeypatch.setattr("invisible_playwright.cli.ensure_binary", boom)
rc = cli.main(["path"])
captured = capsys.readouterr()
assert rc == 1
assert "error:" in captured.err
assert "download failed" in captured.err
assert captured.out == ""
# CL5: no subcommand → argparse error, exit != 0
@pytest.mark.unit
def test_no_subcommand_errors():
with pytest.raises(SystemExit) as exc_info:
cli.main([])
assert exc_info.value.code != 0
# CL6: unknown subcommand → argparse error
@pytest.mark.unit
def test_unknown_subcommand_errors():
with pytest.raises(SystemExit) as exc_info:
cli.main(["bogus"])
assert exc_info.value.code != 0
# Extra: fetch happy path with mocked ensure_binary
@pytest.mark.unit
def test_fetch_subcommand_prints_path(tmp_path, monkeypatch, capsys):
fake_binary = tmp_path / "firefox.exe"
fake_binary.write_text("x")
monkeypatch.setattr("invisible_playwright.cli.ensure_binary", lambda: fake_binary)
rc = cli.main(["fetch"])
captured = capsys.readouterr()
assert rc == 0
assert str(fake_binary) in captured.out

115
tests/test_cloak.py Normal file
View file

@ -0,0 +1,115 @@
"""Cloak guard (e2e) — verifies the source-level "invisible headless" cloak:
the chrome window is hidden from the screen YET keeps rendering on the real GPU
(not Playwright's native headless, which has no WebGL). Runs per-platform in CI:
- Windows: the DWMWA_CLOAK attribute (queried via DWMWA_CLOAKED).
- macOS: the NSWindow alpha (queried via Quartz CGWindowListCopyWindowInfo).
- Linux: skipped there the wrapper hides via Xvfb, not a source-level cloak.
This is the CI validation for the macOS cocoa cloak patch, which can't be built
or run on the Windows/Linux dev boxes.
"""
from __future__ import annotations
import sys
import time
import pytest
from invisible_playwright import InvisiblePlaywright
CLOAK_PREFS = {
"zoom.stealth.cloak_windows": True,
"widget.windows.window_occlusion_tracking.enabled": False,
}
_WEBGL_RENDERER = """() => {
const g = document.createElement('canvas').getContext('webgl');
if (!g) return 'NO-WEBGL';
const d = g.getExtension('WEBGL_debug_renderer_info');
return d ? g.getParameter(d.UNMASKED_RENDERER_WEBGL) : (g.getParameter(g.RENDERER) || '');
}"""
def _windows_moz_window_cloaked() -> bool:
"""True if at least one MozillaWindowClass top-level window is DWMWA_CLOAKED."""
import ctypes
from ctypes import wintypes
user32 = ctypes.windll.user32
dwm = ctypes.windll.dwmapi
DWMWA_CLOAKED = 14
ENUM = ctypes.WINFUNCTYPE(wintypes.BOOL, wintypes.HWND, wintypes.LPARAM)
found = []
def cb(hwnd, _):
c = ctypes.create_unicode_buffer(256)
user32.GetClassNameW(hwnd, c, 256)
if c.value == "MozillaWindowClass":
v = wintypes.DWORD(0)
dwm.DwmGetWindowAttribute(wintypes.HWND(hwnd), DWMWA_CLOAKED,
ctypes.byref(v), 4)
found.append(v.value)
return True
user32.EnumWindows(ENUM(cb), 0)
return any(state != 0 for state in found)
def _macos_firefox_window_alpha_zero() -> bool:
"""True if a Firefox on-screen window reports ~0 alpha (cloaked)."""
from Quartz import ( # type: ignore
CGWindowListCopyWindowInfo,
kCGWindowListOptionOnScreenOnly,
kCGNullWindowID,
)
infos = CGWindowListCopyWindowInfo(kCGWindowListOptionOnScreenOnly, kCGNullWindowID)
alphas = []
for w in infos or []:
owner = (w.get("kCGWindowOwnerName") or "")
if "firefox" in owner.lower() or "nightly" in owner.lower():
alphas.append(float(w.get("kCGWindowAlpha", 1.0)))
# cloaked windows are alpha 0; if Firefox has any window it must be ~0.
return bool(alphas) and all(a < 0.05 for a in alphas)
@pytest.mark.e2e
@pytest.mark.skipif(
sys.platform.startswith("linux"),
reason="source-level cloak is Windows/macOS only; Linux hides via Xvfb",
)
def test_cloak_hides_window_but_keeps_rendering(firefox_binary):
with InvisiblePlaywright(
seed=42, binary_path=firefox_binary, headless=False, extra_prefs=CLOAK_PREFS
) as browser:
page = browser.new_context().new_page()
page.goto("https://example.com", timeout=30_000)
time.sleep(2)
# 1) still renders on the real GPU pipeline (a non-blank screenshot proves
# the compositor is alive despite the window being hidden).
shot = page.screenshot()
assert len(shot) > 3000, "cloaked window produced a blank screenshot (rendering paused)"
# 2) headed pipeline intact: a real WebGL context (Playwright's native
# headless has none). Linux (Xvfb + llvmpipe) and Windows (WARP) give a
# software context on the GPU-less runners, so a missing context there
# is a real regression -> hard fail. macOS GitHub runners expose NO
# WebGL in the CI session at all (even vanilla Firefox), and macOS has
# no software-GL fallback; the cloak's "still rendering" property is
# already proven by the non-blank screenshot above, so we don't also
# require a live WebGL context there.
renderer = page.evaluate(_WEBGL_RENDERER)
webgl_ok = bool(renderer) and renderer != "NO-WEBGL"
if not (sys.platform == "darwin" and not webgl_ok):
assert webgl_ok, f"no real WebGL under cloak: {renderer!r}"
# 3) the window is actually hidden (per-platform).
if sys.platform == "win32":
assert _windows_moz_window_cloaked(), "Firefox window is not DWMWA_CLOAKED"
elif sys.platform == "darwin":
try:
hidden = _macos_firefox_window_alpha_zero()
except ImportError:
pytest.skip("pyobjc Quartz not available to verify macOS cloak alpha")
assert hidden, "Firefox macOS window is not alpha-cloaked"

View file

@ -1,29 +1,203 @@
from stealthfox.constants import BINARY_VERSION, BINARY_BASENAME, ARCHIVE_NAME import pytest
from invisible_playwright.constants import (
ARCHIVE_NAME,
BINARY_BASENAME,
BINARY_ENTRY_REL,
BINARY_VERSION,
BROKEN_VERSIONS,
FIREFOX_UPSTREAM_VERSION,
RELEASE_URL_TEMPLATE,
)
@pytest.mark.unit
def test_broken_versions_excludes_current():
"""The current BINARY_VERSION must NEVER be in BROKEN_VERSIONS — otherwise
every default ensure_binary() call would raise and the wrapper is unusable."""
assert BINARY_VERSION not in BROKEN_VERSIONS
@pytest.mark.unit
def test_firefox_8_is_marked_broken():
"""firefox-8 shipped without the juggler layer (undrivable by Playwright);
it must stay flagged so a stale cache can't silently hand it to a user."""
assert "firefox-8" in BROKEN_VERSIONS
@pytest.mark.unit
def test_binary_version_format(): def test_binary_version_format():
assert BINARY_VERSION.startswith("firefox-") assert BINARY_VERSION.startswith("firefox-")
assert BINARY_VERSION.split("-", 1)[1].isdigit() assert BINARY_VERSION.split("-", 1)[1].isdigit()
@pytest.mark.unit
def test_archive_name_windows(): def test_archive_name_windows():
name = ARCHIVE_NAME("win32", "AMD64") name = ARCHIVE_NAME("win32", "AMD64")
assert name.endswith(".zip") assert name.endswith(".zip")
assert "win-x86_64" in name assert "win-x86_64" in name
@pytest.mark.unit
def test_archive_name_linux(): def test_archive_name_linux():
name = ARCHIVE_NAME("linux", "x86_64") name = ARCHIVE_NAME("linux", "x86_64")
assert name.endswith(".tar.gz") assert name.endswith(".tar.gz")
assert "linux-x86_64" in name assert "linux-x86_64" in name
def test_archive_name_unsupported_raises(): @pytest.mark.unit
import pytest def test_archive_name_macos_arm64():
name = ARCHIVE_NAME("darwin", "arm64")
assert name.endswith(".tar.gz")
assert "macos-arm64" in name
@pytest.mark.unit
def test_archive_name_truly_unsupported_raises():
with pytest.raises(NotImplementedError): with pytest.raises(NotImplementedError):
ARCHIVE_NAME("darwin", "arm64") ARCHIVE_NAME("plan9", "x86_64")
@pytest.mark.unit
def test_binary_basename_format(): def test_binary_basename_format():
assert "firefox" in BINARY_BASENAME.lower() assert "firefox" in BINARY_BASENAME.lower()
assert "stealth" in BINARY_BASENAME.lower() assert "stealth" in BINARY_BASENAME.lower()
# ---- Comprehensive ARCHIVE_NAME edge cases -------------------------------- #
# Same risk shape as bug #15: a missed format assumption (sha256sum binary
# mode) silently produced wrong output. Same class of bug here would be
# uppercase platform string or odd machine value passing through to a
# wrong-named asset on the CDN and 404-ing.
@pytest.mark.unit
@pytest.mark.parametrize("platform_key,machine,expected_substring", [
("win32", "AMD64", "win-x86_64.zip"), # Windows reports AMD64
("win32", "amd64", "win-x86_64.zip"), # lowercase variant
("win32", "x86_64", "win-x86_64.zip"), # mingw-style
("linux", "x86_64", "linux-x86_64.tar.gz"), # standard Linux
("linux", "AMD64", "linux-x86_64.tar.gz"), # odd but plausible
("Linux", "x86_64", "linux-x86_64.tar.gz"), # case-insensitive platform
("WIN32", "AMD64", "win-x86_64.zip"), # ALL CAPS platform
])
def test_archive_name_accepts_case_variations(platform_key, machine, expected_substring):
"""sys.platform / platform.machine() return inconsistent casing across
OS versions and Python versions. The asset filename must be stable
regardless otherwise the CDN 404s."""
assert ARCHIVE_NAME(platform_key, machine).endswith(expected_substring)
@pytest.mark.unit
@pytest.mark.parametrize("machine", ["i386", "i686", "ppc64le", "armv7l", "riscv64"])
def test_archive_name_rejects_unsupported_arches(machine):
"""Unsupported arches must raise NotImplementedError with the bad value
in the message silent fallback to a default arch would download the
wrong binary, run, and fingerprint differently."""
with pytest.raises(NotImplementedError, match=machine):
ARCHIVE_NAME("linux", machine)
@pytest.mark.unit
@pytest.mark.parametrize("machine", ["arm64", "aarch64"])
def test_archive_name_arm64_supported(machine):
"""ARM64 is shipped now (issue #6): both Linux aarch64 and macOS arm64.
ARCHIVE_NAME must map both machine spellings to the canonical -arm64 asset."""
assert ARCHIVE_NAME("linux", machine) == "firefox-150.0.1-stealth-linux-arm64.tar.gz"
assert ARCHIVE_NAME("darwin", machine) == "firefox-150.0.1-stealth-macos-arm64.tar.gz"
@pytest.mark.unit
@pytest.mark.parametrize("platform_key", ["freebsd", "cygwin", "openbsd"])
def test_archive_name_rejects_unsupported_platforms(platform_key):
"""win32/linux/darwin are supported; other platforms must raise, not
silently pick one of the three."""
with pytest.raises(NotImplementedError, match=platform_key):
ARCHIVE_NAME(platform_key, "x86_64")
# ---- ARCHIVE_NAME ↔ BINARY_ENTRY_REL invariant ---------------------------- #
# For every supported platform there MUST be an entry in BINARY_ENTRY_REL,
# otherwise ensure_binary() will raise NotImplementedError AFTER having
# already downloaded a 110 MB tarball — terrible UX.
@pytest.mark.unit
def test_binary_entry_rel_covers_every_supported_platform():
"""If ARCHIVE_NAME accepts a platform key, BINARY_ENTRY_REL must declare
where the executable lives inside the archive for it."""
for plat in ["win32", "linux", "darwin"]:
ARCHIVE_NAME(plat, "x86_64") # must not raise
assert plat in BINARY_ENTRY_REL, (
f"ARCHIVE_NAME accepts {plat!r} but BINARY_ENTRY_REL has no entry "
f"— ensure_binary() will fail late after a 110 MB download."
)
@pytest.mark.unit
def test_binary_entry_rel_extension_matches_platform():
"""firefox.exe on Windows, plain `firefox` on Linux."""
assert BINARY_ENTRY_REL["win32"].endswith(".exe")
assert not BINARY_ENTRY_REL["linux"].endswith(".exe")
assert BINARY_ENTRY_REL["linux"] == "firefox"
assert BINARY_ENTRY_REL["darwin"].endswith(".app/Contents/MacOS/firefox")
# ---- RELEASE_URL_TEMPLATE shape ------------------------------------------- #
@pytest.mark.unit
def test_release_url_template_is_https():
"""No http://. GitHub redirects http but we never accept the redirect."""
assert RELEASE_URL_TEMPLATE.startswith("https://github.com/")
@pytest.mark.unit
def test_release_url_template_has_required_placeholders():
"""{tag} and {asset} must both be present, otherwise _resolve_asset_url
won't format a usable URL and downloads fail with confusing 404s."""
assert "{tag}" in RELEASE_URL_TEMPLATE
assert "{asset}" in RELEASE_URL_TEMPLATE
@pytest.mark.unit
def test_release_url_template_formats_cleanly():
"""Confirm .format() actually substitutes — catches typos like {tags}."""
url = RELEASE_URL_TEMPLATE.format(tag="firefox-99", asset="thing.zip")
assert "{" not in url and "}" not in url
assert "firefox-99" in url
assert "thing.zip" in url
@pytest.mark.unit
def test_release_url_points_at_owned_repo():
"""The template MUST point at an owner/repo the maintainer actually
controls. A typo here would direct everyone's downloads at a stranger's
GitHub account silent supply-chain risk."""
assert "/feder-cr/invisible_playwright/" in RELEASE_URL_TEMPLATE, (
f"RELEASE_URL_TEMPLATE was changed to point elsewhere: "
f"{RELEASE_URL_TEMPLATE!r}. Update this test only if the move is intentional."
)
# ---- Firefox upstream version sanity -------------------------------------- #
@pytest.mark.unit
def test_firefox_upstream_version_is_three_part_semver():
parts = FIREFOX_UPSTREAM_VERSION.split(".")
assert len(parts) >= 2, f"version too short: {FIREFOX_UPSTREAM_VERSION!r}"
for p in parts:
assert p.isdigit(), f"non-numeric segment in {FIREFOX_UPSTREAM_VERSION!r}"
@pytest.mark.unit
def test_binary_basename_includes_upstream_version():
"""The basename references the upstream version, so the asset filename
on the CDN encodes which Firefox was patched. Bumping FIREFOX_UPSTREAM_VERSION
without rebuilding would leave stale binaries this guards against
accidentally desyncing the two."""
assert FIREFOX_UPSTREAM_VERSION in BINARY_BASENAME
@pytest.mark.unit
@pytest.mark.parametrize("plat", ["win32", "linux"])
def test_archive_name_includes_upstream_version(plat):
"""Same desync guard, from the other direction."""
assert FIREFOX_UPSTREAM_VERSION in ARCHIVE_NAME(plat, "x86_64")

View file

@ -0,0 +1,278 @@
"""Regression tests for cross-origin / cross-process iframe interaction.
History: wrapper repo issue #20 reported that a third-party cookie
consent iframe was completely unreachable from Playwright in 0.1.7
``element_handle.content_frame()`` returned ``None``, ``frame.evaluate()``
threw cross-origin SOP errors, and ``frame_locator().click()`` timed
out.
Root cause was a missing pref. FF150 ships with
``fission.webContentIsolationStrategy=1`` (IsolateEverything), which
site-isolates cross-origin iframes into separate webIsolated content
processes even when ``fission.autostart=False``. The Juggler code paths
inherited from the FF146 era assume same-process iframes. The wrapper's
``_BASELINE`` now pins the pref to 0 (IsolateNothing).
These tests exist so a future Firefox upgrade or a fingerprint A/B
that flips this pref by accident cannot ship without a red CI signal.
Layers:
* ``unit`` ``_BASELINE`` contains the pref with the right value. No browser.
* ``e2e`` launch the real binary against a LOCAL HTTP harness on
``127.0.0.1`` (two ports = two SOP origins) and verify the
four protocol operations that regressed: frame URL tracking,
``handle.content_frame()``, ``frame.evaluate()``, and
``frame_locator(...).locator(...)`` element resolution.
The e2e tests run entirely offline. They never call out to a real site;
the cross-origin shape is reproduced with two local HTTP servers on
random free ports.
"""
from __future__ import annotations
import socket
import threading
from http.server import BaseHTTPRequestHandler, HTTPServer
import pytest
from invisible_playwright._fpforge import generate_profile
from invisible_playwright.prefs import _BASELINE, translate_profile_to_prefs
# ────────────────────────────────────────────────────────────────────
# Unit layer — fast, no browser, runs on every CI
# ────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_baseline_pins_web_content_isolation_strategy_to_zero():
"""Regression sentinel.
``fission.webContentIsolationStrategy`` MUST be 0 (IsolateNothing).
The FF150 default is 1 (IsolateEverything), which site-isolates
cross-origin iframes into separate webIsolated content processes
and breaks Playwright frame tracking from the parent process.
"""
assert _BASELINE["fission.webContentIsolationStrategy"] == 0, (
"fission.webContentIsolationStrategy must be 0 (IsolateNothing). "
"If you bumped it for an A/B, cross-origin iframes will appear "
"in page.frames with empty URLs and content_frame() will return "
"None — see the changelog entry that introduced this test."
)
@pytest.mark.unit
def test_baseline_keeps_fission_autostart_off():
"""Belt for the suspenders above. All three prefs are required."""
assert _BASELINE["fission.autostart"] is False
assert _BASELINE["fission.autostart.session"] is False
assert _BASELINE["dom.ipc.processCount.webIsolated"] == 1
@pytest.mark.unit
def test_translated_profile_propagates_isolation_strategy():
"""The fix must survive translate_profile_to_prefs, not just live in _BASELINE."""
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert prefs["fission.webContentIsolationStrategy"] == 0
@pytest.mark.unit
def test_extra_prefs_override_can_break_isolation_only_explicitly():
"""If a caller wants to A/B isolation, they have to set it explicitly.
The wrapper does not silently flip it back on.
"""
p = generate_profile(seed=42)
prefs_default = translate_profile_to_prefs(p)
assert prefs_default["fission.webContentIsolationStrategy"] == 0
prefs_ab = translate_profile_to_prefs(
p, extra_prefs={"fission.webContentIsolationStrategy": 1}
)
assert prefs_ab["fission.webContentIsolationStrategy"] == 1
# ────────────────────────────────────────────────────────────────────
# E2E layer — needs cached binary + bind to localhost ports
# ────────────────────────────────────────────────────────────────────
def _free_port() -> int:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("127.0.0.1", 0))
port = s.getsockname()[1]
s.close()
return port
class _SilentHandler(BaseHTTPRequestHandler):
"""Suppress per-request access logging so pytest output stays clean."""
PAYLOAD = b"" # set per-instance via subclassing
def log_message(self, *_a):
pass
def do_GET(self):
self.send_response(200)
self.send_header("Content-Type", "text/html; charset=utf-8")
self.send_header("Cache-Control", "no-store")
self.end_headers()
self.wfile.write(self.PAYLOAD)
def _serve(payload: bytes, port: int) -> HTTPServer:
"""Start an HTTP server on 127.0.0.1:port serving ``payload`` on every GET."""
handler_cls = type(
"_H", (_SilentHandler,), {"PAYLOAD": payload}
)
srv = HTTPServer(("127.0.0.1", port), handler_cls)
t = threading.Thread(target=srv.serve_forever, daemon=True)
t.start()
return srv
@pytest.fixture
def cross_origin_harness():
"""Spin up TWO local HTTP servers on different localhost ports.
Two ports = two distinct origins under SOP (same host, different port
different origin). The parent page on port A embeds an iframe with
src pointing at port B. Same cross-origin browsing-context shape as
a parent-page-plus-third-party-iframe layout, fully offline.
"""
pa, pb = _free_port(), _free_port()
parent_html = f"""<!doctype html><html><head><title>parent</title></head><body>
<h1>parent</h1>
<iframe id="ifr_plain" src="http://127.0.0.1:{pb}/child" width="300" height="120"></iframe>
<iframe id="ifr_sandbox" src="http://127.0.0.1:{pb}/child" width="300" height="120"
sandbox="allow-scripts allow-same-origin"></iframe>
<iframe id="ifr_titled" src="http://127.0.0.1:{pb}/child" width="300" height="120"
title="cross-origin titled iframe"></iframe>
</body></html>""".encode("utf-8")
child_html = b"""<!doctype html><html><body>
<button id="ok">confirm</button>
<button class="btn-primary">primary</button>
<script>document.getElementById('ok').addEventListener('click', () => document.title = 'clicked')</script>
</body></html>"""
sa = _serve(parent_html, pa)
sb = _serve(child_html, pb)
try:
yield {"parent_url": f"http://127.0.0.1:{pa}/", "child_origin": f"http://127.0.0.1:{pb}"}
finally:
sa.shutdown()
sb.shutdown()
@pytest.mark.e2e
def test_cross_origin_iframe_url_appears_in_page_frames(firefox_binary, cross_origin_harness):
"""``page.frames`` must list the cross-origin iframe with its real URL.
Before the pref fix, the URL came back as '' because the navigation
observer for the iframe fired in a different content process than
the parent's FrameTree was registered in.
"""
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, binary_path=firefox_binary, humanize=False) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto(cross_origin_harness["parent_url"], wait_until="domcontentloaded", timeout=30_000)
page.wait_for_selector("iframe#ifr_plain", timeout=10_000)
page.wait_for_timeout(500)
urls = [f.url for f in page.frames]
assert any(cross_origin_harness["child_origin"] in (u or "") for u in urls), (
f"no frame had the child origin in its URL; page.frames urls = {urls!r}"
)
@pytest.mark.e2e
def test_cross_origin_iframe_content_frame_resolves(firefox_binary, cross_origin_harness):
"""``handle.content_frame()`` must return a Frame (not None) for every
cross-origin iframe shape we care about: plain, sandboxed, titled.
"""
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, binary_path=firefox_binary, humanize=False) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto(cross_origin_harness["parent_url"], wait_until="domcontentloaded", timeout=30_000)
page.wait_for_selector("iframe#ifr_plain", timeout=10_000)
page.wait_for_timeout(500)
for sel in ("iframe#ifr_plain", "iframe#ifr_sandbox", "iframe#ifr_titled"):
handle = page.query_selector(sel)
assert handle is not None, f"{sel!r} not found in DOM"
cf = handle.content_frame()
assert cf is not None, f"{sel!r}: content_frame() returned None"
assert cross_origin_harness["child_origin"] in (cf.url or ""), (
f"{sel!r}: content_frame().url = {cf.url!r}, "
f"expected child origin {cross_origin_harness['child_origin']!r}"
)
@pytest.mark.e2e
def test_cross_origin_iframe_evaluate_returns_real_values(firefox_binary, cross_origin_harness):
"""``frame.evaluate()`` inside the cross-origin iframe must work.
Pre-fix: every evaluate failed with a cross-origin SOP error because
the iframe ended up with a stale/wrong execution context.
"""
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, binary_path=firefox_binary, humanize=False) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto(cross_origin_harness["parent_url"], wait_until="domcontentloaded", timeout=30_000)
page.wait_for_selector("iframe#ifr_plain", timeout=10_000)
page.wait_for_timeout(500)
cf = page.query_selector("iframe#ifr_plain").content_frame()
assert cf is not None
href = cf.evaluate("() => location.href")
assert cross_origin_harness["child_origin"] in href
title = cf.evaluate("() => document.title")
assert isinstance(title, str)
n_buttons = cf.evaluate("() => document.querySelectorAll('button').length")
assert n_buttons == 2
@pytest.mark.e2e
def test_cross_origin_iframe_frame_locator_resolves_button(firefox_binary, cross_origin_harness):
"""``frame_locator(...).locator(...)`` must reach the button inside the iframe."""
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, binary_path=firefox_binary, humanize=False) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto(cross_origin_harness["parent_url"], wait_until="domcontentloaded", timeout=30_000)
page.wait_for_selector("iframe#ifr_plain", timeout=10_000)
for selector in ("button#ok", "button.btn-primary"):
cnt = page.frame_locator("iframe#ifr_plain").locator(selector).count()
assert cnt == 1, f"locator({selector!r}) found {cnt} elements (expected 1)"
@pytest.mark.e2e
def test_cross_origin_iframe_dispatch_event_click_works(firefox_binary, cross_origin_harness):
"""End-to-end interaction via ``dispatch_event`` must succeed.
Plain ``.click()`` can trip Playwright's actionability heuristic on
some third-party UIs (same on vanilla Playwright Firefox not our
regression), but ``dispatch_event('click')`` always works once the
iframe is reachable.
"""
from invisible_playwright import InvisiblePlaywright
with InvisiblePlaywright(seed=42, binary_path=firefox_binary, humanize=False) as browser:
ctx = browser.new_context()
page = ctx.new_page()
page.goto(cross_origin_harness["parent_url"], wait_until="domcontentloaded", timeout=30_000)
page.wait_for_selector("iframe#ifr_plain", timeout=10_000)
page.frame_locator("iframe#ifr_plain").locator("button#ok").dispatch_event(
"click", timeout=4_000
)
cf = page.query_selector("iframe#ifr_plain").content_frame()
assert cf.evaluate("() => document.title") == "clicked"

294
tests/test_detectors_e2e.py Normal file
View file

@ -0,0 +1,294 @@
"""E2E: run the REAL open-source detectors against the patched binary, on CI.
Instead of our own hand-rolled signal checks, this loads the actual detection
libraries and uses their FULL API surface:
* BotD (@fingerprintjs/botd, MIT) the client-side bot detector that
FingerprintJS Pro itself uses. We assert the aggregate verdict
(``detect().bot == False``) AND every one of its ~18 individual detectors
(``getDetections()``) returns ``bot == False``.
* FingerprintJS open-source (MIT) ``get()`` must return a ``visitorId``
that is STABLE across two fresh launches with the same seed, and a RICH
component set (the fingerprint surface is real, not a stub).
* fpscanner (antoinevastel/fpscanner 1.0.6, MIT) ``collectFingerprint()``
runs ~21 bot-detection rules in the browser. We assert the **engine-agnostic**
subset (webdriver / selenium / bot-UA / platform / timezone / language) is
clean. We deliberately do NOT assert the Chrome/GPU-only rules (hasCDP,
hasPlaywright, hasSwiftshaderRenderer, hasMissingChromeObject, ): they're
trivially clean on Firefox, and the GPU ones can legitimately fire on a
software-WebGL CI host (Xvfb/llvmpipe) asserting them would false-red.
* CreepJS (abrahamjuliot/creepjs, MIT, pinned) the gold-standard Firefox-aware
headless/stealth/lie detector. It exposes its result on ``window.Fingerprint``.
We assert ``headlessRating == 0`` (webdriver + headless-UA tells) and the
JS-proxy stealth tells are absent. ``stealthRating`` / ``totalLies`` /
``likeHeadlessRating`` are LOGGED, not hard-asserted, because some of their
sub-signals (hasBadWebGL, prefers-light-color) are GPU/theme-sensitive and
differ on a GPU-less CI host.
Everything is hermetic: the libraries are vendored (tests/vendor/) and served
from a localhost HTTP server no external CDN call. For CreepJS, every non-local
request is aborted, so its optional crowd-comparison POST never runs and the
verdict is computed purely locally. Runs identically on a dev box and a GH runner.
NOT covered: FingerprintJS *Pro* (commercial, server-side) stays the local
realness gate.
"""
from __future__ import annotations
import http.server
import socketserver
import threading
from pathlib import Path
import pytest
from invisible_playwright import InvisiblePlaywright
_VENDOR = Path(__file__).parent / "vendor"
_BOTD = "botd-2.0.0.esm.js"
_FPJS = "fingerprintjs-5.2.0.umd.min.js"
_FPSCANNER = "fpscanner-1.0.6.es.js"
_CREEPJS = "creepjs-10aa672.js" # pinned abrahamjuliot/creepjs@10aa6724
# fpscanner rules that are MEANINGFUL on Firefox and GPU-independent — these must
# stay clean. The omitted rules are Chrome-only (hasCDP/hasPlaywright/
# hasMissingChromeObject/hasHighCPUCount/hasImpossibleDeviceMemory/
# headlessChromeScreenResolution) or GPU-sensitive on a software-WebGL CI host
# (hasSwiftshaderRenderer/hasGPUMismatch/hasMismatchWebGLInWorker).
_FPSCANNER_AGNOSTIC = [
"hasWebdriver", "hasWebdriverIframe", "hasWebdriverWorker", "hasWebdriverWritable",
"hasSeleniumProperty", "hasBotUserAgent", "hasPlatformMismatch",
"hasMismatchLanguages", "hasUTCTimezone", "hasMismatchPlatformIframe",
"hasMismatchPlatformWorker", "hasInconsistentEtsl",
]
_PAGE = f"""<!doctype html><html><head><meta charset="utf-8">
<title>detectors</title>
<script src="/{_FPJS}"></script>
</head><body><h1 id="state">loading</h1>
<script type="module">
window.__botd = null; window.__fp = null; window.__fps = null; window.__err = "";
(async () => {{
try {{
const Botd = await import("/{_BOTD}");
const botd = await Botd.load();
const verdict = botd.detect();
const raw = botd.getDetections() || {{}};
const detections = {{}};
for (const k in raw) detections[k] = {{ bot: raw[k].bot, botKind: raw[k].botKind || null }};
window.__botd = {{ bot: verdict.bot, botKind: verdict.botKind || null, detections }};
}} catch (e) {{ window.__err += " botd:" + e; }}
try {{
const fp = await FingerprintJS.load();
const r = await fp.get();
const keys = Object.keys(r.components || {{}});
const errored = keys.filter(k => r.components[k] && "error" in r.components[k]);
window.__fp = {{ visitorId: r.visitorId, componentKeys: keys, erroredComponents: errored }};
}} catch (e) {{ window.__err += " fp:" + e; }}
try {{
const M = await import("/{_FPSCANNER}");
const scanner = new M.default();
const fp = await scanner.collectFingerprint({{ encrypt: false }});
window.__fps = {{ fastBotDetection: fp.fastBotDetection, details: fp.fastBotDetectionDetails }};
}} catch (e) {{ window.__err += " fps:" + e; }}
document.getElementById("state").textContent = "done";
}})();
</script></body></html>"""
# CreepJS gets its own page: creep.js is a plain `defer` script that runs on load
# and populates window.Fingerprint. A minimal DOM is enough (the rich report DOM
# is only for the visual page, not the computation).
_CREEP_PAGE = f"""<!doctype html><html><head><meta charset="utf-8"><title>creep</title></head>
<body><div id="fingerprint-data"></div><script src="/{_CREEPJS}" defer></script></body></html>"""
class _DetectorSite:
"""Localhost server: `/` → BotD+FPJS+fpscanner page, `/creepjs` → CreepJS page,
`/<file>` the vendored bundle."""
def __init__(self):
page = _PAGE.encode()
creep_page = _CREEP_PAGE.encode()
vendor = _VENDOR
class H(http.server.BaseHTTPRequestHandler):
def do_GET(self): # noqa: N802
p = self.path.split("?")[0]
if p == "/":
body, ctype = page, "text/html; charset=utf-8"
elif p == "/creepjs":
body, ctype = creep_page, "text/html; charset=utf-8"
else:
f = vendor / Path(p.lstrip("/")).name
if not f.is_file():
self.send_error(404); return
body = f.read_bytes()
ctype = "text/javascript; charset=utf-8"
self.send_response(200)
self.send_header("Content-Type", ctype)
self.send_header("Content-Length", str(len(body)))
self.end_headers()
self.wfile.write(body)
def log_message(self, *a):
pass
self._srv = socketserver.TCPServer(("127.0.0.1", 0), H)
self.port = self._srv.server_address[1]
threading.Thread(target=self._srv.serve_forever, daemon=True).start()
@property
def url(self):
return f"http://127.0.0.1:{self.port}/"
@property
def creep_url(self):
return f"http://127.0.0.1:{self.port}/creepjs"
def close(self):
self._srv.shutdown()
@pytest.fixture(scope="module")
def detector_site():
s = _DetectorSite()
yield s
s.close()
def _run_detectors(firefox_binary, url):
"""Launch the binary, load the page, return (botd, fp, fps, err)."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(url, wait_until="load", timeout=45000)
page.wait_for_function(
"() => document.getElementById('state').textContent === 'done'",
timeout=45000,
)
botd = page.evaluate("() => window.__botd")
fp = page.evaluate("() => window.__fp")
fps = page.evaluate("() => window.__fps")
err = page.evaluate("() => window.__err")
return botd, fp, fps, err
def _run_creepjs(firefox_binary, creep_url):
"""Launch the binary, run CreepJS fully offline, return its headless result."""
_EV = """() => {
const f = window.Fingerprint;
if (!f || !f.headless) return { ready: false };
const h = f.headless;
return {
ready: true,
headlessRating: h.headlessRating,
stealthRating: h.stealthRating,
likeHeadlessRating: h.likeHeadlessRating,
headless: h.headless || {},
stealth: h.stealth || {},
totalLies: (f.lies && f.lies.totalLies) || 0,
};
}"""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
# truly offline: abort every non-loopback request (CreepJS's optional
# crowd-comparison POST to arh.antoinevastel.com never runs).
page.route(
"**/*",
lambda r: r.abort() if "127.0.0.1" not in r.request.url else r.continue_(),
)
page.goto(creep_url, wait_until="domcontentloaded", timeout=45000)
page.wait_for_function(
"() => !!(window.Fingerprint && window.Fingerprint.headless)",
timeout=60000,
)
return page.evaluate(_EV)
@pytest.mark.e2e
def test_botd_no_detector_flags_automation(firefox_binary, detector_site):
"""The real BotD must not flag the build — aggregate AND every one of its
individual detectors (webDriver/userAgent/appVersion/plugins/process/...)."""
botd, _fp, _fps, err = _run_detectors(firefox_binary, detector_site.url)
assert botd is not None, f"BotD produced no result (err:{err!r})"
assert botd.get("bot") is False, (
f"BotD aggregate flagged a bot: botKind={botd.get('botKind')!r}"
)
detections = botd.get("detections") or {}
assert detections, f"BotD getDetections() returned nothing (err:{err!r})"
flagged = {k: v.get("botKind") for k, v in detections.items() if v.get("bot")}
assert not flagged, f"BotD individual detectors flagged automation: {flagged}"
@pytest.mark.e2e
def test_fingerprintjs_visitorid_stable_across_launches(firefox_binary, detector_site):
"""FingerprintJS visitorId must be present and identical across two fresh
launches with the same seed a real browser is stable; an over-randomized
spoof drifts (and a drifting fingerprint is itself a bot tell)."""
_b1, fp1, _f1, err1 = _run_detectors(firefox_binary, detector_site.url)
_b2, fp2, _f2, err2 = _run_detectors(firefox_binary, detector_site.url)
assert fp1 and fp1.get("visitorId"), f"no visitorId on run 1 (err:{err1!r})"
assert fp2 and fp2.get("visitorId"), f"no visitorId on run 2 (err:{err2!r})"
assert fp1["visitorId"] == fp2["visitorId"], (
f"FingerprintJS visitorId drifted across launches: "
f"{fp1['visitorId']!r} != {fp2['visitorId']!r} (per-session entropy = bot tell)"
)
@pytest.mark.e2e
def test_fingerprintjs_collects_rich_fingerprint(firefox_binary, detector_site):
"""FingerprintJS must collect a RICH component surface (a real browser
exposes many signals; a stripped/blocked surface is itself suspicious)."""
_b, fp, _f, err = _run_detectors(firefox_binary, detector_site.url)
assert fp and fp.get("visitorId"), f"FingerprintJS produced no id (err:{err!r})"
keys = fp.get("componentKeys") or []
assert len(keys) >= 15, (
f"FingerprintJS collected only {len(keys)} components — surface too thin "
f"(suppressed signals are themselves a tell): {keys}"
)
@pytest.mark.e2e
def test_fpscanner_no_automation_rules(firefox_binary, detector_site):
"""fpscanner's engine-agnostic bot rules (webdriver/selenium/bot-UA/platform/
timezone/language) must all be clean. The Chrome/GPU-only rules are ignored
on purpose (see module docstring) they false-red on a software-WebGL host."""
_b, _fp, fps, err = _run_detectors(firefox_binary, detector_site.url)
assert fps is not None, f"fpscanner produced no result (err:{err!r})"
details = fps.get("details") or {}
assert details, f"fpscanner returned no detection details (err:{err!r})"
flagged = [
k for k in _FPSCANNER_AGNOSTIC
if details.get(k) and details[k].get("detected")
]
assert not flagged, (
f"fpscanner flagged automation on engine-agnostic rules: {flagged} "
f"(full details: { {k: v for k, v in details.items() if v.get('detected')} })"
)
@pytest.mark.e2e
def test_creepjs_headless_and_proxy_clean(firefox_binary, detector_site):
"""CreepJS (Firefox-aware) must see no headless tell and no JS-proxy stealth
tell. ``headlessRating`` aggregates webDriverIsOn + headless-UA checks (all
GPU-independent). The proxy/runtime stealth sub-signals (hasIframeProxy,
hasToStringProxy, hasBadChromeRuntime) must be false a spoof implemented
with a JS Proxy is exactly what CreepJS catches. stealthRating/totalLies/
likeHeadlessRating are GPU/theme-sensitive, so we log them, not assert."""
r = _run_creepjs(firefox_binary, detector_site.creep_url)
assert r and r.get("ready"), f"CreepJS never populated window.Fingerprint: {r!r}"
print(
f"[creepjs] headlessRating={r['headlessRating']} stealthRating={r['stealthRating']} "
f"likeHeadlessRating={r['likeHeadlessRating']} totalLies={r['totalLies']} "
f"headless={r['headless']} stealth={r['stealth']}"
)
assert r["headlessRating"] == 0, (
f"CreepJS headless tells fired: headless={r['headless']} "
f"(headlessRating={r['headlessRating']})"
)
stealth = r.get("stealth") or {}
proxy_tells = {
k: stealth.get(k)
for k in ("hasIframeProxy", "hasToStringProxy", "hasBadChromeRuntime")
if stealth.get(k)
}
assert not proxy_tells, f"CreepJS JS-proxy stealth tells fired: {proxy_tells}"

View file

@ -1,15 +1,28 @@
import hashlib import hashlib
import io
import tarfile
from pathlib import Path from pathlib import Path
import pytest import pytest
import requests
import responses import responses
from stealthfox.download import ensure_binary from invisible_playwright.constants import BINARY_VERSION, RELEASE_URL_TEMPLATE
from stealthfox.constants import BINARY_VERSION from invisible_playwright.download import (
_download_file,
_extract,
_github_token,
_parse_checksums,
_parse_owner_repo,
_resolve_asset_url,
_sha256_file,
cache_dir_for_version,
cache_root,
ensure_binary,
)
def _make_zip(path: Path, inner_name: str, payload: bytes) -> bytes: def _make_zip(path: Path, inner_name: str, payload: bytes) -> bytes:
import io
import zipfile import zipfile
buf = io.BytesIO() buf = io.BytesIO()
with zipfile.ZipFile(buf, "w") as zf: with zipfile.ZipFile(buf, "w") as zf:
@ -19,20 +32,32 @@ def _make_zip(path: Path, inner_name: str, payload: bytes) -> bytes:
return data return data
def _make_targz(path: Path, inner_name: str, payload: bytes) -> bytes:
buf = io.BytesIO()
with tarfile.open(fileobj=buf, mode="w:gz") as tf:
info = tarfile.TarInfo(name=inner_name)
info.size = len(payload)
tf.addfile(info, io.BytesIO(payload))
data = buf.getvalue()
path.write_bytes(data)
return data
@pytest.mark.unit
@responses.activate @responses.activate
def test_ensure_binary_downloads_and_verifies(tmp_path, monkeypatch): def test_ensure_binary_downloads_and_verifies(tmp_path, monkeypatch):
"""Full path: cache miss -> HTTP GET -> SHA256 check -> extract -> return path.""" """Full path: cache miss -> HTTP GET -> SHA256 check -> extract -> return path."""
cache = tmp_path / "cache" cache = tmp_path / "cache"
monkeypatch.setattr("stealthfox.download.cache_root", lambda: cache) monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.zip" archive_path = tmp_path / "archive.zip"
archive_bytes = _make_zip(archive_path, "firefox.exe", b"PEX!") archive_bytes = _make_zip(archive_path, "firefox.exe", b"PEX!")
archive_sha = hashlib.sha256(archive_bytes).hexdigest() archive_sha = hashlib.sha256(archive_bytes).hexdigest()
from stealthfox.constants import ARCHIVE_NAME from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("win32", "AMD64") asset = ARCHIVE_NAME("win32", "AMD64")
url_archive = f"https://github.com/feder-cr/stealthfox/releases/download/{BINARY_VERSION}/{asset}" url_archive = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/{asset}"
url_sums = f"https://github.com/feder-cr/stealthfox/releases/download/{BINARY_VERSION}/checksums.txt" url_sums = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/checksums.txt"
responses.add(responses.GET, url_archive, body=archive_bytes, status=200, responses.add(responses.GET, url_archive, body=archive_bytes, status=200,
content_type="application/zip") content_type="application/zip")
@ -48,18 +73,19 @@ def test_ensure_binary_downloads_and_verifies(tmp_path, monkeypatch):
assert Path(path).name == "firefox.exe" assert Path(path).name == "firefox.exe"
@pytest.mark.unit
@responses.activate @responses.activate
def test_ensure_binary_rejects_sha_mismatch(tmp_path, monkeypatch): def test_ensure_binary_rejects_sha_mismatch(tmp_path, monkeypatch):
cache = tmp_path / "cache" cache = tmp_path / "cache"
monkeypatch.setattr("stealthfox.download.cache_root", lambda: cache) monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.zip" archive_path = tmp_path / "archive.zip"
archive_bytes = _make_zip(archive_path, "firefox.exe", b"PEX!") archive_bytes = _make_zip(archive_path, "firefox.exe", b"PEX!")
wrong_sha = "0" * 64 wrong_sha = "0" * 64
from stealthfox.constants import ARCHIVE_NAME from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("win32", "AMD64") asset = ARCHIVE_NAME("win32", "AMD64")
url_archive = f"https://github.com/feder-cr/stealthfox/releases/download/{BINARY_VERSION}/{asset}" url_archive = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/{asset}"
url_sums = f"https://github.com/feder-cr/stealthfox/releases/download/{BINARY_VERSION}/checksums.txt" url_sums = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/checksums.txt"
responses.add(responses.GET, url_archive, body=archive_bytes, status=200) responses.add(responses.GET, url_archive, body=archive_bytes, status=200)
responses.add(responses.GET, url_sums, body=f"{wrong_sha} {asset}\n", status=200) responses.add(responses.GET, url_sums, body=f"{wrong_sha} {asset}\n", status=200)
@ -69,3 +95,748 @@ def test_ensure_binary_rejects_sha_mismatch(tmp_path, monkeypatch):
with pytest.raises(RuntimeError, match="SHA256"): with pytest.raises(RuntimeError, match="SHA256"):
ensure_binary() ensure_binary()
# DL1: cache hit returns cached path without HTTP call
@pytest.mark.unit
def test_ensure_binary_cache_hit_skips_http(tmp_path, monkeypatch):
"""When the binary already exists in cache, ensure_binary returns immediately
without issuing any HTTP request."""
cache = tmp_path / "cache"
version_dir = cache / BINARY_VERSION
version_dir.mkdir(parents=True)
pre_cached = version_dir / "firefox.exe"
pre_cached.write_text("cached-content")
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
monkeypatch.setattr("sys.platform", "win32")
import platform
monkeypatch.setattr(platform, "machine", lambda: "AMD64")
def _fail_get(*args, **kwargs):
raise AssertionError("HTTP must not be called on cache hit")
monkeypatch.setattr("invisible_playwright.download.requests.get", _fail_get)
path = ensure_binary()
assert path == pre_cached
assert path.read_text() == "cached-content"
# DL2: .tar.gz extraction works
@pytest.mark.unit
def test_extract_tar_gz(tmp_path):
"""_extract handles .tar.gz archives and unpacks the inner files."""
archive = tmp_path / "bundle.tar.gz"
_make_targz(archive, "firefox", b"ELF!")
dst = tmp_path / "out"
_extract(archive, dst)
assert (dst / "firefox").exists()
assert (dst / "firefox").read_bytes() == b"ELF!"
# DL3: checksum line with comment (#) is skipped
@pytest.mark.unit
def test_parse_checksums_skips_comments_and_blanks():
text = (
"# this is a comment\n"
"\n"
" # indented comment\n"
"abc123 file1.zip\n"
"def456 file2.tar.gz\n"
)
out = _parse_checksums(text)
assert out == {"file1.zip": "abc123", "file2.tar.gz": "def456"}
# DL3 sibling: malformed lines (fewer than 2 fields) are silently ignored
@pytest.mark.unit
def test_parse_checksums_ignores_single_field_lines():
text = "loner\nabc123 file.zip\n"
out = _parse_checksums(text)
assert out == {"file.zip": "abc123"}
# DL3 sibling: last field is treated as filename (supports trailing whitespace tokens)
@pytest.mark.unit
def test_parse_checksums_uses_last_token_as_filename():
text = "abc123 some/nested/file.zip\n"
out = _parse_checksums(text)
assert "some/nested/file.zip" in out
# DL3 regression — issue #15 (LostBoxArt).
# GNU coreutils `sha256sum` (and `shasum -b`) print filenames in BINARY MODE
# with a leading `*`: "hash *filename". The parser used parts[-1] verbatim
# so the key became "*filename" and lookups by bare filename returned None,
# raising `RuntimeError: no SHA256 for {asset}` on every first-time fetch.
@pytest.mark.unit
def test_parse_checksums_strips_star_prefix_binary_mode():
"""`sha256sum -b` format (default on Linux when reading actual files)."""
text = "abc123 *firefox.tar.gz\n"
out = _parse_checksums(text)
assert out == {"firefox.tar.gz": "abc123"}, (
"binary-mode '*' prefix must be stripped from the filename key"
)
@pytest.mark.unit
def test_parse_checksums_handles_mixed_binary_and_text_mode():
"""A single checksums.txt with one binary-mode line and one text-mode line.
Both keys must be normalized (no `*` prefix) so consumers can use the bare
filename as the lookup key regardless of how each line was produced."""
text = (
"aaa111 *firefox-win.zip\n"
"bbb222 firefox-linux.tar.gz\n"
)
out = _parse_checksums(text)
assert out == {"firefox-win.zip": "aaa111", "firefox-linux.tar.gz": "bbb222"}
@pytest.mark.unit
def test_parse_checksums_handles_multiple_leading_stars():
"""`.lstrip("*")` strips any run of leading asterisks. Not a real sha256sum
format but defensive guarantees no `*` survives in any key."""
text = "abc123 **doubled.zip\n"
out = _parse_checksums(text)
assert "doubled.zip" in out
assert "**doubled.zip" not in out
@pytest.mark.unit
def test_parse_checksums_handles_crlf_line_endings():
"""sha256sum.exe on Windows writes CRLF. The .strip() on each line should
consume the \\r so the key doesn't end up as 'firefox.zip\\r'."""
text = "abc123 *firefox.zip\r\ndef456 other.tar.gz\r\n"
out = _parse_checksums(text)
assert out == {"firefox.zip": "abc123", "other.tar.gz": "def456"}
@pytest.mark.unit
def test_parse_checksums_handles_utf8_bom_at_start():
"""Some Windows tools prepend a UTF-8 BOM. The first line shouldn't be lost."""
text = "abc123 *firefox.zip\n"
out = _parse_checksums(text)
# The BOM stays attached to the hash field as a non-fatal artifact;
# what matters is that the FILENAME key is parsed and normalized.
keys = list(out.keys())
assert "firefox.zip" in keys, f"BOM caused first line to be lost: keys={keys}"
@pytest.mark.unit
def test_parse_checksums_handles_indented_lines():
"""Leading whitespace on a data line must not break parsing."""
text = " abc123 *indented.zip\n"
out = _parse_checksums(text)
assert out == {"indented.zip": "abc123"}
@pytest.mark.unit
def test_parse_checksums_handles_trailing_whitespace():
"""Trailing spaces on a line shouldn't end up in the key."""
text = "abc123 *trailing.zip \n"
out = _parse_checksums(text)
# After .strip() the trailing spaces are gone, so the key is clean
assert out == {"trailing.zip": "abc123"}
@pytest.mark.unit
def test_parse_checksums_real_world_sha256sum_b_output(tmp_path):
"""End-to-end: invoke the actual `sha256sum` (or its Python equivalent)
on a real file and verify the parser handles that output verbatim.
We can't depend on sha256sum being on PATH on Windows, so we synthesize
the exact byte sequence that GNU coreutils 9.x produces."""
fake_archive = tmp_path / "release.tar.gz"
fake_archive.write_bytes(b"some content")
sha = hashlib.sha256(fake_archive.read_bytes()).hexdigest()
# Exact format coreutils prints in binary mode (default for files):
# "<hash><SP>*<filename>\n"
coreutils_output = f"{sha} *{fake_archive.name}\n"
out = _parse_checksums(coreutils_output)
assert out == {"release.tar.gz": sha}
@pytest.mark.unit
def test_parse_checksums_text_mode_two_space_separator():
"""`sha256sum --text` format uses two spaces. Must also parse cleanly
and the key must be identical to the binary-mode case."""
text = "abc123 textmode.zip\n"
out = _parse_checksums(text)
assert out == {"textmode.zip": "abc123"}
@pytest.mark.unit
def test_parse_checksums_empty_file_returns_empty_dict():
assert _parse_checksums("") == {}
assert _parse_checksums("\n\n\n") == {}
assert _parse_checksums(" \n\t\n") == {}
@pytest.mark.unit
def test_parse_checksums_all_comment_file_returns_empty_dict():
"""A file with only comments shouldn't crash and shouldn't produce keys."""
text = "# generated by release script\n# 2026-05-20\n"
assert _parse_checksums(text) == {}
# DL3 regression — full integration via ensure_binary: confirm the parser
# bug from #15 cannot regress when the live release format is mimicked exactly.
@pytest.mark.unit
@responses.activate
def test_ensure_binary_accepts_binary_mode_checksums(tmp_path, monkeypatch):
"""Reproduce the EXACT format the GitHub release ships:
<sha> *<filename>
Before the #15 fix this raised
RuntimeError: no SHA256 for {asset} in checksums.txt
even though the asset and SHA were both present."""
cache = tmp_path / "cache"
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.zip"
archive_bytes = _make_zip(archive_path, "firefox.exe", b"PEX!")
archive_sha = hashlib.sha256(archive_bytes).hexdigest()
from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("win32", "AMD64")
url_archive = (
f"https://github.com/feder-cr/invisible_playwright/releases/download/"
f"{BINARY_VERSION}/{asset}"
)
url_sums = (
f"https://github.com/feder-cr/invisible_playwright/releases/download/"
f"{BINARY_VERSION}/checksums.txt"
)
responses.add(responses.GET, url_archive, body=archive_bytes, status=200,
content_type="application/zip")
# Binary-mode format (note the `*`): regression sentinel for #15.
responses.add(
responses.GET, url_sums,
body=f"{archive_sha} *{asset}\n",
status=200,
)
# Force the platform branch the test mocks:
monkeypatch.setattr("sys.platform", "win32")
out = ensure_binary()
# No RuntimeError means the parser accepted the `*`-prefixed key.
assert out.exists()
# DL4: unknown archive format (.rar) raises RuntimeError
@pytest.mark.unit
def test_extract_unknown_format_raises(tmp_path):
archive = tmp_path / "thing.rar"
archive.write_bytes(b"not-a-real-rar")
dst = tmp_path / "out"
with pytest.raises(RuntimeError, match="unknown archive format"):
_extract(archive, dst)
# DL5: binary not found after extraction raises RuntimeError
@pytest.mark.unit
@responses.activate
def test_ensure_binary_missing_entry_after_extract_raises(tmp_path, monkeypatch):
"""If the archive extracts cleanly but the expected entry isn't present,
ensure_binary raises RuntimeError."""
cache = tmp_path / "cache"
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.zip"
# zip without firefox.exe inside
archive_bytes = _make_zip(archive_path, "other.bin", b"X")
archive_sha = hashlib.sha256(archive_bytes).hexdigest()
from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("win32", "AMD64")
url_archive = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/{asset}"
url_sums = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/checksums.txt"
responses.add(responses.GET, url_archive, body=archive_bytes, status=200)
responses.add(responses.GET, url_sums, body=f"{archive_sha} {asset}\n", status=200)
monkeypatch.setattr("sys.platform", "win32")
import platform
monkeypatch.setattr(platform, "machine", lambda: "AMD64")
with pytest.raises(RuntimeError, match="binary not found after extraction"):
ensure_binary()
# Pure helper: _parse_owner_repo
@pytest.mark.unit
def test_parse_owner_repo_valid():
owner, repo = _parse_owner_repo(
"https://github.com/feder-cr/invisible_playwright/releases/download/x/y"
)
assert owner == "feder-cr"
assert repo == "invisible_playwright"
@pytest.mark.unit
def test_parse_owner_repo_invalid_raises():
with pytest.raises(RuntimeError, match="cannot parse owner/repo"):
_parse_owner_repo("not-a-github-url")
# Pure helper: _sha256_file matches hashlib output
@pytest.mark.unit
def test_sha256_file_matches_hashlib(tmp_path):
payload = b"hello world"
f = tmp_path / "file.bin"
f.write_bytes(payload)
expected = hashlib.sha256(payload).hexdigest()
assert _sha256_file(f) == expected
# _github_token precedence: STEALTHFOX_GITHUB_TOKEN beats GITHUB_TOKEN
@pytest.mark.unit
def test_github_token_stealthfox_wins(monkeypatch):
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "stealth")
monkeypatch.setenv("GITHUB_TOKEN", "generic")
assert _github_token() == "stealth"
@pytest.mark.unit
def test_github_token_falls_back_to_github_token(monkeypatch):
monkeypatch.delenv("STEALTHFOX_GITHUB_TOKEN", raising=False)
monkeypatch.setenv("GITHUB_TOKEN", "generic")
assert _github_token() == "generic"
@pytest.mark.unit
def test_github_token_none_when_unset(monkeypatch):
monkeypatch.delenv("STEALTHFOX_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
assert _github_token() is None
# Bonus coverage: unsupported platform raises NotImplementedError before any HTTP
@pytest.mark.unit
def test_ensure_binary_unsupported_platform_raises(monkeypatch):
monkeypatch.setattr("sys.platform", "freebsd") # win32/linux/darwin are supported
import platform
monkeypatch.setattr(platform, "machine", lambda: "AMD64")
with pytest.raises(NotImplementedError, match="unsupported platform"):
ensure_binary()
# ──────────────────────────────────────────────────────────────────────
# Linux platform tests — exercise the tar.gz extraction path. Mirrors
# the Windows .zip tests above so both archive formats are covered.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
@responses.activate
def test_ensure_binary_downloads_and_verifies_linux(tmp_path, monkeypatch):
"""Linux happy path: tar.gz download → SHA256 check → extract → return path."""
cache = tmp_path / "cache"
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.tar.gz"
archive_bytes = _make_targz(archive_path, "firefox", b"ELF!")
archive_sha = hashlib.sha256(archive_bytes).hexdigest()
from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("linux", "x86_64")
url_archive = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/{asset}"
url_sums = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/checksums.txt"
responses.add(responses.GET, url_archive, body=archive_bytes, status=200,
content_type="application/gzip")
responses.add(responses.GET, url_sums,
body=f"{archive_sha} {asset}\n", status=200)
monkeypatch.setattr("sys.platform", "linux")
import platform
monkeypatch.setattr(platform, "machine", lambda: "x86_64")
path = ensure_binary()
assert Path(path).exists()
assert Path(path).name == "firefox"
@pytest.mark.unit
@responses.activate
def test_ensure_binary_rejects_sha_mismatch_linux(tmp_path, monkeypatch):
"""Linux SHA mismatch must raise — the tar.gz path runs the same
verifier as the .zip path, so a corrupted archive is rejected before
extraction regardless of platform."""
cache = tmp_path / "cache"
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.tar.gz"
archive_bytes = _make_targz(archive_path, "firefox", b"ELF!")
wrong_sha = "0" * 64
from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("linux", "x86_64")
url_archive = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/{asset}"
url_sums = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/checksums.txt"
responses.add(responses.GET, url_archive, body=archive_bytes, status=200)
responses.add(responses.GET, url_sums, body=f"{wrong_sha} {asset}\n", status=200)
monkeypatch.setattr("sys.platform", "linux")
import platform
monkeypatch.setattr(platform, "machine", lambda: "x86_64")
with pytest.raises(RuntimeError, match="SHA256"):
ensure_binary()
@pytest.mark.unit
def test_ensure_binary_cache_hit_skips_http_linux(tmp_path, monkeypatch):
"""Linux cache hit short-circuits before any HTTP. Looks for the
``firefox`` entry (not ``firefox.exe``) per ``BINARY_ENTRY_REL``."""
cache = tmp_path / "cache"
version_dir = cache / BINARY_VERSION
version_dir.mkdir(parents=True)
pre_cached = version_dir / "firefox"
pre_cached.write_text("cached-content")
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
monkeypatch.setattr("sys.platform", "linux")
import platform
monkeypatch.setattr(platform, "machine", lambda: "x86_64")
def _fail_get(*args, **kwargs):
raise AssertionError("HTTP must not be called on cache hit")
monkeypatch.setattr("invisible_playwright.download.requests.get", _fail_get)
path = ensure_binary()
assert path == pre_cached
assert path.read_text() == "cached-content"
@pytest.mark.unit
@responses.activate
def test_ensure_binary_missing_entry_after_extract_raises_linux(tmp_path, monkeypatch):
"""Linux post-extract sanity check: if the tar.gz lacks a ``firefox``
entry, raise rather than returning a non-existent path. Mirrors the
Windows test and guards against an upstream release artifact regression."""
cache = tmp_path / "cache"
monkeypatch.setattr("invisible_playwright.download.cache_root", lambda: cache)
archive_path = tmp_path / "archive.tar.gz"
# tar.gz without ``firefox`` inside
archive_bytes = _make_targz(archive_path, "other.bin", b"X")
archive_sha = hashlib.sha256(archive_bytes).hexdigest()
from invisible_playwright.constants import ARCHIVE_NAME
asset = ARCHIVE_NAME("linux", "x86_64")
url_archive = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/{asset}"
url_sums = f"https://github.com/feder-cr/invisible_playwright/releases/download/{BINARY_VERSION}/checksums.txt"
responses.add(responses.GET, url_archive, body=archive_bytes, status=200)
responses.add(responses.GET, url_sums, body=f"{archive_sha} {asset}\n", status=200)
monkeypatch.setattr("sys.platform", "linux")
import platform
monkeypatch.setattr(platform, "machine", lambda: "x86_64")
with pytest.raises(RuntimeError, match="binary not found after extraction"):
ensure_binary()
# ========================================================================== #
# _resolve_asset_url — public-repo direct URL vs private-repo API resolution
# ========================================================================== #
# This function chooses between two code paths based on whether a GitHub
# token is set. Both paths produce a downloadable URL but via different
# mechanisms, and a regression here would surface as 404 / 403 / wrong
# binary downloaded.
@pytest.mark.unit
def test_resolve_asset_url_public_returns_direct_url(monkeypatch):
"""No token → return the direct releases/download URL verbatim."""
monkeypatch.delenv("STEALTHFOX_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
url = _resolve_asset_url("firefox-4", "thing.zip")
assert url == RELEASE_URL_TEMPLATE.format(tag="firefox-4", asset="thing.zip")
assert "api.github.com" not in url # public path must skip the API
@pytest.mark.unit
def test_resolve_asset_url_public_url_format_is_stable(monkeypatch):
"""The exact URL shape is what GitHub clients have learned to cache.
Changing it without bumping BINARY_VERSION would 404 on first fetch
for every existing user guard against accidental drift."""
monkeypatch.delenv("STEALTHFOX_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
url = _resolve_asset_url("firefox-4", "abc.tar.gz")
assert url == (
"https://github.com/feder-cr/invisible_playwright/releases/"
"download/firefox-4/abc.tar.gz"
)
@pytest.mark.unit
@responses.activate
def test_resolve_asset_url_private_uses_api_with_token(monkeypatch):
"""Token set → hit the API and return the asset.url (which 302s with
Accept: application/octet-stream). The direct release URL would 404
for a private repo even with the token in headers."""
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "ghp_fake")
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
api_url = (
"https://api.github.com/repos/feder-cr/invisible_playwright"
"/releases/tags/firefox-4"
)
responses.add(
responses.GET, api_url,
json={"assets": [
{"name": "other.zip", "url": "https://api.github.com/.../1"},
{"name": "wanted.zip", "url": "https://api.github.com/.../2"},
]},
status=200,
)
url = _resolve_asset_url("firefox-4", "wanted.zip")
assert url == "https://api.github.com/.../2"
@pytest.mark.unit
@responses.activate
def test_resolve_asset_url_private_raises_when_asset_missing(monkeypatch):
"""If the asset name isn't on the release, raise — better to fail fast
with the asset name in the message than to download something else."""
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "ghp_fake")
api_url = (
"https://api.github.com/repos/feder-cr/invisible_playwright"
"/releases/tags/firefox-4"
)
responses.add(
responses.GET, api_url,
json={"assets": [{"name": "other.zip", "url": "x"}]},
status=200,
)
with pytest.raises(RuntimeError, match="not-here.zip"):
_resolve_asset_url("firefox-4", "not-here.zip")
@pytest.mark.unit
@responses.activate
def test_resolve_asset_url_private_propagates_api_4xx(monkeypatch):
"""If the API returns 404 (release doesn't exist) or 401 (bad token),
don't swallow it silently — raise so the user sees the real reason."""
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "ghp_fake")
api_url = (
"https://api.github.com/repos/feder-cr/invisible_playwright"
"/releases/tags/firefox-99"
)
responses.add(responses.GET, api_url, status=404)
with pytest.raises(requests.HTTPError):
_resolve_asset_url("firefox-99", "thing.zip")
@pytest.mark.unit
@responses.activate
def test_resolve_asset_url_private_sends_auth_header(monkeypatch):
"""The API call MUST include `Authorization: token <ghp_...>`, otherwise
a private repo returns 404 and the user thinks the release is missing."""
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "ghp_secret")
api_url = (
"https://api.github.com/repos/feder-cr/invisible_playwright"
"/releases/tags/firefox-4"
)
captured = {}
def callback(request):
captured["auth"] = request.headers.get("Authorization")
return (200, {}, '{"assets":[{"name":"x.zip","url":"https://x/y"}]}')
responses.add_callback(responses.GET, api_url, callback=callback,
content_type="application/json")
_resolve_asset_url("firefox-4", "x.zip")
assert captured["auth"] == "token ghp_secret"
# ========================================================================== #
# _download_file — file streaming + error propagation
# ========================================================================== #
@pytest.mark.unit
@responses.activate
def test_download_file_writes_full_payload_to_disk(tmp_path):
"""A 200 OK returns the full body; the file on disk matches byte-for-byte."""
url = "https://example.com/some-large.bin"
payload = bytes(range(256)) * 1024 # 256 KB, varied bytes
responses.add(responses.GET, url, body=payload, status=200)
dst = tmp_path / "downloaded.bin"
_download_file(url, dst)
assert dst.exists()
assert dst.read_bytes() == payload
@pytest.mark.unit
@responses.activate
def test_download_file_creates_parent_directories(tmp_path):
"""The dst's parent may not exist yet — _download_file is expected to
mkdir -p before writing. Without this, the first fetch on a clean
machine raises FileNotFoundError because the cache dir doesn't exist."""
url = "https://example.com/x.bin"
responses.add(responses.GET, url, body=b"data", status=200)
deep = tmp_path / "a" / "b" / "c" / "x.bin"
_download_file(url, deep)
assert deep.exists()
assert deep.read_bytes() == b"data"
@pytest.mark.unit
@responses.activate
def test_download_file_propagates_http_404(tmp_path):
"""404s from the CDN must raise — silent 404 → empty file → SHA mismatch
is a much worse failure mode."""
url = "https://example.com/missing.bin"
responses.add(responses.GET, url, status=404)
with pytest.raises(requests.HTTPError):
_download_file(url, tmp_path / "out.bin")
@pytest.mark.unit
@responses.activate
def test_download_file_propagates_http_500(tmp_path):
"""Server errors must surface, not be swallowed as 'empty download'."""
url = "https://example.com/broken.bin"
responses.add(responses.GET, url, status=500)
with pytest.raises(requests.HTTPError):
_download_file(url, tmp_path / "out.bin")
@pytest.mark.unit
@responses.activate
def test_download_file_adds_auth_for_api_urls(monkeypatch, tmp_path):
"""When downloading from api.github.com (private-repo flow), the
request MUST include `Authorization: token <...>` and
`Accept: application/octet-stream` otherwise the API returns the
asset JSON instead of the binary."""
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "ghp_secret")
url = "https://api.github.com/repos/x/y/releases/assets/123"
captured = {}
def callback(request):
captured["auth"] = request.headers.get("Authorization")
captured["accept"] = request.headers.get("Accept")
return (200, {}, b"BIN!")
responses.add_callback(responses.GET, url, callback=callback)
_download_file(url, tmp_path / "out.bin")
assert captured["auth"] == "token ghp_secret"
assert captured["accept"] == "application/octet-stream"
@pytest.mark.unit
@responses.activate
def test_download_file_does_not_send_auth_for_non_api_urls(monkeypatch, tmp_path):
"""Public-repo flow hits github.com/.../releases/download/... directly.
Sending an auth header to that URL is unnecessary and would leak the
token in CDN access logs."""
monkeypatch.setenv("STEALTHFOX_GITHUB_TOKEN", "ghp_secret")
url = "https://github.com/feder-cr/invisible_playwright/releases/download/firefox-4/x.zip"
captured = {}
def callback(request):
captured["auth"] = request.headers.get("Authorization")
return (200, {}, b"BIN!")
responses.add_callback(responses.GET, url, callback=callback)
_download_file(url, tmp_path / "out.bin")
assert captured["auth"] is None, (
"Auth header leaked to a public CDN URL — would expose the token "
"in GitHub's access logs."
)
# ========================================================================== #
# cache_root + cache_dir_for_version — path resolution
# ========================================================================== #
@pytest.mark.unit
def test_cache_root_returns_path():
"""Must return a Path, not a string — downstream code uses .mkdir() etc."""
p = cache_root()
assert isinstance(p, Path)
@pytest.mark.unit
def test_cache_root_contains_package_name():
"""The cache dir should be identifiable as ours so users can `rm -rf`
it without nuking other tools' caches."""
p = cache_root()
assert "invisible-playwright" in str(p).lower()
@pytest.mark.unit
def test_cache_dir_for_version_appends_version_segment():
"""Each binary version gets its own subdir so multiple versions can
coexist (useful for downgrade / A-B testing)."""
p = cache_dir_for_version("firefox-99")
assert p.name == "firefox-99"
assert p.parent == cache_root()
@pytest.mark.unit
def test_cache_dir_for_version_defaults_to_current_binary_version():
"""No-arg call uses the pinned BINARY_VERSION."""
p = cache_dir_for_version()
assert p.name == BINARY_VERSION
@pytest.mark.unit
def test_cache_dir_isolation_between_versions():
"""firefox-3 and firefox-4 must NEVER share a directory — extraction
would clobber one with the other and break downgrade."""
a = cache_dir_for_version("firefox-3")
b = cache_dir_for_version("firefox-4")
assert a != b
assert a.parent == b.parent # but they share the same root
# ========================================================================== #
# _parse_owner_repo — more edge cases
# ========================================================================== #
@pytest.mark.unit
def test_parse_owner_repo_extracts_from_canonical_template():
"""Must work against the exact template stored in constants.py."""
owner, repo = _parse_owner_repo(RELEASE_URL_TEMPLATE)
assert owner and repo # something extracted
assert "/" not in owner and "/" not in repo # no slashes in either segment
@pytest.mark.unit
@pytest.mark.parametrize("bad_template", [
"http://github.com/x/y/releases/", # http, not https
"https://gitlab.com/x/y/releases/", # wrong host
"https://github.com/onlyone/releases/", # missing repo segment
"", # empty
"github.com/x/y/releases/", # missing scheme
])
def test_parse_owner_repo_rejects_malformed_urls(bad_template):
"""Any URL that doesn't match the canonical shape must raise — silent
None/empty extraction would build broken API URLs and confuse the user."""
with pytest.raises(RuntimeError, match="cannot parse"):
_parse_owner_repo(bad_template)
@pytest.mark.unit
def test_parse_owner_repo_handles_repos_with_dashes_and_underscores():
"""Repo names with -, _, . are valid on GitHub; the regex must accept them."""
owner, repo = _parse_owner_repo(
"https://github.com/my-org/my_cool.repo/releases/download/x/y.zip"
)
assert owner == "my-org"
assert repo == "my_cool.repo"
@pytest.mark.unit
def test_ensure_binary_refuses_known_broken_version():
"""A known-broken release (firefox-8, no juggler) must be refused with a
clear error BEFORE any download never silently handed to the user."""
with pytest.raises(RuntimeError, match="known-broken"):
ensure_binary("firefox-8")

219
tests/test_e2e.py Normal file
View file

@ -0,0 +1,219 @@
"""E2E tests for the launcher lifecycle.
Tests requiring the patched Firefox binary are gated behind the
``firefox_binary`` fixture, which skips the test cleanly when the
binary is not cached locally and cannot be downloaded (e.g. no
network or no release token). The constructor-only tests (seed
handling) do not need a binary and always run.
"""
from __future__ import annotations
import pytest
from invisible_playwright import InvisiblePlaywright
# ────────────────────────────────────────────────────────────────────
# Constructor-only tests (no browser launch required)
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_e3_seed_is_accessible():
"""E3: explicit seed is stored on the instance after construction."""
ip = InvisiblePlaywright(seed=42)
assert ip.seed == 42
@pytest.mark.e2e
def test_e4_random_seed_when_none():
"""E4: omitting seed → a fresh positive int31 is chosen."""
ip = InvisiblePlaywright()
assert isinstance(ip.seed, int)
assert ip.seed > 0
assert ip.seed < 2**31
@pytest.mark.e2e
def test_e4b_random_seed_varies_across_instances():
"""E4 extension: two no-seed instances pick different seeds with
overwhelming probability. ``secrets.randbits(31)`` collisions are
~1 in 2 billion, so we accept the negligible flake risk."""
seeds = {InvisiblePlaywright().seed for _ in range(5)}
assert len(seeds) > 1
@pytest.mark.e2e
def test_e6_profile_built_eagerly():
"""The constructor materializes the Profile up front so seed-driven
fields are accessible without launching a browser. Guards against
a regression where Profile generation is deferred into ``__enter__``
and an invalid pin therefore raises only at launch time.
"""
ip = InvisiblePlaywright(seed=42)
assert ip._profile is not None
assert ip._profile.seed == 42
@pytest.mark.e2e
def test_e7_invalid_pin_raises_in_constructor():
"""Invalid pin keys fail fast at construction, not at __enter__."""
with pytest.raises(ValueError):
InvisiblePlaywright(seed=42, pin={"not_a_real_field": 1})
# ────────────────────────────────────────────────────────────────────
# Lifecycle tests (require Firefox binary)
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_e1_sync_context_manager_lifecycle(firefox_binary):
"""E1: ``with InvisiblePlaywright(...) as browser`` yields a real
Playwright Browser object that exposes ``new_context``."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
assert browser is not None
assert hasattr(browser, "new_context")
assert callable(browser.new_context)
@pytest.mark.e2e
def test_e2_create_context_and_page(firefox_binary):
"""E2: a context spawned from the patched browser can create a page."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
ctx = browser.new_context()
try:
page = ctx.new_page()
assert page is not None
assert hasattr(page, "goto")
finally:
ctx.close()
@pytest.mark.e2e
def test_e5_teardown_does_not_raise(firefox_binary):
"""E5: ``__exit__`` cleans up Playwright + virtual display without raising."""
ip = InvisiblePlaywright(seed=42, binary_path=firefox_binary)
browser = ip.__enter__()
try:
assert browser is not None
finally:
ip.__exit__(None, None, None)
# second teardown is idempotent
ip.__exit__(None, None, None)
@pytest.mark.e2e
def test_e8_new_context_defaults_from_profile(firefox_binary):
"""new_context() without kwargs should inherit profile-derived
viewport/screen. Guards the monkey-patch installed in __enter__."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
ctx = browser.new_context()
try:
page = ctx.new_page()
vp = page.viewport_size
assert vp is not None
assert vp["width"] > 0
assert vp["height"] > 0
finally:
ctx.close()
# ────────────────────────────────────────────────────────────────────
# Linux-specific lifecycle tests (no Firefox binary required).
#
# These exercise the launcher's Linux code paths without spawning real
# Firefox or Xvfb. They monkeypatch ``sys.platform`` and (where needed)
# the ``make_virtual_display`` dispatcher so the tests run on any host
# — including Windows hosts that ship the production CI for this repo.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_e9_linux_build_prefs_omits_windows_sandbox_key(monkeypatch):
"""E9: ``_build_prefs(headless=True)`` on Linux must pass
``virtual_display=False`` to the prefs translator. The Win32-only
``security.sandbox.gpu.level`` workaround targets the alt-desktop
GPU sandbox bug and MUST NOT leak into Linux prefs, where Xvfb
handles window hiding instead."""
import sys as _sys
monkeypatch.setattr(_sys, "platform", "linux")
ip = InvisiblePlaywright(seed=42, headless=True)
prefs = ip._build_prefs()
assert "security.sandbox.gpu.level" not in prefs
@pytest.mark.e2e
def test_e10_linux_resolve_headless_invokes_xvfb_dispatcher(monkeypatch):
"""E10: ``_resolve_headless`` with ``headless=True`` on Linux must
call ``make_virtual_display().start()`` and store the result on
``self._virtual_display``. We stub the dispatcher so no real Xvfb
is spawned the dispatcher's platform routing is covered separately
in ``test_headless.py``."""
import sys as _sys
monkeypatch.setattr(_sys, "platform", "linux")
events: list[str] = []
class _FakeDisplay:
def start(self) -> None:
events.append("start")
def stop(self) -> None:
events.append("stop")
from invisible_playwright import launcher as _l
monkeypatch.setattr(_l, "make_virtual_display", lambda: _FakeDisplay())
ip = InvisiblePlaywright(seed=42, headless=True)
result = ip._resolve_headless()
assert result is False
assert events == ["start"]
assert ip._virtual_display is not None
@pytest.mark.e2e
def test_e11_linux_teardown_stops_virtual_display_and_is_idempotent(monkeypatch):
"""E11: ``_teardown`` stops the Linux virtual display, clears the
reference, and a second invocation is a no-op. Guards the cleanup
path used by ``__exit__`` so a failed ``__enter__`` cannot leak Xvfb."""
import sys as _sys
monkeypatch.setattr(_sys, "platform", "linux")
stops: list[bool] = []
class _FakeDisplay:
def start(self) -> None:
pass
def stop(self) -> None:
stops.append(True)
from invisible_playwright import launcher as _l
monkeypatch.setattr(_l, "make_virtual_display", lambda: _FakeDisplay())
ip = InvisiblePlaywright(seed=42, headless=True)
ip._resolve_headless()
ip._teardown()
assert stops == [True]
assert ip._virtual_display is None
ip._teardown()
assert stops == [True]
@pytest.mark.e2e
def test_e12_linux_resolve_headless_without_xvfb_raises_clear_error(monkeypatch):
"""E12: On Linux with ``headless=True`` and ``Xvfb`` missing from
``PATH``, ``_resolve_headless`` must surface a clear, actionable
``RuntimeError`` instead of a cryptic FileNotFoundError. Verifies
the early-check path in ``_LinuxVirtualDisplay.start``."""
import sys as _sys
monkeypatch.setattr(_sys, "platform", "linux")
from invisible_playwright import _headless as _h
monkeypatch.setattr(_h, "_binary_on_path", lambda name: False)
ip = InvisiblePlaywright(seed=42, headless=True)
with pytest.raises(RuntimeError, match="Xvfb"):
ip._resolve_headless()
assert ip._virtual_display is None

View file

@ -0,0 +1,510 @@
"""Fingerprint consistency / lie-detection tests.
Complementary to test_fingerprint_surface.py: those tests ask "do you
look like a real browser?" — these ask "are your fingerprint surfaces
INTERNALLY CONSISTENT?"
Anti-bot systems catch spoofers not by checking each signal in
isolation but by cross-checking related signals. If you spoof UA to
"Windows" but leave navigator.platform as "Linux x86_64", or you spoof
WebGL renderer in the main thread but not in a Web Worker, the
inconsistency proves the spoof is fake.
Sources studied (all FOSS, MIT-licensed):
- creepjs/src/lies/index.ts the canonical lie detector
- creepjs/src/worker/index.ts main-vs-worker scope cross-check
- creepjs/src/math/index.ts Math.x(p) deterministic equality
- creepjs/src/navigator/index.ts UA/platform/oscpu invariants
- niespodd/browser-fingerprinting README worker hwConcurrency,
plugin chain, perf.timeOrigin
Everything runs against `about:blank` with NO network and NO proxy.
Run only this file:
pytest tests/test_fingerprint_consistency.py -m e2e -v
"""
from __future__ import annotations
import pytest
from invisible_playwright import InvisiblePlaywright
PIN = {
"screen.width": 1920,
"screen.height": 1080,
"screen.avail_width": 1920,
"screen.avail_height": 1040,
"screen.dpr": 1.0,
"hardware.concurrency": 8,
"audio.sample_rate": 48000,
"audio.max_channel_count": 2,
}
@pytest.fixture(scope="module")
def page(firefox_binary):
with InvisiblePlaywright(
seed=42,
pin=PIN,
binary_path=firefox_binary,
headless=True,
) as browser:
ctx = browser.new_context()
p = ctx.new_page()
p.goto("about:blank", timeout=30_000)
yield p
def _ev(page, expr):
return page.evaluate(expr)
# ===========================================================================
# 1. Math determinism — same input MUST yield same output
# Source: creepjs/src/math/index.ts
# A wrapper that adds noise to Math.* (canvas-spoofing prefs) exposes
# itself here: two consecutive calls with the same input must be
# byte-identical.
# ===========================================================================
@pytest.mark.e2e
@pytest.mark.parametrize("fn,arg", [
("cos", "1e308"),
("acos", "0.5"),
("asin", "0.5"),
("atan", "Math.PI"),
("atanh", "0.5"),
("cbrt", "Math.PI"),
("cosh", "Math.PI"),
("exp", "Math.PI"),
("expm1", "Math.PI"),
("log", "Math.PI"),
("log1p", "Math.PI"),
("log10", "Math.PI"),
("sin", "Math.PI"),
("sinh", "Math.PI"),
("sqrt", "Math.PI"),
("tan", "Math.PI"),
("tanh", "Math.PI"),
])
def test_math_determinism(page, fn, arg):
"""Math.<fn>(<arg>) must return the same value across 100 calls."""
first, last, all_equal = _ev(page, f"""() => {{
const r = [];
for (let i = 0; i < 100; i++) r.push(Math.{fn}({arg}));
return [r[0], r[99], r.every(x => Object.is(x, r[0]))];
}}""")
assert all_equal, (
f"Math.{fn}({arg}) drifts across calls: first={first}, last={last}"
)
@pytest.mark.e2e
def test_math_pow_two_arg_determinism(page):
ok = _ev(page, """() => {
const a = Math.pow(Math.PI, 2);
for (let i = 0; i < 50; i++) {
if (!Object.is(Math.pow(Math.PI, 2), a)) return false;
}
return true;
}""")
assert ok
# ===========================================================================
# 2. Worker scope vs main thread — navigator properties MUST agree
# Source: creepjs/src/worker/index.ts
# ===========================================================================
def _worker_navigator_dict(page, props):
expr = """async (props) => {
const code = `
self.onmessage = (e) => {
const out = {};
for (const p of e.data) {
try { out[p] = self.navigator[p]; }
catch (err) { out[p] = '<error: ' + err.message + '>'; }
}
if (out.languages && Array.isArray(out.languages)) {
out.languages = [...out.languages];
}
self.postMessage(out);
};
`;
const blob = new Blob([code], { type: 'application/javascript' });
const url = URL.createObjectURL(blob);
const worker = new Worker(url);
try {
const result = await new Promise((resolve, reject) => {
worker.onmessage = (e) => resolve(e.data);
worker.onerror = (e) => reject(new Error(e.message));
worker.postMessage(props);
setTimeout(() => reject(new Error('worker timeout')), 5000);
});
return result;
} finally {
worker.terminate();
URL.revokeObjectURL(url);
}
}"""
return page.evaluate(expr, list(props))
@pytest.mark.e2e
def test_worker_userAgent_matches_main(page):
main = _ev(page, "navigator.userAgent")
worker = _worker_navigator_dict(page, ("userAgent",))
assert worker["userAgent"] == main, (
f"UA drift main vs worker:\n main: {main!r}\n worker: {worker['userAgent']!r}"
)
@pytest.mark.e2e
def test_worker_hardwareConcurrency_matches_main(page):
main = _ev(page, "navigator.hardwareConcurrency")
worker = _worker_navigator_dict(page, ("hardwareConcurrency",))
assert worker["hardwareConcurrency"] == main
@pytest.mark.e2e
def test_worker_language_matches_main(page):
main = _ev(page, "navigator.language")
worker = _worker_navigator_dict(page, ("language",))
assert worker["language"] == main
@pytest.mark.e2e
def test_worker_languages_matches_main(page):
main = _ev(page, "[...navigator.languages]")
worker = _worker_navigator_dict(page, ("languages",))
assert list(worker["languages"]) == list(main)
@pytest.mark.e2e
def test_worker_platform_matches_main(page):
main = _ev(page, "navigator.platform")
worker = _worker_navigator_dict(page, ("platform",))
assert worker["platform"] == main
# ===========================================================================
# 3. Iframe scope vs window scope
# Source: creepjs/src/lies/index.ts (getBehemothIframe pattern)
# ===========================================================================
def _iframe_navigator_dict(page, props):
expr = """(props) => {
const iframe = document.createElement('iframe');
iframe.style.display = 'none';
document.body.appendChild(iframe);
const out = {};
for (const p of props) {
try { out[p] = iframe.contentWindow.navigator[p]; }
catch (e) { out[p] = '<error: ' + e.message + '>'; }
}
if (Array.isArray(out.languages)) out.languages = [...out.languages];
document.body.removeChild(iframe);
return out;
}"""
return page.evaluate(expr, list(props))
@pytest.mark.e2e
def test_iframe_userAgent_matches_window(page):
main = _ev(page, "navigator.userAgent")
iframe = _iframe_navigator_dict(page, ("userAgent",))
assert iframe["userAgent"] == main
@pytest.mark.e2e
def test_iframe_language_matches_window(page):
main = _ev(page, "navigator.language")
iframe = _iframe_navigator_dict(page, ("language",))
assert iframe["language"] == main
@pytest.mark.e2e
def test_iframe_hardwareConcurrency_matches_window(page):
main = _ev(page, "navigator.hardwareConcurrency")
iframe = _iframe_navigator_dict(page, ("hardwareConcurrency",))
assert iframe["hardwareConcurrency"] == main
@pytest.mark.e2e
def test_iframe_screen_matches_window(page):
main = _ev(page, "[screen.width, screen.height]")
iframe = _ev(page, """() => {
const f = document.createElement('iframe');
f.style.display = 'none';
document.body.appendChild(f);
const v = [f.contentWindow.screen.width, f.contentWindow.screen.height];
document.body.removeChild(f);
return v;
}""")
assert iframe == main
# ===========================================================================
# 4. UA self-consistency (creepjs/src/navigator/index.ts)
# ===========================================================================
@pytest.mark.e2e
def test_navigator_platform_matches_userAgent_OS(page):
ua = _ev(page, "navigator.userAgent")
platform = _ev(page, "navigator.platform")
if "Windows" in ua:
assert "Win" in platform
elif "Mac" in ua:
assert "Mac" in platform
elif "Linux" in ua or "X11" in ua:
assert "Linux" in platform or "X11" in platform
@pytest.mark.e2e
def test_navigator_oscpu_matches_userAgent(page):
"""Firefox-only: navigator.oscpu must correlate with UA OS."""
ua = _ev(page, "navigator.userAgent")
oscpu = _ev(page, "navigator.oscpu || ''")
if not oscpu:
pytest.skip("navigator.oscpu not exposed")
if "Windows" in ua:
assert "Windows" in oscpu
elif "Linux" in ua:
assert "Linux" in oscpu
elif "Mac" in ua:
assert "Mac" in oscpu
# ===========================================================================
# 5. Native function self-toString (creepjs/src/lies/index.ts hasKnownToString)
# ===========================================================================
def _is_native_toString(text, fn_name):
"""Mirror of CreepJS hasKnownToString — accept the engine-specific
native patterns (single-line on V8, multi-line on SpiderMonkey)."""
import re as _re
name = _re.escape(fn_name)
patterns = [
rf"^function {name}\(\) \{{ \[native code\] \}}$",
rf"^function get {name}\(\) \{{ \[native code\] \}}$",
rf"^function {name}\(\) \{{[\s\S]*\[native code\][\s\S]*\}}$",
rf"^function get {name}\(\) \{{[\s\S]*\[native code\][\s\S]*\}}$",
]
return any(_re.match(p, text) for p in patterns)
@pytest.mark.e2e
@pytest.mark.parametrize("native_fn,name", [
("Function.prototype.toString", "toString"),
("Function.prototype.bind", "bind"),
("Function.prototype.call", "call"),
("Function.prototype.apply", "apply"),
("Object.getOwnPropertyDescriptor", "getOwnPropertyDescriptor"),
("Object.defineProperty", "defineProperty"),
("Array.prototype.slice", "slice"),
("JSON.stringify", "stringify"),
])
def test_native_function_self_toString_matches(page, native_fn, name):
"""Each native function's `.toString()` must match its engine's
native pattern. A Proxy wrapper or function-rewrite leaks here."""
text = _ev(page, f"{native_fn}.toString()")
assert _is_native_toString(text, name), (
f"{native_fn}.toString() not native-shape: {text!r}"
)
# ===========================================================================
# 6. AudioContext / WebGL determinism
# ===========================================================================
@pytest.mark.e2e
def test_audio_offline_context_deterministic(page):
"""OfflineAudioContext: same graph → byte-identical output."""
ok = _ev(page, """async () => {
async function render() {
const ctx = new (window.OfflineAudioContext ||
window.webkitOfflineAudioContext)(1, 5000, 44100);
const osc = ctx.createOscillator();
osc.connect(ctx.destination);
osc.start(0);
const buf = await ctx.startRendering();
return Array.from(buf.getChannelData(0).slice(0, 50));
}
const a = await render();
const b = await render();
return JSON.stringify(a) === JSON.stringify(b);
}""")
assert ok
@pytest.mark.e2e
def test_webgl_getParameter_deterministic(page):
"""WebGL parameters must not drift across reads."""
ok = _ev(page, """() => {
const c = document.createElement('canvas');
const gl = c.getContext('webgl');
if (!gl) return false;
const params = [gl.MAX_TEXTURE_SIZE, gl.MAX_VIEWPORT_DIMS,
gl.MAX_RENDERBUFFER_SIZE, gl.MAX_VERTEX_ATTRIBS];
const ref = JSON.stringify(params.map(p => gl.getParameter(p)));
for (let i = 0; i < 50; i++) {
if (JSON.stringify(params.map(p => gl.getParameter(p))) !== ref) {
return false;
}
}
return true;
}""")
assert ok
# ===========================================================================
# 7. Locale ↔ Intl cross-consistency
# ===========================================================================
@pytest.mark.e2e
def test_navigator_language_matches_Intl_locale(page):
"""navigator.language base must agree with Intl.DateTimeFormat locale."""
nav = _ev(page, "navigator.language").split("-")[0]
intl = _ev(page,
"Intl.DateTimeFormat().resolvedOptions().locale").split("-")[0]
assert nav == intl, (
f"navigator.language base={nav!r} vs Intl={intl!r}"
)
@pytest.mark.e2e
def test_navigator_language_matches_Intl_NumberFormat(page):
nav = _ev(page, "navigator.language").split("-")[0]
num = _ev(page,
"Intl.NumberFormat().resolvedOptions().locale").split("-")[0]
assert nav == num
@pytest.mark.e2e
def test_navigator_language_matches_Intl_Collator(page):
nav = _ev(page, "navigator.language").split("-")[0]
col = _ev(page,
"(new Intl.Collator()).resolvedOptions().locale").split("-")[0]
assert nav == col
# ===========================================================================
# 8. Property descriptor shape lies
# Spoofers using Object.defineProperty(navigator, prop, {value: ...})
# leave a 'value' field on the descriptor — real native props use a getter.
# ===========================================================================
_DESCRIPTOR_NATIVE_PROPS = [
"userAgent", "platform", "hardwareConcurrency", "language", "languages",
"vendor", "appVersion", "appName", "appCodeName", "doNotTrack",
"cookieEnabled", "onLine", "product", "productSub", "buildID", "oscpu",
]
@pytest.mark.e2e
@pytest.mark.parametrize("prop", _DESCRIPTOR_NATIVE_PROPS)
def test_navigator_property_descriptor_is_getter_not_value(page, prop):
"""Each spoofable navigator.* property must be defined via a native
getter NOT Object.defineProperty(..., {value: x}). The value-field
descriptor is the lazy spoof leak CreepJS catches."""
has_lie = _ev(page, f"""() => {{
let proto = navigator;
let descriptor = null;
while (proto && !descriptor) {{
descriptor = Object.getOwnPropertyDescriptor(proto, {prop!r});
proto = Object.getPrototypeOf(proto);
}}
if (!descriptor) return null;
return 'value' in descriptor;
}}""")
if has_lie is None:
pytest.skip(f"navigator.{prop} not exposed")
assert has_lie is False, (
f"navigator.{prop} descriptor exposes 'value' field — lazy spoof"
)
# ===========================================================================
# 9. performance.timeOrigin + monotonic
# ===========================================================================
@pytest.mark.e2e
def test_performance_timeOrigin_stable(page):
assert _ev(page,
"performance.timeOrigin === performance.timeOrigin")
@pytest.mark.e2e
def test_performance_now_monotonic(page):
ok = _ev(page, """() => {
let prev = performance.now();
for (let i = 0; i < 100; i++) {
const cur = performance.now();
if (cur < prev) return false;
prev = cur;
}
return true;
}""")
assert ok
# ===========================================================================
# 10. Window dimension invariants
# ===========================================================================
@pytest.mark.e2e
def test_window_inner_not_larger_than_outer(page):
inner, outer = _ev(page, "[window.innerWidth, window.outerWidth]")
assert inner <= outer
@pytest.mark.e2e
def test_screen_avail_not_larger_than_screen(page):
aw, w = _ev(page, "[screen.availWidth, screen.width]")
ah, h = _ev(page, "[screen.availHeight, screen.height]")
assert aw <= w and ah <= h
# ===========================================================================
# 11. Firefox UA invariants
# ===========================================================================
@pytest.mark.e2e
def test_firefox_UA_implies_empty_vendor(page):
"""Firefox: navigator.vendor === ''"""
if "Firefox" not in _ev(page, "navigator.userAgent"):
pytest.skip("Firefox-only invariant")
if "Chrome" in _ev(page, "navigator.userAgent"):
pytest.skip("Chrome+Firefox UA — likely synthetic")
assert _ev(page, "navigator.vendor") == ""
@pytest.mark.e2e
def test_firefox_appVersion_short_form(page):
"""Real Firefox's appVersion is '5.0 (Windows)' form, not the full UA."""
if "Firefox" not in _ev(page, "navigator.userAgent"):
pytest.skip("Firefox-only invariant")
av = _ev(page, "navigator.appVersion")
ua = _ev(page, "navigator.userAgent")
assert av.startswith("5.0 (")
assert len(av) < len(ua)
@pytest.mark.e2e
def test_firefox_UA_implies_appName_Netscape(page):
"""navigator.appName === 'Netscape' (historical invariant)."""
if "Firefox" not in _ev(page, "navigator.userAgent"):
pytest.skip("Firefox-only invariant")
assert _ev(page, "navigator.appName") == "Netscape"

View file

@ -0,0 +1,311 @@
"""Fingerprint surface tests — replicate the checks performed by the canonical
anti-bot detection libraries against an OFFLINE browser session.
Each test asserts the SAME thing the upstream detector would flag. A pass
here means our patched build appears human to that detector; a fail
means a real stealth hole that anti-bot kits would exploit in production.
Detector libraries studied (all FOSS, MIT-licensed):
- github.com/fingerprintjs/BotD 19 detectors, the most
widely deployed client-side
bot detector
- github.com/abrahamjuliot/creepjs headless / stealth / lies
modules
- github.com/fingerprintjs/fingerprintjs canvas / audio / color /
touch consistency
- github.com/antoinevastel/fpscanner UA / platform / oscpu
cross-checks
- bot.sannysoft.com classic Puppeteer harness
Everything runs against `about:blank` with NO network and NO proxy. The
suite is intended to be part of the release-gate: pre-push hook runs
`pytest -m e2e` and these tests must be green on every release.
Run only this file:
pytest tests/test_fingerprint_surface.py -m e2e -v
"""
from __future__ import annotations
import re
import sys
import pytest
from invisible_playwright import InvisiblePlaywright
# ────────────────────────────────────────────────────────────────────
# Inline PIN — a coherent mid-range Windows desktop. Not user-config:
# these specific values are what the surface tests assert against.
# Keep PIN small (only fields that JS exposes) and stable across runs.
# ────────────────────────────────────────────────────────────────────
PIN = {
"screen.width": 1920,
"screen.height": 1080,
"screen.avail_width": 1920,
"screen.avail_height": 1040,
"screen.dpr": 1.0,
"hardware.concurrency": 8,
"audio.sample_rate": 48000,
"audio.max_channel_count": 2,
}
@pytest.fixture(scope="module")
def page(firefox_binary):
"""One headless browser shared across the whole module.
~20s startup paid once, then every test runs in ~50ms."""
with InvisiblePlaywright(
seed=42,
pin=PIN,
binary_path=firefox_binary,
headless=True,
) as browser:
ctx = browser.new_context()
p = ctx.new_page()
p.goto("about:blank", timeout=30_000)
yield p
def _ev(page, expr):
return page.evaluate(expr)
# ===========================================================================
# sannysoft.com — classic Puppeteer detection harness
# ===========================================================================
@pytest.mark.e2e
def test_sannysoft_chrome_object_consistency(page):
"""Firefox UA + window.chrome present = bot-framework leak."""
if "Firefox" in _ev(page, "navigator.userAgent"):
assert not _ev(page, "typeof window.chrome !== 'undefined'")
@pytest.mark.e2e
def test_sannysoft_permissions_query_works(page):
"""navigator.permissions.query() must return a proper PermissionStatus."""
ok = _ev(page, """async () => {
if (!navigator.permissions || !navigator.permissions.query) return false;
try {
const r = await navigator.permissions.query({name: 'notifications'});
return r && typeof r.state === 'string';
} catch (e) { return false; }
}""")
assert ok
@pytest.mark.e2e
def test_sannysoft_iframe_chrome_not_leaked(page):
"""iframe.contentWindow.chrome must not leak on Firefox UA."""
if "Firefox" not in _ev(page, "navigator.userAgent"):
pytest.skip("Firefox-only invariant")
leaks = _ev(page, """() => {
const iframe = document.createElement('iframe');
iframe.style.display = 'none';
document.body.appendChild(iframe);
const is = typeof iframe.contentWindow.chrome !== 'undefined';
document.body.removeChild(iframe);
return is;
}""")
assert not leaks
@pytest.mark.e2e
def test_sannysoft_iframe_languages_not_empty(page):
"""Iframe-scope navigator.languages must have ≥1 entry."""
n = _ev(page, """() => {
const f = document.createElement('iframe');
f.style.display = 'none';
document.body.appendChild(f);
const len = f.contentWindow.navigator.languages.length;
document.body.removeChild(f);
return len;
}""")
assert n > 0
# ===========================================================================
# FingerprintJS — fingerprint surface coherence
# ===========================================================================
@pytest.mark.e2e
def test_fpjs_canvas_2d_context_returns_valid(page):
ok = _ev(page, """() => {
const c = document.createElement('canvas');
c.width = 100; c.height = 100;
const ctx = c.getContext('2d');
if (!ctx) return false;
ctx.fillText('test', 10, 10);
const data = c.toDataURL();
return data.length > 100 && data.startsWith('data:image/png;base64');
}""")
assert ok
@pytest.mark.e2e
def test_fpjs_audio_context_works(page):
ok = _ev(page, """async () => {
try {
const ctx = new (window.OfflineAudioContext ||
window.webkitOfflineAudioContext)(1, 5000, 44100);
const osc = ctx.createOscillator();
osc.connect(ctx.destination);
osc.start(0);
const buf = await ctx.startRendering();
return buf && buf.length > 0;
} catch (e) { return false; }
}""")
assert ok
@pytest.mark.e2e
def test_fpjs_color_gamut_query_works(page):
"""matchMedia('(color-gamut: ...)') must match at least srgb."""
ok = _ev(page, """matchMedia('(color-gamut: srgb)').matches ||
matchMedia('(color-gamut: p3)').matches ||
matchMedia('(color-gamut: rec2020)').matches""")
assert ok
@pytest.mark.e2e
def test_fpjs_screen_color_depth_realistic(page):
"""Atypical color depths are headless-distinctive."""
cd = _ev(page, "screen.colorDepth")
assert cd in (24, 30, 32)
# ===========================================================================
# PIN-locked surfaces (the values declared in PIN above)
# ===========================================================================
@pytest.mark.e2e
def test_pin_screen_width_lands_in_screen_object(page):
assert _ev(page, "screen.width") == PIN["screen.width"]
@pytest.mark.e2e
def test_pin_screen_height_lands_in_screen_object(page):
assert _ev(page, "screen.height") == PIN["screen.height"]
@pytest.mark.e2e
def test_pin_hardware_concurrency_lands_in_navigator(page):
assert (_ev(page, "navigator.hardwareConcurrency")
== PIN["hardware.concurrency"])
@pytest.mark.e2e
def test_pin_audio_sample_rate_lands_in_AudioContext(page):
assert _ev(page,
"(new (window.AudioContext||window.webkitAudioContext)()).sampleRate"
) == PIN["audio.sample_rate"]
@pytest.mark.e2e
def test_pin_audio_max_channels_lands_in_destination(page):
assert _ev(page,
"(new (window.AudioContext||window.webkitAudioContext)())"
".destination.maxChannelCount"
) == PIN["audio.max_channel_count"]
# ===========================================================================
# fpscanner-style cross-checks
# ===========================================================================
@pytest.mark.e2e
def test_fpscanner_ua_vs_platform_consistent(page):
"""UA OS substring must agree with navigator.platform OS substring."""
ua = _ev(page, "navigator.userAgent")
platform = _ev(page, "navigator.platform")
if "Windows" in ua:
assert "Win" in platform, f"UA Win but platform={platform!r}"
elif "Mac" in ua:
assert "Mac" in platform
elif "Linux" in ua:
assert "Linux" in platform or "X11" in platform
@pytest.mark.e2e
def test_fpscanner_no_userAgentData_on_firefox(page):
"""navigator.userAgentData is Chromium-only. Presence on Firefox UA = bot."""
if "Firefox" in _ev(page, "navigator.userAgent"):
assert not _ev(page, "'userAgentData' in navigator")
# ===========================================================================
# WebGL masking-detector guard (pixelscan getFixedRedBox / webglHash)
#
# pixelscan flags "fingerprint masking" on the WebGL readPixels output. We
# reproduce ITS probe locally (the fingerprintjs gradient triangle) and check
# the structural signature it keys on: our stealth readPixels noise MUST be a
# coherent, monotonic gamma remap (smooth, ~0 spikes), NOT isolated +-1 flips
# (which read as unnatural high-frequency noise and were flagged as masking).
# This is the CI-safe local stand-in for pixelscan's server-side check; it
# guards the gamma fix from ever silently regressing to the +-1 algorithm.
# ===========================================================================
_WEBGL_MASKING_PROBE = """() => {
const c = document.createElement('canvas');
const gl = c.getContext('webgl') || c.getContext('experimental-webgl');
if (!gl) return { error: 'no-webgl' };
const vs = 'attribute vec2 a;uniform vec2 o;varying vec2 v;' +
'void main(){v=a+o;gl_Position=vec4(a,0,1);}';
const fs = 'precision mediump float;varying vec2 v;' +
'void main(){gl_FragColor=vec4(v,0,1);}';
const buf = gl.createBuffer(); gl.bindBuffer(gl.ARRAY_BUFFER, buf);
gl.bufferData(gl.ARRAY_BUFFER,
new Float32Array([-0.2,-0.9,0, 0.4,-0.26,0, 0,0.732134444,0]), gl.STATIC_DRAW);
const p = gl.createProgram();
const s1 = gl.createShader(gl.VERTEX_SHADER); gl.shaderSource(s1, vs); gl.compileShader(s1);
const s2 = gl.createShader(gl.FRAGMENT_SHADER); gl.shaderSource(s2, fs); gl.compileShader(s2);
gl.attachShader(p, s1); gl.attachShader(p, s2); gl.linkProgram(p); gl.useProgram(p);
const loc = gl.getAttribLocation(p, 'a'); gl.enableVertexAttribArray(loc);
gl.vertexAttribPointer(loc, 3, gl.FLOAT, false, 0, 0);
const off = gl.getUniformLocation(p, 'o'); gl.uniform2f(off, 1, 1);
gl.drawArrays(gl.TRIANGLE_STRIP, 0, 3);
const w = gl.drawingBufferWidth, h = gl.drawingBufferHeight;
const px = new Uint8Array(w * h * 4);
gl.readPixels(0, 0, w, h, gl.RGBA, gl.UNSIGNED_BYTE, px);
// count small local extrema (|delta|<=3 to both horizontal neighbours, same
// sign) the +-1-noise signature; a smooth/monotonic render has ~none.
let spikes = 0;
for (let y = 0; y < h; y++) {
for (let x = 1; x < w - 1; x++) {
for (let ch = 0; ch < 3; ch++) {
const i = (y * w + x) * 4 + ch; const val = px[i];
if (val === 0) continue;
const dl = val - px[i - 4], dr = val - px[i + 4];
if (dl * dr > 0 && Math.abs(dl) <= 3 && Math.abs(dr) <= 3) spikes++;
}
}
}
return { spikes: spikes, dims: w + 'x' + h };
}"""
@pytest.mark.e2e
def test_webgl_readpixels_no_masking_signature(page):
"""Stealth WebGL readPixels noise must be a coherent gamma remap (smooth),
not isolated +-1 flips. +-1 noise on the smooth gradient triangle produced
~300+ 'spikes' and pixelscan flagged it as masking; the gamma remap leaves
the gradient smooth (~0 spikes). Regression guard for the gamma fix."""
res = _ev(page, _WEBGL_MASKING_PROBE)
if res.get("error") == "no-webgl" and sys.platform == "darwin":
pytest.skip(
"macOS CI runners expose no WebGL (no software-GL fallback); the gamma "
"readPixels remap is platform-agnostic C++ and is exercised by the Linux "
"(Xvfb/llvmpipe) and Windows (WARP) gates."
)
assert "error" not in res, f"WebGL probe failed: {res}"
# genuine / gamma -> ~0; the rejected +-1 algorithm produced ~320.
assert res["spikes"] < 30, (
f"WebGL readPixels shows {res['spikes']} high-frequency noise spikes "
f"(pixelscan-maskable); the stealth noise must be a smooth gamma remap."
)

77
tests/test_fpforge.py Normal file
View file

@ -0,0 +1,77 @@
"""Profile generator — seed reproducibility and basic shape."""
import pytest
from invisible_playwright._fpforge import (
Profile,
GPUProfile,
ScreenProfile,
HardwareProfile,
AudioProfile,
generate_profile,
)
def test_profile_has_expected_fields():
p = generate_profile(seed=42)
assert isinstance(p.gpu, GPUProfile)
assert isinstance(p.screen, ScreenProfile)
assert isinstance(p.hardware, HardwareProfile)
assert isinstance(p.audio, AudioProfile)
def test_same_seed_reproduces_profile():
a = generate_profile(seed=1234)
b = generate_profile(seed=1234)
assert a.gpu.renderer == b.gpu.renderer
assert a.gpu.vendor == b.gpu.vendor
assert a.screen.width == b.screen.width
assert a.screen.height == b.screen.height
assert a.hardware.concurrency == b.hardware.concurrency
def test_different_seeds_produce_different_profiles():
a = generate_profile(seed=1)
b = generate_profile(seed=999)
# Not every field needs to differ, but at least one should
diffs = [
a.gpu.renderer != b.gpu.renderer,
a.screen.width != b.screen.width,
a.hardware.concurrency != b.hardware.concurrency,
a.audio.sample_rate != b.audio.sample_rate,
]
assert any(diffs), "seeds 1 and 999 produced identical profiles across all sampled fields"
def test_screen_dimensions_are_positive_integers():
p = generate_profile(seed=42)
assert isinstance(p.screen.width, int) and p.screen.width > 0
assert isinstance(p.screen.height, int) and p.screen.height > 0
# Sanity: not larger than 8K, not smaller than 1024
assert 1024 <= p.screen.width <= 7680
assert 600 <= p.screen.height <= 4320
def test_hardware_concurrency_in_realistic_range():
p = generate_profile(seed=42)
# Real consumer hardware: 2-32 logical CPUs. Anything outside is a sampler bug.
assert 2 <= p.hardware.concurrency <= 32
def test_audio_sample_rate_is_standard():
p = generate_profile(seed=42)
# Real audio devices report one of these standard rates
assert p.audio.sample_rate in (44100, 48000, 96000)
def test_gpu_renderer_is_non_empty_string():
p = generate_profile(seed=42)
assert isinstance(p.gpu.renderer, str) and p.gpu.renderer.strip()
assert isinstance(p.gpu.vendor, str) and p.gpu.vendor.strip()
@pytest.mark.parametrize("seed", [1, 42, 100, 9999, 2**31 - 1])
def test_generation_is_stable_across_seed_range(seed):
"""No exceptions on a representative seed range."""
p = generate_profile(seed=seed)
assert p.gpu.renderer
assert p.screen.width > 0

325
tests/test_geo.py Normal file
View file

@ -0,0 +1,325 @@
"""Unit tests for `invisible_playwright._geo` (timezone="auto" resolution).
Covers: the precedence policy (resolve_session_timezone), proxyrequests
translation, egress IP discovery (mocked HTTP), and IPIANA mapping (mocked
mmdb). No real network or mmdb is touched.
"""
import sys
import types
import pytest
from invisible_playwright import _geo
from invisible_playwright._geo import (
GeoTimezoneError,
_proxies_for_requests,
_proxy_is_set,
discover_egress_ip,
ip_to_timezone,
prepare_session_geo,
resolve_session_timezone,
)
SOCKS = {"server": "socks5://gw.example:1080", "username": "u", "password": "p"}
HTTP = {"server": "http://gw.example:8080", "username": "u", "password": "p"}
# ──────────────────────────────────────────────────────────────────────
# _proxy_is_set
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
@pytest.mark.parametrize(
"proxy,expected",
[
(None, False),
({}, False),
({"server": ""}, False),
({"server": " "}, False),
({"server": "direct://"}, False),
({"server": "DIRECT://"}, False),
({"server": "socks5://h:1"}, True),
({"server": "http://h:8080"}, True),
],
)
def test_proxy_is_set(proxy, expected):
assert _proxy_is_set(proxy) is expected
# ──────────────────────────────────────────────────────────────────────
# _proxies_for_requests — scheme + credential translation
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_proxies_socks5_uses_socks5h_remote_dns():
out = _proxies_for_requests(SOCKS)
assert out["http"] == "socks5h://u:p@gw.example:1080"
assert out["https"] == out["http"]
@pytest.mark.unit
def test_proxies_socks4_scheme():
out = _proxies_for_requests({"server": "socks4://gw:1080"})
assert out["http"] == "socks4://gw:1080"
@pytest.mark.unit
def test_proxies_http_and_https_schemes():
assert _proxies_for_requests(HTTP)["http"] == "http://u:p@gw.example:8080"
out = _proxies_for_requests({"server": "https://gw:8443"})
assert out["https"] == "https://gw:8443"
@pytest.mark.unit
def test_proxies_no_scheme_defaults_to_http():
out = _proxies_for_requests({"server": "gw.example:3128"})
assert out["http"] == "http://gw.example:3128"
@pytest.mark.unit
def test_proxies_credentials_are_url_encoded():
out = _proxies_for_requests(
{"server": "socks5://gw:1080", "username": "user@x", "password": "p:w/d"}
)
# '@', ':' and '/' in creds must be percent-encoded so they don't break
# the proxy URL parsing.
assert "user%40x:p%3Aw%2Fd@gw:1080" in out["http"]
@pytest.mark.unit
def test_proxies_no_credentials_has_no_auth_prefix():
out = _proxies_for_requests({"server": "socks5://gw:1080"})
assert out["http"] == "socks5h://gw:1080"
# ──────────────────────────────────────────────────────────────────────
# discover_egress_ip — mocked requests
# ──────────────────────────────────────────────────────────────────────
class _FakeResp:
def __init__(self, text, status=200):
self.text = text
self._status = status
def raise_for_status(self):
if self._status >= 400:
raise RuntimeError(f"HTTP {self._status}")
@pytest.mark.unit
def test_discover_egress_ip_first_endpoint_wins(monkeypatch):
calls = []
def fake_get(url, **kw):
calls.append(url)
return _FakeResp("203.0.113.7\n")
monkeypatch.setattr(_geo.requests, "get", fake_get)
assert discover_egress_ip(SOCKS) == "203.0.113.7"
assert len(calls) == 1 # stopped at the first success
@pytest.mark.unit
def test_discover_egress_ip_falls_through_to_next_on_error(monkeypatch):
seq = iter([_FakeResp("junk-not-an-ip"), _FakeResp("198.51.100.42")])
def fake_get(url, **kw):
return next(seq)
monkeypatch.setattr(_geo.requests, "get", fake_get)
assert discover_egress_ip(HTTP) == "198.51.100.42"
@pytest.mark.unit
def test_discover_egress_ip_all_fail_raises(monkeypatch):
def fake_get(url, **kw):
raise OSError("connection refused")
monkeypatch.setattr(_geo.requests, "get", fake_get)
with pytest.raises(GeoTimezoneError):
discover_egress_ip(SOCKS)
@pytest.mark.unit
def test_discover_egress_ip_no_proxy_is_direct(monkeypatch):
# proxy=None → direct request, requests.get must get proxies=None.
seen = {}
def fake_get(url, **kw):
seen["proxies"] = kw.get("proxies", "MISSING")
return _FakeResp("192.0.2.55")
monkeypatch.setattr(_geo.requests, "get", fake_get)
assert discover_egress_ip(None) == "192.0.2.55"
assert seen["proxies"] is None
# ──────────────────────────────────────────────────────────────────────
# ip_to_timezone — mocked mmdb reader
# ──────────────────────────────────────────────────────────────────────
class _FakeReader:
def __init__(self, record):
self._record = record
def __enter__(self):
return self
def __exit__(self, *a):
return False
def get(self, ip):
return self._record
def _install_fake_maxminddb(monkeypatch, record):
mod = types.ModuleType("maxminddb")
mod.open_database = lambda path: _FakeReader(record)
monkeypatch.setitem(sys.modules, "maxminddb", mod)
@pytest.mark.unit
def test_ip_to_timezone_reads_location_time_zone(monkeypatch):
_install_fake_maxminddb(monkeypatch, {"location": {"time_zone": "Europe/Rome"}})
assert ip_to_timezone("1.2.3.4", "x.mmdb") == "Europe/Rome"
@pytest.mark.unit
def test_ip_to_timezone_ip_absent_raises(monkeypatch):
_install_fake_maxminddb(monkeypatch, None)
with pytest.raises(GeoTimezoneError):
ip_to_timezone("1.2.3.4", "x.mmdb")
@pytest.mark.unit
def test_ip_to_timezone_missing_zone_raises(monkeypatch):
_install_fake_maxminddb(monkeypatch, {"location": {}})
with pytest.raises(GeoTimezoneError):
ip_to_timezone("1.2.3.4", "x.mmdb")
@pytest.mark.unit
def test_ip_to_timezone_invalid_iana_raises(monkeypatch):
_install_fake_maxminddb(monkeypatch, {"location": {"time_zone": "Not/AZone"}})
with pytest.raises(GeoTimezoneError):
ip_to_timezone("1.2.3.4", "x.mmdb")
# ──────────────────────────────────────────────────────────────────────
# resolve_session_timezone — the precedence policy
# ──────────────────────────────────────────────────────────────────────
@pytest.fixture
def stub_egress(monkeypatch):
"""Make egress resolution deterministic + offline; record if it ran."""
state = {"called": False}
def fake_discover(proxy=None, **kw):
state["called"] = True
state["proxy_arg"] = proxy
return "203.0.113.7"
monkeypatch.setattr(_geo, "discover_egress_ip", fake_discover)
monkeypatch.setattr(_geo, "ip_to_timezone", lambda ip, mmdb: "America/New_York")
# ensure_geoip_mmdb is imported from .download at call time
import invisible_playwright.download as dl
monkeypatch.setattr(dl, "ensure_geoip_mmdb", lambda *a, **k: "fake.mmdb")
return state
@pytest.mark.unit
def test_resolve_explicit_iana_wins(stub_egress):
# An explicit zone wins and never triggers resolution (proxy or not).
assert resolve_session_timezone("Asia/Tokyo", SOCKS) == "Asia/Tokyo"
assert resolve_session_timezone("Asia/Tokyo", None) == "Asia/Tokyo"
assert stub_egress["called"] is False
@pytest.mark.unit
def test_resolve_empty_with_proxy_resolves_from_proxy(stub_egress):
assert resolve_session_timezone("", SOCKS) == "America/New_York"
assert stub_egress["called"] is True
assert stub_egress["proxy_arg"] == SOCKS # routed through the proxy
@pytest.mark.unit
def test_resolve_auto_with_proxy_resolves_from_proxy(stub_egress):
assert resolve_session_timezone("auto", HTTP) == "America/New_York"
assert stub_egress["proxy_arg"] == HTTP
@pytest.mark.unit
def test_resolve_empty_no_proxy_resolves_from_host(stub_egress):
# auto ALWAYS resolves — without a proxy, from the host's own public IP.
assert resolve_session_timezone("", None) == "America/New_York"
assert stub_egress["called"] is True
assert stub_egress["proxy_arg"] is None # direct request, no proxy
@pytest.mark.unit
def test_resolve_auto_no_proxy_resolves_from_host(stub_egress):
assert resolve_session_timezone("auto", None) == "America/New_York"
assert stub_egress["proxy_arg"] is None
@pytest.mark.unit
def test_resolve_direct_proxy_resolves_via_host(stub_egress):
# direct:// counts as "no proxy" → resolve from the host IP, don't skip.
assert resolve_session_timezone("auto", {"server": "direct://"}) == "America/New_York"
assert stub_egress["proxy_arg"] is None
@pytest.mark.unit
def test_resolve_no_proxy_failure_falls_back_to_host(monkeypatch):
# Without a proxy, a lookup failure must NOT break the launch → host TZ ("").
def boom(proxy=None, **kw):
raise GeoTimezoneError("offline")
monkeypatch.setattr(_geo, "discover_egress_ip", boom)
assert resolve_session_timezone("auto", None) == ""
assert resolve_session_timezone("", None) == ""
@pytest.mark.unit
def test_resolve_proxy_failure_raises(monkeypatch):
# With a proxy set, a failure must raise — never a silent host-TZ fallback.
def boom(proxy=None, **kw):
raise GeoTimezoneError("no egress")
monkeypatch.setattr(_geo, "discover_egress_ip", boom)
with pytest.raises(GeoTimezoneError):
resolve_session_timezone("auto", SOCKS)
with pytest.raises(GeoTimezoneError):
resolve_session_timezone("", SOCKS)
# ──────────────────────────────────────────────────────────────────────
# prepare_session_geo — one round-trip for BOTH timezone + the WebRTC
# egress IP. The egress feeds the srflx override (only behind a proxy).
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_prepare_geo_egress_present_behind_proxy(stub_egress):
geo = prepare_session_geo("auto", SOCKS)
assert geo.timezone == "America/New_York"
assert geo.egress_ip == "203.0.113.7" # discovered for WebRTC
assert stub_egress["proxy_arg"] == SOCKS
@pytest.mark.unit
def test_prepare_geo_egress_present_even_with_explicit_tz(stub_egress):
# explicit IANA zone still needs the egress for WebRTC behind a proxy.
geo = prepare_session_geo("Asia/Tokyo", SOCKS)
assert geo.timezone == "Asia/Tokyo"
assert geo.egress_ip == "203.0.113.7"
assert stub_egress["called"] is True
@pytest.mark.unit
def test_prepare_geo_no_egress_without_proxy(stub_egress):
# no proxy → no WebRTC override (real STUN already tells the truth).
geo = prepare_session_geo("auto", None)
assert geo.timezone == "America/New_York"
assert geo.egress_ip is None
@pytest.mark.unit
def test_prepare_geo_timezone_matches_resolve_session_timezone(stub_egress):
# the thin tz wrapper must stay equivalent to prepare_session_geo().timezone
for tz, proxy in [("Asia/Tokyo", SOCKS), ("auto", HTTP), ("", None)]:
assert prepare_session_geo(tz, proxy).timezone == resolve_session_timezone(tz, proxy)

131
tests/test_geoip_update.py Normal file
View file

@ -0,0 +1,131 @@
"""Unit tests for the intelligent geoip mmdb auto-update in `download.py`.
daijro/geoip-all-in-one rebuilds weekly; `ensure_geoip_mmdb` keeps the cache
fresh without a download (or API call) on every launch. These tests mock the
cache root, the latest-tag API, and the per-tag download so nothing touches the
network.
"""
import os
import time
import pytest
import invisible_playwright.download as dl
@pytest.fixture
def cache(tmp_path, monkeypatch):
"""Point the cache at tmp_path and clear the env override."""
monkeypatch.setattr(dl, "cache_root", lambda: tmp_path)
monkeypatch.delenv("STEALTHFOX_GEOIP_MMDB", raising=False)
return tmp_path
def _make_cached(root, tag, name=dl.GEOIP_MMDB_NAME):
d = root / "geoip" / tag
d.mkdir(parents=True, exist_ok=True)
f = d / name
f.write_bytes(b"FAKE-MMDB")
return f
def _set_marker_age(root, days):
m = root / "geoip" / ".last_check"
m.parent.mkdir(parents=True, exist_ok=True)
m.touch()
old = time.time() - days * 86400
os.utime(m, (old, old))
# ──────────────────────────────────────────────────────────────────────
# env override
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_env_override_returns_file(tmp_path, monkeypatch):
f = tmp_path / "mine.mmdb"
f.write_bytes(b"X")
monkeypatch.setenv("STEALTHFOX_GEOIP_MMDB", str(f))
assert dl.ensure_geoip_mmdb() == f
@pytest.mark.unit
def test_env_override_missing_raises(tmp_path, monkeypatch):
monkeypatch.setenv("STEALTHFOX_GEOIP_MMDB", str(tmp_path / "nope.mmdb"))
with pytest.raises(RuntimeError):
dl.ensure_geoip_mmdb()
# ──────────────────────────────────────────────────────────────────────
# freshness window
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_fresh_cache_no_network(cache, monkeypatch):
f = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 0) # just checked
def boom():
raise AssertionError("latest-tag API must NOT be called within the window")
monkeypatch.setattr(dl, "_latest_geoip_tag", boom)
assert dl.ensure_geoip_mmdb(max_age_days=7) == f
@pytest.mark.unit
def test_stale_same_tag_no_download(cache, monkeypatch):
f = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 30) # stale → will re-check
monkeypatch.setattr(dl, "_latest_geoip_tag", lambda: "2026.06.03")
# real _download_geoip_tag runs but target exists, so no actual download:
monkeypatch.setattr(dl, "_download_file", lambda *a, **k: (_ for _ in ()).throw(
AssertionError("must not download when tag already cached")))
assert dl.ensure_geoip_mmdb(max_age_days=7) == f
@pytest.mark.unit
def test_stale_new_tag_downloads_and_prunes(cache, monkeypatch):
old = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 30)
monkeypatch.setattr(dl, "_latest_geoip_tag", lambda: "2026.06.10")
def fake_download(tag):
return _make_cached(cache, tag) # simulate fetch+extract of the new tag
monkeypatch.setattr(dl, "_download_geoip_tag", fake_download)
got = dl.ensure_geoip_mmdb(max_age_days=7)
assert got.parent.name == "2026.06.10"
assert not old.parent.exists() # old tag pruned
assert got.exists()
# ──────────────────────────────────────────────────────────────────────
# offline resilience
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_api_down_with_cache_uses_cache(cache, monkeypatch):
f = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 30)
def boom():
raise OSError("offline")
monkeypatch.setattr(dl, "_latest_geoip_tag", boom)
assert dl.ensure_geoip_mmdb(max_age_days=7) == f # stale cache reused, no raise
@pytest.mark.unit
def test_cold_cache_api_down_falls_back_to_pinned(cache, monkeypatch):
# no cache at all + API unreachable → pinned GEOIP_MMDB_VERSION fallback.
def boom():
raise OSError("offline")
monkeypatch.setattr(dl, "_latest_geoip_tag", boom)
captured = {}
def fake_download(tag):
captured["tag"] = tag
return _make_cached(cache, tag)
monkeypatch.setattr(dl, "_download_geoip_tag", fake_download)
got = dl.ensure_geoip_mmdb(max_age_days=7)
assert captured["tag"] == dl.GEOIP_MMDB_VERSION
assert got.exists()

124
tests/test_headless.py Normal file
View file

@ -0,0 +1,124 @@
"""Unit tests for the ``_headless`` window-hider dispatcher.
``make_virtual_display`` is pure platform routing:
- Linux: a ``_LinuxVirtualDisplay`` (Xvfb) object the launcher start()s/stop()s.
- Windows / macOS: ``None`` the patched binary self-cloaks its chrome windows
via ``cloak_prefs()`` (injected by the launcher), so nothing host-side spawns.
- Anything else: a clear ``RuntimeError`` naming the platform.
``_LinuxVirtualDisplay`` construction does no I/O (Xvfb is only spawned in
``start()``), so it's safe to exercise on any host.
"""
from __future__ import annotations
import pytest
import invisible_playwright._headless as headless
from invisible_playwright._headless import (
CLOAK_PREFS,
_LinuxVirtualDisplay,
cloak_prefs,
make_virtual_display,
)
@pytest.mark.unit
def test_make_virtual_display_returns_none_on_win32(monkeypatch):
"""Windows hides via the in-binary cloak pref, not a host-side display."""
monkeypatch.setattr(headless.sys, "platform", "win32")
assert make_virtual_display() is None
@pytest.mark.unit
def test_make_virtual_display_returns_none_on_darwin(monkeypatch):
"""macOS is now supported — it hides via the same in-binary cloak pref."""
monkeypatch.setattr(headless.sys, "platform", "darwin")
assert make_virtual_display() is None
@pytest.mark.unit
def test_make_virtual_display_returns_linux_xvfb_on_linux(monkeypatch):
"""``__init__`` of ``_LinuxVirtualDisplay`` does no I/O — only ``start()``
spawns Xvfb. Exercising the dispatcher here is safe on any host."""
monkeypatch.setattr(headless.sys, "platform", "linux")
assert isinstance(make_virtual_display(), _LinuxVirtualDisplay)
@pytest.mark.unit
def test_make_virtual_display_accepts_linux_variants(monkeypatch):
"""``sys.platform`` can be ``linux2`` on older Pythons / WSL builds.
The dispatcher uses ``startswith("linux")`` to accept all variants."""
monkeypatch.setattr(headless.sys, "platform", "linux2")
assert isinstance(make_virtual_display(), _LinuxVirtualDisplay)
@pytest.mark.unit
def test_make_virtual_display_raises_on_unsupported_platform(monkeypatch):
monkeypatch.setattr(headless.sys, "platform", "freebsd14")
with pytest.raises(RuntimeError, match="Windows, macOS and Linux"):
make_virtual_display()
@pytest.mark.unit
def test_make_virtual_display_error_mentions_offending_platform(monkeypatch):
"""Error message should include the actual ``sys.platform`` so the
user can diagnose why their CI / weird container is being rejected."""
monkeypatch.setattr(headless.sys, "platform", "sunos5")
with pytest.raises(RuntimeError, match="sunos5"):
make_virtual_display()
@pytest.mark.unit
def test_cloak_prefs_enables_cloak_and_disables_occlusion():
"""The cloak prefs must turn on the in-binary cloak and turn OFF Windows
occlusion tracking (so a hidden window keeps painting). Returns a copy."""
p = cloak_prefs()
assert p["zoom.stealth.cloak_windows"] is True
assert p["widget.windows.window_occlusion_tracking.enabled"] is False
assert p == CLOAK_PREFS and p is not CLOAK_PREFS
# ──────────────────────────────────────────────────────────────────────
# _LinuxVirtualDisplay — construction-only smoke tests. ``start()`` is
# E2E because it spawns Xvfb; ``stop()`` is safe to call when no Xvfb
# was ever started, so we exercise that path explicitly.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_linux_virtual_display_initial_state_is_clean():
"""Construction must not spawn Xvfb or mutate the environment — only
``start()`` does. Mirrors the Windows construction-state test."""
vd = _LinuxVirtualDisplay()
assert vd._proc is None
assert vd._display is None
assert vd._saved_env == {}
@pytest.mark.unit
def test_linux_virtual_display_geometry_default():
"""Default geometry is 1920x1080x24 — matches the profile sampler's
default screen and avoids the Xvfb default of 1280x1024 which the
fingerprint pipeline never produces."""
vd = _LinuxVirtualDisplay()
assert vd._geometry == "1920x1080x24"
@pytest.mark.unit
def test_linux_virtual_display_custom_geometry():
"""Caller-supplied width/height feed straight into the Xvfb geometry
spec; the depth is always 24 (Firefox/ANGLE assume true-color)."""
vd = _LinuxVirtualDisplay(width=2560, height=1440)
assert vd._geometry == "2560x1440x24"
@pytest.mark.unit
def test_linux_virtual_display_stop_without_start_is_safe():
"""``stop()`` before ``start()`` must be a no-op — supports the
``__exit__`` path on a launcher that failed before Xvfb was spawned.
Verifies no AttributeError on env restore (saved_env is empty)."""
vd = _LinuxVirtualDisplay()
vd.stop()
vd.stop()
assert vd._proc is None
assert vd._display is None

56
tests/test_imports.py Normal file
View file

@ -0,0 +1,56 @@
"""Public API surface — what users actually import."""
import importlib
import pytest
def test_top_level_import():
import invisible_playwright as ip
assert hasattr(ip, "InvisiblePlaywright")
assert hasattr(ip, "BINARY_VERSION")
assert hasattr(ip, "FIREFOX_UPSTREAM_VERSION")
assert hasattr(ip, "__version__")
def test_version_string():
from invisible_playwright import __version__
parts = __version__.split(".")
assert len(parts) >= 2
assert all(p.isdigit() or p.replace("-", "").replace("rc", "").isdigit()
or any(c.isdigit() for c in p) for p in parts)
def test_sync_api_module():
from invisible_playwright.sync_api import InvisiblePlaywright as SyncCls
from invisible_playwright import InvisiblePlaywright as TopCls
assert SyncCls is TopCls
def test_async_api_module_importable():
mod = importlib.import_module("invisible_playwright.async_api")
assert hasattr(mod, "InvisiblePlaywright")
def test_async_class_is_distinct_from_sync():
from invisible_playwright import InvisiblePlaywright as Sync
from invisible_playwright.async_api import InvisiblePlaywright as Async
assert Sync is not Async
@pytest.mark.parametrize("name", [
"constants",
"download",
"prefs",
"launcher",
"cli",
"_proxy",
"_fpforge",
])
def test_submodule_importable(name):
importlib.import_module(f"invisible_playwright.{name}")
def test_dunder_all_is_complete():
import invisible_playwright as ip
for name in ip.__all__:
assert hasattr(ip, name), f"{name} declared in __all__ but missing"

371
tests/test_integration.py Normal file
View file

@ -0,0 +1,371 @@
"""Integration tests — multi-module pipelines without a real browser.
These tests verify that the fingerprint sampler, Profile dataclass, prefs
translation and proxy translation compose correctly. They do NOT launch
Firefox. Browser-lifecycle tests live in ``test_e2e.py``.
Scope: Windows, Linux, and platform-agnostic. Platform-specific tests
monkeypatch ``sys.platform`` so the same suite exercises both branches
regardless of the host OS.
"""
from __future__ import annotations
import random
import sys
import pytest
from invisible_playwright._fpforge import generate_profile
from invisible_playwright._proxy import configure_proxy
from invisible_playwright.prefs import (
_WIN_LIGHT_COLORS,
translate_profile_to_prefs,
)
# Keys every Profile-derived prefs dict MUST carry. Sourced from
# ``translate_profile_to_prefs`` direct writes (not from _BASELINE) plus
# a couple of baseline keys that callers commonly read.
_REQUIRED_PREFS_KEYS = (
"zoom.stealth.screen.width",
"zoom.stealth.screen.height",
"zoom.stealth.screen.avail_width",
"zoom.stealth.screen.avail_height",
"zoom.stealth.screen.dpr",
"layout.css.devPixelsPerPx",
"zoom.stealth.hw_concurrency",
"zoom.stealth.storage.quota_mb",
"zoom.stealth.audio.sample_rate",
"zoom.stealth.audio.output_latency_ms",
"zoom.stealth.audio.max_channel_count",
"media.av1.enabled",
"media.encoder.webm.enabled",
"media.mediasource.webm.enabled",
"media.mediasource.mp4.enabled",
"zoom.stealth.font.whitelist",
"zoom.stealth.font.metrics",
"ui.systemUsesDarkTheme",
"intl.accept_languages",
"general.useragent.locale",
"intl.locale.requested",
"zoom.stealth.fpp.hw_seed",
"zoom.stealth.webrtc.host_ip",
"zoom.stealth.webgl.renderer",
"zoom.stealth.webgl.vendor",
"zoom.stealth.webgl.msaa",
"zoom.stealth.canvas.noise_skip_mask",
# baseline sanity
"privacy.resistFingerprinting",
"media.peerconnection.enabled",
"general.useragent.override",
)
# ──────────────────────────────────────────────────────────────────────
# IT1: profile → prefs pipeline yields a complete prefs dict
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_generate_profile_then_translate_has_all_required_keys():
"""IT1 — generate_profile → translate_profile_to_prefs succeeds and the
returned dict contains every key downstream code (Playwright, the C++
patches) needs to find."""
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile)
missing = [k for k in _REQUIRED_PREFS_KEYS if k not in prefs]
assert not missing, f"prefs dict missing required keys: {missing}"
# ──────────────────────────────────────────────────────────────────────
# IT2: SOCKS proxy + prefs — mutates prefs in place, returns None
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_socks5_proxy_mutates_prefs_then_pipeline_still_valid():
"""IT2 — configure_proxy writes SOCKS auth keys to the profile-derived
prefs dict; the result is still a valid prefs dict (all required keys
intact) and the proxy return is ``None`` so Playwright sees no proxy."""
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile)
pw_proxy = configure_proxy(
{
"server": "socks5://proxy.example.com:1080",
"username": "alice",
"password": "s3cret",
},
prefs,
)
assert pw_proxy is None # Firefox handles SOCKS internally.
assert prefs["network.proxy.type"] == 1
assert prefs["network.proxy.socks"] == "proxy.example.com"
assert prefs["network.proxy.socks_port"] == 1080
assert prefs["network.proxy.socks_version"] == 5
assert prefs["network.proxy.socks_username"] == "alice"
assert prefs["network.proxy.socks_password"] == "s3cret"
assert prefs["network.proxy.socks_remote_dns"] is True
# Profile-derived keys must still be present after proxy mutation.
for k in _REQUIRED_PREFS_KEYS:
assert k in prefs, f"proxy mutation dropped required key {k!r}"
# ──────────────────────────────────────────────────────────────────────
# IT3: pin overrides propagate end-to-end into the prefs dict
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_pin_screen_width_propagates_through_pipeline():
"""IT3 — a pinned ``screen.width`` shows up in the final prefs dict
under ``zoom.stealth.screen.width``."""
profile = generate_profile(seed=42, pin={"screen.width": 2560})
prefs = translate_profile_to_prefs(profile)
assert profile.screen.width == 2560
assert prefs["zoom.stealth.screen.width"] == 2560
@pytest.mark.integration
def test_multiple_pins_all_visible_in_prefs():
"""IT3.b — pinning several unrelated fields at once still routes every
one through to the prefs dict."""
pin = {
"screen.width": 3840,
"screen.height": 2160,
"hardware.concurrency": 16,
"audio.sample_rate": 48000,
}
profile = generate_profile(seed=42, pin=pin)
prefs = translate_profile_to_prefs(profile)
assert prefs["zoom.stealth.screen.width"] == 3840
assert prefs["zoom.stealth.screen.height"] == 2160
assert prefs["zoom.stealth.hw_concurrency"] == 16
assert prefs["zoom.stealth.audio.sample_rate"] == 48000
# ──────────────────────────────────────────────────────────────────────
# IT4 / IT5: end-to-end determinism + variation
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_pipeline_deterministic_for_same_seed():
"""IT4 — running the full pipeline twice with the same seed produces
identical prefs dicts."""
a = translate_profile_to_prefs(generate_profile(seed=1234))
b = translate_profile_to_prefs(generate_profile(seed=1234))
assert a == b
@pytest.mark.integration
def test_pipeline_varies_across_seeds():
"""IT5 — different seeds produce different prefs dicts. Compare the
full dict, not just a sampled field, to catch regressions where a
single hot field accidentally becomes seed-invariant."""
a = translate_profile_to_prefs(generate_profile(seed=1))
b = translate_profile_to_prefs(generate_profile(seed=2))
assert a != b
# ──────────────────────────────────────────────────────────────────────
# IT6: HTTP proxy passthrough does NOT mutate SOCKS prefs
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_http_proxy_returned_unchanged_no_socks_mutations():
"""IT6 — an HTTP proxy is returned to Playwright unchanged and the
SOCKS prefs are never written. Verifies the two proxy paths don't
cross-pollute the prefs dict."""
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile)
proxy_in = {"server": "http://proxy.example.com:8080", "username": "bob"}
pw_proxy = configure_proxy(proxy_in, prefs)
assert pw_proxy is proxy_in # returned unchanged (same object)
# No SOCKS prefs should have been written.
assert "network.proxy.type" not in prefs
assert "network.proxy.socks" not in prefs
assert "network.proxy.socks_port" not in prefs
# ──────────────────────────────────────────────────────────────────────
# IT7: profile.fonts reaches prefs as a comma-joined whitelist
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_profile_fonts_propagate_to_prefs_whitelist():
"""IT7 — every font in ``profile.fonts`` appears in the comma-joined
``zoom.stealth.font.whitelist`` pref, in order."""
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile)
assert profile.fonts, "fixture seed=42 produced empty fonts list"
whitelist = prefs["zoom.stealth.font.whitelist"]
assert isinstance(whitelist, str)
assert whitelist == ",".join(profile.fonts)
for font in profile.fonts:
assert font in whitelist
# ──────────────────────────────────────────────────────────────────────
# IT8: dark_theme controls the Win10 light-palette overlay
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_dark_theme_pipeline_omits_light_palette():
"""IT8.a — dark_theme=True profile → no light-palette colors in prefs."""
profile = generate_profile(seed=42, pin={"dark_theme": True})
prefs = translate_profile_to_prefs(profile)
assert prefs["ui.systemUsesDarkTheme"] == 1
for key in _WIN_LIGHT_COLORS:
assert key not in prefs, f"dark theme leaked light color: {key}"
@pytest.mark.integration
def test_light_theme_pipeline_includes_light_palette():
"""IT8.b — dark_theme=False profile → full Win10 light palette is
overlaid onto the prefs dict."""
profile = generate_profile(seed=42, pin={"dark_theme": False})
prefs = translate_profile_to_prefs(profile)
assert prefs["ui.systemUsesDarkTheme"] == 0
for key, value in _WIN_LIGHT_COLORS.items():
assert prefs[key] == value
# ──────────────────────────────────────────────────────────────────────
# IT9: many seeds all produce valid prefs dicts
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_many_seeds_all_produce_valid_prefs():
"""IT9 — sweep 10 distinct seeds through the full pipeline. Every run
must succeed and yield a prefs dict containing every required key.
Catches regressions where a rare CPT branch produces a prefs key
missing/wrong-typed."""
rng = random.Random(2026)
seeds = [rng.randint(1, 2**31 - 1) for _ in range(10)]
for seed in seeds:
profile = generate_profile(seed=seed)
prefs = translate_profile_to_prefs(profile)
missing = [k for k in _REQUIRED_PREFS_KEYS if k not in prefs]
assert not missing, f"seed={seed} missing keys: {missing}"
# ──────────────────────────────────────────────────────────────────────
# IT10 (extra): Windows-specific pipeline — virtual display + SOCKS
#
# Combines two Windows-specific branches that real callers stack:
# headless mode (virtual_display=True) and a SOCKS5 proxy. Catches
# ordering bugs where one branch silently overwrites the other.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_windows_virtual_display_with_socks_proxy(monkeypatch):
"""IT10 — Windows + virtual_display=True + SOCKS5 proxy: both branches
land their keys in the prefs dict and don't clobber each other."""
monkeypatch.setattr(sys, "platform", "win32")
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile, virtual_display=True)
pw_proxy = configure_proxy(
{"server": "socks5://127.0.0.1:1080"}, prefs
)
assert pw_proxy is None
assert prefs["security.sandbox.gpu.level"] == 0 # virtual_display branch
assert prefs["network.proxy.type"] == 1 # SOCKS branch
assert prefs["network.proxy.socks"] == "127.0.0.1"
# Windows still has the renderer cleared.
assert prefs["zoom.stealth.webgl.renderer"] == ""
# ──────────────────────────────────────────────────────────────────────
# IT11 (extra): Linux-specific pipeline — Xvfb workarounds + GPU spoof
# + SOCKS5 proxy. The Linux equivalent of IT10. Verifies that the three
# Linux-only branches (renderer spoof, Xvfb webrender disable, MSAA
# from profile) coexist with proxy mutation in the same prefs dict.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_linux_xvfb_workarounds_with_socks_proxy(monkeypatch):
"""IT11 — Linux + SOCKS5 proxy: Xvfb workarounds applied, GPU renderer
spoofed from profile, SOCKS keys written. virtual_display is a Windows-
only concept so we omit it here; passing ``virtual_display=True`` on
Linux must NOT set ``security.sandbox.gpu.level`` (covered by VD3)."""
monkeypatch.setattr(sys, "platform", "linux")
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile, virtual_display=True)
pw_proxy = configure_proxy(
{"server": "socks5://127.0.0.1:1080"}, prefs
)
assert pw_proxy is None
# Xvfb workarounds present.
assert prefs["gfx.webrender.all"] is False
assert prefs["gfx.webrender.force-disabled"] is True
assert prefs["webgl.force-enabled"] is True
# Windows-only sandbox key absent on Linux even with virtual_display=True.
assert "security.sandbox.gpu.level" not in prefs
# GPU renderer is spoofed from the profile (not cleared like on Windows).
assert prefs["zoom.stealth.webgl.renderer"] == profile.gpu.renderer
assert prefs["zoom.stealth.webgl.renderer"] # non-empty
# SOCKS branch wrote its keys without clobbering the Linux prefs above.
assert prefs["network.proxy.type"] == 1
assert prefs["network.proxy.socks"] == "127.0.0.1"
# ──────────────────────────────────────────────────────────────────────
# IT12 (extra): Linux pipeline carries profile MSAA end-to-end. Windows
# pins MSAA to 4 regardless of the profile; Linux must let the sampled
# value through. Guards the platform branch in ``translate_profile_to_prefs``.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_linux_msaa_pin_propagates_through_pipeline(monkeypatch):
"""IT12 — pinning MSAA on Linux survives the prefs translation; on
Windows the same pin is overwritten to 4 (covered by the unit tests)."""
monkeypatch.setattr(sys, "platform", "linux")
profile = generate_profile(seed=42, pin={"webgl.msaa_samples": 8})
prefs = translate_profile_to_prefs(profile)
assert prefs["zoom.stealth.webgl.msaa"] == 8
assert prefs["webgl.msaa-samples"] == 8
assert prefs["webgl.msaa-force"] is True
# ──────────────────────────────────────────────────────────────────────
# IT13 (extra): Linux font metrics receive the GTK/DejaVu compensation
# block. End-to-end check that ``_LINUX_GENERIC_FONT_FACTORS`` is
# prepended to the per-font metrics string sampled from the profile.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.integration
def test_linux_font_metrics_include_generic_factors(monkeypatch):
"""IT13 — on Linux the font metrics pref starts with the generic
width-scale factors (GTK/DejaVu compensation) so glyph widths match
Windows. Without this, Linux sessions leak via metric drift."""
from invisible_playwright.prefs import _LINUX_GENERIC_FONT_FACTORS
monkeypatch.setattr(sys, "platform", "linux")
profile = generate_profile(seed=42)
prefs = translate_profile_to_prefs(profile)
metrics = prefs["zoom.stealth.font.metrics"]
assert metrics.startswith(_LINUX_GENERIC_FONT_FACTORS)

View file

@ -0,0 +1,271 @@
"""Launcher helpers that don't require launching the actual browser."""
import pytest
from invisible_playwright.launcher import (
InvisiblePlaywright,
_IANA_TO_POSIX_TZ,
_tz_env,
_CHROME_W,
_CHROME_H,
_TASKBAR_H,
)
def test_tz_env_known_iana_returns_posix():
assert _tz_env("America/New_York") == "EST5EDT"
assert _tz_env("America/Chicago") == "CST6CDT"
assert _tz_env("America/Los_Angeles") == "PST8PDT"
def test_tz_env_arizona_no_dst():
"""America/Phoenix must NOT have a DST suffix — Arizona doesn't observe DST."""
assert _tz_env("America/Phoenix") == "MST7"
def test_tz_env_hawaii_no_dst():
assert _tz_env("Pacific/Honolulu") == "HST10"
def test_tz_env_unknown_iana_passes_through():
"""Linux glibc parses IANA names directly via /usr/share/zoneinfo,
so unknown zones should fall through unchanged."""
assert _tz_env("Europe/Berlin") == "Europe/Berlin"
assert _tz_env("Asia/Tokyo") == "Asia/Tokyo"
def test_iana_to_posix_table_well_formed():
for iana, posix in _IANA_TO_POSIX_TZ.items():
assert "/" in iana, f"{iana} is not an IANA zone identifier"
assert "/" not in posix, f"{posix} should be POSIX format, no slashes"
assert posix[0].isalpha(), f"{posix} should start with a letter"
def test_chrome_offsets_are_positive_ints():
"""These pad the spoofed viewport to fit inside the spoofed screen.
Any zero/negative value would let viewport bleed past screen bounds."""
assert _CHROME_W > 0
assert _CHROME_H > 0
assert _TASKBAR_H > 0
def test_invisible_playwright_constructs_without_launching():
"""The class should be instantiable for inspection without entering
the context manager (which would try to download the binary)."""
obj = InvisiblePlaywright(seed=42)
assert obj is not None
obj2 = InvisiblePlaywright(seed=42, headless=True)
assert obj2 is not None
# ─── profile_dir kwarg — persistent context support ─────────────────────── #
import pytest
from pathlib import Path
@pytest.mark.unit
def test_profile_dir_none_by_default():
"""No persistent profile unless explicitly opted in. Prevents accidental
state-leak between scripts that share the same seed."""
obj = InvisiblePlaywright(seed=42)
assert obj._profile_dir is None
assert obj._persistent_context is None
@pytest.mark.unit
def test_profile_dir_string_is_coerced_to_path(tmp_path):
"""Accept str or Path. Always store as Path internally."""
obj = InvisiblePlaywright(seed=42, profile_dir=str(tmp_path))
assert isinstance(obj._profile_dir, Path)
assert obj._profile_dir == tmp_path
@pytest.mark.unit
def test_profile_dir_path_is_stored_as_is(tmp_path):
obj = InvisiblePlaywright(seed=42, profile_dir=tmp_path)
assert obj._profile_dir == tmp_path
@pytest.mark.unit
def test_profile_dir_does_not_create_dir_until_enter(tmp_path):
"""Construction must not touch the filesystem. Directory creation only
happens when the user actually enters the context manager otherwise
a typo at instantiation would silently spawn dirs."""
target = tmp_path / "nonexistent"
assert not target.exists()
InvisiblePlaywright(seed=42, profile_dir=target)
assert not target.exists()
@pytest.mark.unit
def test_persistent_context_kwargs_match_default_exactly():
"""Persistent kwargs must be IDENTICAL to non-persistent default
kwargs. From firefox-5 (C7 closure) the docShell.overrideTimezone
method is present in the patched binary, so the per-realm overrides
Playwright applies for `locale=`/`timezone_id=` land successfully and
no longer hang the persistent context launch handshake.
Before firefox-5 we had to filter these out (180s timeout otherwise).
A future refactor that re-introduces that filter would silently lose
timezone/locale isolation in persistent sessions this test is the
sentinel that catches the regression at the unit level."""
obj = InvisiblePlaywright(seed=42, locale="en-GB", timezone="Europe/London",
profile_dir="/tmp/x")
persistent = obj._persistent_context_kwargs()
default = obj._default_context_kwargs()
assert persistent == default, (
"persistent_context kwargs must match default_context kwargs since "
f"firefox-5.\n persistent: {persistent!r}\n default: {default!r}"
)
@pytest.mark.unit
def test_persistent_context_kwargs_INCLUDES_locale_and_timezone():
"""Sentinel for the C7 closure: firefox-5 ships the C++ overrideTimezone
IDL method, so locale + timezone_id MUST be passed through to
launch_persistent_context. If they're not, the wrapper is silently
dropping per-context isolation two sessions with different
`timezone=` would end up sharing whatever TZ the env var set.
Regression-defense: do NOT re-add the firefox-4-era filter."""
obj = InvisiblePlaywright(seed=42, locale="en-GB", timezone="Europe/London",
profile_dir="/tmp/x")
kw = obj._persistent_context_kwargs()
assert kw.get("locale") == "en-GB", (
f"locale must be in persistent kwargs (firefox-5+ supports it via "
f"docShell.languageOverride). Got: {kw.get('locale')!r}"
)
assert kw.get("timezone_id") == "Europe/London", (
f"timezone_id must be in persistent kwargs (firefox-5+ supports it "
f"via docShell.overrideTimezone IDL method, patch.md section 19). "
f"Got: {kw.get('timezone_id')!r}"
)
@pytest.mark.unit
def test_persistent_context_kwargs_omits_timezone_when_empty_string():
"""Empty timezone='' is the 'use host TZ' sentinel — must NOT pass
timezone_id to Playwright in that case (would pin to literal '' and
break Intl)."""
obj = InvisiblePlaywright(seed=42, timezone="", profile_dir="/tmp/x")
kw = obj._persistent_context_kwargs()
assert "timezone_id" not in kw
# ─── Mocked __enter__ flow — confirms the right Playwright call is made ── #
@pytest.mark.unit
def test_enter_with_profile_dir_calls_launch_persistent_context(tmp_path, monkeypatch):
"""When profile_dir is set, __enter__ must call
`firefox.launch_persistent_context(user_data_dir=...)` and NOT
`firefox.launch(...)`. This is the structural test that the persistent
branch is wired correctly without it, profile_dir would be silently
accepted but ignored."""
from unittest.mock import MagicMock
# Mock ensure_binary so we don't hit the network
monkeypatch.setattr("invisible_playwright.launcher.ensure_binary",
lambda: tmp_path / "firefox")
# Mock sync_playwright().start() → fake playwright with our recording firefox
fake_ctx = MagicMock(name="persistent_context")
fake_firefox = MagicMock()
fake_firefox.launch_persistent_context.return_value = fake_ctx
fake_playwright = MagicMock()
fake_playwright.firefox = fake_firefox
fake_pw = MagicMock()
fake_pw.start.return_value = fake_playwright
monkeypatch.setattr("invisible_playwright.launcher.sync_playwright",
lambda: fake_pw)
profile = tmp_path / "myprofile"
obj = InvisiblePlaywright(seed=42, profile_dir=profile)
returned = obj.__enter__()
# The persistent branch was taken
fake_firefox.launch_persistent_context.assert_called_once()
fake_firefox.launch.assert_not_called()
# The user_data_dir was passed verbatim
call_kwargs = fake_firefox.launch_persistent_context.call_args.kwargs
assert call_kwargs["user_data_dir"] == str(profile)
# The directory was created on disk (Playwright fails otherwise)
assert profile.exists() and profile.is_dir()
# __enter__ returned the BrowserContext, not a Browser
assert returned is fake_ctx
@pytest.mark.unit
def test_enter_without_profile_dir_calls_launch_not_persistent(tmp_path, monkeypatch):
"""Default path: profile_dir=None → firefox.launch, not
launch_persistent_context. Sentinel that the non-persistent flow
isn't accidentally rerouted."""
from unittest.mock import MagicMock
monkeypatch.setattr("invisible_playwright.launcher.ensure_binary",
lambda: tmp_path / "firefox")
fake_browser = MagicMock(name="browser")
fake_browser.new_context = MagicMock()
fake_firefox = MagicMock()
fake_firefox.launch.return_value = fake_browser
fake_playwright = MagicMock()
fake_playwright.firefox = fake_firefox
fake_pw = MagicMock()
fake_pw.start.return_value = fake_playwright
monkeypatch.setattr("invisible_playwright.launcher.sync_playwright",
lambda: fake_pw)
obj = InvisiblePlaywright(seed=42)
returned = obj.__enter__()
fake_firefox.launch.assert_called_once()
fake_firefox.launch_persistent_context.assert_not_called()
assert returned is fake_browser
@pytest.mark.unit
def test_persistent_context_user_data_dir_is_created_if_missing(tmp_path, monkeypatch):
"""First-run scenario: the directory the user names doesn't exist yet.
__enter__ must mkdir -p it (Playwright won't, and would crash with
'user_data_dir does not exist')."""
from unittest.mock import MagicMock
monkeypatch.setattr("invisible_playwright.launcher.ensure_binary",
lambda: tmp_path / "firefox")
fake_pw = MagicMock()
fake_pw.start.return_value = MagicMock()
fake_pw.start.return_value.firefox.launch_persistent_context = MagicMock(
return_value=MagicMock()
)
monkeypatch.setattr("invisible_playwright.launcher.sync_playwright",
lambda: fake_pw)
nested = tmp_path / "a" / "b" / "c" / "profile"
assert not nested.parent.exists() # parent doesn't exist either
obj = InvisiblePlaywright(seed=42, profile_dir=nested)
obj.__enter__()
assert nested.is_dir()
@pytest.mark.unit
def test_teardown_closes_persistent_context(tmp_path, monkeypatch):
"""The teardown must close the persistent context. Forgetting this
leaves Firefox + Playwright running until the parent process exits,
which on long-running tools (job orchestrators, MCP servers) leaks
handles indefinitely."""
from unittest.mock import MagicMock
monkeypatch.setattr("invisible_playwright.launcher.ensure_binary",
lambda: tmp_path / "firefox")
fake_ctx = MagicMock(name="persistent_context")
fake_pw = MagicMock()
fake_pw.start.return_value.firefox.launch_persistent_context.return_value = fake_ctx
monkeypatch.setattr("invisible_playwright.launcher.sync_playwright",
lambda: fake_pw)
obj = InvisiblePlaywright(seed=42, profile_dir=tmp_path / "p")
obj.__enter__()
obj.__exit__(None, None, None)
fake_ctx.close.assert_called_once()

View file

@ -0,0 +1,206 @@
"""Unit tests for pure helpers in ``launcher.py``.
These cover code paths that are not exercised by the E2E launcher tests
(`test_e2e.py`) because they live in private helpers below the Playwright
boundary. The tests instantiate ``InvisiblePlaywright`` for the methods
that read ``self._profile`` but never enter ``__enter__``, so no Firefox
binary or virtual display is required.
"""
from __future__ import annotations
import pytest
from invisible_playwright import InvisiblePlaywright
from invisible_playwright.launcher import (
_CHROME_H,
_CHROME_W,
_IANA_TO_POSIX_TZ,
_TASKBAR_H,
_tz_env,
)
# ── _tz_env (IANA → POSIX) ────────────────────────────────────────────
@pytest.mark.unit
def test_tz_env_eastern_us_maps_to_posix_with_dst():
"""Eastern US zones share the same POSIX form; spot-check a few."""
assert _tz_env("America/New_York") == "EST5EDT"
assert _tz_env("America/Detroit") == "EST5EDT"
assert _tz_env("America/Indiana/Indianapolis") == "EST5EDT"
@pytest.mark.unit
def test_tz_env_central_mountain_pacific_map_to_posix_with_dst():
assert _tz_env("America/Chicago") == "CST6CDT"
assert _tz_env("America/Denver") == "MST7MDT"
assert _tz_env("America/Los_Angeles") == "PST8PDT"
@pytest.mark.unit
def test_tz_env_phoenix_strips_dst():
"""Arizona (outside Navajo Nation) does NOT observe DST. The POSIX
form must be ``MST7`` (no second segment) using ``MST7MDT`` caused
FP Pro to deduce vpn_origin_timezone=America/Denver from a 60-minute
offset error in summer. Guard against regression of that mapping.
"""
assert _tz_env("America/Phoenix") == "MST7"
@pytest.mark.unit
def test_tz_env_honolulu_strips_dst():
"""Hawaii does not observe DST. POSIX form ``HST10`` (no DST segment)."""
assert _tz_env("Pacific/Honolulu") == "HST10"
@pytest.mark.unit
def test_tz_env_passthrough_for_unmapped_zone():
"""Zones outside the lookup table fall through to their IANA name —
glibc on Linux reads /usr/share/zoneinfo directly. Windows MSVCRT
won't understand them but that's accepted; the mapping covers the
common residential-proxy zones."""
assert _tz_env("Europe/Berlin") == "Europe/Berlin"
assert _tz_env("Asia/Tokyo") == "Asia/Tokyo"
@pytest.mark.unit
def test_tz_env_empty_string_passes_through():
"""Empty string is never set as ``TZ`` by the caller, but the helper
is still defensive return it unchanged rather than raising."""
assert _tz_env("") == ""
@pytest.mark.unit
def test_iana_to_posix_phoenix_and_honolulu_present():
"""Sanity-check the no-DST entries are still in the mapping; deleting
them would silently revert the Phoenix DST bug."""
assert _IANA_TO_POSIX_TZ["America/Phoenix"] == "MST7"
assert _IANA_TO_POSIX_TZ["Pacific/Honolulu"] == "HST10"
# ── InvisiblePlaywright._humanize_max_seconds ─────────────────────────
@pytest.mark.unit
def test_humanize_true_defaults_to_one_and_a_half_seconds():
ip = InvisiblePlaywright(seed=42, humanize=True)
assert ip._humanize_max_seconds() == 1.5
@pytest.mark.unit
def test_humanize_float_passes_through_as_seconds():
ip = InvisiblePlaywright(seed=42, humanize=2.5)
assert ip._humanize_max_seconds() == 2.5
@pytest.mark.unit
def test_humanize_int_coerced_to_float():
"""``humanize=3`` is valid (truthy, not ``True``) → float coercion."""
ip = InvisiblePlaywright(seed=42, humanize=3)
out = ip._humanize_max_seconds()
assert out == 3.0
assert isinstance(out, float)
@pytest.mark.unit
def test_humanize_small_float_passes_through():
"""Below the default cap — the user's value wins."""
ip = InvisiblePlaywright(seed=42, humanize=0.4)
assert ip._humanize_max_seconds() == 0.4
# ── InvisiblePlaywright._default_context_kwargs ───────────────────────
@pytest.mark.unit
def test_default_context_viewport_subtracts_window_chrome():
"""Viewport must fit inside the spoofed screen with the headed
window chrome subtracted. Otherwise Playwright complains about the
viewport being larger than the screen."""
ip = InvisiblePlaywright(seed=42)
kw = ip._default_context_kwargs()
p = ip._profile
assert kw["viewport"]["width"] == p.screen.width - _CHROME_W
assert kw["viewport"]["height"] == p.screen.height - _TASKBAR_H - _CHROME_H
@pytest.mark.unit
def test_default_context_screen_matches_profile():
ip = InvisiblePlaywright(seed=42)
kw = ip._default_context_kwargs()
p = ip._profile
assert kw["screen"] == {"width": p.screen.width, "height": p.screen.height}
assert kw["device_scale_factor"] == p.screen.dpr
@pytest.mark.unit
def test_default_context_color_scheme_follows_dark_theme():
"""``color_scheme`` must match ``profile.dark_theme`` so the Playwright
realm tells matchMedia the same thing the prefs tell the chrome."""
ip_dark = InvisiblePlaywright(seed=42, pin={"dark_theme": True})
ip_light = InvisiblePlaywright(seed=42, pin={"dark_theme": False})
assert ip_dark._default_context_kwargs()["color_scheme"] == "dark"
assert ip_light._default_context_kwargs()["color_scheme"] == "light"
@pytest.mark.unit
def test_default_context_includes_timezone_when_set():
ip = InvisiblePlaywright(seed=42, timezone="America/New_York")
assert ip._default_context_kwargs()["timezone_id"] == "America/New_York"
@pytest.mark.unit
def test_default_context_omits_timezone_when_empty():
"""Default ``timezone=""`` means "let the host TZ leak through"
Playwright must not receive ``timezone_id`` at all in that case,
otherwise it overrides to the literal empty string."""
ip = InvisiblePlaywright(seed=42)
assert "timezone_id" not in ip._default_context_kwargs()
@pytest.mark.unit
def test_default_context_includes_locale_when_set():
ip = InvisiblePlaywright(seed=42, locale="de-DE")
assert ip._default_context_kwargs()["locale"] == "de-DE"
@pytest.mark.unit
def test_default_context_omits_locale_when_empty():
ip = InvisiblePlaywright(seed=42, locale="")
assert "locale" not in ip._default_context_kwargs()
# ── InvisiblePlaywright._build_env — WebRTC egress auto-derive ─────────
# Locks the 2026-06-10 fix: behind a proxy the launcher feeds the discovered
# egress IP to nICEr (srflx override) + drops IPv6. Without it, a proxied
# session's WebRTC silently fell back to leaking/blocking. Runs in tests.yml.
@pytest.mark.unit
def test_build_env_injects_webrtc_egress_when_discovered():
ip = InvisiblePlaywright(seed=42)
ip._webrtc_egress_ip = "203.0.113.9" # what __enter__ resolves behind a proxy
env = ip._build_env()
assert env["STEALTHFOX_WEBRTC_PUBLIC_IP"] == "203.0.113.9"
assert env["STEALTHFOX_WEBRTC_DISABLE_IPV6"] == "1"
@pytest.mark.unit
def test_build_env_no_webrtc_keys_without_proxy(monkeypatch):
monkeypatch.delenv("STEALTHFOX_WEBRTC_PUBLIC_IP", raising=False)
ip = InvisiblePlaywright(seed=42)
ip._webrtc_egress_ip = None # no proxy → real STUN already truthful
env = ip._build_env()
assert "STEALTHFOX_WEBRTC_PUBLIC_IP" not in env
assert "STEALTHFOX_WEBRTC_DISABLE_IPV6" not in env
@pytest.mark.unit
def test_build_env_caller_env_override_wins(monkeypatch):
monkeypatch.setenv("STEALTHFOX_WEBRTC_PUBLIC_IP", "198.51.100.5")
ip = InvisiblePlaywright(seed=42)
ip._webrtc_egress_ip = "203.0.113.9" # auto-discovered
env = ip._build_env()
assert env["STEALTHFOX_WEBRTC_PUBLIC_IP"] == "198.51.100.5" # caller wins
assert env["STEALTHFOX_WEBRTC_DISABLE_IPV6"] == "1"

255
tests/test_mouse.py Normal file
View file

@ -0,0 +1,255 @@
"""Regression tests for issue #9: jugglerSendMouseEvent missing in FF150.
The Juggler JS in upstream Playwright calls ``win.windowUtils.jugglerSendMouseEvent``
at four sites, but the C++ side was never landed when the Juggler was ported
to FF150. Every Playwright mouse code path therefore fails on the patched
binary until the JS is swapped to ``win.synthesizeMouseEvent``.
The suite below was inspired by ``microsoft/playwright-python/tests/async/test_click.py``
and covers each patched call site:
- ``PageHandler.js::Page.dispatchMouseEvent::sendEvents``
- ``PageHandler.js`` off-viewport mousemove hack
- ``PageHandler.js`` stealthfox humanize hook
- ``PageHandler.js::Page.dispatchWheelEvent`` (scrollRectIntoViewIfNeeded guard)
- ``PageAgent.js::_dispatchDragEvent``
"""
from __future__ import annotations
import urllib.parse
import pytest
from invisible_playwright import InvisiblePlaywright
def _data_url(html: str) -> str:
return "data:text/html," + urllib.parse.quote(html)
# ────────────────────────────────────────────────────────────────────
# Page.dispatchMouseEvent::sendEvents — the main loop swapped in fix #9.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_mouse_move_does_not_raise(firefox_binary):
"""page.mouse.move was the canonical repro from issue #9."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto("about:blank")
page.mouse.move(100, 100)
page.mouse.move(200, 200)
@pytest.mark.e2e
def test_click_the_button(firefox_binary):
"""Inspired by Playwright test_click.py::test_click_the_button.
Verifies the full mousedown -> mouseup -> click sequence reaches the page."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<button id=b onclick=\"window.__clicked=true;this.textContent='ok'\">x</button>"
))
page.click("#b")
assert page.evaluate("window.__clicked") is True
assert page.eval_on_selector("#b", "el => el.textContent") == "ok"
@pytest.mark.e2e
def test_double_click_fires_dblclick(firefox_binary):
"""Inspired by test_click.py::test_double_click_the_button."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<button id=b ondblclick=\"window.__dbl=true\">x</button>"
))
page.dblclick("#b")
assert page.evaluate("window.__dbl") is True
@pytest.mark.e2e
def test_right_click_fires_contextmenu(firefox_binary):
"""Inspired by test_click.py::test_fire_contextmenu_event_on_right_click.
Right-click hits the special ``button === 2`` branch that dispatches
both ``mousedown`` and ``contextmenu`` through ``sendEvents``."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<div id=d style='width:200px;height:100px;background:red' "
"oncontextmenu=\"event.preventDefault();window.__ctx=true\">x</div>"
))
page.click("#d", button="right")
assert page.evaluate("window.__ctx") is True
@pytest.mark.e2e
def test_click_with_modifier_keys(firefox_binary):
"""Inspired by test_click.py::test_update_modifiers_correctly.
Modifiers travel through the ``modifiers`` arg of the synthesized event."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<button id=b style='width:200px;height:80px;font-size:24px' "
"onclick=\"window.__shift=event.shiftKey\">click</button>"
))
page.click("#b", modifiers=["Shift"])
assert page.evaluate("window.__shift") is True
@pytest.mark.e2e
def test_locator_click(firefox_binary):
"""Locator.click also goes through Page.dispatchMouseEvent."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<button id=b onclick=\"this.textContent='clicked'\">x</button>"
))
page.locator("#b").click()
assert page.eval_on_selector("#b", "el => el.textContent") == "clicked"
# ────────────────────────────────────────────────────────────────────
# Off-viewport mousemove hack — the ``windowUtils.sendMouseEvent`` call
# at the old line 642 (also removed in FF150). The synthesizeMouseEvent
# replacement must not raise.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_mouse_move_outside_viewport_does_not_raise(firefox_binary):
"""Negative coordinates exercise the "move mouse off web content" path."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto("about:blank")
page.mouse.move(-50, -50)
# ────────────────────────────────────────────────────────────────────
# Stealthfox humanize hook — bezier expansion uses synthesizeMouseEvent
# inside a per-step loop. We verify the hook still fires intermediate
# moves between two faraway points.
# ────────────────────────────────────────────────────────────────────
def _humanize_move_count(firefox_binary, humanize):
"""Count page mousemove events fired by ONE long mouse.move."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary, humanize=humanize) as browser:
page = browser.new_page()
page.goto(_data_url(
"<div id=d style='width:600px;height:400px' "
"onmousemove=\"window.__n=(window.__n||0)+1\">x</div>"
))
page.mouse.move(10, 10)
page.evaluate("window.__n = 0")
page.mouse.move(500, 300)
return page.evaluate("window.__n")
@pytest.mark.e2e
def test_humanize_emits_intermediate_moves(firefox_binary):
"""A long mouse.move must expand into MANY intermediate mousemove events when
humanize is on (Bezier), and ~1 (a teleport) when off. We assert the on/off
CONTRAST: `moves >= 1` alone was a false-green a teleport already fires 1
and that false-green hid a pref-namespace bug (wrapper wrote
`invisible_playwright.humanize`, the binary's Juggler reads `stealthfox.humanize`)
that left humanize silently dead in production. This test now fails if the
pref ever stops reaching the binary."""
on = _humanize_move_count(firefox_binary, True)
off = _humanize_move_count(firefox_binary, False)
assert off <= 2, f"humanize OFF should ~teleport (<=2 moves), got {off}"
assert on >= 4, (
f"humanize ON must expand into many intermediate moves (Bezier); got {on} "
f"(off={off}). moves==1 means the cursor teleports — the exact automation "
f"tell humanize exists to remove, and a sign the stealthfox.* pref isn't "
f"reaching the binary's Juggler."
)
# ────────────────────────────────────────────────────────────────────
# Page.dispatchWheelEvent — the second scrollRectIntoViewIfNeeded site
# was guarded so wheel events do not crash before dispatch.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_mouse_wheel_does_not_raise(firefox_binary):
"""Wheel calls scrollRectIntoViewIfNeeded too; the guard must hold."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<div style='height:3000px'>tall</div>"
))
page.mouse.wheel(0, 200)
# ────────────────────────────────────────────────────────────────────
# Hover — locator.hover sends a mousemove through the same sendEvents
# path; checked via mouseenter on the target element.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_hover_triggers_mouseenter(firefox_binary):
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<div id=h style='width:200px;height:100px;background:red' "
"onmouseenter=\"window.__h=true\">x</div>"
))
page.locator("#h").hover()
# Wait for the event rather than reading immediately: under load / on a
# virtual display the mouseenter can land a beat after hover() returns,
# which made an instant read flaky. wait_for_function still fails (times
# out) if mouseenter genuinely never fires. Timeout is generous (10s) so a
# busy full-suite run — where browser startup + CPU contention can push
# the event past a tight 5s window — doesn't flake; the event itself fires
# in well under a second when run in isolation.
page.wait_for_function("() => window.__h === true", timeout=10_000)
# ────────────────────────────────────────────────────────────────────
# Manual mousedown/mouseup — exercises the same sendEvents path but
# splits the press/release across two API calls.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_manual_down_up_fires_full_sequence(firefox_binary):
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<button id=b style='width:200px;height:100px' "
"onmousedown=\"window.__d=true\" "
"onmouseup=\"window.__u=true\" "
"onclick=\"window.__c=true\">x</button>"
))
box = page.locator("#b").bounding_box()
cx = box["x"] + box["width"] / 2
cy = box["y"] + box["height"] / 2
page.mouse.move(cx, cy)
page.mouse.down()
page.mouse.up()
assert page.evaluate("window.__d") is True
assert page.evaluate("window.__u") is True
assert page.evaluate("window.__c") is True
# ────────────────────────────────────────────────────────────────────
# Scroll-and-click — verifies the scrollRectIntoViewIfNeeded guard in
# Page.dispatchMouseEvent does not break the auto-scroll behavior on a
# button placed off-screen below the viewport.
# ────────────────────────────────────────────────────────────────────
@pytest.mark.e2e
def test_click_offscreen_button_after_scroll(firefox_binary):
"""Inspired by test_click.py::test_scroll_and_click_the_button."""
with InvisiblePlaywright(seed=42, binary_path=firefox_binary) as browser:
page = browser.new_page()
page.goto(_data_url(
"<div style='height:3000px'></div>"
"<button id=b onclick=\"window.__c=true\">deep</button>"
))
page.click("#b")
assert page.evaluate("window.__c") is True

260
tests/test_network.py Normal file
View file

@ -0,0 +1,260 @@
"""Unit tests for invisible_playwright._fpforge._network.
Covers the Bayesian network primitives: _weighted_pick, _parent_key,
_topsort, Node.sample, Network.sample.
"""
import random
import pytest
from invisible_playwright._fpforge._network import (
Network,
Node,
_parent_key,
_topsort,
_weighted_pick,
)
# ── _weighted_pick ─────────────────────────────────────────────────────
@pytest.mark.unit
def test_weighted_pick_normal_weights_deterministic_per_seed():
"""WP1 [HAPPY]: returns one of the values; deterministic with seeded rng."""
table = [{"value": "A", "prob": 0.7}, {"value": "B", "prob": 0.3}]
rng = random.Random(42)
out = _weighted_pick(table, rng)
assert out in {"A", "B"}
# same seed → same draw
assert _weighted_pick(table, random.Random(42)) == out
@pytest.mark.unit
def test_weighted_pick_single_element_table():
"""WP2 [BVA]: single entry → always returns that value."""
table = [{"value": "X", "prob": 1.0}]
for seed in (0, 1, 999):
assert _weighted_pick(table, random.Random(seed)) == "X"
@pytest.mark.unit
def test_weighted_pick_empty_table_raises():
"""WP3 [NEG]: empty list → ValueError."""
with pytest.raises(ValueError, match="Empty CPT entry"):
_weighted_pick([], random.Random(0))
@pytest.mark.unit
def test_weighted_pick_all_zero_probs_uses_uniform_fallback():
"""WP4 [ECP]: total == 0 → falls back to rng.choice (uniform)."""
table = [{"value": "A", "prob": 0}, {"value": "B", "prob": 0}]
# Sample many times — both outcomes must be reachable under uniform choice.
rng = random.Random(123)
seen = {_weighted_pick(table, rng) for _ in range(50)}
assert seen == {"A", "B"}
@pytest.mark.unit
def test_weighted_pick_unnormalized_weights():
"""WP6 [ECP]: weights 3/7 normalize to 0.3/0.7; same seed → same result."""
table = [{"value": "A", "prob": 3}, {"value": "B", "prob": 7}]
rng_a = random.Random(42)
rng_b = random.Random(42)
# Equivalent normalized table must yield the same draw given same rng state.
table_norm = [{"value": "A", "prob": 0.3}, {"value": "B", "prob": 0.7}]
assert _weighted_pick(table, rng_a) == _weighted_pick(table_norm, rng_b)
@pytest.mark.unit
def test_weighted_pick_complex_value_types_returned_as_is():
"""WP7 [ECP]: values can be dicts; returned by reference."""
payload = {"w": 1920, "h": 1080}
table = [{"value": payload, "prob": 1.0}]
assert _weighted_pick(table, random.Random(0)) is payload
@pytest.mark.unit
def test_weighted_pick_total_exactly_zero_single_entry():
"""WP8 [BVA]: total = 0 with one value → uniform fallback returns it."""
table = [{"value": "A", "prob": 0}]
assert _weighted_pick(table, random.Random(0)) == "A"
# ── _parent_key ─────────────────────────────────────────────────────────
@pytest.mark.unit
def test_parent_key_single_string_parent():
"""PK1 [ECP]: single string parent → value returned as-is."""
assert _parent_key(["gpu"], {"gpu": "Intel"}) == "Intel"
@pytest.mark.unit
def test_parent_key_single_non_string_parent_uses_json():
"""PK2 [ECP]: single non-string parent → json.dumps with sort_keys."""
assert _parent_key(["x"], {"x": 42}) == "42"
@pytest.mark.unit
def test_parent_key_multiple_parents_returns_json_array():
"""PK3 [ECP]: multiple parents → JSON array in declared order."""
assert _parent_key(["a", "b"], {"a": "X", "b": "Y"}) == '["X", "Y"]'
@pytest.mark.unit
def test_parent_key_single_dict_parent_sorted_keys():
"""PK4 [ECP]: dict value → JSON with sorted keys for stable lookup."""
out = _parent_key(["gpu"], {"gpu": {"renderer": "A", "vendor": "B"}})
assert out == '{"renderer": "A", "vendor": "B"}'
# ── _topsort ────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_topsort_linear_chain():
"""TS1 [HAPPY]: A → B → C produces order [A, B, C]."""
a = Node("A")
b = Node("B", parents=["A"])
c = Node("C", parents=["B"])
order = [n.name for n in _topsort([c, b, a])]
assert order == ["A", "B", "C"]
@pytest.mark.unit
def test_topsort_diamond():
"""TS2 [HAPPY]: diamond A→{B,C}→D — A before B,C; B,C before D."""
a = Node("A")
b = Node("B", parents=["A"])
c = Node("C", parents=["A"])
d = Node("D", parents=["B", "C"])
order = [n.name for n in _topsort([d, c, b, a])]
assert order.index("A") < order.index("B")
assert order.index("A") < order.index("C")
assert order.index("B") < order.index("D")
assert order.index("C") < order.index("D")
@pytest.mark.unit
def test_topsort_direct_cycle_raises():
"""TS3 [NEG]: A↔B mutual parent → ValueError("Cycle at ...")."""
a = Node("A", parents=["B"])
b = Node("B", parents=["A"])
with pytest.raises(ValueError, match="Cycle"):
_topsort([a, b])
@pytest.mark.unit
def test_topsort_unknown_parent_raises():
"""TS4 [NEG]: parent name not in node list → ValueError."""
a = Node("A", parents=["ghost"])
with pytest.raises(ValueError, match="unknown parent"):
_topsort([a])
@pytest.mark.unit
def test_topsort_single_root_node():
"""TS5 [BVA]: one root node → returns it unchanged."""
a = Node("A")
assert [n.name for n in _topsort([a])] == ["A"]
@pytest.mark.unit
def test_topsort_empty_list():
"""TS6 [BVA]: empty → empty."""
assert _topsort([]) == []
# ── Node.sample ─────────────────────────────────────────────────────────
@pytest.mark.unit
def test_node_sample_classifier_ignores_cpt():
"""NS1 [ECP]: classifier node returns classifier output, CPT unused."""
node = Node("c", parents=["x"], classifier=lambda ctx: "FIXED")
assert node.sample({"x": "anything"}, random.Random(0)) == "FIXED"
@pytest.mark.unit
def test_node_sample_marginal_root():
"""NS2 [ECP]: root with single-entry CPT → returns that value."""
node = Node("r", parents=[], cpt=[{"value": "A", "prob": 1.0}])
assert node.sample({}, random.Random(0)) == "A"
@pytest.mark.unit
def test_node_sample_conditional_key_exists():
"""NS3 [ECP]: parent value in CPT → samples from that distribution."""
cpt = {
"high_end": [{"value": "fast", "prob": 1.0}],
"low_end": [{"value": "slow", "prob": 1.0}],
}
node = Node("hw", parents=["gpu_class"], cpt=cpt)
assert node.sample({"gpu_class": "high_end"}, random.Random(0)) == "fast"
assert node.sample({"gpu_class": "low_end"}, random.Random(0)) == "slow"
@pytest.mark.unit
def test_node_sample_conditional_key_miss_falls_back_to_union():
"""NS4 [ECP]: unknown parent value → union of all CPT entries."""
cpt = {
"high_end": [{"value": "fast", "prob": 1.0}],
"low_end": [{"value": "slow", "prob": 1.0}],
}
node = Node("hw", parents=["gpu_class"], cpt=cpt)
rng = random.Random(0)
seen = {node.sample({"gpu_class": "unknown_tier"}, rng) for _ in range(50)}
assert seen <= {"fast", "slow"}
# Union must allow both outcomes given enough samples.
assert len(seen) >= 1
@pytest.mark.unit
def test_node_sample_conditional_empty_cpt_raises():
"""NS5 [NEG]: CPT with all-empty value lists → ValueError."""
cpt = {"a": [], "b": []}
node = Node("x", parents=["p"], cpt=cpt)
with pytest.raises(ValueError, match="no CPT entries"):
node.sample({"p": "unknown"}, random.Random(0))
# ── Network.sample ──────────────────────────────────────────────────────
@pytest.mark.unit
def test_network_sample_basic_graph_returns_all_keys():
"""NW1 [HAPPY]: 3-node network → context dict has all node names."""
gpu = Node("gpu", parents=[], cpt=[{"value": "Intel", "prob": 1.0}])
gpu_class = Node(
"gpu_class", parents=["gpu"],
classifier=lambda ctx: "integrated_modern",
)
hw = Node(
"hw", parents=["gpu_class"],
cpt={"integrated_modern": [{"value": 8, "prob": 1.0}]},
)
net = Network([gpu, gpu_class, hw])
out = net.sample(random.Random(42))
assert set(out.keys()) == {"gpu", "gpu_class", "hw"}
assert out["gpu"] == "Intel"
assert out["gpu_class"] == "integrated_modern"
assert out["hw"] == 8
@pytest.mark.unit
def test_network_sample_deterministic_per_seed():
"""NW2 [ECP]: same rng seed → identical sample."""
gpu = Node("gpu", parents=[], cpt=[
{"value": "Intel", "prob": 0.5},
{"value": "NVIDIA", "prob": 0.5},
])
net = Network([gpu])
assert net.sample(random.Random(7)) == net.sample(random.Random(7))
@pytest.mark.unit
def test_network_sample_varies_across_seeds():
"""NW3 [ECP]: 32 distinct seeds over a 2-way root must see both outcomes."""
gpu = Node("gpu", parents=[], cpt=[
{"value": "Intel", "prob": 0.5},
{"value": "NVIDIA", "prob": 0.5},
])
net = Network([gpu])
seen = {net.sample(random.Random(s))["gpu"] for s in range(32)}
assert seen == {"Intel", "NVIDIA"}

83
tests/test_pin.py Normal file
View file

@ -0,0 +1,83 @@
"""Pin parameter validation and propagation through the fingerprint generator."""
import pytest
from invisible_playwright._fpforge import generate_profile
from invisible_playwright.prefs import translate_profile_to_prefs
def test_pin_screen_width_propagates_to_prefs():
p = generate_profile(seed=42, pin={"screen.width": 2560, "screen.height": 1440})
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.screen.width"] == 2560
assert prefs["zoom.stealth.screen.height"] == 1440
def test_pin_gpu_renderer_propagates():
target = "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)"
p = generate_profile(seed=42, pin={"gpu.renderer": target})
# The Profile carries the pinned value regardless of platform; the prefs
# translation may suppress it on Windows for hash-coherence reasons.
assert p.gpu.renderer == target
def test_pin_hardware_concurrency_propagates():
p = generate_profile(seed=42, pin={"hardware.concurrency": 16})
assert p.hardware.concurrency == 16
def test_pin_audio_sample_rate_propagates():
p = generate_profile(seed=42, pin={"audio.sample_rate": 48000})
assert p.audio.sample_rate == 48000
def test_pin_unknown_key_raises():
with pytest.raises(ValueError, match="not valid|unknown"):
generate_profile(seed=42, pin={"nonexistent.field": 123})
def test_pin_unknown_group_raises():
with pytest.raises(ValueError, match="unknown group"):
generate_profile(seed=42, pin={"madeup.field": "x"})
def test_pin_unknown_field_in_known_group_raises():
with pytest.raises(ValueError, match="unknown field"):
generate_profile(seed=42, pin={"screen.not_a_real_field": 100})
def test_pin_key_without_dot_raises():
"""Top-level keys must be in the allowlist; arbitrary flat keys reject."""
with pytest.raises(ValueError, match="not valid"):
generate_profile(seed=42, pin={"madeup": 1})
def test_pin_top_level_fonts_accepted():
p = generate_profile(seed=42, pin={"fonts": ["Arial", "Verdana", "Tahoma"]})
assert "Arial" in p.fonts
assert "Verdana" in p.fonts
def test_pin_top_level_dark_theme_accepted():
p = generate_profile(seed=42, pin={"dark_theme": True})
assert p.dark_theme is True
def test_pin_fonts_wrong_type_raises():
with pytest.raises(TypeError, match="list/tuple"):
generate_profile(seed=42, pin={"fonts": "Arial,Verdana"})
def test_pin_overrides_seed_value():
"""The same seed produces different output once a pin is applied."""
natural = generate_profile(seed=42)
pinned = generate_profile(seed=42, pin={"screen.width": natural.screen.width + 100})
assert pinned.screen.width == natural.screen.width + 100
assert pinned.screen.width != natural.screen.width
def test_pin_reproducibility_within_same_seed():
a = generate_profile(seed=42, pin={"screen.width": 1920, "audio.sample_rate": 48000})
b = generate_profile(seed=42, pin={"screen.width": 1920, "audio.sample_rate": 48000})
assert a.screen.width == b.screen.width
assert a.audio.sample_rate == b.audio.sample_rate
assert a.gpu.renderer == b.gpu.renderer

View file

@ -1,14 +1,29 @@
from stealthfox._fpforge import generate_profile import re
from stealthfox.prefs import translate_profile_to_prefs import sys
import pytest
from invisible_playwright._fpforge import generate_profile
from invisible_playwright.prefs import (
_LINUX_GENERIC_FONT_FACTORS,
_accept_language,
_font_metrics_for_platform,
_WIN_LIGHT_COLORS,
translate_profile_to_prefs,
)
def test_translate_includes_gpu_renderer(): @pytest.mark.unit
def test_translate_includes_gpu_renderer_windows(monkeypatch):
"""On Windows, renderer/vendor are cleared so ANGLE reports native hardware."""
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42) p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p) prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.renderer"] == p.gpu.renderer assert prefs["zoom.stealth.webgl.renderer"] == ""
assert prefs["zoom.stealth.webgl.vendor"] == p.gpu.vendor assert prefs["zoom.stealth.webgl.vendor"] == ""
@pytest.mark.unit
def test_translate_includes_screen(): def test_translate_includes_screen():
p = generate_profile(seed=42) p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p) prefs = translate_profile_to_prefs(p)
@ -16,20 +31,486 @@ def test_translate_includes_screen():
assert prefs["zoom.stealth.screen.height"] == p.screen.height assert prefs["zoom.stealth.screen.height"] == p.screen.height
@pytest.mark.unit
def test_translate_is_deterministic_per_seed(): def test_translate_is_deterministic_per_seed():
a = translate_profile_to_prefs(generate_profile(seed=42)) a = translate_profile_to_prefs(generate_profile(seed=42))
b = translate_profile_to_prefs(generate_profile(seed=42)) b = translate_profile_to_prefs(generate_profile(seed=42))
assert a == b assert a == b
@pytest.mark.unit
def test_translate_varies_across_seeds(): def test_translate_varies_across_seeds():
a = translate_profile_to_prefs(generate_profile(seed=1)) a = translate_profile_to_prefs(generate_profile(seed=1))
b = translate_profile_to_prefs(generate_profile(seed=2)) b = translate_profile_to_prefs(generate_profile(seed=2))
assert a != b assert a != b
@pytest.mark.unit
def test_translate_has_stealth_baseline_constants(): def test_translate_has_stealth_baseline_constants():
p = generate_profile(seed=42) p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p) prefs = translate_profile_to_prefs(p)
assert prefs.get("privacy.resistFingerprinting") is False assert prefs.get("privacy.resistFingerprinting") is False
assert "media.peerconnection.enabled" in prefs assert "media.peerconnection.enabled" in prefs
# ──────────────────────────────────────────────────────────────────────
# _accept_language (platform-agnostic)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_accept_language_with_region():
# AL1
assert _accept_language("en-US") == "en-US, en"
@pytest.mark.unit
def test_accept_language_no_region():
# AL2
assert _accept_language("fr") == "fr"
@pytest.mark.unit
def test_accept_language_underscore_normalized():
# AL3
assert _accept_language("pt_BR") == "pt-BR, pt"
# ──────────────────────────────────────────────────────────────────────
# _font_metrics_for_platform
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_font_metrics_windows_returns_empty(monkeypatch):
# FM2: Windows never applies width-scale factors.
monkeypatch.setattr(sys, "platform", "win32")
assert _font_metrics_for_platform("Arial|1.0,Verdana|0.9,") == ""
@pytest.mark.unit
def test_font_metrics_empty_input_returns_empty():
# FM3: Empty input always returns "" regardless of platform.
assert _font_metrics_for_platform("") == ""
# ──────────────────────────────────────────────────────────────────────
# Platform-specific GPU / MSAA (Windows)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_gpu_renderer_empty_on_windows(monkeypatch):
# PG2
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.renderer"] == ""
assert prefs["zoom.stealth.webgl.vendor"] == ""
@pytest.mark.unit
def test_msaa_pinned_to_4_on_windows(monkeypatch):
# PG4: even when profile.webgl.msaa_samples differs, Windows pins to 4.
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42, pin={"webgl.msaa_samples": 8})
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.msaa"] == 4
assert prefs["webgl.msaa-samples"] == 4
assert prefs["webgl.msaa-force"] is True
# ──────────────────────────────────────────────────────────────────────
# Canvas noise skip mask (Windows always uses intel path)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_canvas_noise_mask_windows_uses_intel_path(monkeypatch):
# CN3: on Windows _renderer_lo is hardcoded to "intel" → mask=15.
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(
seed=42,
pin={"gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11)"},
)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.canvas.noise_skip_mask"] == 15
# ──────────────────────────────────────────────────────────────────────
# WebGL extensions (Windows clears them)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_webgl_extensions_cleared_on_windows(monkeypatch):
# WE2
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.extensions"] == ""
assert prefs["zoom.stealth.webgl2.extensions"] == ""
# ──────────────────────────────────────────────────────────────────────
# Timezone (platform-agnostic)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_timezone_set_uses_juggler_pref():
# TZ1 — juggler.timezone.override is the sole C++-read timezone pref;
# the old zoom.stealth.timezone alias (orphan) must NOT be reintroduced.
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, timezone="America/New_York")
assert prefs["juggler.timezone.override"] == "America/New_York"
assert "zoom.stealth.timezone" not in prefs
@pytest.mark.unit
def test_timezone_empty_omits_the_key():
# TZ2
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, timezone="")
assert "juggler.timezone.override" not in prefs
assert "zoom.stealth.timezone" not in prefs
# ──────────────────────────────────────────────────────────────────────
# extra_prefs overlay (platform-agnostic)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_extra_prefs_adds_custom_key():
# EP1
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, extra_prefs={"custom.pref": 42})
assert prefs["custom.pref"] == 42
@pytest.mark.unit
def test_extra_prefs_none_value_deletes_key():
# EP2
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(
p, extra_prefs={"privacy.resistFingerprinting": None}
)
assert "privacy.resistFingerprinting" not in prefs
@pytest.mark.unit
def test_extra_prefs_overrides_existing_key():
# EP3 — override a real baseline key (hw_seed is the live cross-process seed)
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, extra_prefs={"zoom.stealth.fpp.hw_seed": 999})
assert prefs["zoom.stealth.fpp.hw_seed"] == 999
@pytest.mark.unit
def test_extra_prefs_none_is_no_op():
# EP4
p = generate_profile(seed=42)
base = translate_profile_to_prefs(p)
with_none = translate_profile_to_prefs(p, extra_prefs=None)
assert base == with_none
@pytest.mark.unit
def test_extra_prefs_empty_dict_is_no_op():
# EP5
p = generate_profile(seed=42)
base = translate_profile_to_prefs(p)
with_empty = translate_profile_to_prefs(p, extra_prefs={})
assert base == with_empty
# ──────────────────────────────────────────────────────────────────────
# System colors / dark theme (platform-agnostic — palette is Win10)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_system_colors_present_when_light_theme():
# SC1
p = generate_profile(seed=42, pin={"dark_theme": False})
prefs = translate_profile_to_prefs(p)
assert prefs["ui.systemUsesDarkTheme"] == 0
# Spot-check a few keys from the Win10 light palette.
for key in _WIN_LIGHT_COLORS:
assert key in prefs
assert prefs[key] == _WIN_LIGHT_COLORS[key]
@pytest.mark.unit
def test_system_colors_absent_when_dark_theme():
# SC2
p = generate_profile(seed=42, pin={"dark_theme": True})
prefs = translate_profile_to_prefs(p)
assert prefs["ui.systemUsesDarkTheme"] == 1
for key in _WIN_LIGHT_COLORS:
assert key not in prefs
# ──────────────────────────────────────────────────────────────────────
# Locale prefs (platform-agnostic)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_locale_en_us_accept_languages():
# LC1
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, locale="en-US")
assert prefs["intl.accept_languages"] == "en-US, en"
@pytest.mark.unit
def test_locale_underscore_form_normalized():
# LC2
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, locale="de_DE")
assert prefs["intl.accept_languages"] == "de-DE, de"
assert prefs["general.useragent.locale"] == "de-DE"
assert prefs["intl.locale.requested"] == "de-DE"
@pytest.mark.unit
def test_locale_empty_falls_back_to_en_us():
# LC3
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, locale="")
assert prefs["intl.accept_languages"] == "en-US, en"
# ──────────────────────────────────────────────────────────────────────
# Xvfb workarounds (Windows must NOT set Linux-only keys)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_xvfb_workarounds_absent_on_windows(monkeypatch):
# XW2
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert "gfx.webrender.all" not in prefs
assert "gfx.webrender.force-disabled" not in prefs
assert "webgl.force-enabled" not in prefs
# ──────────────────────────────────────────────────────────────────────
# Windows virtual-desktop workarounds
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_virtual_display_workaround_applied_on_windows(monkeypatch):
# VD1
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, virtual_display=True)
assert prefs["security.sandbox.gpu.level"] == 0
@pytest.mark.unit
def test_virtual_display_workaround_absent_when_disabled(monkeypatch):
# VD2
monkeypatch.setattr(sys, "platform", "win32")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, virtual_display=False)
assert "security.sandbox.gpu.level" not in prefs
# ──────────────────────────────────────────────────────────────────────
# Seed-derived LAN IP (platform-agnostic)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_lan_ip_matches_192_168_pattern():
# LI1
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
ip = prefs["zoom.stealth.webrtc.host_ip"]
m = re.match(r"^192\.168\.(\d+)\.(\d+)$", ip)
assert m, f"unexpected LAN IP format: {ip!r}"
o3, o4 = int(m.group(1)), int(m.group(2))
assert 1 <= o3 <= 254
assert 1 <= o4 <= 254
@pytest.mark.unit
def test_lan_ip_deterministic_per_seed():
# LI2
a = translate_profile_to_prefs(generate_profile(seed=42))["zoom.stealth.webrtc.host_ip"]
b = translate_profile_to_prefs(generate_profile(seed=42))["zoom.stealth.webrtc.host_ip"]
assert a == b
@pytest.mark.unit
def test_lan_ip_seed_zero_has_no_zero_octets():
# LI3: code adds +1 so neither dynamic octet should ever be 0.
p = generate_profile(seed=0)
prefs = translate_profile_to_prefs(p)
ip = prefs["zoom.stealth.webrtc.host_ip"]
octets = ip.split(".")
assert octets[0] == "192"
assert octets[1] == "168"
assert int(octets[2]) >= 1
assert int(octets[3]) >= 1
# ──────────────────────────────────────────────────────────────────────
# Linux-specific tests — exercise the branches that only fire when
# ``sys.platform.startswith("linux")``. Patched via ``monkeypatch`` so
# these run on any host CI environment.
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_font_metrics_linux_prepends_generic_factors(monkeypatch):
# FM1: Linux prepends the GTK/DejaVu compensation block to the
# per-font metrics string sampled from the profile.
monkeypatch.setattr(sys, "platform", "linux")
out = _font_metrics_for_platform("Arial|1.0,Verdana|0.9,")
assert out.startswith(_LINUX_GENERIC_FONT_FACTORS)
assert out.endswith("Arial|1.0,Verdana|0.9,")
@pytest.mark.unit
def test_font_metrics_linux_empty_input_returns_empty(monkeypatch):
# FM1b: even on Linux, empty profile metrics short-circuits before
# the prepend so we never emit a metrics pref containing only the
# generic block (which would surface as a tampering signal).
monkeypatch.setattr(sys, "platform", "linux")
assert _font_metrics_for_platform("") == ""
@pytest.mark.unit
def test_font_metrics_linux2_variant_uses_linux_branch(monkeypatch):
# FM1c: ``sys.platform`` can be ``linux2`` on older Pythons / odd
# WSL builds. ``startswith("linux")`` accepts both.
monkeypatch.setattr(sys, "platform", "linux2")
out = _font_metrics_for_platform("Verdana|0.9,")
assert out.startswith(_LINUX_GENERIC_FONT_FACTORS)
@pytest.mark.unit
def test_gpu_renderer_set_from_profile_on_linux(monkeypatch):
# PG1: on Linux we spoof to the profile's Windows-ANGLE renderer
# string so cross-platform sessions present a consistent Windows GPU.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.renderer"] == p.gpu.renderer
assert prefs["zoom.stealth.webgl.vendor"] == p.gpu.vendor
assert prefs["zoom.stealth.webgl.renderer"] # non-empty
@pytest.mark.unit
def test_msaa_from_profile_on_linux(monkeypatch):
# PG3: on Linux, MSAA comes from the profile's sampled value rather
# than being pinned to 4 (which is the Windows ANGLE default).
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42, pin={"webgl.msaa_samples": 8})
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.msaa"] == 8
assert prefs["webgl.msaa-samples"] == 8
assert prefs["webgl.msaa-force"] is True
@pytest.mark.unit
def test_msaa_zero_disables_force_on_linux(monkeypatch):
# PG3b: MSAA=0 means "no MSAA" so ``webgl.msaa-force`` must be False.
# Verifies the ``> 0`` guard on the force flag.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42, pin={"webgl.msaa_samples": 0})
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.msaa"] == 0
assert prefs["webgl.msaa-force"] is False
@pytest.mark.unit
def test_canvas_noise_mask_intel_on_linux(monkeypatch):
# CN1: Intel renderer → 1/16 noise (mask=15). Pinning the renderer
# exercises the live ``_renderer_lo`` branch on Linux (where the
# value is read from the profile rather than hardcoded as on Windows).
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(
seed=42,
pin={
"gpu.renderer": "ANGLE (Intel, Intel(R) UHD Graphics 630 Direct3D11 vs_5_0 ps_5_0, D3D11)",
"gpu.vendor": "Google Inc. (Intel)",
},
)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.canvas.noise_skip_mask"] == 15
@pytest.mark.unit
def test_canvas_noise_mask_nvidia_on_linux(monkeypatch):
# CN2: NVIDIA/AMD renderer → 1/8 noise (mask=7). The "intel" substring
# check must NOT match here.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(
seed=42,
pin={
"gpu.renderer": "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11 vs_5_0 ps_5_0, D3D11)",
"gpu.vendor": "Google Inc. (NVIDIA)",
},
)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.canvas.noise_skip_mask"] == 7
@pytest.mark.unit
def test_webgl_extensions_preserved_on_linux(monkeypatch):
# WE1: on Linux the curated WebGL1/2 extension lists from _BASELINE
# remain in the prefs dict so the patched binary publishes them
# instead of native Mesa's set.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert prefs["zoom.stealth.webgl.extensions"]
assert prefs["zoom.stealth.webgl2.extensions"]
# Spot-check a canonical Windows ANGLE extension is in the list.
assert "ANGLE_instanced_arrays" in prefs["zoom.stealth.webgl.extensions"]
assert "OVR_multiview2" in prefs["zoom.stealth.webgl2.extensions"]
@pytest.mark.unit
def test_xvfb_workarounds_applied_on_linux(monkeypatch):
# XW1: Linux Firefox under Xvfb can't run WebRender, so we force the
# software path. These are added via ``setdefault`` so callers can
# still override them via ``extra_prefs``.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p)
assert prefs["gfx.webrender.all"] is False
assert prefs["gfx.webrender.force-disabled"] is True
assert prefs["webgl.force-enabled"] is True
@pytest.mark.unit
def test_xvfb_workarounds_caller_can_override(monkeypatch):
# XW1b: the workarounds are added with ``setdefault``, so a user-
# supplied ``extra_prefs`` value wins. Verifies the override path
# doesn't get clobbered by the platform branch.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(
p, extra_prefs={"webgl.force-enabled": False}
)
assert prefs["webgl.force-enabled"] is False
@pytest.mark.unit
def test_virtual_display_no_op_on_linux(monkeypatch):
# VD3: ``virtual_display`` is a Windows-only concept (CreateDesktop
# alt-desktop GPU sandbox workaround). Even when True, Linux must
# not pick up ``security.sandbox.gpu.level``.
monkeypatch.setattr(sys, "platform", "linux")
p = generate_profile(seed=42)
prefs = translate_profile_to_prefs(p, virtual_display=True)
assert "security.sandbox.gpu.level" not in prefs

348
tests/test_profile.py Normal file
View file

@ -0,0 +1,348 @@
"""Unit tests for `_fpforge/profile.py`.
Covers `_validate_pin_key`, `_apply_pins_to_raw`, and `generate_profile`.
Test cases derived via ECP/BVA/error guessing.
"""
from dataclasses import FrozenInstanceError
import pytest
from invisible_playwright._fpforge import generate_profile
from invisible_playwright._fpforge.profile import (
Profile,
_PIN_GROUPS,
_PIN_TO_RAW,
_apply_pins_to_raw,
_validate_pin_key,
)
# ─────────────────────────────────────────────────────────────────────
# _validate_pin_key
# ─────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_validate_pin_key_top_level_fonts():
"""VK1 — `fonts` is a known top-level key."""
_validate_pin_key("fonts")
@pytest.mark.unit
def test_validate_pin_key_top_level_dark_theme():
"""VK2 — `dark_theme` is a known top-level key."""
_validate_pin_key("dark_theme")
@pytest.mark.unit
def test_validate_pin_key_dotted_screen_width():
"""VK3 — valid dotted path `screen.width`."""
_validate_pin_key("screen.width")
@pytest.mark.unit
def test_validate_pin_key_dotted_gpu_renderer():
"""VK4 — valid dotted path `gpu.renderer`."""
_validate_pin_key("gpu.renderer")
@pytest.mark.unit
def test_validate_pin_key_dotted_webgl_msaa_samples():
"""VK5 — valid dotted path `webgl.msaa_samples`."""
_validate_pin_key("webgl.msaa_samples")
@pytest.mark.unit
def test_validate_pin_key_no_dot_not_top_level_raises():
"""VK6 — bare key not in top-level set raises with hint."""
with pytest.raises(ValueError, match="group.field"):
_validate_pin_key("bogus")
@pytest.mark.unit
def test_validate_pin_key_unknown_group_raises():
"""VK7 — unknown group prefix."""
with pytest.raises(ValueError, match="unknown group"):
_validate_pin_key("network.port")
@pytest.mark.unit
def test_validate_pin_key_unknown_field_in_valid_group_raises():
"""VK8 — known group, unknown field."""
with pytest.raises(ValueError, match="unknown field"):
_validate_pin_key("screen.brightness")
@pytest.mark.unit
def test_validate_pin_key_empty_string_raises():
"""VK9 — empty key fails the dotted-form check."""
with pytest.raises(ValueError):
_validate_pin_key("")
@pytest.mark.unit
@pytest.mark.parametrize("group,fields", sorted(_PIN_GROUPS.items()))
def test_validate_pin_key_all_groups_first_field(group, fields):
"""VK10 — every defined group accepts its sorted-first field."""
first = sorted(fields)[0]
_validate_pin_key(f"{group}.{first}")
# ─────────────────────────────────────────────────────────────────────
# _apply_pins_to_raw
# ─────────────────────────────────────────────────────────────────────
def _raw_baseline():
"""A minimal raw dict for pin tests — only the keys we care about."""
return {
"screen_w": 1920,
"screen_h": 1080,
"webgl_vendor": "Google Inc. (Intel)",
"webgl_renderer": "ANGLE (Intel)",
"font_whitelist": "arial,calibri",
"dark_theme": 0,
}
@pytest.mark.unit
def test_apply_pins_to_raw_screen_width():
"""AP1 — `screen.width` rewrites `screen_w` in raw."""
out = _apply_pins_to_raw(_raw_baseline(), {"screen.width": 2560})
assert out["screen_w"] == 2560
@pytest.mark.unit
def test_apply_pins_to_raw_fonts_list():
"""AP2 — list pin joined into comma-separated whitelist."""
out = _apply_pins_to_raw(_raw_baseline(), {"fonts": ["Arial", "Verdana"]})
assert out["font_whitelist"] == "Arial,Verdana"
@pytest.mark.unit
def test_apply_pins_to_raw_fonts_tuple():
"""AP3 — tuple pin is also accepted."""
out = _apply_pins_to_raw(_raw_baseline(), {"fonts": ("Arial",)})
assert out["font_whitelist"] == "Arial"
@pytest.mark.unit
def test_apply_pins_to_raw_fonts_string_raises():
"""AP4 — bare string is not a list/tuple, must raise."""
with pytest.raises(TypeError, match="list/tuple"):
_apply_pins_to_raw(_raw_baseline(), {"fonts": "Arial"})
@pytest.mark.unit
def test_apply_pins_to_raw_fonts_int_raises():
"""AP5 — int is also rejected."""
with pytest.raises(TypeError):
_apply_pins_to_raw(_raw_baseline(), {"fonts": 42})
@pytest.mark.unit
def test_apply_pins_to_raw_multiple_pins():
"""AP6 — multiple pins all land in raw."""
pin = {"gpu.vendor": "X", "gpu.renderer": "Y"}
out = _apply_pins_to_raw(_raw_baseline(), pin)
assert out["webgl_vendor"] == "X"
assert out["webgl_renderer"] == "Y"
@pytest.mark.unit
def test_apply_pins_to_raw_returns_copy_not_mutation():
"""AP7 — input dict is not mutated."""
raw = _raw_baseline()
snapshot = dict(raw)
_apply_pins_to_raw(raw, {"screen.width": 9999})
assert raw == snapshot
@pytest.mark.unit
def test_apply_pins_to_raw_unknown_key_silent():
"""AP8 — key not in `_PIN_TO_RAW` (and not 'fonts') is ignored.
Validation happens upstream in `generate_profile`; the inner helper
guards defensively but does not raise.
"""
raw = _raw_baseline()
out = _apply_pins_to_raw(raw, {"some.unknown": 123})
# No change to known fields
assert out["screen_w"] == raw["screen_w"]
# No new key added
assert "some.unknown" not in out
# ─────────────────────────────────────────────────────────────────────
# generate_profile
# ─────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_generate_profile_happy_path():
"""GP1 — returns a fully populated Profile."""
p = generate_profile(seed=42)
assert isinstance(p, Profile)
assert p.seed == 42
assert p.gpu.vendor
assert p.gpu.renderer
assert p.gpu.class_tier in _PIN_GROUPS["gpu"].union({"low_end", "mid_range",
"high_end", "integrated_old", "integrated_modern", "workstation"})
assert p.screen.width > 0
assert p.screen.height > 0
assert p.hardware.concurrency > 0
assert p.audio.sample_rate > 0
@pytest.mark.unit
def test_generate_profile_deterministic():
"""GP2 — same seed → identical Profile (equality on frozen dataclass)."""
a = generate_profile(seed=42)
b = generate_profile(seed=42)
assert a == b
@pytest.mark.unit
def test_generate_profile_seed_float_coerced():
"""GP3 — float seed is coerced to int (truncated)."""
a = generate_profile(seed=42.7)
b = generate_profile(seed=42)
assert a == b
@pytest.mark.unit
def test_generate_profile_seed_string_coerced():
"""GP4 — numeric string seed works via int() coercion."""
a = generate_profile(seed="42")
b = generate_profile(seed=42)
assert a == b
@pytest.mark.unit
def test_generate_profile_no_pin_samples_freely():
"""GP5 — no pin: every field is sampler-derived (sanity: 2 seeds differ)."""
a = generate_profile(seed=1)
b = generate_profile(seed=2)
assert a != b
@pytest.mark.unit
def test_generate_profile_pin_overrides_screen_width():
"""GP6 — pinned width visible on the Profile dataclass."""
p = generate_profile(seed=42, pin={"screen.width": 9999})
assert p.screen.width == 9999
@pytest.mark.unit
def test_generate_profile_pin_visible_in_prefs_dict():
"""GP7 — pinned values flow through to to_prefs_dict()."""
p = generate_profile(seed=42, pin={"screen.width": 9999})
assert p.to_prefs_dict()["screen_w"] == 9999
@pytest.mark.unit
def test_generate_profile_invalid_pin_raises():
"""GP8 — bad pin key surfaces ValueError from validation."""
with pytest.raises(ValueError):
generate_profile(seed=42, pin={"bogus": 1})
@pytest.mark.unit
def test_generate_profile_empty_pin_equals_no_pin():
"""GP9 — empty pin dict is a no-op."""
a = generate_profile(seed=42, pin={})
b = generate_profile(seed=42)
assert a == b
@pytest.mark.unit
def test_generate_profile_is_frozen():
"""GP10 — Profile dataclass is immutable."""
p = generate_profile(seed=42)
with pytest.raises(FrozenInstanceError):
p.seed = 99 # type: ignore[misc]
@pytest.mark.unit
def test_generate_profile_fonts_is_list_of_strings():
"""GP11 — fonts is a non-empty list of stripped strings."""
p = generate_profile(seed=42)
assert isinstance(p.fonts, list)
assert len(p.fonts) > 0
assert all(isinstance(f, str) and f.strip() == f for f in p.fonts)
@pytest.mark.unit
def test_generate_profile_to_prefs_dict_flat_and_matches_raw():
"""GP12 — to_prefs_dict() returns a flat dict containing core sampler keys."""
p = generate_profile(seed=42)
d = p.to_prefs_dict()
assert isinstance(d, dict)
for key in ("screen_w", "screen_h", "webgl_vendor", "webgl_renderer",
"hw_concurrency", "stealth_seed"):
assert key in d
@pytest.mark.unit
def test_generate_profile_seed_zero():
"""GP13 — seed=0 is a valid lowest-value boundary."""
p = generate_profile(seed=0)
assert p.seed == 0
@pytest.mark.unit
def test_generate_profile_seed_max_int31():
"""GP14 — seed at int31 upper bound works."""
seed = (1 << 31) - 1
p = generate_profile(seed=seed)
assert p.seed == seed
@pytest.mark.unit
def test_generate_profile_dark_theme_is_bool():
"""GP15 — dark_theme is coerced to bool on the dataclass."""
p = generate_profile(seed=42)
assert isinstance(p.dark_theme, bool)
# ─────────────────────────────────────────────────────────────────────
# Additional pin coverage (recheck pass)
# ─────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_generate_profile_pin_dark_theme_true():
"""Pinning dark_theme=True flows through coercion to bool."""
p = generate_profile(seed=42, pin={"dark_theme": True})
assert p.dark_theme is True
@pytest.mark.unit
def test_generate_profile_pin_dark_theme_false():
p = generate_profile(seed=42, pin={"dark_theme": False})
assert p.dark_theme is False
@pytest.mark.unit
def test_generate_profile_pin_fonts_list_visible_on_profile():
"""fonts pin: list → joined raw string → split back to list on Profile."""
p = generate_profile(seed=42, pin={"fonts": ["Arial", "Verdana"]})
assert p.fonts == ["Arial", "Verdana"]
@pytest.mark.unit
def test_generate_profile_pin_gpu_renderer_propagates():
p = generate_profile(seed=42, pin={"gpu.renderer": "FORCED_RENDERER"})
assert p.gpu.renderer == "FORCED_RENDERER"
assert p.to_prefs_dict()["webgl_renderer"] == "FORCED_RENDERER"
@pytest.mark.unit
def test_generate_profile_pin_to_raw_keymap_complete():
"""Every dotted pin key (besides 'fonts') has a `_PIN_TO_RAW` mapping.
Guards against silently-ignored pins if someone adds a key to `_PIN_GROUPS`
but forgets the raw-key mapping.
"""
dotted = {f"{group}.{field}" for group, fields in _PIN_GROUPS.items()
for field in fields}
# 'dark_theme' is top-level and present in _PIN_TO_RAW; 'fonts' is handled
# specially and intentionally absent.
missing = dotted - set(_PIN_TO_RAW.keys())
assert missing == set(), f"pin keys without raw mapping: {sorted(missing)}"

266
tests/test_proxy.py Normal file
View file

@ -0,0 +1,266 @@
"""Unit tests for `invisible_playwright._proxy.configure_proxy`.
Decision-table coverage of every input partition: None/empty/direct,
SOCKS4/5/default, HTTP/HTTPS, case variants, malformed, mutation contract.
"""
import pytest
from invisible_playwright._proxy import configure_proxy
# ──────────────────────────────────────────────────────────────────────
# CP1-CP7: no-op cases — return None, do NOT mutate prefs
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_cp1_none_proxy_returns_none():
prefs = {}
assert configure_proxy(None, prefs) is None
assert prefs == {}
@pytest.mark.unit
def test_cp2_empty_dict_returns_none():
prefs = {}
assert configure_proxy({}, prefs) is None
assert prefs == {}
@pytest.mark.unit
def test_cp3_empty_server_returns_none():
prefs = {}
assert configure_proxy({"server": ""}, prefs) is None
assert prefs == {}
@pytest.mark.unit
def test_cp4_whitespace_server_returns_none():
prefs = {}
assert configure_proxy({"server": " "}, prefs) is None
assert prefs == {}
@pytest.mark.unit
def test_cp5_direct_scheme_returns_none():
prefs = {}
assert configure_proxy({"server": "direct://"}, prefs) is None
assert prefs == {}
@pytest.mark.unit
def test_cp6_direct_scheme_uppercase_returns_none():
prefs = {}
assert configure_proxy({"server": "DIRECT://"}, prefs) is None
assert prefs == {}
@pytest.mark.unit
def test_cp7_direct_scheme_mixed_case_returns_none():
prefs = {}
assert configure_proxy({"server": "DiReCt://"}, prefs) is None
assert prefs == {}
# ──────────────────────────────────────────────────────────────────────
# CP8-CP9: HTTP/HTTPS — passthrough (return proxy unchanged, no mutation)
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_cp8_http_proxy_passthrough():
prefs = {}
proxy = {"server": "http://proxy:8080"}
result = configure_proxy(proxy, prefs)
assert result == proxy
# No SOCKS-related mutations.
assert "network.proxy.type" not in prefs
assert "network.proxy.socks" not in prefs
@pytest.mark.unit
def test_cp9_https_proxy_passthrough():
prefs = {}
proxy = {"server": "https://proxy:8080"}
result = configure_proxy(proxy, prefs)
assert result == proxy
assert "network.proxy.type" not in prefs
@pytest.mark.unit
def test_cp8b_http_with_username_password_passthrough():
"""HTTP proxies preserve username/password for Playwright to consume."""
prefs = {}
proxy = {"server": "http://proxy:8080", "username": "user", "password": "pw"}
result = configure_proxy(proxy, prefs)
assert result == proxy
assert "network.proxy.type" not in prefs
# ──────────────────────────────────────────────────────────────────────
# CP10-CP13: SOCKS — mutate prefs, return None
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_cp10_socks5_with_credentials():
prefs = {}
proxy = {
"server": "socks5://host:1080",
"username": "u",
"password": "p",
}
result = configure_proxy(proxy, prefs)
assert result is None
assert prefs["network.proxy.type"] == 1
assert prefs["network.proxy.socks"] == "host"
assert prefs["network.proxy.socks_port"] == 1080
assert prefs["network.proxy.socks_version"] == 5
assert prefs["network.proxy.socks_username"] == "u"
assert prefs["network.proxy.socks_password"] == "p"
assert prefs["network.proxy.socks_remote_dns"] is True
@pytest.mark.unit
def test_cp11_socks4_sets_version_4():
prefs = {}
configure_proxy({"server": "socks4://host:1080"}, prefs)
assert prefs["network.proxy.socks_version"] == 4
@pytest.mark.unit
def test_cp12_bare_socks_defaults_to_v5():
prefs = {}
configure_proxy({"server": "socks://host:1080"}, prefs)
assert prefs["network.proxy.socks_version"] == 5
@pytest.mark.unit
def test_cp13_socks_scheme_is_case_insensitive():
prefs = {}
proxy = {"server": "SOCKS5://HOST:1080"}
result = configure_proxy(proxy, prefs)
assert result is None
assert prefs["network.proxy.type"] == 1
# Host preserves case (only the scheme is case-folded).
assert prefs["network.proxy.socks"] == "HOST"
assert prefs["network.proxy.socks_version"] == 5
# ──────────────────────────────────────────────────────────────────────
# CP14-CP15: edge SOCKS inputs
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_cp14_socks_without_port_dropped_silently():
prefs = {}
result = configure_proxy({"server": "socks5://hostonly"}, prefs)
assert result is None
# Malformed input drops silently — no mutations.
assert "network.proxy.type" not in prefs
assert "network.proxy.socks" not in prefs
@pytest.mark.unit
def test_cp15_socks_without_credentials_uses_empty_strings():
prefs = {}
configure_proxy({"server": "socks5://host:1080"}, prefs)
assert prefs["network.proxy.socks_username"] == ""
assert prefs["network.proxy.socks_password"] == ""
@pytest.mark.unit
def test_cp15b_socks_with_none_credentials_uses_empty_strings():
"""`proxy.get("username")` returning None should resolve to ""."""
prefs = {}
configure_proxy(
{"server": "socks5://host:1080", "username": None, "password": None},
prefs,
)
assert prefs["network.proxy.socks_username"] == ""
assert prefs["network.proxy.socks_password"] == ""
# ──────────────────────────────────────────────────────────────────────
# CP16: mutation contract — prefs dict mutated in-place
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_cp16_prefs_mutated_in_place():
"""Caller's prefs dict receives the SOCKS keys directly (not a copy)."""
prefs = {"existing.pref": "kept"}
sentinel = prefs
configure_proxy({"server": "socks5://host:1080"}, prefs)
# Same object identity — mutated, not replaced.
assert prefs is sentinel
# Existing pref preserved.
assert prefs["existing.pref"] == "kept"
# SOCKS keys added.
assert "network.proxy.type" in prefs
assert "network.proxy.socks" in prefs
# ──────────────────────────────────────────────────────────────────────
# CP17: boundary — IPv6-style host preserved via rsplit
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_cp17_ipv6_bracketed_host_preserved_via_rsplit():
"""rsplit(':', 1) keeps brackets intact for `[::1]:1080`-style hosts."""
prefs = {}
configure_proxy({"server": "socks5://[::1]:1080"}, prefs)
assert prefs["network.proxy.socks"] == "[::1]"
assert prefs["network.proxy.socks_port"] == 1080
# ──────────────────────────────────────────────────────────────────────
# Recheck additions — branches discovered while re-reading _proxy.py
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_socks_with_surrounding_whitespace_in_server_stripped():
"""The implementation strips whitespace before scheme checks."""
prefs = {}
result = configure_proxy({"server": " socks5://host:1080 "}, prefs)
assert result is None
assert prefs["network.proxy.socks"] == "host"
assert prefs["network.proxy.socks_port"] == 1080
@pytest.mark.unit
def test_server_key_missing_returns_none():
"""No 'server' key → treated as empty → no-op."""
prefs = {}
result = configure_proxy({"username": "u"}, prefs)
assert result is None
assert prefs == {}
@pytest.mark.unit
def test_server_key_none_returns_none():
"""`server: None` is normalized to "" by the implementation."""
prefs = {}
result = configure_proxy({"server": None}, prefs)
assert result is None
assert prefs == {}
@pytest.mark.unit
def test_socks_port_coerced_to_int():
"""Port string is parsed via int() — not a numeric string."""
prefs = {}
configure_proxy({"server": "socks5://host:443"}, prefs)
assert prefs["network.proxy.socks_port"] == 443
assert isinstance(prefs["network.proxy.socks_port"], int)
@pytest.mark.unit
def test_socks_non_numeric_port_raises_value_error():
"""Non-numeric port is a programmer error — int() raises."""
prefs = {}
with pytest.raises(ValueError):
configure_proxy({"server": "socks5://host:notaport"}, prefs)

View file

@ -0,0 +1,197 @@
"""E2E: the patched Firefox SENDS SOCKS5 username/password and routes through it.
Playwright's own ``proxy=`` ignores SOCKS auth; this is the patched
``nsProtocolProxyService`` feature (reads ``network.proxy.socks_username`` /
``socks_password``). ``test_proxy.py`` already unit-tests on CI that the wrapper
sets those prefs; this proves the binary actually performs the RFC1929 auth
handshake and relays traffic.
Fully hermetic a local SOCKS5 server + a local HTTP target, with the localhost
target forced through the proxy via ``allow_hijacking_localhost`` so it runs
identically on a dev box and on a GitHub runner (no external site, no secrets).
"""
from __future__ import annotations
import http.server
import socket
import socketserver
import struct
import threading
import pytest
from invisible_playwright import InvisiblePlaywright
_USER = "ferd_socks_user"
_PASS = "ferd_socks_pw_42"
class _Socks5AuthRecorder:
"""SOCKS5 that REQUIRES RFC1929 user/pass auth, records the creds it saw,
then relays CONNECT to the requested target."""
def __init__(self):
self._srv = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self._srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self._srv.bind(("127.0.0.1", 0))
self._srv.listen(16)
self.port = self._srv.getsockname()[1]
self.seen_creds: list[tuple[str, str]] = []
self._stop = False
threading.Thread(target=self._serve, daemon=True).start()
def _serve(self):
while not self._stop:
try:
conn, _ = self._srv.accept()
except OSError:
break
threading.Thread(target=self._handle, args=(conn,), daemon=True).start()
def _recv(self, s, n):
buf = b""
while len(buf) < n:
chunk = s.recv(n - len(buf))
if not chunk:
return None
buf += chunk
return buf
def _handle(self, conn):
try:
head = self._recv(conn, 2)
if not head or head[0] != 0x05:
conn.close(); return
methods = self._recv(conn, head[1]) or b""
if 0x02 not in methods: # we REQUIRE user/pass
conn.sendall(b"\x05\xff"); conn.close(); return
conn.sendall(b"\x05\x02") # select user/pass auth
if not self._recv(conn, 1): # RFC1929 version byte
conn.close(); return
ulen = self._recv(conn, 1)[0]
uname = (self._recv(conn, ulen) or b"").decode("utf-8", "ignore")
plen = self._recv(conn, 1)[0]
passwd = (self._recv(conn, plen) or b"").decode("utf-8", "ignore")
self.seen_creds.append((uname, passwd))
conn.sendall(b"\x01\x00") # auth success
req = self._recv(conn, 4)
if not req:
conn.close(); return
_, cmd, _, atyp = req
if atyp == 0x01:
addr = socket.inet_ntoa(self._recv(conn, 4))
elif atyp == 0x03:
addr = (self._recv(conn, self._recv(conn, 1)[0]) or b"").decode()
elif atyp == 0x04:
addr = socket.inet_ntop(socket.AF_INET6, self._recv(conn, 16))
else:
conn.close(); return
port = struct.unpack("!H", self._recv(conn, 2))[0]
if cmd != 0x01: # only CONNECT
conn.sendall(b"\x05\x07\x00\x01\x00\x00\x00\x00\x00\x00"); conn.close(); return
try:
up = socket.create_connection((addr, port), timeout=15)
except OSError:
conn.sendall(b"\x05\x05\x00\x01\x00\x00\x00\x00\x00\x00"); conn.close(); return
conn.sendall(b"\x05\x00\x00\x01\x00\x00\x00\x00\x00\x00")
self._pipe(conn, up)
except Exception:
try:
conn.close()
except OSError:
pass
@staticmethod
def _pipe(a, b):
def fwd(src, dst):
try:
while True:
data = src.recv(65536)
if not data:
break
dst.sendall(data)
except OSError:
pass
finally:
try:
dst.shutdown(socket.SHUT_WR)
except OSError:
pass
threading.Thread(target=fwd, args=(a, b), daemon=True).start()
fwd(b, a)
def close(self):
self._stop = True
try:
self._srv.close()
except OSError:
pass
class _LocalHTTP:
"""A tiny localhost HTTP server — the CONNECT target relayed by the proxy."""
_HTML = b"<!doctype html><title>ok</title><h1 id=ok>socks-routed</h1>"
def __init__(self):
html = self._HTML
class H(http.server.BaseHTTPRequestHandler):
def do_GET(self): # noqa: N802
self.send_response(200)
self.send_header("Content-Type", "text/html; charset=utf-8")
self.send_header("Content-Length", str(len(html)))
self.end_headers()
self.wfile.write(html)
def log_message(self, *a):
pass
self._srv = socketserver.TCPServer(("127.0.0.1", 0), H)
self.port = self._srv.server_address[1]
threading.Thread(target=self._srv.serve_forever, daemon=True).start()
def close(self):
self._srv.shutdown()
@pytest.fixture
def socks_auth():
s = _Socks5AuthRecorder()
yield s
s.close()
@pytest.fixture
def local_http():
h = _LocalHTTP()
yield h
h.close()
@pytest.mark.e2e
def test_socks5_auth_creds_sent_and_routed(firefox_binary, socks_auth, local_http):
"""The binary must perform SOCKS5 user/pass auth with the configured creds
and relay the page through the proxy."""
proxy = {
"server": f"socks5://127.0.0.1:{socks_auth.port}",
"username": _USER,
"password": _PASS,
}
# Firefox bypasses the proxy for localhost by default; force it through.
prefs = {
"network.proxy.allow_hijacking_localhost": True,
"network.proxy.no_proxies_on": "",
}
with InvisiblePlaywright(
seed=42, binary_path=firefox_binary, proxy=proxy, extra_prefs=prefs
) as browser:
page = browser.new_page()
page.goto(f"http://127.0.0.1:{local_http.port}/", wait_until="load", timeout=30000)
text = page.evaluate("() => document.getElementById('ok').textContent")
assert text == "socks-routed", "page did not load through the SOCKS proxy"
assert (_USER, _PASS) in socks_auth.seen_creds, (
f"patched Firefox did not send the SOCKS5 auth creds from prefs; "
f"proxy saw: {socks_auth.seen_creds!r}"
)

View file

@ -0,0 +1,349 @@
"""Unit tests for the deterministic reCAPTCHA cookie builder.
Validates the contract:
- 6 .google.com cookies always present
- Per-site cookies built from a `browsing_history` list (sampled by the
Bayesian network in _fpforge)
- Determinism: same (seed, history) identical content
- Chrome 400-day cookie cap respected
- Playwright add_cookies field requirements satisfied
"""
import pytest
from invisible_playwright._recaptcha_seed import (
build_cookies,
_sub_seed,
)
pytestmark = pytest.mark.unit
_FIXED_NOW = 1779600000 # 2026-05-23, frozen for determinism
# Sample browsing history for tests (mimics what _fpforge produces).
_SAMPLE_HISTORY = [
{"name": "github.com", "category": "dev", "cookie_profile": "ga_cf"},
{"name": "stackoverflow.com", "category": "dev", "cookie_profile": "ga_consent_clarity"},
{"name": "amazon.com", "category": "shop", "cookie_profile": "ga_consent_clarity"},
{"name": "wikipedia.org", "category": "reference", "cookie_profile": "minimal"},
{"name": "youtube.com", "category": "media", "cookie_profile": "ga_only"},
]
# ===========================================================================
# 1. Set composition
# ===========================================================================
def test_only_google_cookies_when_no_history():
"""Empty/None history → only the 5 .google.com cookies (1P_JAR removed
in realism round 2 deprecated by Google 2022)."""
cookies = build_cookies(seed=42, browsing_history=None, now=_FIXED_NOW)
names = sorted(c["name"] for c in cookies)
assert names == sorted(["NID", "CONSENT", "SOCS",
"_GRECAPTCHA", "ENID"])
assert all(c["domain"] == ".google.com" for c in cookies)
def test_browsing_history_adds_host_cookies():
"""Each history site contributes 1+ cookies on its domain."""
cookies = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
google = [c for c in cookies if c["domain"] == ".google.com"]
assert len(google) == 5 # 1P_JAR removed
domains = {c["domain"] for c in cookies if c["domain"] != ".google.com"}
for site in _SAMPLE_HISTORY:
assert f".{site['name']}" in domains
def test_domain_dot_prefix_normalized():
"""All host cookie domains have a leading dot for sub-domain coverage."""
cookies = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
for c in cookies:
assert c["domain"].startswith("."), f"missing dot: {c['domain']}"
# ===========================================================================
# 2. Cookie profile recipes (each profile yields the expected cookie set)
# ===========================================================================
def test_profile_minimal_yields_ga_only():
history = [{"name": "x.com", "cookie_profile": "minimal"}]
cookies = build_cookies(seed=42, browsing_history=history, now=_FIXED_NOW)
host = [c for c in cookies if c["domain"] == ".x.com"]
names = [c["name"] for c in host]
assert names == ["_ga"]
def test_profile_ga_only_yields_ga_and_gid():
history = [{"name": "x.com", "cookie_profile": "ga_only"}]
cookies = build_cookies(seed=42, browsing_history=history, now=_FIXED_NOW)
host = [c for c in cookies if c["domain"] == ".x.com"]
names = sorted(c["name"] for c in host)
assert names == ["_ga", "_gid"]
def test_profile_ga_cf_yields_ga_and_cf_bm():
history = [{"name": "x.com", "cookie_profile": "ga_cf"}]
cookies = build_cookies(seed=42, browsing_history=history, now=_FIXED_NOW)
host = [c for c in cookies if c["domain"] == ".x.com"]
names = sorted(c["name"] for c in host)
assert names == ["__cf_bm", "_ga"]
def test_profile_ga_consent_yields_three_cookies():
history = [{"name": "x.com", "cookie_profile": "ga_consent"}]
cookies = build_cookies(seed=42, browsing_history=history, now=_FIXED_NOW)
host = [c for c in cookies if c["domain"] == ".x.com"]
names = sorted(c["name"] for c in host)
# Always _ga + _gid + one of OneTrust|CookieYes
assert "_ga" in names and "_gid" in names
assert any(n in names for n in ("OptanonAlertBoxClosed", "cookieyes-consent"))
assert len(host) == 3
def test_profile_ga_consent_clarity_yields_at_least_four_cookies():
"""Always _ga + _gid + _clck + consent banner. Optionally _fbp, _dc_gtm_*,
__hssrc (probabilistic per rng see test_new_helper_cookies_*)."""
history = [{"name": "x.com", "cookie_profile": "ga_consent_clarity"}]
cookies = build_cookies(seed=42, browsing_history=history, now=_FIXED_NOW)
host = [c for c in cookies if c["domain"] == ".x.com"]
names = sorted(c["name"] for c in host)
assert "_ga" in names and "_gid" in names and "_clck" in names
assert any(n in names for n in ("OptanonAlertBoxClosed", "cookieyes-consent"))
assert len(host) >= 4 # 4 baseline + 0-3 helpers
def test_unknown_profile_falls_back_to_ga():
history = [{"name": "x.com", "cookie_profile": "nonexistent_profile"}]
cookies = build_cookies(seed=42, browsing_history=history, now=_FIXED_NOW)
host = [c for c in cookies if c["domain"] == ".x.com"]
assert [c["name"] for c in host] == ["_ga"]
# ===========================================================================
# 3. Determinism
# ===========================================================================
def test_same_seed_and_history_same_content():
a = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
b = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
assert a == b
def test_different_seed_different_content():
a = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
b = build_cookies(seed=99, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
a_nid = next(c for c in a if c["name"] == "NID")["value"]
b_nid = next(c for c in b if c["name"] == "NID")["value"]
assert a_nid != b_nid
def test_history_order_does_not_affect_domain_specific_cookies():
"""Sub-seed is keyed on domain name, not order in history list."""
h1 = [_SAMPLE_HISTORY[0], _SAMPLE_HISTORY[1]]
h2 = [_SAMPLE_HISTORY[1], _SAMPLE_HISTORY[0]]
a = {(c["domain"], c["name"]): c["value"]
for c in build_cookies(seed=42, browsing_history=h1, now=_FIXED_NOW)
if c["domain"] != ".google.com"}
b = {(c["domain"], c["name"]): c["value"]
for c in build_cookies(seed=42, browsing_history=h2, now=_FIXED_NOW)
if c["domain"] != ".google.com"}
assert a == b
def test_sub_seed_distinct_tags_distinct_streams():
assert _sub_seed(42, "google") != _sub_seed(42, "dom:github.com")
assert _sub_seed(42, "dom:github.com") != _sub_seed(42, "dom:amazon.com")
assert _sub_seed(0, "any") != 0 # seed=0 still produces non-zero sub-seed
# ===========================================================================
# 4. Format / structural correctness for the Google batch
# ===========================================================================
def test_nid_format():
cookies = build_cookies(seed=42, now=_FIXED_NOW)
nid = next(c for c in cookies if c["name"] == "NID")
prefix, b64 = nid["value"].split("=", 1)
assert prefix.isdigit() and len(prefix) == 3
# Broadened to 100-540 in realism round 2 to cover historical NID versions
assert 100 <= int(prefix) <= 540
assert len(b64) == 178
def test_consent_format():
cookies = build_cookies(seed=42, now=_FIXED_NOW)
consent = next(c for c in cookies if c["name"] == "CONSENT")
assert consent["value"].startswith("YES+cb.")
assert "+FX+" in consent["value"]
# ===========================================================================
# 5. Chrome 400-day cookie cap compliance
# ===========================================================================
def test_all_expiries_within_400_day_cap():
"""Chrome 104+ caps cookie expiry to 400 days. Cookies > 400d silently
truncated / dropped. We tighten everything to <=395d (except __cf_bm
which is short-lived telemetry)."""
cookies = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
max_allowed = _FIXED_NOW + 400 * 86400
for c in cookies:
# Short-lived telemetry cookies are fine
if c["name"] in ("__cf_bm", "1P_JAR", "_gid"):
continue
assert c["expires"] <= max_allowed, (
f"Cookie {c['name']} expires {c['expires'] - _FIXED_NOW}s "
f"(> 400d cap) — would be silently dropped"
)
# ===========================================================================
# 6. Playwright add_cookies field requirements
# ===========================================================================
def test_all_cookies_have_required_playwright_fields():
cookies = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
for c in cookies:
assert c.get("name"), f"missing name: {c}"
assert c.get("value") is not None, f"missing value: {c}"
assert c.get("domain"), f"missing domain: {c}"
assert c.get("path") == "/", f"path != / for {c['name']}"
def test_modern_cookies_marked_secure():
"""Cookies with sameSite=None require secure=True under Firefox/Chrome.
Also generally needed for cookies set via Playwright add_cookies without
a navigation context."""
cookies = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
for c in cookies:
if c.get("sameSite") == "None":
assert c.get("secure") is True, f"{c['name']} None+!secure invalid"
def test_httponly_on_signed_cookies():
cookies = build_cookies(seed=42, now=_FIXED_NOW)
nid = next(c for c in cookies if c["name"] == "NID")
enid = next(c for c in cookies if c["name"] == "ENID")
assert nid.get("httpOnly") is True
assert enid.get("httpOnly") is True
# ===========================================================================
# 7. End-to-end with real fpforge Profile
# ===========================================================================
def test_with_real_fpforge_profile():
"""End-to-end: generate a real Profile, ensure browsing_history is populated
and build_cookies works against it."""
from invisible_playwright._fpforge import generate_profile
prof = generate_profile(seed=42)
assert isinstance(prof.browsing_history, list)
# The Bayesian network samples ~15-30 sites per persona
assert 5 <= len(prof.browsing_history) <= 50, \
f"unexpected history length: {len(prof.browsing_history)}"
# Each entry has the expected fields
for site in prof.browsing_history:
assert "name" in site and "category" in site and "cookie_profile" in site
# build_cookies works against the real profile
cookies = build_cookies(seed=prof.seed, browsing_history=prof.browsing_history,
now=_FIXED_NOW)
# 6 google + at least 1 cookie per visited site
assert len(cookies) >= 6 + len(prof.browsing_history)
def test_same_seed_same_browsing_history_via_fpforge():
"""Profile.browsing_history is deterministic from seed (Bayesian sampler)."""
from invisible_playwright._fpforge import generate_profile
a = generate_profile(seed=42).browsing_history
b = generate_profile(seed=42).browsing_history
assert a == b
# ===========================================================================
# 8. Realism improvements (2026-05-24 round 2)
# ===========================================================================
def test_no_1p_jar_cookie():
"""1P_JAR was deprecated by Google in 2022. Including it is an
anachronism flag for fingerprinters that look at cookie freshness."""
cookies = build_cookies(seed=42, browsing_history=_SAMPLE_HISTORY, now=_FIXED_NOW)
names = {c["name"] for c in cookies}
assert "1P_JAR" not in names
def test_nid_prefix_broadened_range():
"""NID 3-digit prefix should cover historical versions (137/105/511/525
seen in real captures) range 100-540, not just 500-540."""
seen_prefixes = set()
for seed in range(200):
cookies = build_cookies(seed=seed, now=_FIXED_NOW)
nid = next(c for c in cookies if c["name"] == "NID")
prefix = int(nid["value"].split("=", 1)[0])
seen_prefixes.add(prefix)
assert min(seen_prefixes) < 500, f"NID range never goes below 500 ({sorted(seen_prefixes)[:5]})"
assert max(seen_prefixes) <= 540
def test_consent_lang_from_timezone_eu():
"""CONSENT cookie's `lang+region` token derived from IANA timezone."""
cookies = build_cookies(seed=42, now=_FIXED_NOW, timezone="Europe/Rome")
consent = next(c for c in cookies if c["name"] == "CONSENT")
assert ".it+IT+" in consent["value"], f"expected it+IT in: {consent['value']}"
def test_consent_lang_default_fx():
"""Unknown / US timezone → default `en+FX` (non-EU fallback)."""
cookies = build_cookies(seed=42, now=_FIXED_NOW, timezone="America/New_York")
consent = next(c for c in cookies if c["name"] == "CONSENT")
assert ".en+FX+" in consent["value"]
def test_consent_lang_de_for_berlin():
cookies = build_cookies(seed=42, now=_FIXED_NOW, timezone="Europe/Berlin")
consent = next(c for c in cookies if c["name"] == "CONSENT")
assert ".de+DE+" in consent["value"]
def test_consent_lang_no_timezone_default():
"""timezone=None → default en+FX."""
cookies = build_cookies(seed=42, now=_FIXED_NOW)
consent = next(c for c in cookies if c["name"] == "CONSENT")
assert ".en+FX+" in consent["value"]
def test_new_helper_cookies_appear_in_ga_consent_clarity():
"""ga_consent_clarity recipe should sometimes include _fbp, _dc_gtm_*, __hssrc
(probabilistic per rng). Check across many seeds that they appear."""
saw_fbp = False
saw_gtm = False
saw_hssrc = False
history = [{"name": "site.com", "cookie_profile": "ga_consent_clarity"}]
for seed in range(100):
cookies = build_cookies(seed=seed, browsing_history=history, now=_FIXED_NOW)
names = {c["name"] for c in cookies if c["domain"] == ".site.com"}
if "_fbp" in names: saw_fbp = True
if any(n.startswith("_dc_gtm_") for n in names): saw_gtm = True
if "__hssrc" in names: saw_hssrc = True
assert saw_fbp, "_fbp never appeared in 100 seeds (rng pick broken)"
assert saw_gtm, "_dc_gtm_* never appeared in 100 seeds"
assert saw_hssrc, "__hssrc never appeared in 100 seeds"
def test_fbp_format():
"""_fbp format: fb.<idx>.<unix_ms>.<random_int>"""
history = [{"name": "x.com", "cookie_profile": "ga_consent_clarity"}]
# Try multiple seeds until we hit a seed that includes _fbp (50% chance)
for seed in range(20):
cookies = build_cookies(seed=seed, browsing_history=history, now=_FIXED_NOW)
fbp = next((c for c in cookies if c["name"] == "_fbp"), None)
if fbp:
parts = fbp["value"].split(".")
assert parts[0] == "fb"
assert parts[1].isdigit()
assert parts[2].isdigit() and len(parts[2]) >= 13 # unix ms
assert parts[3].isdigit()
return
raise AssertionError("never got _fbp across 20 seeds — distribution broken")

Some files were not shown because too many files have changed in this diff Show more