mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-06-24 21:38:09 +02:00
feat(proxy): integrate Scrapling for enhanced web scraping capabilities
- Replaced Playwright with Scrapling's fetchers in the web crawling and YouTube processing modules for improved performance and flexibility. - Updated proxy configuration to support dynamic proxy selection via environment variables. - Enhanced logging to track performance metrics during web scraping operations. - Refactored related modules to utilize the new proxy utilities and streamline the scraping process.
This commit is contained in:
parent
41a93ca8fb
commit
640ef5f15d
16 changed files with 5770 additions and 4886 deletions
|
|
@ -277,9 +277,13 @@ TURNSTILE_ENABLED=FALSE
|
|||
TURNSTILE_SECRET_KEY=
|
||||
|
||||
|
||||
# Proxy provider selection. Selects a ProxyProvider implementation registered in
|
||||
# app/utils/proxy/registry.py. Default: "anonymous_proxies". Add new vendors there.
|
||||
# PROXY_PROVIDER=anonymous_proxies
|
||||
|
||||
# Residential Proxy Configuration (anonymous-proxies.net)
|
||||
# Used for web crawling, link previews, and YouTube transcript fetching to avoid IP bans.
|
||||
# Leave commented out to disable proxying.
|
||||
# Consumed by the "anonymous_proxies" provider. Leave commented out to disable proxying.
|
||||
# RESIDENTIAL_PROXY_USERNAME=your_proxy_username
|
||||
# RESIDENTIAL_PROXY_PASSWORD=your_proxy_password
|
||||
# RESIDENTIAL_PROXY_HOSTNAME=rotating.dnsproxifier.com:31230
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue