SurfSense/surfsense_backend/app/tasks/document_processors
DESKTOP-RTLN3BA\$punk 640ef5f15d feat(proxy): integrate Scrapling for enhanced web scraping capabilities
- Replaced Playwright with Scrapling's fetchers in the web crawling and YouTube processing modules for improved performance and flexibility.
- Updated proxy configuration to support dynamic proxy selection via environment variables.
- Enhanced logging to track performance metrics during web scraping operations.
- Refactored related modules to utilize the new proxy utilities and streamline the scraping process.
2026-06-09 00:15:10 -07:00
..
__init__.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
_direct_converters.py refactor: improve content extraction and encoding handling 2026-04-16 00:25:46 -07:00
_helpers.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
_save.py feat(backend): Remove LLM summaries from document indexing 2026-06-04 00:50:19 +05:30
base.py chore: ran linting 2026-03-17 04:40:46 +05:30
circleback_processor.py feat(backend): Remove LLM summaries from document indexing 2026-06-04 00:50:19 +05:30
extension_processor.py feat(backend): Remove LLM summaries from document indexing 2026-06-04 00:50:19 +05:30
file_processors.py Merge upstream/dev 2026-06-05 19:18:12 +02:00
markdown_processor.py feat(backend): Remove LLM summaries from document indexing 2026-06-04 00:50:19 +05:30
youtube_processor.py feat(proxy): integrate Scrapling for enhanced web scraping capabilities 2026-06-09 00:15:10 -07:00