DESKTOP-RTLN3BA\$punk
0e4285095c
fix: llamaclud v2 impl
2026-04-16 01:15:47 -07:00
DESKTOP-RTLN3BA\$punk
2f793e7a69
refactor: improve content extraction and encoding handling
...
- Enhanced Azure Document Intelligence parser to raise an error for empty or whitespace-only content.
- Updated LLMRouterService to log premium model strings more clearly.
- Added automatic encoding detection for file reading in document processors.
- Improved error handling for empty markdown content extraction in file processors.
- Refactored DocumentUploadTab component for better accessibility and user interaction.
2026-04-16 00:25:46 -07:00
DESKTOP-RTLN3BA\$punk
656e061f84
feat: add processing mode support for document uploads and ETL pipeline, improded error handling ux
...
Build and Push Docker Images / tag_release (push) Waiting to run
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions
- Introduced a `ProcessingMode` enum to differentiate between basic and premium processing modes.
- Updated `EtlRequest` to include a `processing_mode` field, defaulting to basic.
- Enhanced ETL pipeline services to utilize the selected processing mode for Azure Document Intelligence and LlamaCloud parsing.
- Modified various routes and services to handle processing mode, affecting document upload and indexing tasks.
- Improved error handling and logging to include processing mode details.
- Added tests to validate processing mode functionality and its impact on ETL operations.
2026-04-14 21:26:00 -07:00
CREDO23
c30cc08771
Merge upstream/dev into feat/kb-export-and-folder-upload
2026-04-11 10:28:40 +02:00
Anish Sarkar
bb5b90e5bd
fix: update Azure Document Intelligence parser to use prebuilt-layout model
2026-04-10 20:40:54 +05:30
CREDO23
4ccdd80e26
Harden vision LLM fallback, folder upload validation, and export memory
2026-04-09 16:14:53 +02:00
CREDO23
e164fe0612
Fix misleading log when vision LLM fails vs not provided
2026-04-09 15:29:39 +02:00
CREDO23
55661bcde6
Replace mimetypes fallback with explicit extension-to-MIME mapping
2026-04-09 15:21:32 +02:00
CREDO23
71db53fc55
Add 5MB file size guard before base64 encoding for vision LLM
2026-04-09 15:17:08 +02:00
CREDO23
d6c4fb8938
Add try/except fallback in _extract_image for vision LLM failures
2026-04-09 15:11:24 +02:00
CREDO23
caaec2e0a7
Simplify vision LLM image description prompt
2026-04-09 14:56:18 +02:00
CREDO23
7e90a8ed3c
Route uploaded images to vision LLM with document-parser fallback
2026-04-09 14:33:33 +02:00
Anish Sarkar
8455451ce1
chore: ran linting
2026-04-08 05:20:03 +05:30
Anish Sarkar
20fa93f0ba
refactor: make Azure Document Intelligence an internal LLAMACLOUD accelerator instead of a standalone ETL service
2026-04-08 03:26:24 +05:30
Anish Sarkar
1fa8d1220b
feat: add support for Azure Document Intelligence in ETL pipeline
2026-04-08 00:59:12 +05:30
Anish Sarkar
0a26a6c5bb
chore: ran linting
2026-04-07 05:55:39 +05:30
Anish Sarkar
e7beeb2a36
refactor: unify file skipping logic across Dropbox, Google Drive, and OneDrive connectors by replacing classification checks with a centralized service-based approach, enhancing maintainability and consistency in file handling
2026-04-07 02:19:31 +05:30
Anish Sarkar
dc7047f64d
refactor: implement file type classification for supported extensions across Dropbox, Google Drive, and OneDrive connectors, enhancing file handling and error management
2026-04-06 22:03:47 +05:30
Anish Sarkar
f40de6b695
feat: add parsers for Docling, LlamaCloud, and Unstructured to ETL pipeline
2026-04-05 17:27:24 +05:30
Anish Sarkar
2824410be2
feat: add plaintext parser to ETL pipeline for reading text files
2026-04-05 17:26:42 +05:30
Anish Sarkar
35582c9389
feat: add direct_convert module to ETL pipeline for file conversion
2026-04-05 17:26:29 +05:30
Anish Sarkar
02fc6f1d16
feat: add audio transcription functionality to ETL pipeline
2026-04-05 17:26:03 +05:30
Anish Sarkar
5d22349dc1
feat: implement ETL pipeline with file classification and extraction services
2026-04-05 17:25:25 +05:30