Commit graph

11 commits

Author SHA1 Message Date
Anish Sarkar
8455451ce1 chore: ran linting 2026-04-08 05:20:03 +05:30
Anish Sarkar
20fa93f0ba refactor: make Azure Document Intelligence an internal LLAMACLOUD accelerator instead of a standalone ETL service 2026-04-08 03:26:24 +05:30
Anish Sarkar
1fa8d1220b feat: add support for Azure Document Intelligence in ETL pipeline 2026-04-08 00:59:12 +05:30
Anish Sarkar
0a26a6c5bb chore: ran linting 2026-04-07 05:55:39 +05:30
Anish Sarkar
e7beeb2a36 refactor: unify file skipping logic across Dropbox, Google Drive, and OneDrive connectors by replacing classification checks with a centralized service-based approach, enhancing maintainability and consistency in file handling 2026-04-07 02:19:31 +05:30
Anish Sarkar
dc7047f64d refactor: implement file type classification for supported extensions across Dropbox, Google Drive, and OneDrive connectors, enhancing file handling and error management 2026-04-06 22:03:47 +05:30
Anish Sarkar
f40de6b695 feat: add parsers for Docling, LlamaCloud, and Unstructured to ETL pipeline 2026-04-05 17:27:24 +05:30
Anish Sarkar
2824410be2 feat: add plaintext parser to ETL pipeline for reading text files 2026-04-05 17:26:42 +05:30
Anish Sarkar
35582c9389 feat: add direct_convert module to ETL pipeline for file conversion 2026-04-05 17:26:29 +05:30
Anish Sarkar
02fc6f1d16 feat: add audio transcription functionality to ETL pipeline 2026-04-05 17:26:03 +05:30
Anish Sarkar
5d22349dc1 feat: implement ETL pipeline with file classification and extraction services 2026-04-05 17:25:25 +05:30