mirror of
https://github.com/VectifyAI/PageIndex.git
synced 2026-06-09 19:45:15 +02:00
Default behavior unchanged. Users can opt in via pdf_parser="pypdfium2" for cleaner text extraction (no broken words, correct Unicode) and 3-5x faster parsing. PyPDF2 remains the only required dependency; pypdfium2 is lazy-imported.
7 lines
257 B
Text
7 lines
257 B
Text
litellm==1.83.7
|
|
# openai-agents # optional: required for examples/agentic_vectorless_rag_demo.py
|
|
pymupdf==1.26.4
|
|
# pypdfium2 # optional: enables pdf_parser="pypdfium2" (cleaner text, faster, Apache 2.0)
|
|
PyPDF2==3.0.1
|
|
python-dotenv==1.2.2
|
|
pyyaml==6.0.2
|