PageIndex/requirements.txt at 4dec4d66a98d249af7d324420ce568f8cecfb1be - apunkt/PageIndex - bitfreedom.net: free all bits, everywhere

apunkt/PageIndex

mirror of https://github.com/VectifyAI/PageIndex.git synced 2026-06-09 19:45:15 +02:00

Ray 9539fe7513 Add pypdfium2 as optional PDF parser

Default behavior unchanged. Users can opt in via pdf_parser="pypdfium2"
for cleaner text extraction (no broken words, correct Unicode) and
3-5x faster parsing. PyPDF2 remains the only required dependency;
pypdfium2 is lazy-imported.

2026-05-11 16:04:07 +08:00

7 lines

257 B

Text

Raw Blame History

 litellm==1.83.7
 # openai-agents  # optional: required for examples/agentic_vectorless_rag_demo.py
 pymupdf==1.26.4
 # pypdfium2     # optional: enables pdf_parser="pypdfium2" (cleaner text, faster, Apache 2.0)
 PyPDF2==3.0.1
 python-dotenv==1.2.2
 pyyaml==6.0.2