PageIndex/pageindex
Ray 9539fe7513 Add pypdfium2 as optional PDF parser
Default behavior unchanged. Users can opt in via pdf_parser="pypdfium2"
for cleaner text extraction (no broken words, correct Unicode) and
3-5x faster parsing. PyPDF2 remains the only required dependency;
pypdfium2 is lazy-imported.
2026-05-11 16:04:07 +08:00
..
__init__.py Add PageIndexClient with agent-based retrieval via OpenAI Agents SDK (#125) 2026-03-26 23:19:50 +08:00
client.py Add pypdfium2 as optional PDF parser 2026-05-11 16:04:07 +08:00
config.yaml Add pypdfium2 as optional PDF parser 2026-05-11 16:04:07 +08:00
page_index.py Add pypdfium2 as optional PDF parser 2026-05-11 16:04:07 +08:00
page_index_md.py Restructure examples directory and improve document storage (#189) 2026-03-28 04:28:59 +08:00
retrieve.py Add pypdfium2 as optional PDF parser 2026-05-11 16:04:07 +08:00
utils.py Add pypdfium2 as optional PDF parser 2026-05-11 16:04:07 +08:00