PageIndex

mirror of https://github.com/VectifyAI/PageIndex.git synced 2026-06-12 19:55:17 +02:00

Author	SHA1	Message	Date
Ray	108cb28518	Move pdf_parser off doc dict, pass via call args	2026-05-11 16:40:32 +08:00
Ray	ec1aaca4c9	Centralize default parser as DEFAULT_PDF_PARSER constant	2026-05-11 16:24:01 +08:00
Ray	1629ef4318	Take pdf_parser out of ConfigLoader, use plain function arg	2026-05-11 16:20:45 +08:00
Ray	9539fe7513	Add pypdfium2 as optional PDF parser Default behavior unchanged. Users can opt in via pdf_parser="pypdfium2" for cleaner text extraction (no broken words, correct Unicode) and 3-5x faster parsing. PyPDF2 remains the only required dependency; pypdfium2 is lazy-imported.	2026-05-11 16:04:07 +08:00
Ray	a108c021ae	Disable agent tracing and auto-add litellm/ prefix for retrieve_model * Disable agent tracing and auto-add litellm/ prefix for retrieve_model * Preserve supported retrieve_model prefixes * Remove temporary retrieve_model tests * Limit tracing disablement to demo execution	2026-03-29 00:55:57 +08:00
Ray	4002dc94de	Rename demo script and update README wording	2026-03-28 04:56:05 +08:00
Ray	77722838e1	Restructure examples directory and improve document storage (#189 ) * Consolidate tests/ into examples/documents/ * Add line_count and reorder structure keys * Lazy-load documents with _meta.json index * Update demo script and add pre-shipped workspace * Extract shared helpers for JSON reading and meta entry building	2026-03-28 04:28:59 +08:00
Kylin	5d4491f3bf	Add PageIndexClient with agent-based retrieval via OpenAI Agents SDK (#125 ) * Add PageIndexClient with retrieve, streaming support and litellm integration * Add OpenAI agents demo example * Update README with example agent demo section * Support separate retrieve_model configuration for index and retrieve	2026-03-26 23:19:50 +08:00

8 commits