Add PageIndexClient with agent-based retrieval via OpenAI Agents SDK (#125)

* Add PageIndexClient with retrieve, streaming support and litellm integration * Add OpenAI agents demo example * Update README with example agent demo section * Support separate retrieve_model configuration for index and retrieve
2026-04-24 23:56:21 +02:00 · 2026-03-26 23:19:50 +08:00 · 2026-03-26 23:19:50 +08:00 · 5d4491f3bf
commit 5d4491f3bf
parent 2403be8f27
9 changed files with 501 additions and 7 deletions
--- a/README.md
+++ b/README.md
@ -147,15 +147,17 @@ You can follow these steps to generate a PageIndex tree from a PDF document.
 pip3 install --upgrade -r requirements.txt
 ```

-### 2. Set your OpenAI API key
+### 2. Set your LLM API key

-Create a `.env` file in the root directory and add your API key:
+Create a `.env` file in the root directory with your LLM API key::

 ```bash
-CHATGPT_API_KEY=your_openai_key_here
+OPENAI_API_KEY=your_openai_key_here
+# or
+CHATGPT_API_KEY=your_openai_key_here  # legacy, still supported
 ```

-### 3. Run PageIndex on your PDF
+### 3. Generate PageIndex structure for your PDF

 ```bash
 python3 run_pageindex.py --pdf_path /path/to/your/document.pdf
@ -189,7 +191,21 @@ python3 run_pageindex.py --md_path /path/to/your/document.md
 > Note: in this function, we use "#" to determine node heading and their levels. For example, "##" is level 2, "###" is level 3, etc. Make sure your markdown file is formatted correctly. If your Markdown file was converted from a PDF or HTML, we don't recommend using this function, since most existing conversion tools cannot preserve the original hierarchy. Instead, use our [PageIndex OCR](https://pageindex.ai/blog/ocr), which is designed to preserve the original hierarchy, to convert the PDF to a markdown file and then use this function.
 </details>

-<!-- 
+### A Complete Agentic RAG Example
+
+For a complete agent-based QA example using the [OpenAI Agents SDK](https://github.com/openai/openai-agents-python), see [`examples/openai_agents_demo.py`](examples/openai_agents_demo.py).
+
+```bash
+# Install optional dependency
+pip3 install openai-agents
+
+# Run the demo
+python3 examples/openai_agents_demo.py
+```
+
+---
+
+<!--
 # ☁️ Improved Tree Generation with PageIndex OCR

 This repo is designed for generating PageIndex tree structure for simple PDFs, but many real-world use cases involve complex PDFs that are hard to parse by classic Python tools. However, extracting high-quality text from PDF documents remains a non-trivial challenge. Most OCR tools only extract page-level content, losing the broader document context and hierarchy.