Commit graph

42 commits

Author SHA1 Message Date
Ray
a108c021ae
Disable agent tracing and auto-add litellm/ prefix for retrieve_model
* Disable agent tracing and auto-add litellm/ prefix for retrieve_model

* Preserve supported retrieve_model prefixes

* Remove temporary retrieve_model tests

* Limit tracing disablement to demo execution
2026-03-29 00:55:57 +08:00
Ray
4002dc94de Rename demo script and update README wording 2026-03-28 04:56:05 +08:00
Ray
77722838e1
Restructure examples directory and improve document storage (#189)
* Consolidate tests/ into examples/documents/

* Add line_count and reorder structure keys

* Lazy-load documents with _meta.json index

* Update demo script and add pre-shipped workspace

* Extract shared helpers for JSON reading and meta entry building
2026-03-28 04:28:59 +08:00
Kylin
5d4491f3bf
Add PageIndexClient with agent-based retrieval via OpenAI Agents SDK (#125)
* Add PageIndexClient with retrieve, streaming support and litellm integration
* Add OpenAI agents demo example
* Update README with example agent demo section
* Support separate retrieve_model configuration for index and retrieve
2026-03-26 23:19:50 +08:00
Kylin
2403be8f27
Integrate LiteLLM for multi-provider LLM support (#168)
* Integrate litellm for multi-provider LLM support

* recover the default config yaml

* Use litellm.acompletion for native async support

* fix tob

* Rename llm_complete/allm_complete to llm_completion/llm_acompletion, remove unused llm_complete_stream

* Pin litellm to version 1.82.0

* resolve comments

* args from cli is used to overrides config.yaml

* Fix get_page_tokens hardcoded model default

Pass opt.model to get_page_tokens so tokenization respects the
configured model instead of always using gpt-4o-2024-11-20.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove explicit openai dependency from requirements.txt

openai is no longer directly imported; it comes in as a transitive
dependency of litellm. Pinning it explicitly risks version conflicts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Restore openai==1.101.0 pin in requirements.txt

litellm==1.82.0 and openai-agents have conflicting openai version
requirements, but openai==1.101.0 works at runtime for both.
The pin is necessary to prevent litellm from pulling in openai>=2.x
which would break openai-agents.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove explicit openai dependency from requirements.txt

openai is not directly used; it comes in as a transitive dependency
of litellm. No openai-agents in this branch so no pin needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix an litellm error log

* resolve comments

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 18:47:07 +08:00
BukeLy
85f17f9955 Fix list_index variable shadowing in fix_incorrect_toc
The loop variable `list_index = page_index - start_index` was
overwriting the outer `list_index = incorrect_item['list_index']`,
causing results to be written back to wrong index positions.

Rename the loop variable to `page_list_idx` to avoid shadowing.

Closes #66
2026-03-16 14:19:51 +08:00
Bukely_
599d2ce497
Merge pull request #65 from luojiyin1987/fix/extract-toc-infinite-loop
fix: prevent infinite loop in extract_toc_content
2026-03-16 13:34:05 +08:00
Bukely_
b487f9d7c7
Merge pull request #63 from luojiyin1987/fix/api-error-return
fix: make ChatGPT_API_with_finish_reason return consistent tuple
2026-03-16 13:24:54 +08:00
Matias Insaurralde
cf52a678a3
fix: rename tob_extractor_prompt typo to toc_extractor_prompt (#109)
Signed-off-by: Matías Insaurralde <matias@insaurral.de>
2026-02-27 15:16:19 +08:00
luojiyin
ac9ceaf2ee
fix: prevent infinite loop in extract_toc_content
The while loop exit condition used len(chat_history), but chat_history
was rebuilt every iteration with exactly 2 elements, making the check
len(chat_history) > 5 never true.

Replace with explicit attempt counter and max_attempts limit.
2026-01-19 12:34:39 +08:00
luojiyin
87962b4d42
fix: make ChatGPT_API_with_finish_reason return consistent tuple
Signed-off-by: luojiyin <luojiyin@hotmail.com>
2026-01-19 12:27:35 +08:00
Ray
33ec9aca6e by default add node summary 2025-09-01 14:06:57 +08:00
zmtomorrow
d799a0cf67 firx print_toc 2025-08-31 11:52:59 +01:00
zmtomorrow
d4c910ad5a firx print_toc 2025-08-31 11:51:50 +01:00
Ray
987f91c3bf fix params 2025-08-28 13:22:43 +08:00
Ray
4b4ae4d51d fix model 2025-08-28 13:07:15 +08:00
Ray
6d1b505541 fix params 2025-08-28 12:45:39 +08:00
zmtomorrow
277bcb9512 filter code 2025-08-27 15:22:44 +01:00
zmtomorrow
fdcbed54f1 Merge branch 'feat/markdown-tree' of github.com:VectifyAI/PageIndex into feat/markdown-tree 2025-08-27 15:12:55 +01:00
zmtomorrow
edd1be353c filter code 2025-08-27 15:12:45 +01:00
Ray
ad6a926fce add markdown runner 2025-08-26 21:27:40 +08:00
zmtomorrow
82ad5e2651 add markdown_to_tree 2025-08-26 12:18:06 +01:00
zmtomorrow
fd70ef1acf add markdown_to_tree 2025-08-26 12:17:21 +01:00
zmtomorrow
78cce56b33 add markdown_to_tree 2025-08-26 12:17:05 +01:00
Ray
c22778f85d fix node summary 2025-08-26 16:30:12 +08:00
Ray
19faaad74f fix format 2025-08-26 16:17:05 +08:00
Ray
34ed3fbc60 fix structure 2025-08-26 16:14:29 +08:00
Ray
802f149dd1 add summary 2025-08-26 15:49:03 +08:00
zmtomorrow
04bbdae647 add markdown_to_tree 2025-08-25 20:57:44 +01:00
Ray
d9568ff4ed use async with for client calls 2025-06-26 01:37:20 +08:00
Ray
6d6e92d56e consolidate async calls 2025-06-25 15:41:29 +08:00
Ray
991324efed
Merge pull request #21 from clarenceluo78/main
Fix: handle TOC items exceeding document length
2025-06-01 14:56:10 +08:00
Ray
ad65e3f19c fix option for adding node text 2025-05-30 14:02:38 +08:00
clarenceluo78
1679600c9a fix: handle TOC items exceeding document length 2025-05-30 03:03:20 +01:00
Ray
57ea07da3d fix arg parsing 2025-04-23 17:10:54 +08:00
Ray
7b890a16fa fix index range 2025-04-20 07:59:20 +08:00
Ray
b588cd62a1 add async. various fixes. 2025-04-20 07:57:07 +08:00
zmtomorrow
5aef9b4a49 fix physical index 2025-04-18 17:01:02 +08:00
Ray
4ba12cf3ce fix pdf name 2025-04-09 19:37:00 +08:00
Ray
8a7f8132cf fix yaml 2025-04-09 17:30:36 +08:00
Ray
63024b1d7b fix typo 2025-04-07 14:56:18 +08:00
Ray
403a7a4f54 Restructure and update CLI entry point to run_pageindex.py 2025-04-06 21:49:16 +08:00