* feat:compatible with Pageindex SDK
* corner cases fixed
* fix: mock behavior of old SDK
* fix: close streaming response and warn on empty api_key
- LegacyCloudAPI: close response in `finally` for both _stream_chat_response
variants so abandoned iterators no longer leak the TCP connection.
- PageIndexClient: emit a warning instead of silently falling back to local
when api_key is the empty string, surfacing typical env-var-unset misconfig.
- FakeResponse: add close()/closed to match the real requests.Response API.
- Add unit coverage for stream close (both paths) and the empty-api_key warning.
- Add scripts/e2e_legacy_sdk.py to smoke-test the legacy SDK contract end-to-end
against api.pageindex.ai.
* chore: mark legacy SDK methods with @deprecated and docstring pointers
- Decorate the 12 PageIndexClient cloud-SDK compat methods with
@typing_extensions.deprecated(..., category=PendingDeprecationWarning):
- IDE/type-checkers render them with a strikethrough hint
- runtime warnings stay silent by default (no spam for existing callers),
surfaceable via `python -W default::PendingDeprecationWarning`
- Add a one-line docstring on each pointing to the Collection-based equivalent.
- Promote typing-extensions to a direct dependency (was transitive via litellm).
---------
Co-authored-by: XinyanZhou <xinyanzhou@XinyanZhoudeMacBook-Pro.local>
Co-authored-by: saccharin98 <xinyanzhou938@gmail.com>
Co-authored-by: mountain <kose2livs@gmail.com>
The cloud backend previously polled tree_resp["retrieval_ready"]
as the ready signal. Empirically this flag is not a reliable
indicator — docs can reach status=="completed" without
retrieval_ready flipping, causing col.add() to wait until the 10
min timeout before giving up on otherwise-successful uploads.
The cloud API's canonical ready signal is status=="completed";
switch the poll to check that instead.
* Consolidate tests/ into examples/documents/
* Add line_count and reorder structure keys
* Lazy-load documents with _meta.json index
* Update demo script and add pre-shipped workspace
* Extract shared helpers for JSON reading and meta entry building
* Add PageIndexClient with retrieve, streaming support and litellm integration
* Add OpenAI agents demo example
* Update README with example agent demo section
* Support separate retrieve_model configuration for index and retrieve
* Integrate litellm for multi-provider LLM support
* recover the default config yaml
* Use litellm.acompletion for native async support
* fix tob
* Rename llm_complete/allm_complete to llm_completion/llm_acompletion, remove unused llm_complete_stream
* Pin litellm to version 1.82.0
* resolve comments
* args from cli is used to overrides config.yaml
* Fix get_page_tokens hardcoded model default
Pass opt.model to get_page_tokens so tokenization respects the
configured model instead of always using gpt-4o-2024-11-20.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Remove explicit openai dependency from requirements.txt
openai is no longer directly imported; it comes in as a transitive
dependency of litellm. Pinning it explicitly risks version conflicts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Restore openai==1.101.0 pin in requirements.txt
litellm==1.82.0 and openai-agents have conflicting openai version
requirements, but openai==1.101.0 works at runtime for both.
The pin is necessary to prevent litellm from pulling in openai>=2.x
which would break openai-agents.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Remove explicit openai dependency from requirements.txt
openai is not directly used; it comes in as a transitive dependency
of litellm. No openai-agents in this branch so no pin needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix an litellm error log
* resolve comments
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The loop variable `list_index = page_index - start_index` was
overwriting the outer `list_index = incorrect_item['list_index']`,
causing results to be written back to wrong index positions.
Rename the loop variable to `page_list_idx` to avoid shadowing.
Closes#66
Issues are opened by external users who don't have write permissions.
Add allowed_non_write_users: "*" so claude-code-action runs for all
issue authors, not just repo collaborators.
Backfill workflow triggers issue-dedupe via gh workflow run, which
makes the actor github-actions. Add it to allowed_bots so
claude-code-action accepts the trigger.
- Only retry 403 when rate-limit headers indicate throttling, not permission errors
- Add fetchAllComments() with pagination for issues with 100+ comments
- Add pagination loop in backfill workflow to handle repos with 200+ open issues
Replace the copilot-generated inline search logic with a claude-code-action
based architecture inspired by anthropic/claude-code's approach:
- Add .claude/commands/dedupe.md with 5-parallel-search strategy
- Add scripts/comment-on-duplicates.sh with 3-day grace period warning
- Rewrite issue-dedupe.yml to use claude-code-action + /dedupe command
- Rewrite autoclose script to check bot comments, human activity, and thumbsdown
- Rewrite backfill to trigger dedupe workflow per issue with rate limiting
- Add concurrency control, timeout, input validation, and rate limit retry
- Remove gh.sh (unnecessary), backfill-dedupe.js (replaced by workflow trigger)
The while loop exit condition used len(chat_history), but chat_history
was rebuilt every iteration with exactly 2 elements, making the check
len(chat_history) > 5 never true.
Replace with explicit attempt counter and max_attempts limit.