fix: poll status=="completed" in cloud add_document

The cloud backend previously polled tree_resp["retrieval_ready"]
as the ready signal. Empirically this flag is not a reliable
indicator — docs can reach status=="completed" without
retrieval_ready flipping, causing col.add() to wait until the 10
min timeout before giving up on otherwise-successful uploads.

The cloud API's canonical ready signal is status=="completed";
switch the poll to check that instead.
This commit is contained in:
Ray 2026-04-11 01:04:03 +08:00
parent f5de9c9dbb
commit 162b70d825

View file

@ -141,12 +141,13 @@ class CloudBackend:
doc_id = resp["doc_id"]
# Poll until retrieval-ready
# Poll until indexing completes. The cloud API signals readiness via
# status == "completed"; retrieval_ready is not a reliable indicator.
for _ in range(120): # 10 min max
tree_resp = self._request("GET", f"/doc/{self._enc(doc_id)}/", params={"type": "tree"})
if tree_resp.get("retrieval_ready"):
return doc_id
status = tree_resp.get("status", "")
if status == "completed":
return doc_id
if status == "failed":
raise CloudAPIError(f"Document {doc_id} indexing failed")
time.sleep(5)