feat: add automatic client retry logic with exponential backoff
All checks were successful
Publish to PyPI / publish (push) Successful in 16s
All checks were successful
Publish to PyPI / publish (push) Successful in 16s
This commit is contained in:
parent
5040d181d2
commit
93adb6c45c
7 changed files with 87 additions and 66 deletions
|
|
@ -11,7 +11,8 @@ SecureChatCompletion(
|
|||
base_url: str = "https://api.nomyo.ai",
|
||||
allow_http: bool = False,
|
||||
api_key: Optional[str] = None,
|
||||
secure_memory: bool = True
|
||||
secure_memory: bool = True,
|
||||
max_retries: int = 2
|
||||
)
|
||||
```
|
||||
|
||||
|
|
@ -21,6 +22,7 @@ SecureChatCompletion(
|
|||
- `allow_http` (bool): Allow HTTP connections (ONLY for local development, never in production)
|
||||
- `api_key` (Optional[str]): Optional API key for bearer authentication
|
||||
- `secure_memory` (bool): Enable secure memory protection (default: True)
|
||||
- `max_retries` (int): Number of retries on retryable errors (429, 500, 502, 503, 504, network errors). Uses exponential backoff. Default: 2
|
||||
|
||||
### Methods
|
||||
|
||||
|
|
@ -92,13 +94,18 @@ The `SecureCompletionClient` class handles the underlying encryption, key manage
|
|||
### Constructor
|
||||
|
||||
```python
|
||||
SecureCompletionClient(router_url: str = "https://api.nomyo.ai", allow_http: bool = False)
|
||||
SecureCompletionClient(
|
||||
router_url: str = "https://api.nomyo.ai",
|
||||
allow_http: bool = False,
|
||||
max_retries: int = 2
|
||||
)
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
|
||||
- `router_url` (str): Base URL of the NOMYO Router (must use HTTPS for production)
|
||||
- `allow_http` (bool): Allow HTTP connections (ONLY for local development, never in production)
|
||||
- `max_retries` (int): Number of retries on retryable errors (429, 500, 502, 503, 504, network errors). Uses exponential backoff. Default: 2
|
||||
|
||||
### Methods
|
||||
|
||||
|
|
|
|||
|
|
@ -48,20 +48,14 @@ HTTP/1.1 503 Service Unavailable
|
|||
- **Implement exponential backoff** when you receive a `429` response. Start with a short delay (e.g. 500 ms) and double it on each subsequent failure, up to a reasonable maximum.
|
||||
- **Monitor for `503` responses** — repeated occurrences indicate that your usage pattern is triggering the abuse threshold. Refactor your request logic before the cool-down expires.
|
||||
|
||||
## Example: Exponential Backoff
|
||||
## Retry Behaviour
|
||||
|
||||
The client retries automatically on `429`, `500`, `502`, `503`, `504`, and network errors using exponential backoff (1 s, 2 s, …). The default is **2 retries**. You can raise or disable this per client:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import httpx
|
||||
# More retries for high-throughput workloads
|
||||
client = SecureChatCompletion(api_key="...", max_retries=5)
|
||||
|
||||
async def request_with_backoff(client, *args, max_retries=5, **kwargs):
|
||||
delay = 0.5
|
||||
for attempt in range(max_retries):
|
||||
response = await client.create(*args, **kwargs)
|
||||
if response.status_code == 429:
|
||||
await asyncio.sleep(delay)
|
||||
delay = min(delay * 2, 30)
|
||||
continue
|
||||
return response
|
||||
raise RuntimeError("Rate limit exceeded after maximum retries")
|
||||
# Disable retries entirely
|
||||
client = SecureChatCompletion(api_key="...", max_retries=0)
|
||||
```
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue