-
Notifications
You must be signed in to change notification settings - Fork 26
Description
ai written but based on an error we observed in production logs, edited lightly by me
The WorkOS Python SDK has no retry logic for transient failures. When a WorkOS API call encounters a timeout or transient server error, it fails immediately on the first attempt. This is especially problematic for operations like authenticate_with_refresh_token, which are inherently idempotent and safe to retry.
We hit this in production today. Our call to authenticate_with_refresh_token connected to api.workos.com successfully, but WorkOS never sent response headers. After 25 seconds (the SDK's DEFAULT_REQUEST_TIMEOUT), httpx.ReadTimeout was raised and our user's auth flow was broken.
A single automatic retry would have resolved this transparently.
Current behavior
AsyncHTTPClient.request() in workos/utils/http_client.py makes a single request with no retry:
workos-python/src/workos/utils/http_client.py
Line 237 in 588850d
| response = await self._client.request(**prepared_request_parameters) |
Transient failures — httpx.TimeoutException, httpx.ConnectError, HTTP 429, HTTP 5xx — all fail immediately.
Additionally, these transport-level exceptions are not caught or wrapped in WorkOS exception types, so they bubble up as raw httpx errors. Consumers catching workos.exceptions.BaseRequestException won't catch timeouts.
Expected behavior
Comparison to other auth/identity SDKs:
| SDK | Default retries | Retryable conditions |
|---|---|---|
| Auth0 Python | 2 | 408, 429, 5xx, connection errors |
| AWS SDKs | 3 | 429, 5xx, connection errors |
| WorkOS Python | 0 | None |
A reasonable default would be 2-3 retries with exponential backoff for:
httpx.TimeoutExceptionandhttpx.ConnectError- HTTP 429 (respecting
Retry-Afterheader) - HTTP 500, 502, 503, 504
This could be implemented via httpx's transport-level retry or a simple retry loop in request().
Workaround
Wrapping SDK calls with our own try/except httpx.ReadTimeout, but this requires knowing about httpx internals — the SDK's exception hierarchy should abstract transport errors away.
Environment
workosv5.45.0httpxv0.28.1- Python 3.14
- Async client (
AsyncHTTPClient)