fix: rotate auth on Anthropic "out of extra usage" 400 (relates to #2599)#3305
Conversation
Anthropic upstream returns subscription-quota exhaustion as
HTTP 400 with body:
{"type":"error","error":{"type":"invalid_request_error",
"message":"You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}}
isRequestInvalidError currently treats every 400 with the
"invalid_request_error" substring as a non-retriable client
shape error. Result: the proxy returns the 400 to the client
without trying any other credential in the auth pool, even
though the failure is per-credential rather than per-request.
This patch:
* isRequestInvalidError returns false for 400 + "out of extra
usage" body so the retry/rotate path fires.
* MarkResult promotes that same 400 to statusCode 429 so the
credential gets cooled like a quota-exceeded one rather than
staying hot for the next request.
Relates to router-for-me#2599 - that issue tracks the cause-side workaround
(payload.default injection so requests count as included usage).
This PR addresses the effect-side rotation gap so a multi-account
pool degrades gracefully when one account hits the wall.
Tested locally with three OAuth pool entries against haiku-4-5
and 55KB chat-completion requests; previously failed with 400,
now succeeds via rotation.
|
This pull request targeted The base branch has been automatically changed to |
There was a problem hiding this comment.
Code Review
This pull request updates the error handling logic in sdk/cliproxy/auth/conductor.go to treat specific Anthropic 400 errors related to subscription quota exhaustion as 429 Too Many Requests. This change ensures that affected credentials are cooled down and rotated rather than returning a terminal error to the client. The review feedback suggests using case-insensitive string comparisons when checking error messages to ensure the logic remains robust against potential changes in upstream error formatting.
| if statusCode == http.StatusBadRequest && result.Error != nil && | ||
| (strings.Contains(result.Error.Message, "out of extra usage") || | ||
| strings.Contains(result.Error.Message, "claude.ai/settings/usage")) { | ||
| statusCode = http.StatusTooManyRequests | ||
| } |
There was a problem hiding this comment.
For robustness against potential changes in upstream error message casing, it is recommended to perform a case-insensitive check. This aligns with the pattern used in isModelSupportErrorMessage elsewhere in this file.
| if statusCode == http.StatusBadRequest && result.Error != nil && | |
| (strings.Contains(result.Error.Message, "out of extra usage") || | |
| strings.Contains(result.Error.Message, "claude.ai/settings/usage")) { | |
| statusCode = http.StatusTooManyRequests | |
| } | |
| if statusCode == http.StatusBadRequest && result.Error != nil && | |
| (strings.Contains(strings.ToLower(result.Error.Message), "out of extra usage") || | |
| strings.Contains(strings.ToLower(result.Error.Message), "claude.ai/settings/usage")) { | |
| statusCode = http.StatusTooManyRequests | |
| } |
| if strings.Contains(msg, "out of extra usage") || | ||
| strings.Contains(msg, "claude.ai/settings/usage") { | ||
| return false | ||
| } |
There was a problem hiding this comment.
As with the check in MarkResult, using strings.ToLower here ensures that the rotation logic remains effective even if the human-readable part of the error message from Anthropic changes its casing.
| if strings.Contains(msg, "out of extra usage") || | |
| strings.Contains(msg, "claude.ai/settings/usage") { | |
| return false | |
| } | |
| if strings.Contains(strings.ToLower(msg), "out of extra usage") || | |
| strings.Contains(strings.ToLower(msg), "claude.ai/settings/usage") { | |
| return false | |
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86eaf39abf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if statusCode == http.StatusBadRequest && result.Error != nil && | ||
| (strings.Contains(result.Error.Message, "out of extra usage") || | ||
| strings.Contains(result.Error.Message, "claude.ai/settings/usage")) { | ||
| statusCode = http.StatusTooManyRequests |
There was a problem hiding this comment.
Cool exhausted Anthropic auths globally
When an Anthropic OAuth account hits this subscription-quota error, the status promotion is applied only to the current model state in this result.Model != "" branch. updateAggregatedAvailability keeps the auth selectable while any other model state is available, and isAuthBlockedForModel checks only the requested model, so an auth that supports multiple Claude models is cooled for just the model that failed; requests for another model, or model-pool fallbacks on the same auth, can keep selecting the same exhausted credential instead of rotating to another credential.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR fixes an auth-rotation gap for Anthropic OAuth credential pools when a single credential hits the “out of extra usage” subscription-quota error (returned as HTTP 400). The change ensures this condition is treated as credential-scoped so the manager can rotate to other healthy credentials and cool down the depleted one.
Changes:
- Exempt Anthropic “out of extra usage” 400 responses from the
isRequestInvalidErrorfast-fail path so rotation/retry can proceed. - In
MarkResult, treat the same 400 body shape like a 429 for cooldown/quota tracking so depleted credentials are temporarily skipped.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Anthropic 400 "out of extra usage" is per-credential | ||
| // subscription quota; treat like 429 so the auth gets cooled | ||
| // rather than returning the error to the client. |
| if statusCode == http.StatusBadRequest && result.Error != nil && | ||
| (strings.Contains(result.Error.Message, "out of extra usage") || | ||
| strings.Contains(result.Error.Message, "claude.ai/settings/usage")) { | ||
| statusCode = http.StatusTooManyRequests | ||
| } |
| // Anthropic OAuth subscription quota exhaustion arrives as 400 | ||
| // invalid_request_error but is per-credential, not per-request. | ||
| // Let it fall through to rotation + cool-down (see MarkResult promotion). | ||
| if strings.Contains(msg, "out of extra usage") || | ||
| strings.Contains(msg, "claude.ai/settings/usage") { | ||
| return false |
luispater
left a comment
There was a problem hiding this comment.
Summary:
- Fixes a rotation gap for Anthropic OAuth subscription-quota exhaustion that surfaces as HTTP 400
invalid_request_errorwith an "out of extra usage" message. - Makes this error credential-scoped (eligible for rotation) and cools the depleted auth like a 429 so it won’t be re-selected immediately.
What changed:
isRequestInvalidError: returns false for HTTP 400 errors whose message containsout of extra usageorclaude.ai/settings/usage, allowing the auth pool to try other credentials.Manager.MarkResult: treats the same 400 body markers as 429 for cooldown/quota bookkeeping.
Non-blocking suggestions:
- Add a small regression test to ensure this 400 pattern does not short-circuit rotation, and that the auth is marked quota-exceeded/cooldown as intended.
- Consider making the stored
LastErrorstatus consistent with the internal 400→429 promotion to ease debugging.
Test plan:
- go build -o test-output ./cmd/server && rm test-output
This is an automated Codex review result and still requires manual verification by a human reviewer.
Summary
Fixes a rotation gap where the proxy returns Anthropic's subscription-quota 400 directly to the client without trying any other credential in the auth pool, even though the failure is per-credential rather than per-request.
Symptom
When an OAuth-backed Claude account in the auth pool runs out of "extra usage", Anthropic returns:
The current
isRequestInvalidErrorclassifier insdk/cliproxy/auth/conductor.gotreats every 400 containing the substringinvalid_request_erroras a client-shape error and short-circuits the retry/rotate loop. Effects:Quota.Exceededis only set on 429 inMarkResult).I observed this with three OAuth pool entries when one account hit the wall: every Hermes-style request (~55KB system prompt) returned a 400 with the message above, with the other two healthy accounts sitting idle.
Patch
Two hunks in
sdk/cliproxy/auth/conductor.go:isRequestInvalidErrorreturnsfalsefor 400 with body containingout of extra usageorclaude.ai/settings/usage. That lets the retry/rotate path take over instead of bailing to the client.MarkResultpromotes that same 400 →http.StatusTooManyRequestsso the credential is cooled like a normal 429 and the existingQuota.Exceededmachinery skips it on subsequent selections.Both checks are scoped narrowly to the two body markers Anthropic uses for this specific error; other 400
invalid_request_errorcases (genuine client-shape errors) keep the existing fast-fail behaviour.Relationship to #2599
#2599 tracks the cause-side workaround — injecting a Claude Code system prompt via
payload.defaultso requests count against included plan usage instead of extra usage, so this 400 is never returned in the first place.This PR is the effect-side fix — when this 400 is returned (because the cause-side workaround isn't enabled, or the user really has exhausted included usage too), a multi-account pool now degrades gracefully instead of failing the request.
The two fixes are complementary, not exclusive.
Test plan
msg_*IDs.invalid_request_errorpaths unchanged.Notes
nextQuotaCooldownwith the auth's currentBackoffLevel.