Skip to content

fix: rotate auth on Anthropic "out of extra usage" 400 (relates to #2599)#3305

Open
anon2023-Halmoni wants to merge 1 commit intorouter-for-me:devfrom
anon2023-Halmoni:fix/rotate-on-anthropic-extra-usage-400
Open

fix: rotate auth on Anthropic "out of extra usage" 400 (relates to #2599)#3305
anon2023-Halmoni wants to merge 1 commit intorouter-for-me:devfrom
anon2023-Halmoni:fix/rotate-on-anthropic-extra-usage-400

Conversation

@anon2023-Halmoni
Copy link
Copy Markdown

Summary

Fixes a rotation gap where the proxy returns Anthropic's subscription-quota 400 directly to the client without trying any other credential in the auth pool, even though the failure is per-credential rather than per-request.

Symptom

When an OAuth-backed Claude account in the auth pool runs out of "extra usage", Anthropic returns:

HTTP/1.1 400 Bad Request
Content-Type: application/json

{"type":"error","error":{"type":"invalid_request_error","message":"You're out of extra usage. Add more at claude.ai/settings/usage and keep going."}}

The current isRequestInvalidError classifier in sdk/cliproxy/auth/conductor.go treats every 400 containing the substring invalid_request_error as a client-shape error and short-circuits the retry/rotate loop. Effects:

  1. The depleted credential is never marked exhausted (Quota.Exceeded is only set on 429 in MarkResult).
  2. The next request from the pool can pick the same depleted credential and fail again.
  3. Other healthy credentials in the pool are never tried.

I observed this with three OAuth pool entries when one account hit the wall: every Hermes-style request (~55KB system prompt) returned a 400 with the message above, with the other two healthy accounts sitting idle.

Patch

Two hunks in sdk/cliproxy/auth/conductor.go:

  1. isRequestInvalidError returns false for 400 with body containing out of extra usage or claude.ai/settings/usage. That lets the retry/rotate path take over instead of bailing to the client.
  2. MarkResult promotes that same 400 → http.StatusTooManyRequests so the credential is cooled like a normal 429 and the existing Quota.Exceeded machinery skips it on subsequent selections.

Both checks are scoped narrowly to the two body markers Anthropic uses for this specific error; other 400 invalid_request_error cases (genuine client-shape errors) keep the existing fast-fail behaviour.

Relationship to #2599

#2599 tracks the cause-side workaround — injecting a Claude Code system prompt via payload.default so requests count against included plan usage instead of extra usage, so this 400 is never returned in the first place.

This PR is the effect-side fix — when this 400 is returned (because the cause-side workaround isn't enabled, or the user really has exhausted included usage too), a multi-account pool now degrades gracefully instead of failing the request.

The two fixes are complementary, not exclusive.

Test plan

  • Built locally with three OAuth pool entries.
  • Confirmed previous behaviour: 400 returned to client without rotation (depleted account, single-credential pool — request fails outright).
  • Confirmed patched behaviour: with one depleted + two healthy credentials in pool, requests succeed via rotation. Hermes-style 55KB chat-completion requests that previously failed with 400 now return 200 with valid msg_* IDs.
  • No tests broken — only narrowed an existing classifier; existing 400 invalid_request_error paths unchanged.

Notes

  • This affects OAuth-backed Claude accounts specifically (where the proxy is acting as a token broker for a Claude subscription). API-key-backed access doesn't return this body shape.
  • The cooldown duration on the promoted 429 follows the existing 429 path — nextQuotaCooldown with the auth's current BackoffLevel.

Anthropic upstream returns subscription-quota exhaustion as
HTTP 400 with body:

  {"type":"error","error":{"type":"invalid_request_error",
   "message":"You're out of extra usage. Add more at
   claude.ai/settings/usage and keep going."}}

isRequestInvalidError currently treats every 400 with the
"invalid_request_error" substring as a non-retriable client
shape error. Result: the proxy returns the 400 to the client
without trying any other credential in the auth pool, even
though the failure is per-credential rather than per-request.

This patch:
* isRequestInvalidError returns false for 400 + "out of extra
  usage" body so the retry/rotate path fires.
* MarkResult promotes that same 400 to statusCode 429 so the
  credential gets cooled like a quota-exceeded one rather than
  staying hot for the next request.

Relates to router-for-me#2599 - that issue tracks the cause-side workaround
(payload.default injection so requests count as included usage).
This PR addresses the effect-side rotation gap so a multi-account
pool degrades gracefully when one account hits the wall.

Tested locally with three OAuth pool entries against haiku-4-5
and 55KB chat-completion requests; previously failed with 400,
now succeeds via rotation.
Copilot AI review requested due to automatic review settings May 10, 2026 05:41
@github-actions
Copy link
Copy Markdown

This pull request targeted main.

The base branch has been automatically changed to dev.

@github-actions github-actions Bot changed the base branch from main to dev May 10, 2026 05:41
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the error handling logic in sdk/cliproxy/auth/conductor.go to treat specific Anthropic 400 errors related to subscription quota exhaustion as 429 Too Many Requests. This change ensures that affected credentials are cooled down and rotated rather than returning a terminal error to the client. The review feedback suggests using case-insensitive string comparisons when checking error messages to ensure the logic remains robust against potential changes in upstream error formatting.

Comment on lines +2188 to +2192
if statusCode == http.StatusBadRequest && result.Error != nil &&
(strings.Contains(result.Error.Message, "out of extra usage") ||
strings.Contains(result.Error.Message, "claude.ai/settings/usage")) {
statusCode = http.StatusTooManyRequests
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For robustness against potential changes in upstream error message casing, it is recommended to perform a case-insensitive check. This aligns with the pattern used in isModelSupportErrorMessage elsewhere in this file.

Suggested change
if statusCode == http.StatusBadRequest && result.Error != nil &&
(strings.Contains(result.Error.Message, "out of extra usage") ||
strings.Contains(result.Error.Message, "claude.ai/settings/usage")) {
statusCode = http.StatusTooManyRequests
}
if statusCode == http.StatusBadRequest && result.Error != nil &&
(strings.Contains(strings.ToLower(result.Error.Message), "out of extra usage") ||
strings.Contains(strings.ToLower(result.Error.Message), "claude.ai/settings/usage")) {
statusCode = http.StatusTooManyRequests
}

Comment on lines +2600 to +2603
if strings.Contains(msg, "out of extra usage") ||
strings.Contains(msg, "claude.ai/settings/usage") {
return false
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

As with the check in MarkResult, using strings.ToLower here ensures that the rotation logic remains effective even if the human-readable part of the error message from Anthropic changes its casing.

Suggested change
if strings.Contains(msg, "out of extra usage") ||
strings.Contains(msg, "claude.ai/settings/usage") {
return false
}
if strings.Contains(strings.ToLower(msg), "out of extra usage") ||
strings.Contains(strings.ToLower(msg), "claude.ai/settings/usage") {
return false
}

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86eaf39abf

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +2188 to +2191
if statusCode == http.StatusBadRequest && result.Error != nil &&
(strings.Contains(result.Error.Message, "out of extra usage") ||
strings.Contains(result.Error.Message, "claude.ai/settings/usage")) {
statusCode = http.StatusTooManyRequests
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cool exhausted Anthropic auths globally

When an Anthropic OAuth account hits this subscription-quota error, the status promotion is applied only to the current model state in this result.Model != "" branch. updateAggregatedAvailability keeps the auth selectable while any other model state is available, and isAuthBlockedForModel checks only the requested model, so an auth that supports multiple Claude models is cooled for just the model that failed; requests for another model, or model-pool fallbacks on the same auth, can keep selecting the same exhausted credential instead of rotating to another credential.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an auth-rotation gap for Anthropic OAuth credential pools when a single credential hits the “out of extra usage” subscription-quota error (returned as HTTP 400). The change ensures this condition is treated as credential-scoped so the manager can rotate to other healthy credentials and cool down the depleted one.

Changes:

  • Exempt Anthropic “out of extra usage” 400 responses from the isRequestInvalidError fast-fail path so rotation/retry can proceed.
  • In MarkResult, treat the same 400 body shape like a 429 for cooldown/quota tracking so depleted credentials are temporarily skipped.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2185 to +2187
// Anthropic 400 "out of extra usage" is per-credential
// subscription quota; treat like 429 so the auth gets cooled
// rather than returning the error to the client.
Comment on lines +2188 to +2192
if statusCode == http.StatusBadRequest && result.Error != nil &&
(strings.Contains(result.Error.Message, "out of extra usage") ||
strings.Contains(result.Error.Message, "claude.ai/settings/usage")) {
statusCode = http.StatusTooManyRequests
}
Comment on lines +2597 to +2602
// Anthropic OAuth subscription quota exhaustion arrives as 400
// invalid_request_error but is per-credential, not per-request.
// Let it fall through to rotation + cool-down (see MarkResult promotion).
if strings.Contains(msg, "out of extra usage") ||
strings.Contains(msg, "claude.ai/settings/usage") {
return false
Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:

  • Fixes a rotation gap for Anthropic OAuth subscription-quota exhaustion that surfaces as HTTP 400 invalid_request_error with an "out of extra usage" message.
  • Makes this error credential-scoped (eligible for rotation) and cools the depleted auth like a 429 so it won’t be re-selected immediately.

What changed:

  • isRequestInvalidError: returns false for HTTP 400 errors whose message contains out of extra usage or claude.ai/settings/usage, allowing the auth pool to try other credentials.
  • Manager.MarkResult: treats the same 400 body markers as 429 for cooldown/quota bookkeeping.

Non-blocking suggestions:

  • Add a small regression test to ensure this 400 pattern does not short-circuit rotation, and that the auth is marked quota-exceeded/cooldown as intended.
  • Consider making the stored LastError status consistent with the internal 400→429 promotion to ease debugging.

Test plan:

  • go build -o test-output ./cmd/server && rm test-output

This is an automated Codex review result and still requires manual verification by a human reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants