feat(access): per-api-key model allowlist with glob matching (v2, review fixes) by mtuckerb · Pull Request #2914 · router-for-me/CLIProxyAPI

mtuckerb · 2026-04-19T18:03:01Z

Summary

Re-submission of #2911 against `dev` (auto-retarget CI closed the original).
Includes the three fixes requested in the review.

The dashboard already stores `allowedModels` per UserApiKey and tries to push them to `PUT /v0/management/api-keys`, but upstream had nowhere to land them — so no key was ever actually restricted. This PR wires enforcement at the request-middleware layer.

Feature (original commit)

`APIKeyPolicy{Key, AllowedModels}` on `SDKConfig`; `APIKeyDefaultPolicy` (`allow-all` default, `deny-all` opt-in) so existing deployments are unaffected.
`IsModelAllowedForKey` uses `path.Match` globs and tolerates a single leading `/` segment so policies stay portable across prefixed credentials.
`PutAPIKeys` accepts both legacy string list and structured `[{key, allowedModels}]`; `GetAPIKeys` surfaces `api-key-policies`.
`ModelACLMiddleware` extracts the target model from the JSON body or the Gemini-style `/v1beta/models/:` path and enforces the per-key policy; wired into both `/v1` and `/v1beta` groups after `AuthMiddleware`.

Review fixes (second commit)

DoS hardening in ModelACLMiddleware. Body read is now capped at 10 MiB via `io.LimitReader` (and short-circuited on `Content-Length`); oversize requests get HTTP 413 instead of silently buffering. Non-size read errors fail closed with 400 rather than bypassing the policy check.
O(1) policy lookup. `APIKeyPolicies` is backed by an `atomic.Pointer[map[string]*APIKeyPolicy]` cache, lazily built via CompareAndSwap, safe under concurrent reads. Adds `SetAPIKeyPolicies` (atomic replace + invalidate) and `InvalidatePolicyIndex`.
Policy rotation/deletion sync. `PatchAPIKeys` keeps `APIKeyPolicies` in sync on rename (via `old`/`new`) and rotation (via `index`/`value`); `DeleteAPIKeys` drops matching policy rows so a revoked credential cannot leave its allowlist behind; `PutAPIKeys` routes through `SetAPIKeyPolicies` for atomic cache refresh.

Test plan

`go build ./...` clean
`go test ./...` green on `dev` + this branch
New middleware tests: `OversizedBodyRejectedWith413`, `OversizedBodyRejectedViaContentLength`
New cache tests: setter-rebuilds, in-place-edit+invalidate, nil-safe, concurrent reads, defensive-copy, empty-clears
New handler tests: rename-via-old-new, rename-via-index-value, rename-with-existing-target, append-new, delete-by-value, delete-by-index, last-row-clears, PUT-rebuilds-policies

Replaces #2911.

Adds enforcement for per-client-key model ACLs at the request middleware layer. The dashboard already stores allowedModels per UserApiKey and attempts to push them to PUT /v0/management/api-keys, but upstream until now had nowhere to land them — the structured payload was rejected and the dashboard's fallback silently dropped the policy data, so no key was ever actually restricted. config: - Add APIKeyPolicy{Key, AllowedModels} and SDKConfig.APIKeyPolicies - Add SDKConfig.APIKeyDefaultPolicy ("allow-all" | "deny-all"; defaults to allow-all so existing deployments are unaffected) - IsModelAllowedForKey supports path.Match globs and tolerates a single leading "<prefix>/" segment so policies stay portable across prefixed credentials management API: - PutAPIKeys now accepts both the legacy string list and the structured [{key, allowedModels}] form, with allowedModels / allowed-models / allowed_models all recognized so existing dashboard payloads work - GetAPIKeys exposes api-key-policies when any are configured middleware: - New ModelACLMiddleware extracts the targeted model from the JSON body ("model" field) or the Gemini-style /v1beta/models/<model>:<action> path, then enforces the per-key policy. The body is read once and restored via io.NopCloser so downstream handlers see the original request bytes - Wired into both /v1 and /v1beta groups directly after AuthMiddleware - Bypasses any request that has no identified api key (preserving the legacy "no auth provider configured => allow" behavior of AuthMiddleware) and any request shape with no extractable model (listings, websocket upgrades) so discovery endpoints keep working tests: - Unit tests for IsModelAllowedForKey covering allow-all, deny-all, empty allowed list semantics, exact matches, glob matches, and the prefixed-credential case - Middleware tests covering the allow / deny / unknown-key / Gemini path / list-endpoint / body-pass-through paths - Parser tests for the management PutAPIKeys payload covering the legacy plain list, the wrapped items form, and the structured form in all three case styles

Three changes landed after PR router-for-me#2911 review: middleware (DoS hardening): - Cap request body inspected by ModelACLMiddleware at 10 MiB via io.LimitReader; requests above the cap are rejected with HTTP 413 ("request_too_large") instead of being buffered into memory - Short-circuit on Content-Length when the client declares an oversized body, so we bail before reading any bytes - Any non-size read error now fails closed with HTTP 400 rather than silently allowing the request past the policy check - New tests cover both the streamed-oversize and Content-Length paths config (O(1) policy lookup + cache invalidation): - SDKConfig.APIKeyPolicies is now backed by an atomic.Pointer cache (map[string]*APIKeyPolicy) so IsModelAllowedForKey is O(1) per call instead of O(N) linear scan; the cache is lazily rebuilt via CompareAndSwap and safe under concurrent reads - Add SetAPIKeyPolicies (atomic replace + invalidate) and InvalidatePolicyIndex (manual flush after in-place edits) - Tests exercise mutation visibility, nil-safe receivers, concurrent access, and defensive-copy semantics management handlers (policy rotation/deletion): - PatchAPIKeys gains a dedicated handler that keeps APIKeyPolicies in sync on rename (via old/new) and rotation (via index/value); when the target key already has a policy the source row is dropped so there are no duplicate entries - DeleteAPIKeys removes the matching APIKeyPolicy rows so a revoked credential cannot leave its allowlist behind; slice is nilled when empty - PutAPIKeys routes through SetAPIKeyPolicies so the cache is refreshed atomically with the slice swap - Tests cover rename-with-existing-target, append-new, delete-by-value, delete-by-index, last-row-clears, and PUT-rebuilds-policies No behavior change for existing deployments that do not configure api-key-policies. All existing tests still pass.

gemini-code-assist

Code Review

This pull request introduces per-API-key model access control (ACL) enforcement. It adds a new ModelACLMiddleware that validates requested models against configured policies for both OpenAI and Gemini-style endpoints. The configuration system is updated to support structured API key policies with glob-based model matching and a lazy-loaded, atomic cache for efficient lookups. Management handlers for API keys are enhanced to keep these policies in sync during creation, rotation, and deletion. Feedback highlights potential data races in the management handlers due to missing synchronization, unsafe slice reuse that could lead to stale cache reads, and memory pressure concerns regarding the buffering of large request bodies in the ACL middleware.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b792cc2c49

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

- config_lists.go: PutAPIKeys, PatchAPIKeys, and DeleteAPIKeys now acquire h.mu before mutating h.cfg and use persistLocked so the mutation and the subsequent config write happen atomically from the perspective of readers (middleware goroutines). Previously the mutations raced with ModelACLMiddleware evaluating the same fields on concurrent requests. Matches the existing pattern in PutGeminiKeys / PatchGeminiKey. (HIGH review comment) - config_lists.go: renameAPIKeyPolicy no longer reslices the existing APIKeyPolicies backing array via "[:0]". That was unsafe because SDKConfig.policyIndex caches *APIKeyPolicy into the same backing array; a concurrent reader resolving the cache mid-write would observe a partially-mutated entry. Allocate a fresh slice so any still-cached pointers keep pointing at the pre-mutation entries until InvalidatePolicyIndex forces a refresh. (MEDIUM review comment) - model_acl.go: extractRequestedModel now peeks the first 16 KiB of the request body before committing to buffering the full cap. In realistic chat-completion payloads "model" is always near the top of the JSON, so the peek alone is usually sufficient and the middleware allocates ~16 KiB per request instead of up to modelACLMaxBodyBytes (10 MiB). When the peek does not contain "model", the middleware falls back to the prior bounded read and still enforces the cap. Under concurrency this is the difference between a bounded constant footprint and N*10 MiB. (MEDIUM review comment) - model_acl_test.go: two new tests lock in peek-first behavior — one covers a large body with "model" in the peek (must extract, enforce, and preserve the full downstream body byte-for-byte), the other covers "model" past the peek but within the cap (must still extract via the fallback read).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6bbea94c1d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

The previous extractor for /v1beta/models/<model>:<action> paths truncated at the first '/' inside <model>. That broke deployments using force-model-prefix, where the routed model identifier is literally "<prefix>/<model>" (e.g. "teamA/gemini-3-pro"). The GeminiHandler forwards the whole segment-before-":" as the model, and IsModelAllowedForKey already strips a single leading "<prefix>/" before glob matching, so the ACL extractor must mirror that and forward the full identifier including embedded slashes. Otherwise: - allowed requests were rejected (the ACL saw "teamA" instead of "teamA/gemini-3-pro") - policies scoped to just the prefix segment could unintentionally allow unrelated models sharing the prefix Added TestModelACLMiddleware_GeminiPathPreservesPrefix which uses the same AllowedModels pattern against both an allowed and a disallowed model under the same prefix — a regression would surface as a false-positive on either the allow path or the deny path.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d34ee9200

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Two P1s from the Codex reviewer on the prior commit: 1. Websocket ACL bypass /v1/responses accepts a websocket upgrade, and the model identifier is only selected later in frames consumed by the ResponsesWebsocket handler. extractRequestedModel returns found=false for the upgrade request, and the middleware was falling through as "no model here, allow", letting a restricted api key use any model through the websocket path. Fix: after identifying the key but before attempting body extraction, detect Upgrade: websocket and reject the upgrade when keyHasModelRestriction(cfg, apiKey) is true. A key is considered restricted when either it has an APIKeyPolicy entry with a non-empty AllowedModels list, or the deployment default is deny-all and the key has no matching policy. Unrestricted keys still pass through so the legacy websocket flow keeps working for them. 2. Unbounded drain after 413 After detecting the body exceeded modelACLMaxBodyBytes, the code called io.Copy(io.Discard, c.Request.Body), which for a chunked/streamed request without a trustworthy Content-Length can hold the handler goroutine indefinitely — an attacker just keeps streaming bytes and the ACL check turns into a request-slot exhaustion path. Fix: drop the drain entirely. Return 413 and let net/http close the connection. No keep-alive concern since we're rejecting the request. Tests added: - WebsocketUpgradeRejectedForRestrictedKey: key with AllowedModels attempting WS upgrade gets 403, handler never runs. - WebsocketUpgradeAllowedForUnrestrictedKey: key with no policy under allow-all default still gets through. - WebsocketUpgradeRejectedUnderDenyAllDefault: deny-all default rejects WS upgrades for keys with no matching policy. - OversizedBodyDoesNotDrainRemainder: uses a reader that blocks after its prefix to prove the middleware returns 413 promptly (within 2s) instead of reading to EOF.

Two P2s from the Codex reviewer, both rooted in the same legacy edge case: APIKeys can hold duplicate values, and the policy helpers (renameAPIKeyPolicy, removeAPIKeyPolicy) are global — they rewrite or drop the policy row by value, not by index. That opened two ways for a duplicate to survive a patch/delete without its intended restriction: 1. PatchAPIKeys old/new path The loop broke at the first match. If APIKeys was ["sk-dup", "sk-other", "sk-dup"] and the caller did {"old":"sk-dup","new":"sk-new"}, one slot remained "sk-dup" while the policy migrated to "sk-new", so the leftover "sk-dup" fell back to the default policy (usually allow-all). Fix: remove the break — rename every duplicate so nothing is left behind that the global policy migration would orphan. 2. PatchAPIKeys index/value path Similar shape: rewriting one slot while the old value still appears elsewhere, then moving the policy to the new value, left the surviving duplicate unrestricted. Fix: only migrate the policy when the old value no longer appears anywhere in APIKeys. When duplicates remain, the policy stays attached to the old value; the new value falls back to the default until the caller sets an explicit policy via PUT /api-keys. 3. DeleteAPIKeys indexed path Same pattern — removed one slot, then dropped the policy row unconditionally, which orphaned surviving duplicates. Fix: only drop the policy when no other entries still carry the removed value. Added apiKeyValuePresent() helper so the three call sites share one membership check. Tests added: - PatchAPIKeys_RenameViaOldNewRenamesAllDuplicates: three-slot fixture with two duplicates proves all matching slots are renamed and the migrated policy still applies. - PatchAPIKeys_IndexedReplaceLeavesPolicyWhenDuplicateRemains: two-duplicate fixture proves the surviving slot keeps its policy (asserts both positively — still allows the permitted model — and negatively — still rejects models outside the allowlist). - DeleteAPIKeys_ByIndexKeepsPolicyWhenDuplicateRemains: same shape for the delete path.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e48e63fb12

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Codex/ChatGPT reasoning-model upstreams (the gpt-5.x family) emit assistant text only via response.output_text.delta events; the final response.completed event has no output item with content[].type == "output_text". The codex executor's non-streaming path filtered out every line except response.completed and passed it straight to the chat-completions translator, which then surfaced a message with content == null while completion_tokens was non-zero. Clients that read .choices[0].message.content saw the field as null and rendered fallback text such as "Something went wrong." even though the model had generated the requested response. Fix: walk the SSE body once, accumulate text from response.output_text.delta events, and synthesize a message item containing the accumulated text into response.output[] before passing the event to the translator. The accumulator is a no-op for non-reasoning models that already populate the message item themselves (codexResponseHasOutputText guards against double-injection). Tests cover codexResponseHasOutputText across the empty/missing/populated shapes, injectCodexOutputText for fresh and pre-populated output arrays, and a representative end-to-end gpt-5.x SSE body to verify the loop accumulates the deltas and produces the expected post-injection event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…4-24

luispater

Summary

Adds per-API-key model allowlisting (glob matching) with management API support for structured allowedModels payloads, and enforces it via a new ModelACLMiddleware on /v1 and /v1beta.

Blocking

Alias bypass: The middleware is wired on /v1 and /v1beta only. The /backend-api/codex/* alias group also uses AuthMiddleware but is not protected, so restricted keys can bypass allowlists by calling the alias endpoints.
Body buffering on non-JSON: extractRequestedModel reads and buffers the full request body for any POST/PUT/PATCH to look for "model". When ACLs are enabled, this will also buffer non-JSON/multipart requests like POST /v1/images/edits, which can be large. Suggest gating body reads to JSON content-types (or a small allowlist of JSON routes), and/or extracting model from multipart form fields for image edit requests.
CI: retarget is failing because a separate PR (#2911) from the same head branch cannot be retargeted to dev (duplicate base/head). This likely needs the duplicate PR cleaned up so the check goes green.

Non-blocking

GET /v0/management/api-keys returns api-key-policies using allowed-models (kebab-case). If the dashboard expects allowedModels, consider adding an alias or documenting the response shape.
Minor nit: the test case name "empty plain array fails" expects success (empty clears keys).

Test plan

CI build ✅
No local checkout/tests run (review-only via gh as requested).

sccgaochengyi · 2026-04-27T07:52:34Z

We actively need this feature. Per-key model allowlisting is exactly what we've been looking for to enforce user-tier access controls (e.g. restricting intern/contractor keys to deepseek-/glm-/qwen* while granting full-time employees broader access).

The implementation looks solid — the middleware-based enforcement, glob matching, atomic policy cache, and backward-compatible payload formats all align well with how we'd want to consume this.

That said, I'd echo the two blocking items from @luispater's review:

Codex route bypass — wiring ModelACLMiddleware onto /backend-api/codex/* is critical; without it the allowlist is trivially circumvented.
/v1/models filtering — returning the full model list to a restricted key is a poor UX (users see models they can't use). Ideally the models endpoint would filter based on the calling key's policy.

Both seem like small deltas on top of this PR. Hoping we can get these addressed and land this soon.

Three blockers + two nits from luispater's CHANGES_REQUESTED review on PR router-for-me#2914: server.go (BLOCKING — alias bypass): - /backend-api/codex/* uses AuthMiddleware but had no ModelACLMiddleware, so a key restricted by api-key-policies could bypass the allowlist by routing through the codex alias instead of /v1/responses. Added ModelACLMiddleware to the codexDirect group. model_acl.go (BLOCKING — non-JSON body buffering): - extractRequestedModel previously buffered any POST/PUT/PATCH body up to 10 MiB to look for a JSON "model" field. With ACLs enabled this also buffered non-JSON uploads (image edits, future binary endpoints) even when no JSON model field could exist. - Dispatch on Content-Type: - application/json (and *+json, missing CT) → existing peek+buffer path with the 10 MiB cap. - multipart/form-data → ParseMultipartForm with a 1 MiB in-memory cap; large file parts spool to temp disk per stdlib. This recovers enforcement on /v1/images/edits without ballooning resident memory. - everything else → no body inspection, allow through. - ParseMultipartForm is idempotent, so downstream handlers using c.MultipartForm() / c.PostForm() still see the parsed form. config (non-blocking nit — response shape): - APIKeyPolicy.AllowedModels JSON tag: allowed-models → allowedModels to match what the dashboard sends on PUT. Input parser still accepts all three case styles (camel, kebab, snake) for backward compat. tests (non-blocking nit + new coverage): - api_keys_payload_test.go: rename "empty plain array fails" → "empty plain array clears keys" (the case asserts wantOK:true). - model_acl_test.go: 8 new tests: - CodexAliasDisallowedModelRejected / AllowedModelPasses - MultipartFormDataDisallowedModelRejected / AllowedModelPasses - MultipartWithoutModelFieldAllowed - NonJSONContentTypeAllowedThrough (10 MiB octet-stream → 200, proves the middleware no longer buffers non-JSON bodies) - JSONWithCharsetParameterStillInspected - VendorJSONContentTypeInspected (application/vnd.api+json) go test ./internal/api/... ./internal/config/... is green; the only failing test on this branch (TestCodexStaticModelsIncludeGPT55) also fails on pristine main and is unrelated.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9f805087c8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

# Conflicts: # internal/runtime/executor/codex_executor.go

mtuckerb · 2026-04-29T11:27:51Z

Addressed the remaining reviewer-facing gap around model-list UX. Restricted API keys now get policy-filtered model listings for both OpenAI/Claude-shaped /v1/models and Gemini-shaped /v1beta/models, while unrestricted keys retain the full catalog and deny-all/no-policy keys receive an empty filtered list.

Also added focused coverage for:

restricted keys only seeing allowed models
unrestricted keys seeing all models
deny-all unknown keys seeing no models
/v1/models response filtering behavior

Verification:

go test ./internal/api
go test ./...
GitHub checks are green (build, close-when-agents-md-changed, ensure-no-translator-changes).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d8305cef2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Two new reviewer-flagged gaps after the prior round: 1. P1 — Content-Type-based bypass of model ACL. Several model-executing routes (ChatCompletions, Completions, Responses, ClaudeMessages) parse the request body as JSON regardless of Content-Type. The previous default-allow branch in extractRequestedModel let a restricted key send Content-Type: text/plain with a JSON body containing a disallowed model and bypass enforcement. Restructure the dispatch so only multipart/form-data takes the dedicated form-parsing path; every other content type — including text/plain, application/octet-stream, and any other non-JSON value — runs the same peek+gjson inspection as application/json. Bodies that genuinely have no JSON model field (true binary uploads etc.) still pass through, and the existing oversize cap continues to limit per-request memory. 2. P2 — Duplicate-key silent-skip in PUT /api-keys structured payload. parseAPIKeysPayload accepted [{key:'sk-x'},{key:'sk-x',allowedModels:[...]}] and silently kept only the first row, dropping the restriction on the duplicate. That is a quiet permission widening. Reject duplicate structured keys with ok=false (HTTP 400 from PutAPIKeys) so callers notice and fix the payload instead of getting an unintended fall-back to the default policy. Tests: - TestModelACLMiddleware_NonJSONContentTypeWithJSONBodyStillEnforces: Content-Type: text/plain + JSON body with disallowed model returns 403. - TestModelACLMiddleware_OctetStreamWithoutModelAllowed: opaque octet-stream body without a JSON model field still returns 200 from the middleware. - TestModelACLMiddleware_NonJSONContentTypeAllowedThrough updated to the new contract — small opaque body returns 200; the prior 'large opaque payload returns 200' assertion was the exact bypass this PR closes. - TestParseAPIKeysPayload_StructuredDuplicateKeyRejected: duplicate structured rows are rejected. go test ./... passes.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9369170b9f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

mtuckerb · 2026-05-02T20:34:14Z

Addressed the two new bot-flagged gaps from the latest pass (commit 9369170).

P1 — Content-Type bypass (codex, model_acl.go):
Several model-executing routes (ChatCompletions, Completions, Responses, ClaudeMessages) parse the body as JSON regardless of Content-Type, so the previous default-allow branch let a restricted key send Content-Type: text/plain with a JSON body containing a disallowed model and skip enforcement. The Content-Type switch now only branches off the JSON path for content types we explicitly know are not JSON (multipart/form-data). Everything else — including missing, unparseable, or misleading Content-Type values — runs the same peek+gjson inspection as application/json. Bodies without a JSON model field still pass through (gjson returns ok=false), and the existing 10 MiB cap continues to bound per-request memory.

P2 — Silent duplicate-key dedupe in PUT /api-keys (codex, config_lists.go):
parseAPIKeysPayload was silently skipping duplicate structured rows, which let payloads like [{"key":"sk-x"},{"key":"sk-x","allowedModels":["gpt-4o*"]}] widen access by dropping the restricted row. Duplicates are now rejected (HTTP 400). Legacy plain-list inputs still go through dedupeKeyList because they have no per-row metadata to lose.

Tests:

TestModelACLMiddleware_NonJSONContentTypeWithJSONBodyStillEnforces — Content-Type: text/plain + JSON body with disallowed model → 403.
TestModelACLMiddleware_OctetStreamWithoutModelAllowed — opaque octet-stream body without a JSON model field → 200.
TestModelACLMiddleware_NonJSONContentTypeAllowedThrough — kept the test name (intent is right) but rewrote the body to a small 4 KiB opaque buffer; the prior 10 MiB-payload-returns-200 assertion was the exact bypass this round closes.
TestParseAPIKeysPayload_StructuredDuplicateKeyRejected — duplicate structured rows → ok=false.

Verification:

go test ./internal/api/...
go test ./...
GitHub checks green (build, close-when-agents-md-changed, ensure-no-translator-changes).

mtuckerb · 2026-05-02T20:35:44Z

Round-4 fixes pushed at commit 9369170b9fac499c8846ebba42cccf3072a94371 addressing the two new codex findings against the prior commit:

P1 — Content-Type bypass of model ACL (model_acl.go)
The previous switch had a default: return "", false, nil arm for unknown content types. Several model-executing handlers (ChatCompletions, Completions, Responses, ClaudeMessages) parse the body via c.GetRawData() and decode JSON regardless of Content-Type, so a restricted key could send Content-Type: text/plain with a JSON body containing a disallowed model and bypass enforcement. The dispatch now branches off the JSON path only for multipart/form-data; everything else — including missing, unparseable, or deliberately misleading Content-Type values — runs the same peek + gjson inspection as application/json. Bodies that genuinely contain no JSON model field still pass through (gjson returns ok=false), and the existing 10 MiB cap continues to bound per-request memory.

P2 — Silent duplicate-key dedupe in PUT /api-keys structured payload (config_lists.go)
parseAPIKeysPayload previously skipped duplicate keys in the structured form, which silently dropped allowedModels from a later duplicate row and let the key fall back to the deployment default policy (usually allow-all). Duplicates now return ok=false → HTTP 400, so callers fix their payload instead of getting an invisible permission widening. The legacy plain-string form continues to dedupe via dedupeKeyList because it carries no per-row metadata to lose.

Tests

TestModelACLMiddleware_NonJSONContentTypeWithJSONBodyStillEnforces: text/plain + JSON body with disallowed model → 403.
TestModelACLMiddleware_OctetStreamWithoutModelAllowed: opaque octet-stream body without a JSON model field still returns 200.
TestModelACLMiddleware_NonJSONContentTypeAllowedThrough updated to the new contract — small opaque body returns 200; the prior "large opaque payload returns 200" assertion was the exact bypass this round closes.
TestParseAPIKeysPayload_StructuredDuplicateKeyRejected: duplicate structured rows are rejected.

Verification: go test ./... passes; PR checks (build, close-when-agents-md-changed, ensure-no-translator-changes) are green.

Red-team review of commit 9369170 probed five edge cases the round-4 unit tests did not already cover. All pass against the current implementation; pinning them down so a future regression in either the Content-Type dispatch or the duplicate-key parser surfaces as a test failure rather than a silent ACL bypass. model_acl_test.go: EmptyContentTypeWithDisallowedJSONStillEnforces, UnparseableContentTypeStillEnforces, MixedCaseContentTypeStillDispatches. api_keys_payload_test.go: StructuredWrappedDuplicateKeyRejected, StructuredDuplicateKeyAfterTrimRejected. go test ./internal/api/... ./internal/api/handlers/management/... passes.

mtuckerb · 2026-05-02T20:52:33Z

Red-team self-review of 9369170b shipped at e0e8d1f9. No new blocking findings; added five adversarial test cases that the round-4 unit tests didn't cover.

Coverage added:

TestModelACLMiddleware_EmptyContentTypeWithDisallowedJSONStillEnforces: no Content-Type header + JSON body with disallowed model → 403. Catches a regression of parseRequestMediaType("") → "" ever short-circuiting back to fail-open.
TestModelACLMiddleware_UnparseableContentTypeStillEnforces: malformed Content-Type that mime.ParseMediaType rejects also lands in the default branch and still enforces.
TestModelACLMiddleware_MixedCaseContentTypeStillDispatches: APPLICATION/JSON survives ToLower. A future refactor that drops the lowercase step surfaces here instead of as a silent ACL bypass.
TestParseAPIKeysPayload_StructuredWrappedDuplicateKeyRejected: duplicate keys in the {items:[...]} wrapped structured form get rejected — same loop as the bare array form, but pin the wrapped form too because both shapes go through it.
TestParseAPIKeysPayload_StructuredDuplicateKeyAfterTrimRejected: whitespace-padded duplicates are also rejected; trim-then-dedupe is correctly ordered.

Pre-existing concerns I noticed but they're not in scope for this PR — flagging for awareness:

No global MaxBytesReader on the multipart code path. The 1 MiB modelACLMultipartMemoryCap is the in-memory portion of the form parse; mime/multipart will happily disk-spool larger file fields. Predates this PR.
extractModelFromMultipartBody allow-throughs on a malformed multipart body. Intentional — the route handler should reject — but worth keeping in mind if the multipart route surface grows.

Tests pass against the new commit; PR checks are green.

* fix(executor): fix OAuth extra usage detection by Anthropic API Three changes to avoid Anthropic's content-based system prompt validation: 1. Fix identity prefix: Use 'You are Claude Code, Anthropic's official CLI for Claude.' instead of the SDK agent prefix, matching real Claude Code. 2. Move user system instructions to user message: Only keep billing header + identity prefix in system[] array. User system instructions are prepended to the first user message as <system-reminder> blocks. 3. Enable cch signing for OAuth tokens by default: The xxHash64 cch integrity check was previously gated behind experimentalCCHSigning config flag. Now automatically enabled when using OAuth tokens. Related: router-for-me#2599 * fix(executor): inject full Claude Code system prompt blocks with proper cache scopes Previous fix only injected billing header + agent identifier (2 blocks). Anthropic's updated detection now validates system prompt content depth: - Block count (needs 4-6 blocks, not 2) - Cache control scopes (org for agent, global for core prompt) - Presence of known Claude Code instruction sections Changes: - Add claude_system_prompt.go with extracted Claude Code v2.1.63 system prompt sections (intro, system instructions, doing tasks, tone & style, output efficiency) - Rewrite checkSystemInstructionsWithSigningMode to build 5 system blocks: [0] billing header (no cache_control) [1] agent identifier (cache_control: ephemeral, scope=org) [2] core intro prompt (cache_control: ephemeral, scope=global) [3] system instructions (no cache_control) [4] doing tasks (no cache_control) - Third-party client system instructions still moved to first user message Follow-up to 69b950d * fix: use sjson to build system blocks, avoid raw newlines in JSON The previous commit used fmt.Sprintf with %s to insert multi-line string constants into JSON strings. Go raw string literals contain actual newline bytes, which produce invalid JSON (control characters in string values). Replace with buildTextBlock() helper that uses sjson.SetBytes to properly escape text content for JSON serialization. * fix: buildTextBlock cache_control sjson path issue sjson treats 'cache_control.type' as nested path, creating {ephemeral: {scope: org}} instead of {type: ephemeral, scope: org}. Pass the whole map to sjson.SetBytes as a single value. * fix: build cache_control JSON manually to avoid sjson map marshaling * fix: remove invalid org scope and match Claude Code block layout * fix(claude): sanitize forwarded third-party prompts for OAuth cloaking Only for Claude OAuth requests, sanitize forwarded system-prompt context before it is prepended into the first user message. This preserves neutral task/tool instructions while removing OpenCode branding, docs links, environment banners, and product-specific workflow sections that still triggered Anthropic extra-usage classification after top-level system[] cloaking. * fix(claude): remove invalid cache_control scope from static system block Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(claude): reduce forwarded OAuth prompt to minimal tool reminder Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(auth, executor): normalize Qwen base URL, adjust RefreshLead duration, and add tests * fix(claude): remap OAuth tool names to Claude Code style to avoid third-party fingerprint detection A/B testing confirmed that Anthropic uses tool name fingerprinting to detect third-party clients on OAuth traffic. OpenCode-style lowercase names like 'bash', 'read', 'todowrite' trigger extra-usage billing, while Claude Code TitleCase names like 'Bash', 'Read', 'TodoWrite' pass through normally. Changes: - Add oauthToolRenameMap: maps lowercase tool names to Claude Code equivalents - Add oauthToolsToRemove: removes 'question' and 'skill' (no Claude Code counterpart) - remapOAuthToolNames: renames tools, removes blacklisted ones, updates tool_choice and messages - reverseRemapOAuthToolNames/reverseRemapOAuthToolNamesFromStreamLine: reverse map for responses - Apply in Execute(), ExecuteStream(), and CountTokens() for OAuth token requests * fix(auth): preserve and restore ready view cursors during index rebuilds * fix(handlers): add base URL validation and improve API key deletion tests * feat(antigravity): configurable signature cache with bypass-mode validation Antigravity 的 Claude thinking signature 处理新增 cache/bypass 双模式，并为 bypass 模式实现按 SIGNATURE-CHANNEL-SPEC.md 的签名校验。新增 antigravity-signature-cache-enabled 配置项（默认 true）： - cache mode（true）：使用服务端缓存的签名，行为与原有逻辑完全一致 - bypass mode（false）：直接使用客户端提供的签名，经过校验和归一化支持配置热重载，运行时可切换模式。校验流程： 1. 剥离历史 cache-mode 的 'modelGroup#' 前缀（如 claude#Exxxx → Exxxx） 2. 首字符必须为 'E'（单层编码）或 'R'（双层编码），否则拒绝 3. R 开头：base64 解码 → 内层必须以 'E' 开头 → 继续单层校验 4. E 开头：base64 解码 → 首字节必须为 0x12（Claude protobuf 标识） 5. 所有合法签名归一化为 R 形式（双层 base64）发往 Antigravity 后端非法签名处理策略： - 非严格模式（默认）：translator 静默丢弃无签名的 thinking block - 严格模式（antigravity-signature-bypass-strict: true）： executor 层在请求发往上游前直接返回 HTTP 400 按 SIGNATURE-CHANNEL-SPEC.md 解析 Claude 签名的完整 protobuf 结构： - Top-level Field 2（容器）→ Field 1（渠道块） - 渠道块提取：channel_id (Field 1)、infrastructure (Field 2)、 model_text (Field 6)、field7 (Field 7) - 计算 routing_class、infrastructure_class、schema_features - 使用 google.golang.org/protobuf/encoding/protowire 解析 - resolveThinkingSignature 拆分为 resolveCacheModeSignature / resolveBypassModeSignature - hasResolvedThinkingSignature：mode-aware 签名有效性判断（cache: len>=50 via HasValidSignature，bypass: non-empty） - validateAntigravityRequestSignatures：executor 预检，仅在 bypass + strict 模式下拦截非法签名返回 400 - 响应侧签名缓存逻辑与 cache mode 集成 - Cache mode 行为完全保留：无 '#' 前缀的原生签名静默丢弃 * docs(antigravity): document signature validation spec alignment Add package-level comment documenting the protobuf tree structure, base64 encoding equivalence proof, output dimensions, and spec section references. Remove unreachable legacy_vertex_group dead code. * fix(antigravity): refine 429 handling and credits fallback Includes: restore SDK docs under docs/; update antigravity executor credits tests; gofmt. * fix(claude): preserve OAuth tool renames when filtering tools Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(claude): map question/skill to TitleCase instead of removing them Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(claude): address PR review feedback for OAuth cloaking - Use buildTextBlock for billing header to avoid raw JSON string interpolation - Fix empty array edge case in prependToFirstUserMessage - Allow remapOAuthToolNames to process messages even without tools array - Move claude_system_prompt.go to helps/ per repo convention - Export prompt constants (ClaudeCode* prefix) for cross-package access Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(handlers): update listener to bind on all interfaces instead of localhost Fixed: router-for-me#2640 * feat(antigravity): prefer prod URL as first priority Promote cloudcode-pa.googleapis.com to the first position in the fallback order, with daily and sandbox URLs as fallbacks. * fix(executor): implement immediate retry with token refresh on 429 for Qwen and add associated tests Closes: router-for-me#2661 * fix(executor): handle OAuth tool name remapping with rename detection and add tests Closes: router-for-me#2656 * docs: refine CLIproxyAPI Quota Inspector description in all README locales * docs: fix CLIProxyAPI Quota Inspector naming and link casing * refactor(executor): remove immediate retry with token refresh on 429 for Qwen and update tests accordingly * docs: clarify codex quota window wording in README locales * fix(executor): handle 429 Retry-After header and default retry logic for quota exhaustion - Added proper parsing of `Retry-After` headers for 429 responses. - Set default retry duration when "disable cooling" is active on quota exhaustion. - Updated tests to verify `Retry-After` handling and default behavior. * fix(antigravity): allow 32MB bypass signatures Raise the local bypass-signature ceiling so long Claude thinking signatures are not rejected before request translation, and keep the oversized-signature test cheap to execute. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(antigravity): reduce bypass mode log noise Keep cache-disable visibility at info level while suppressing duplicate state-change logs and moving strict-mode chatter down to debug. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * feat(auth): implement auto-refresh loop for managing auth token schedule - Introduced `authAutoRefreshLoop` to handle token refresh scheduling. - Replaced semaphore-based refresh logic in `Manager` with the new loop. - Added unit tests to verify refresh schedule logic and edge cases. * fix(antigravity): drop redacted thinking blocks with empty text Antigravity wraps empty thinking text into a prompt-caching-scope object that omits the required inner "thinking" field, causing 400 "messages.N.content.0.thinking.thinking: Field required" when Claude Max requests are routed through Antigravity in bypass mode. * fix(antigravity): skip full schema cleanup for empty tool requests Avoid whole-payload schema sanitization when translated Antigravity requests have no actual tool schemas, including missing and empty tools arrays. Add regression coverage so image-heavy no-tool requests keep bypassing the old memory amplification path. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * refactor(auth): simplify auth directory scanning and improve JSON processing logic - Replaced `filepath.Walk` with `os.ReadDir` for cleaner directory traversal. - Fixed `isAuthJSON` check to use `filepath.Dir` for directory comparison. - Updated auth hash cache generation and file synthesis to improve readability and maintainability. * feat(auth): add configurable worker pool size for auto-refresh loop - Introduced `auth-auto-refresh-workers` config option to override default concurrency. - Updated `authAutoRefreshLoop` to support customizable worker counts. - Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations. * fix(antigravity): strip thinking blocks with empty signatures instead of rejecting Thinking blocks with empty signatures come from proxy-generated responses (Antigravity/Gemini routed as Claude). These should be silently dropped from the request payload before forwarding, not rejected with 400. Fixes 10 "missing thinking signature" errors. * fix(antigravity): discard thinking blocks with non-Claude-format signatures Proxy-generated thinking blocks may carry hex hashes or other non-Claude signatures (e.g. "d5cb9cd0823142109f451861") from Gemini responses. These are now discarded alongside empty-signature blocks during the strip phase, before validation runs. Valid Claude signatures always start with 'E' or 'R' (after stripping any cache prefix). * fix(antigravity): use E-prefixed fake signature in strict bypass test The strict bypass test used testGeminiSignaturePayload() which produces a base64 string starting with 'C'. Since StripInvalidSignatureThinkingBlocks now strips all non-E/R signatures unconditionally, the test payload was stripped before reaching ValidateClaudeBypassSignatures, causing the test to pass the request through instead of rejecting it with 400. Replace with testFakeClaudeSignature() which produces a base64 string starting with 'E' (valid at the lightweight check) but with invalid protobuf content (no valid field 2), so strict mode correctly rejects it at the deep validation layer. * fix(antigravity): cap maxOutputTokens using registry max_completion_tokens Claude models on antigravity have a 64000 token output limit but max_tokens from downstream requests was passed through uncapped, causing 400 INVALID_ARGUMENT from Google when clients sent 128000. * chore: remove Qwen support from SDK and internal components - Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations. - Removed all Qwen-related OAuth flows, token handling, and execution logic. - Cleaned up dependencies and references to Qwen across the codebase. * chore(models): remove outdated GPT-5 and related model entries from registry JSON * feat(session-affinity): add session-sticky routing for multi-account load balancing When multiple auth credentials are configured, requests from the same session are now routed to the same credential, improving upstream prompt cache hit rates and maintaining context continuity. Core components: - SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with session-to-auth binding; automatic failover when bound auth is unavailable, re-binding via the fallback selector for even distribution - SessionCache: TTL-based in-memory cache with background cleanup goroutine, supporting per-session and per-auth invalidation - StoppableSelector interface: lifecycle hook for selectors holding resources, called during Manager.StopAutoRefresh() Session ID extraction priority (extractSessionIDs): 1. metadata.user_id with Claude Code session format (old user_{hash}_session_{uuid} and new JSON {session_id} format) 2. X-Session-ID header (generic client support) 3. metadata.user_id (non-Claude format, used as-is) 4. conversation_id field 5. Stable FNV hash from system prompt + first user/assistant messages (fallback for clients with no explicit session ID); returns both a full hash (primaryID) and a short hash without assistant content (fallbackID) to inherit bindings from the first turn Multi-format message hash covers OpenAI messages, Claude system array, Gemini contents/systemInstruction, and OpenAI Responses API input items (including inline messages with role but no type field). Configuration (config.yaml routing section): - session-affinity: bool (default false) - session-affinity-ttl: duration string (default "1h") - claude-code-session-affinity: bool (deprecated, alias for above) All three fields trigger selector rebuild on config hot reload. Side effect: Idempotency-Key header is no longer auto-generated with a random UUID when absent — only forwarded when explicitly provided by the client, to avoid polluting session hash extraction. * fix(antigravity): strip billing header from system instruction before upstream call The x-anthropic-billing-header block in the Claude system array is client-internal metadata and should not be forwarded to the Gemini upstream as part of systemInstruction.parts. * fix(docker-build): improve argument handling and error messaging for usage option * fix(util): forward custom Host header to upstream Custom headers configured under openai-compatibility (and any other provider passing through applyCustomHeaders) were silently dropped for the Host key, because Go's net/http reads the wire Host from req.Host, not req.Header["Host"]. As a result, virtual-host routed upstreams (e.g. LiteLLM behind an ingress) saw the base-url's host instead of the user-configured override and returned 404. Detect the Host key with http.CanonicalHeaderKey and assign it to req.Host so it is actually written on the wire. Other headers continue to use Header.Set as before. Fixes router-for-me#2833 * fix(handlers): include execution session metadata and skip idempotency key when absent - Refactored `requestExecutionMetadata` to handle empty `Idempotency-Key` gracefully. - Added test to validate metadata inclusion of execution session without idempotency key. * feat(auth): add proxy URL override support to auth constructors and executors - Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`. - Updated executors to use proxy-aware constructors for improved configurability. - Added unit tests to validate proxy override precedence and functionality. Closes: router-for-me#2823 * chore: remove iFlow-related modules and dependencies - Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules. - Removed iFlow-specific tests, executors, and helpers across SDK and internal components. - Updated all references to exclude iFlow functionality. * feat(models): add Claude Opus 4.7 model entry to registry JSON * fix(tests): update model lookup references and enhance Claude executor tests * fix(tests): update Gemini family test case numbers for consistency * fix(util): also keep Host in header map for synthetic requests Addressing the P1 note from the Codex reviewer: applyCustomHeaders is also called with a synthetic &http.Request{Header: ...} from the websockets executors (aistudio_executor.go, codex_websockets_executor.go), which forward only the header map. The previous continue meant a custom Host was dropped from that map, regressing virtual-host overrides on those flows. Mirror the value to both r.Host (for real net/http) and r.Header (for header-map-only consumers). * feat(api): integrate auth index into key retrieval endpoints for Gemini, Claude, Codex, OpenAI, and Vertex * fix(management): stabilize auth-index mapping * fix(tests): remove obsolete config_auth_index_test file * feat(translator): add partial and full image generation support in Codex-GPT and Codex-Gemini flows - Introduced `LastImageHashByItemID` in Codex-GPT and `LastImageHashByID` in Codex-Gemini for deduplication of generated images. - Added support for handling `partial_image` and `image_generation_call` types, with inline data embedding for Gemini and URL payload conversion for GPT. - Extended unit tests to verify image handling in both streaming and non-streaming modes. * fix(executor): drop obsolete context-1m-2025-08-07 beta header (fixes router-for-me#2866) Anthropic has moved the 1M-context-window feature to General Availability, so the context-1m-2025-08-07 beta flag is no longer accepted and now causes 400 Bad Request errors when forwarded upstream. Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the now-invalid beta header. Also drop the unused net/textproto import that was only needed for the header-key lookup. * feat(docs): add VisionCoder sponsorship details and optimize external links - Added VisionCoder sponsorship information to `README.md`, `README_CN.md`, and `README_JA.md`. - Updated external links to include `target="_blank"` for improved user experience. - Added new logo asset `visioncoder.png` for README use. * feat(access): per-api-key model allowlist with glob matching Adds enforcement for per-client-key model ACLs at the request middleware layer. The dashboard already stores allowedModels per UserApiKey and attempts to push them to PUT /v0/management/api-keys, but upstream until now had nowhere to land them — the structured payload was rejected and the dashboard's fallback silently dropped the policy data, so no key was ever actually restricted. config: - Add APIKeyPolicy{Key, AllowedModels} and SDKConfig.APIKeyPolicies - Add SDKConfig.APIKeyDefaultPolicy ("allow-all" | "deny-all"; defaults to allow-all so existing deployments are unaffected) - IsModelAllowedForKey supports path.Match globs and tolerates a single leading "<prefix>/" segment so policies stay portable across prefixed credentials management API: - PutAPIKeys now accepts both the legacy string list and the structured [{key, allowedModels}] form, with allowedModels / allowed-models / allowed_models all recognized so existing dashboard payloads work - GetAPIKeys exposes api-key-policies when any are configured middleware: - New ModelACLMiddleware extracts the targeted model from the JSON body ("model" field) or the Gemini-style /v1beta/models/<model>:<action> path, then enforces the per-key policy. The body is read once and restored via io.NopCloser so downstream handlers see the original request bytes - Wired into both /v1 and /v1beta groups directly after AuthMiddleware - Bypasses any request that has no identified api key (preserving the legacy "no auth provider configured => allow" behavior of AuthMiddleware) and any request shape with no extractable model (listings, websocket upgrades) so discovery endpoints keep working tests: - Unit tests for IsModelAllowedForKey covering allow-all, deny-all, empty allowed list semantics, exact matches, glob matches, and the prefixed-credential case - Middleware tests covering the allow / deny / unknown-key / Gemini path / list-endpoint / body-pass-through paths - Parser tests for the management PutAPIKeys payload covering the legacy plain list, the wrapped items form, and the structured form in all three case styles * fix(access): address review feedback on per-key model ACL Three changes landed after PR router-for-me#2911 review: middleware (DoS hardening): - Cap request body inspected by ModelACLMiddleware at 10 MiB via io.LimitReader; requests above the cap are rejected with HTTP 413 ("request_too_large") instead of being buffered into memory - Short-circuit on Content-Length when the client declares an oversized body, so we bail before reading any bytes - Any non-size read error now fails closed with HTTP 400 rather than silently allowing the request past the policy check - New tests cover both the streamed-oversize and Content-Length paths config (O(1) policy lookup + cache invalidation): - SDKConfig.APIKeyPolicies is now backed by an atomic.Pointer cache (map[string]*APIKeyPolicy) so IsModelAllowedForKey is O(1) per call instead of O(N) linear scan; the cache is lazily rebuilt via CompareAndSwap and safe under concurrent reads - Add SetAPIKeyPolicies (atomic replace + invalidate) and InvalidatePolicyIndex (manual flush after in-place edits) - Tests exercise mutation visibility, nil-safe receivers, concurrent access, and defensive-copy semantics management handlers (policy rotation/deletion): - PatchAPIKeys gains a dedicated handler that keeps APIKeyPolicies in sync on rename (via old/new) and rotation (via index/value); when the target key already has a policy the source row is dropped so there are no duplicate entries - DeleteAPIKeys removes the matching APIKeyPolicy rows so a revoked credential cannot leave its allowlist behind; slice is nilled when empty - PutAPIKeys routes through SetAPIKeyPolicies so the cache is refreshed atomically with the slice swap - Tests cover rename-with-existing-target, append-new, delete-by-value, delete-by-index, last-row-clears, and PUT-rebuilds-policies No behavior change for existing deployments that do not configure api-key-policies. All existing tests still pass. * fix(access): address gemini-code-assist review round 2 - config_lists.go: PutAPIKeys, PatchAPIKeys, and DeleteAPIKeys now acquire h.mu before mutating h.cfg and use persistLocked so the mutation and the subsequent config write happen atomically from the perspective of readers (middleware goroutines). Previously the mutations raced with ModelACLMiddleware evaluating the same fields on concurrent requests. Matches the existing pattern in PutGeminiKeys / PatchGeminiKey. (HIGH review comment) - config_lists.go: renameAPIKeyPolicy no longer reslices the existing APIKeyPolicies backing array via "[:0]". That was unsafe because SDKConfig.policyIndex caches *APIKeyPolicy into the same backing array; a concurrent reader resolving the cache mid-write would observe a partially-mutated entry. Allocate a fresh slice so any still-cached pointers keep pointing at the pre-mutation entries until InvalidatePolicyIndex forces a refresh. (MEDIUM review comment) - model_acl.go: extractRequestedModel now peeks the first 16 KiB of the request body before committing to buffering the full cap. In realistic chat-completion payloads "model" is always near the top of the JSON, so the peek alone is usually sufficient and the middleware allocates ~16 KiB per request instead of up to modelACLMaxBodyBytes (10 MiB). When the peek does not contain "model", the middleware falls back to the prior bounded read and still enforces the cap. Under concurrency this is the difference between a bounded constant footprint and N*10 MiB. (MEDIUM review comment) - model_acl_test.go: two new tests lock in peek-first behavior — one covers a large body with "model" in the peek (must extract, enforce, and preserve the full downstream body byte-for-byte), the other covers "model" past the peek but within the cap (must still extract via the fallback read). * fix(access): preserve full Gemini model path in ACL extraction The previous extractor for /v1beta/models/<model>:<action> paths truncated at the first '/' inside <model>. That broke deployments using force-model-prefix, where the routed model identifier is literally "<prefix>/<model>" (e.g. "teamA/gemini-3-pro"). The GeminiHandler forwards the whole segment-before-":" as the model, and IsModelAllowedForKey already strips a single leading "<prefix>/" before glob matching, so the ACL extractor must mirror that and forward the full identifier including embedded slashes. Otherwise: - allowed requests were rejected (the ACL saw "teamA" instead of "teamA/gemini-3-pro") - policies scoped to just the prefix segment could unintentionally allow unrelated models sharing the prefix Added TestModelACLMiddleware_GeminiPathPreservesPrefix which uses the same AllowedModels pattern against both an allowed and a disallowed model under the same prefix — a regression would surface as a false-positive on either the allow path or the deny path. * fix(access): address codex review round 3 — WS bypass + drain DoS Two P1s from the Codex reviewer on the prior commit: 1. Websocket ACL bypass /v1/responses accepts a websocket upgrade, and the model identifier is only selected later in frames consumed by the ResponsesWebsocket handler. extractRequestedModel returns found=false for the upgrade request, and the middleware was falling through as "no model here, allow", letting a restricted api key use any model through the websocket path. Fix: after identifying the key but before attempting body extraction, detect Upgrade: websocket and reject the upgrade when keyHasModelRestriction(cfg, apiKey) is true. A key is considered restricted when either it has an APIKeyPolicy entry with a non-empty AllowedModels list, or the deployment default is deny-all and the key has no matching policy. Unrestricted keys still pass through so the legacy websocket flow keeps working for them. 2. Unbounded drain after 413 After detecting the body exceeded modelACLMaxBodyBytes, the code called io.Copy(io.Discard, c.Request.Body), which for a chunked/streamed request without a trustworthy Content-Length can hold the handler goroutine indefinitely — an attacker just keeps streaming bytes and the ACL check turns into a request-slot exhaustion path. Fix: drop the drain entirely. Return 413 and let net/http close the connection. No keep-alive concern since we're rejecting the request. Tests added: - WebsocketUpgradeRejectedForRestrictedKey: key with AllowedModels attempting WS upgrade gets 403, handler never runs. - WebsocketUpgradeAllowedForUnrestrictedKey: key with no policy under allow-all default still gets through. - WebsocketUpgradeRejectedUnderDenyAllDefault: deny-all default rejects WS upgrades for keys with no matching policy. - OversizedBodyDoesNotDrainRemainder: uses a reader that blocks after its prefix to prove the middleware returns 413 promptly (within 2s) instead of reading to EOF. * fix(access): keep policies consistent across duplicate key values Two P2s from the Codex reviewer, both rooted in the same legacy edge case: APIKeys can hold duplicate values, and the policy helpers (renameAPIKeyPolicy, removeAPIKeyPolicy) are global — they rewrite or drop the policy row by value, not by index. That opened two ways for a duplicate to survive a patch/delete without its intended restriction: 1. PatchAPIKeys old/new path The loop broke at the first match. If APIKeys was ["sk-dup", "sk-other", "sk-dup"] and the caller did {"old":"sk-dup","new":"sk-new"}, one slot remained "sk-dup" while the policy migrated to "sk-new", so the leftover "sk-dup" fell back to the default policy (usually allow-all). Fix: remove the break — rename every duplicate so nothing is left behind that the global policy migration would orphan. 2. PatchAPIKeys index/value path Similar shape: rewriting one slot while the old value still appears elsewhere, then moving the policy to the new value, left the surviving duplicate unrestricted. Fix: only migrate the policy when the old value no longer appears anywhere in APIKeys. When duplicates remain, the policy stays attached to the old value; the new value falls back to the default until the caller sets an explicit policy via PUT /api-keys. 3. DeleteAPIKeys indexed path Same pattern — removed one slot, then dropped the policy row unconditionally, which orphaned surviving duplicates. Fix: only drop the policy when no other entries still carry the removed value. Added apiKeyValuePresent() helper so the three call sites share one membership check. Tests added: - PatchAPIKeys_RenameViaOldNewRenamesAllDuplicates: three-slot fixture with two duplicates proves all matching slots are renamed and the migrated policy still applies. - PatchAPIKeys_IndexedReplaceLeavesPolicyWhenDuplicateRemains: two-duplicate fixture proves the surviving slot keeps its policy (asserts both positively — still allows the permitted model — and negatively — still rejects models outside the allowlist). - DeleteAPIKeys_ByIndexKeepsPolicyWhenDuplicateRemains: same shape for the delete path. * feat(auth): add refresh backoff for ineffective token updates - Introduced `refreshIneffectiveBackoff` to prevent tight-looping in auto-refresh when token refresh fails to update expiry. - Adjusted refresh logic to apply backoff when `shouldRefresh` evaluates true. Closes: router-for-me#2830 * fix(executor): recover gpt-5.x text content for non-streaming requests Codex/ChatGPT reasoning-model upstreams (the gpt-5.x family) emit assistant text only via response.output_text.delta events; the final response.completed event has no output item with content[].type == "output_text". The codex executor's non-streaming path filtered out every line except response.completed and passed it straight to the chat-completions translator, which then surfaced a message with content == null while completion_tokens was non-zero. Clients that read .choices[0].message.content saw the field as null and rendered fallback text such as "Something went wrong." even though the model had generated the requested response. Fix: walk the SSE body once, accumulate text from response.output_text.delta events, and synthesize a message item containing the accumulated text into response.output[] before passing the event to the translator. The accumulator is a no-op for non-reasoning models that already populate the message item themselves (codexResponseHasOutputText guards against double-injection). Tests cover codexResponseHasOutputText across the empty/missing/populated shapes, injectCodexOutputText for fresh and pre-populated output arrays, and a representative end-to-end gpt-5.x SSE body to verify the loop accumulates the deltas and produces the expected post-injection event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(codex): backfill streaming response output * perf(codex): avoid repeated output patch writes * feat(api): add support for `HEAD` requests to `/healthz` endpoint - Refactored `/healthz` handler to support `HEAD` requests alongside `GET`. - Updated tests to include validation for `HEAD` requests with expected status and empty body. Closes: router-for-me#2929 * feat(models): add Kimi K2.6 model entry to registry JSON * feat(models): add hardcoded GPT-Image-2 model support in Codex - Added `GPT-Image-2` as a built-in model to avoid dependency on remote updates for Codex. - Updated model tier functions (`CodexFree`, `CodexTeam`, etc.) to include built-in models via `WithCodexBuiltins`. - Introduced new handlers for image generation and edit operations under `OpenAIAPIHandler`. - Extended tests to validate 503 response for unsupported image model requests. * fix(handlers): remove handling of unsupported `n` parameter in OpenAI image handlers * fix(handlers): remove references to unsupported `n` parameter in OpenAI image handlers * feat(codex): enable image generation for all Codex upstream requests Codex CLI gates the built-in image_generation tool behind AuthMode::Chatgpt (OAuth only). When clients connect via API key auth through CPA, the tool is absent from requests, making image generation unavailable through the reverse proxy. Changes: 1. Inject image_generation tool (codex_executor.go): Add ensureImageGenerationTool() that appends {"type":"image_generation","output_format":"png"} to the tools array if not already present. Applied to all three execution paths: Execute, executeCompact, and ExecuteStream. 2. Route aliases for Codex CLI direct access (server.go): Add /backend-api/codex/responses routes that map to the same OpenAI Responses API handlers as /v1/responses. This allows Codex CLI to connect via chatgpt_base_url config while keeping AuthMode::Chatgpt, which enables the built-in image_generation tool on the client side. 3. Unit tests (codex_executor_imagegen_test.go): Cover no-tools, existing tools, already-present, empty array, and mixed built-in tool scenarios. * feat(antigravity): conductor-level credits fallback for Claude models Move credits handling from executor-level retry to conductor-level orchestration. When all free-tier auths are exhausted (429/503), the conductor discovers auths with available Google One AI credits and retries with enabledCreditTypes injected via context flag. Key changes: - Add AntigravityCreditsHint system for tracking per-auth credits state - Conductor tries credits fallback after all auths fail (Execute/Stream/Count) - Executor injects enabledCreditTypes only when conductor sets context flag - Credits fallback respects provider scope (requires antigravity in providers) - Add context cancellation check in credits fallback to avoid wasted requests - Remove executor-level attemptCreditsFallback and preferCredits machinery - Restructure 429 decision logic (parse details first, keyword fallback) - Expand shouldAbort to cover INVALID_ARGUMENT/FAILED_PRECONDITION/500+UNKNOWN - Support human-readable retry delay parsing (e.g. "1h43m56s") * feat: support extracting X-Amp-Thread-Id header as session id for session affinity * fix(antigravity): remove credits fallback from CountTokens, fix gofmt CountTokens upstream API does not support enabledCreditTypes, so remove the dead credits fallback path from ExecuteCount and delete the unused tryAntigravityCreditsExecuteCount method. Fix gofmt on credits test file. * fix: forward HTTP headers to executor Options so session affinity can read X-Amp-Thread-Id * fix(antigravity): respect pinned auth in credits fallback, release deferred body on success - findAllAntigravityCreditsCandidateAuths now filters by PinnedAuthMetadataKey to prevent credential isolation violations during credits fallback - Release deferredBody reference on success path to avoid holding large payloads in memory for the lifetime of the gin context * refactor(logging): strip unrelated deferred body changes, keep credits-only logging Remove deferred body optimization and maxErrorLog constants that were unrelated to credits fallback. Keep only MarkCreditsUsed/CreditsUsed helpers for flagging requests that consumed AI credits. * fix(auth): break credits cold-start deadlock by keeping unknown-hint auths as fallback candidates Replace antigravityCreditsAvailableForModel with inline known/unknown split. Auths whose credit hints are not yet populated are kept as lower-priority candidates instead of being rejected, breaking the chicken-and-egg deadlock at cold start. * perf(antigravity): async credits hint refresh for warm tokens * feat(logging): add AI API path support for image routes - Included `/v1/images` in AI API path prefixes. - Introduced tests to validate `/v1/images/generations` and `/v1/images/edits` as AI API paths. * feat(models): add GPT-5.5 model entry to registry JSON * Add GPT-5.5 Codex model support * chore(models): remove GPT-5.5 model entry from registry JSON * feat(codex): pass base model to enable conditional image_generation tool injection - Modified `ensureImageGenerationTool` to accept `baseModel` for conditional logic. - Ensured `gpt-5.3-codex-spark` models bypass image_generation tool injection. - Updated relevant tests and executor logic to reflect changes. * fix antigravity credits stream fallback * fix(codex): classify known upstream failures Normalize Codex context, thinking-signature, previous-response, and auth failures to explicit error codes: context_too_large, thinking_signature_invalid, previous_response_not_found, auth_unavailable. Refs router-for-me#2596. * feat(auth): disallow free-tier Codex auth during selection process - Introduced `disallowFreeAuthFromMetadata` and `isFreeCodexAuth` to enforce skipping free-tier credentials. - Modified scheduler logic to honor `DisallowFreeAuthMetadataKey` during auth selection. - Updated `ensureImageGenerationTool` to skip tool injection for free-tier Codex auth. - Added context utility `WithDisallowFreeAuth` and integrated with image handlers. - Augmented relevant tests to cover free-tier exclusion scenarios. * Add CPA Usage Keeper to README ecosystem list * docs:Add CPA Usage Keeper to README ecosystem list * feat(api): implement protocol multiplexer and Redis queue for usage integration - Added `protocol_multiplexer.go`, enabling support for both HTTP and Redis protocols on a single listener. - Introduced `redis_queue_protocol.go` to handle Redis-compatible RESP commands for queue management. - Integrated `redisqueue` package, supporting in-memory queuing with expiration pruning. - Updated server initialization to manage a shared listener and multiplex connections. - Adjusted `Handler` to adopt `AuthenticateManagementKey` for modular key validation, supporting both HTTP and Redis flows. * feat(security): implement IP ban for repeated management key and Redis AUTH failures - Added IP ban logic to `AuthenticateManagementKey` and Redis protocol handlers, blocking requests after multiple failed attempts. - Introduced unit tests to validate IP ban behavior across localhost and remote clients. - Synchronized Redis protocol's authentication policy with management key validation. * feat(models): add Codex Auto Review model entry to registry JSON Closes: router-for-me#2995 * feat(api): enhance model assignment logic in image handlers - Updated `buildImagesResponsesRequest` to derive `model` dynamically based on `toolJSON`. - Adjusted streaming execution to handle dynamic model resolution across multiple contexts. Closes: router-for-me#2965 * fix(test): remove free tier from GPT-5.5 inclusion test GPT-5.5 was correctly removed from codex-free tier in 7b89583 (since free accounts cannot access it), but the test was not updated to reflect this. This caused TestCodexStaticModelsIncludeGPT55 to fail on the free subtest. Changes: - Remove free tier from GPT-5.5 inclusion test - Add new TestCodexFreeModelsExcludeGPT55 to explicitly verify that free tier does NOT include GPT-5.5 * feat(config): add support for disabling OpenAI compatibility providers - Introduced a `Disabled` flag to OpenAI compatibility configurations. - Updated routing, auth selection, and API handling logic to respect the `Disabled` state. - Extended relevant APIs, YAML configurations, and data structures to include the `Disabled` field. - Adjusted all relevant loops and filters to skip disabled providers. Closes: router-for-me#3060 router-for-me#3059 router-for-me#2977 * feat(executor): add support for Codex image generation tool usage tracking - Introduced `publishCodexImageToolUsage` to report image generation tool metrics. - Updated executor logic to handle image generation tool events and defaults. - Added parsing logic for `image_gen` tool usage details in `helps/usage_helpers.go`. - Updated `UsageReporter` for additional model-specific usage publishing. - Refactored usage detail normalizations. Closes: router-for-me#3063 * fix(usage_helpers): skip zero-token usage in additional model records - Added `buildAdditionalModelRecord` to filter out zero-token usage details. - Introduced `hasNonZeroTokenUsage` helper function for token usage validation. - Updated tests to cover scenarios for zero and non-zero token usage. * feat(codex): handle thinking-signature conversion for reasoning content - Implemented `appendReasoningContent` to support processing of `thinking` signature and text as reasoning input. - Added test cases to validate reasoning content conversion with and without text. * fix(codex): include `content` field in reasoning item initialization * Preserve Codex reasoning signatures for Claude * fix(antigravity): use real antigravity UA when polling credits balance The loadCodeAssist polling call hardcoded the User-Agent to google-api-nodejs-client/9.15.1. Google Cloud Code returns the paidTier object WITHOUT the availableCredits array for that UA, so updateAntigravityCreditsBalance always saw "no credits", set the hint to Available=false for every Google One AI Ultra account, and the conductor-level credits fallback could never find a candidate. Switching to resolveUserAgent(auth) (the same UA used for streamGenerateContent / generateContent) makes the response include availableCredits, so the credits hint is populated correctly and the fallback can actually inject enabledCreditTypes:["GOOGLE_ONE_AI"] when free tier is exhausted. * test(api): add validation for unsupported models in OpenAI image handlers - Introduced tests to ensure unsupported models are rejected in `/images/generations` and `/images/edits`. - Added `isSupportedImagesModel` and `rejectUnsupportedImagesModel` functions for consistent model validation. - Enhanced image handler logic to apply validation checks for model compatibility. * fix antigravity user agent handling * fix antigravity client agent headers * fix(access): address luispater review — alias bypass + multipart gating Three blockers + two nits from luispater's CHANGES_REQUESTED review on PR router-for-me#2914: server.go (BLOCKING — alias bypass): - /backend-api/codex/* uses AuthMiddleware but had no ModelACLMiddleware, so a key restricted by api-key-policies could bypass the allowlist by routing through the codex alias instead of /v1/responses. Added ModelACLMiddleware to the codexDirect group. model_acl.go (BLOCKING — non-JSON body buffering): - extractRequestedModel previously buffered any POST/PUT/PATCH body up to 10 MiB to look for a JSON "model" field. With ACLs enabled this also buffered non-JSON uploads (image edits, future binary endpoints) even when no JSON model field could exist. - Dispatch on Content-Type: - application/json (and *+json, missing CT) → existing peek+buffer path with the 10 MiB cap. - multipart/form-data → ParseMultipartForm with a 1 MiB in-memory cap; large file parts spool to temp disk per stdlib. This recovers enforcement on /v1/images/edits without ballooning resident memory. - everything else → no body inspection, allow through. - ParseMultipartForm is idempotent, so downstream handlers using c.MultipartForm() / c.PostForm() still see the parsed form. config (non-blocking nit — response shape): - APIKeyPolicy.AllowedModels JSON tag: allowed-models → allowedModels to match what the dashboard sends on PUT. Input parser still accepts all three case styles (camel, kebab, snake) for backward compat. tests (non-blocking nit + new coverage): - api_keys_payload_test.go: rename "empty plain array fails" → "empty plain array clears keys" (the case asserts wantOK:true). - model_acl_test.go: 8 new tests: - CodexAliasDisallowedModelRejected / AllowedModelPasses - MultipartFormDataDisallowedModelRejected / AllowedModelPasses - MultipartWithoutModelFieldAllowed - NonJSONContentTypeAllowedThrough (10 MiB octet-stream → 200, proves the middleware no longer buffers non-JSON bodies) - JSONWithCharsetParameterStillInspected - VendorJSONContentTypeInspected (application/vnd.api+json) go test ./internal/api/... ./internal/config/... is green; the only failing test on this branch (TestCodexStaticModelsIncludeGPT55) also fails on pristine main and is unrelated. * fix(access): filter model listings by API key policy * Fix TUI hourly usage fallback --------- Co-authored-by: wykk-12138 <z710959836@gmail.com> Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> Co-authored-by: Luis Pater <webmaster@idotorg.org> Co-authored-by: sususu98 <suchangshan@foxmail.com> Co-authored-by: ZTXBOSS666 <150424052+ZTXBOSS666@users.noreply.github.com> Co-authored-by: Chén Mù <10558748+hkfires@users.noreply.github.com> Co-authored-by: Allen Yi <yiwenyou_allen@outlook.com> Co-authored-by: sususu98 <33882693+sususu98@users.noreply.github.com> Co-authored-by: muzhi1991 <2101044+muzhi1991@users.noreply.github.com> Co-authored-by: Supra4E8C <tan13318527080@163.com> Co-authored-by: Supra4E8C <69194597+LTbinglingfeng@users.noreply.github.com> Co-authored-by: octo-patch <octo-patch@users.noreply.github.com> Co-authored-by: Lyra <lyra@tuckerbradford.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: stringer07 <1742292793@qq.com> Co-authored-by: MoYeRanQianZhi <MoYeRanQianZhi@gmail.com> Co-authored-by: XYenon <i@xyenon.bid> Co-authored-by: Ben Vargas <ben@vargas.com> Co-authored-by: Matthias319 <Matthias319@users.noreply.github.com> Co-authored-by: Will <wangxking@outlook.com> Co-authored-by: philipbankier <philipbankier@gmail.com> Co-authored-by: xbang <xbang0222@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e0e8d1f90b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

mtuckerb · 2026-05-03T12:51:18Z

ready for another review!

mtuckerb · 2026-05-06T14:08:02Z

@luispater anything else you need from me on this ?

mtuckerb · 2026-05-12T22:43:00Z

Quick status update: I re-checked the latest inline review state and found three newer codex comments after my prior red-team note that I hadn't accounted for yet:

image endpoints can default to gpt-image-2 when model is omitted, so the ACL needs to treat that implicit default as the requested model;
malformed multipart/form-data currently fail-opens in extraction and needs to reject/fail safe;
Amp provider aliases under /api/provider/... need the same model ACL middleware as /v1, /v1beta, and /backend-api/codex.

I'm treating those as the remaining blockers and will push another fix/test pass before asking for re-review again.

mtuckerb and others added 2 commits April 19, 2026 13:54

gemini-code-assist Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread internal/api/handlers/management/config_lists.go

Comment thread internal/api/handlers/management/config_lists.go Outdated

Comment thread internal/api/model_acl.go Outdated

chatgpt-codex-connector Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread internal/api/model_acl.go

Comment thread internal/api/model_acl.go Outdated

chatgpt-codex-connector Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread internal/api/model_acl.go Outdated

chatgpt-codex-connector Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread internal/api/handlers/management/config_lists.go Outdated

Comment thread internal/api/handlers/management/config_lists.go

mtuckerb added 2 commits April 19, 2026 14:33

chatgpt-codex-connector Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread internal/api/model_acl.go

mtuckerb and others added 2 commits April 20, 2026 08:13

Merge remote-tracking branch 'origin/main' into merge/upstream-2026-0…

6f3e84d

…4-24

luispater requested changes Apr 25, 2026

View reviewed changes

sccgaochengyi mentioned this pull request Apr 28, 2026

feat(access): per-api-key model allowlist with glob matching hongshancapital/llm-gateway#1

Closed

5 tasks

chatgpt-codex-connector Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/api/model_acl.go Outdated

mtuckerb and others added 2 commits April 28, 2026 13:10

Merge remote-tracking branch 'origin/main'

ff5748d

# Conflicts: # internal/runtime/executor/codex_executor.go

fix(access): filter model listings by API key policy

1d8305c

chatgpt-codex-connector Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread internal/api/handlers/management/config_lists.go

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Comment thread internal/api/model_acl.go

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Comment thread internal/api/model_acl.go

Comment thread internal/api/server.go

Uh oh!

Conversation

mtuckerb commented Apr 19, 2026

Summary

Feature (original commit)

Review fixes (second commit)

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

luispater left a comment

Choose a reason for hiding this comment

Summary

Blocking

Non-blocking

Test plan

Uh oh!

sccgaochengyi commented Apr 27, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mtuckerb commented Apr 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mtuckerb commented May 2, 2026

Uh oh!

mtuckerb commented May 2, 2026

Uh oh!

mtuckerb commented May 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

mtuckerb commented May 3, 2026

Uh oh!

mtuckerb commented May 6, 2026

Uh oh!

mtuckerb commented May 12, 2026