mmm#68
Conversation
| modelId = extractNativeModelId(family, subpath, parsedBody); | ||
| } | ||
|
|
||
| if (!family) { |
There was a problem hiding this comment.
Pull request overview
This PR enhances the /native/:family/* passthrough router—especially /native/auto/*—to better auto-route OpenAI-compatible requests across configured providers (including improved model/subpath heuristics and provider selection), and updates dev test scaffolding to validate the new behavior.
Changes:
- Add auto-routing normalization + provider-native-family resolution (including prioritizing exact model matches during provider selection).
- Expand OpenAI detection heuristics (model IDs + subpaths) and add multipart form-data model extraction for auto routing.
- Export models route helpers for reuse in native routing, and extend dev mock config + SDK test to cover generic auto-routed model IDs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/api/routes/nativeProviders.ts | Extends auto routing heuristics, provider selection, and adds multipart model extraction; refactors error handling for routing. |
| apps/api/routes/models.ts | Exports applyResponseHeaders and buildModelsPayload for reuse by native routing. |
| apps/api/modules/dataManager.ts | Extends LoadedProviderData typing to include optional provider identity + declared native protocol fields. |
| apps/api/dev/testSetup.ts | Adds an additional OpenAI-compatible mock provider + model to exercise generic auto routing. |
| apps/api/dev/testOpenAISdk.ts | Adds SDK integration coverage for auto-routed generic model IDs via /native/auto/v1. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const rawText = rawBody.toString('latin1'); | ||
| for (const part of rawText.split(`--${boundary}`)) { | ||
| const [rawHeaders, rawValue] = part.split(/\r?\n\r?\n/, 2); | ||
| if (!rawHeaders || !rawValue) continue; | ||
| if (!new RegExp(`name="${fieldName}"`, 'i').test(rawHeaders)) continue; | ||
| return rawValue.replace(/\r?\n--$/, '').trim() || null; | ||
| } |
There was a problem hiding this comment.
extractMultipartFormField() converts the entire multipart body buffer to a latin1 string and then split()s it by boundary. For large multipart requests (audio/images/uploads) this duplicates memory and can create many large intermediate strings/arrays, increasing GC pressure and risking OOM under load.
Consider reusing the existing buffer-based multipart parser (modules/mediaParsing.ts parseMultipartBody) or implementing a lightweight Buffer scan that stops once the 'model' field is found, to avoid allocating a full-body string.
| const rawText = rawBody.toString('latin1'); | |
| for (const part of rawText.split(`--${boundary}`)) { | |
| const [rawHeaders, rawValue] = part.split(/\r?\n\r?\n/, 2); | |
| if (!rawHeaders || !rawValue) continue; | |
| if (!new RegExp(`name="${fieldName}"`, 'i').test(rawHeaders)) continue; | |
| return rawValue.replace(/\r?\n--$/, '').trim() || null; | |
| } | |
| const initialBoundary = Buffer.from(`--${boundary}`, 'latin1'); | |
| const partBoundary = Buffer.from(`\r\n--${boundary}`, 'latin1'); | |
| const headerSeparator = Buffer.from('\r\n\r\n', 'latin1'); | |
| const altHeaderSeparator = Buffer.from('\n\n', 'latin1'); | |
| const fieldNamePattern = new RegExp(`name="${fieldName}"`, 'i'); | |
| let boundaryIndex = rawBody.indexOf(initialBoundary); | |
| while (boundaryIndex !== -1) { | |
| let partStart = | |
| boundaryIndex === 0 | |
| ? boundaryIndex + initialBoundary.length | |
| : boundaryIndex + partBoundary.length; | |
| if ( | |
| rawBody[partStart] === 0x2d && | |
| rawBody[partStart + 1] === 0x2d | |
| ) { | |
| return null; | |
| } | |
| if (rawBody[partStart] === 0x0d && rawBody[partStart + 1] === 0x0a) { | |
| partStart += 2; | |
| } else if (rawBody[partStart] === 0x0a) { | |
| partStart += 1; | |
| } | |
| let headersEnd = rawBody.indexOf(headerSeparator, partStart); | |
| let separatorLength = headerSeparator.length; | |
| if (headersEnd === -1) { | |
| headersEnd = rawBody.indexOf(altHeaderSeparator, partStart); | |
| separatorLength = altHeaderSeparator.length; | |
| } | |
| if (headersEnd === -1) return null; | |
| const rawHeaders = rawBody.subarray(partStart, headersEnd).toString('latin1'); | |
| const valueStart = headersEnd + separatorLength; | |
| const nextBoundaryIndex = rawBody.indexOf(partBoundary, valueStart); | |
| if (nextBoundaryIndex === -1) return null; | |
| if (fieldNamePattern.test(rawHeaders)) { | |
| let valueEnd = nextBoundaryIndex; | |
| if ( | |
| valueEnd >= 2 && | |
| rawBody[valueEnd - 2] === 0x0d && | |
| rawBody[valueEnd - 1] === 0x0a | |
| ) { | |
| valueEnd -= 2; | |
| } else if (valueEnd >= 1 && rawBody[valueEnd - 1] === 0x0a) { | |
| valueEnd -= 1; | |
| } | |
| return rawBody.subarray(valueStart, valueEnd).toString('latin1').trim() || null; | |
| } | |
| boundaryIndex = nextBoundaryIndex; | |
| } |
| function extractRoutingModelId( | ||
| parsedBody: any, | ||
| subpath: string, | ||
| rawBody?: Buffer, | ||
| contentType?: string | ||
| ): string | null { | ||
| if (typeof parsedBody?.model === 'string' && parsedBody.model.trim()) { | ||
| return parsedBody.model.trim(); | ||
| } | ||
|
|
||
| const multipartModelId = | ||
| rawBody && contentType | ||
| ? extractMultipartFormField(rawBody, contentType, 'model') | ||
| : null; | ||
| if (multipartModelId) { | ||
| return multipartModelId; | ||
| } |
There was a problem hiding this comment.
New auto-routing support for multipart/form-data model extraction (extractRoutingModelId → extractMultipartFormField) isn't covered by the existing SDK integration test additions, which currently only exercise JSON bodies. Adding a test that posts multipart/form-data with a 'model' field (e.g. to /native/auto/v1/chat/completions or an /audio/* endpoint) would help prevent regressions in this routing path.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 332d9c0c71
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /^\/(?:v1\/)?fine_tuning(?:\/|$)/i.test(normalized) || | ||
| /^\/(?:v1\/)?moderations(?:\/|$)/i.test(normalized) || | ||
| /^\/(?:v1\/)?realtime(?:\/|$)/i.test(normalized) || | ||
| /^\/v1\/models(?:\/|$)/i.test(normalized) |
There was a problem hiding this comment.
Don’t force
/v1/models/* auto routes to OpenAI
The new ^/v1/models heuristic hard-codes these paths to the OpenAI family, and the later family-hint filtering in inferNativeFamilyFromModelAndProviders excludes Gemini candidates even when the model exists there. In auto mode this makes Gemini-style POST /v1/models/{id}:generateContent requests resolve to OpenAI and then fail with no provider, whereas model/provider-based inference previously allowed routing to Gemini for these model IDs.
Useful? React with 👍 / 👎.
| const rawText = rawBody.toString('latin1'); | ||
| for (const part of rawText.split(`--${boundary}`)) { |
There was a problem hiding this comment.
Avoid full-string parsing of multipart request bodies
This converts the entire multipart payload (including uploaded binary files) into a string and scans it for the model field. On /native/auto file uploads (audio/images/files), that adds a second full in-memory copy and expensive string processing before forwarding upstream, which can trigger memory-pressure failures for larger uploads because the body is already buffered once by intake.
Useful? React with 👍 / 👎.
No description provided.