fix: prevent OOM when loading multi-file OpenAPI specs#5
Conversation
The document loader's cross-file $ref inlining created an exponentially growing document structure. Each inlined schema triggered a recursive walk that re-inlined all transitively referenced schemas under new pointer paths, causing the document to balloon until the process ran out of memory. This was triggered in practice by OpenAPI specs with ~70 shared schemas in a models.yaml file referenced from multiple endpoints. The fix: - Track already-inlined external refs by canonical key to avoid re-inlining the same schema multiple times - Deep-clone external schema nodes before inserting into the root document to prevent mutations to the cached source file - Walk the cloned subtree directly instead of walking from a pointer path within the root document
128a85e to
9bb3be9
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| jsonPointer.set(doc, newRef.pointer, cloned); | ||
|
|
||
| await AsyncJsonWalker.walk(cloned, walkFn, newRef.pointer); | ||
| } |
There was a problem hiding this comment.
Deduplication skips inlining when target path differs
Low Severity
node.$ref = newRef.$ is set unconditionally (line 202), but jsonPointer.set(doc, newRef.pointer, cloned) only runs for the first encounter of a given canonicalKey. Since newRef depends on componentKeyForPointer(ptr), if the same external schema is referenced from contexts producing different component keys (e.g., one resolving to schemas, another to parameters), the second occurrence's $ref will point to a path that was never populated in doc, creating a dangling reference.


Summary
$refreferences (e.g.models.yaml#/components/schemas/Foo)Root cause
When the loader encounters a cross-file
$ref, it copies the referenced schema into the root document at#/components/schemas/<title>and recursively walks the copy to resolve nested refs. However:walkedarray tracked pointer paths in the root document, not canonical external schema identities. The same external schema was inlined and walked once per unique pointer path where it was referenced.refNodewas inserted by reference (not cloned), so the cached external document was mutated by subsequent walks.AsyncJsonWalker.walk(doc, walkFn, newRef.pointer)could discover sibling schemas added by previous inlines, causing cascading re-processing.With ~70 shared schemas referenced from ~35 endpoints, this created nested
components/schemastrees (e.g./components/schemas/CompanyStatus/components/schemas/ApproverList/components/schemas/...) growing without bound until OOM at ~4GB.Fix
inlinedRefsSet to trackfilename#pointerpairs that have already been inlined — each external schema is copied and walked exactly onceJSON.parse(JSON.stringify(refNode))deep-clones the node before inserting, preventing mutation of the cached source documentdocTesting
load.spec.tswith 4 tests covering multi-file$refresolutionmodels.yaml+multi-file-api.yaml) exercising cross-file refs from multiple endpointsNote
Medium Risk
Touches core OpenAPI document-loading/inlining logic; mistakes could break
$refrewriting or schema placement for multi-file specs, though changes are scoped and covered by new tests.Overview
Prevents runaway growth/OOM when loading multi-file OpenAPI specs by deduplicating cross-file
$refinlining and avoiding repeated reinsertion of the same external schema.load()now tracks already-inlined canonical refs, deep-clones referenced nodes before inserting into the root document, and walks only the cloned subtree; new fixtures andload.spec.tstests assert schemas are inlined once,$refs are rewritten locally, and no nestedcomponentstrees are produced.Written by Cursor Bugbot for commit 9bb3be9. This will update automatically on new commits. Configure here.