fix(idl-v2): cycle/duplicate guards and FFI hardening#1345
fix(idl-v2): cycle/duplicate guards and FFI hardening#1345
Conversation
Tooling that emits v2 IDL from another source (codegen, schema-first generators, IDL linters) needs deterministic interface_ids up front. Today the only path is round-tripping through parse_idl with placeholder @0x0000... and copy-pasting the hex from the validation error one service at a time. Add compute_interface_ids in sails-idl-parser-v2 and a matching compute_interface_ids_to_json C ABI export in sails-idl-parser-wasm. Both run the full validation pipeline but clear explicit ids on non-@partial services first, so placeholder mismatches no longer error out. @partial services pass through verbatim — explicit id is preserved, missing id surfaces the same error as parse_idl. Closes part of #1319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SailsIdlParser.computeInterfaceIds(idl) returns a Record<string, string> mapping each service name to its canonical interface_id. Wraps the new WASM export added in the previous commit. Refactor the WASM call plumbing into a private invoke<T>() helper shared with parse() so both entry points use the same memory-fill / error-decode / cleanup path. Tests cover: not-initialized error, basic shape, placeholder mismatches ignored, cross-check that computeInterfaceIds() agrees with parse() on the same IDL (regression pin against algorithm drift), @partial pass-through with explicit id, @partial without id errors, whitespace-only IDL returns {}, parse error propagation. Update docs/idl-v2-spec.md to clarify that @<interface_id> is optional on non-@partial services and point tooling at computeInterfaceIds. Closes #1319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex review found six reachable bugs in the parser, four pre-existing
and exposed by the new compute_interface_ids API. All cause WASM traps
or panics that surface in JS as RuntimeError.
Rust:
- post_process: detect cyclic `extends` graphs (self-extends and
longer cycles A→B→A previously stack-overflowed in compute_service_id).
Track a `visiting` set and return a Validation error.
- post_process: reject duplicate service names at index-build time
instead of letting the BTreeMap silently collapse them and trapping
later via `expect(interface_id must be set)`.
- idl-parser-wasm: replace `String::from_utf8_unchecked` with safe
`str::from_utf8`. The previous code was UB on any non-UTF-8 input
from a raw FFI caller.
- idl-parser-wasm: sanitize NUL bytes before `CString::new` in
create_parse_result. Validation errors can quote user input
(e.g. `@entry_id: 1\0`), which previously panicked the Wasm module.
- idl-parser-wasm: accept zero-length input with a non-null pointer so
the FFI matches the safe-Rust API (compute_interface_ids("") → {}).
JS:
- SailsIdlParser.computeInterfaceIds returns an Object.create(null)
record. Service names like `__proto__`, `constructor`, or `toString`
are valid IDL idents (per the Pest grammar's ASCII_ALPHA | "_"),
and would otherwise pollute downstream dictionary consumers.
Tests: 3 new Rust tests and 5 new JS tests cover each bug.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a new computeInterfaceIds API to the Sails IDL parser, enabling deterministic computation of service identifiers. Key changes include updates to the IDL specification to support an optional @interface_id suffix, implementation of the new logic in both Rust and TypeScript, and enhanced validation to detect duplicate service definitions and cyclic extension graphs. Review feedback highlights a potential fragility in manual JSON construction within the WASM FFI and suggests an optimization to reduce redundant string allocations in the Rust post-processing phase.
| // Service idents are ASCII per the IDL grammar; no escaping needed. | ||
| json.push_str(name); |
There was a problem hiding this comment.
Manual JSON construction without escaping assumes that service identifiers are strictly restricted to characters that do not require JSON escaping (like double quotes or backslashes). While the IDL grammar likely enforces this, it's a fragile assumption for an FFI boundary. If the grammar is ever expanded to allow more characters in identifiers, this will produce invalid JSON.
There was a problem hiding this comment.
Good catch — switched to serde_json::to_string over a BTreeMap<&str, String> in 54e7055. serde_json is already in the WASM blob via sails-idl-ast's serde feature (used by parse_idl_to_json), so the ~13KB cost is for the new monomorphization, not a new dep. Future-proof against any grammar evolution.
| let name = s.name.name.to_string(); | ||
| if service_index.insert(name.clone(), idx).is_some() { |
There was a problem hiding this comment.
The service name is being cloned twice here: once via to_string() and then again via clone() for the insert call. Since s.name.name is already a String, you can avoid one allocation by cloning it once and reusing it for both the index insertion and the potential error message.
| let name = s.name.name.to_string(); | |
| if service_index.insert(name.clone(), idx).is_some() { | |
| let name = s.name.name.clone(); | |
| if service_index.insert(name.clone(), idx).is_some() { |
There was a problem hiding this comment.
Fixed in the same commit. Used the borrow form in format!() so we keep the single clone BTreeMap::insert actually needs and drop the redundant one.
Two suggestions from the review bot, both legitimate: - compute_interface_ids_to_json: replace the hand-rolled JSON map emitter with serde_json::to_string. Service idents are ASCII-safe per the current Pest grammar, but serde_json is already in the WASM blob (sails-idl-ast pulls it in via the `serde` feature for parse_idl_to_json), so using it here removes the "what if grammar changes" footgun for a ~13KB blob increase. - ServiceInterfaceId::new: the duplicate-name guard cloned the service name twice — once via to_string() for the index key, and again via .clone() into the error message. Use the borrow in format! and keep the single clone BTreeMap::insert needs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's stable rustfmt (1.95.0) wanted CString::new placed on one line, not split with the `.expect(...)` on its own line. Local toolchain had formatted the original layout as acceptable but stable disagreed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
vobradovich
left a comment
There was a problem hiding this comment.
Remove computeInterfaceIds
parse_idl_to_json already returns the post-processed AST JSON with computed interface_ids: parse_idl runs validate_and_post_process, and the JS parse() path normalizes those ids into InterfaceId values. So SailsIdlParser.computeInterfaceIds() mostly duplicates existing surface area. Its only semantic difference is that it ignores explicit non-@partial placeholder/mismatched ids, which also bypasses the normal parse_idl validation behavior. I’d remove the new exported WASM function and JS method, and have callers read parse(idl).services[*].interface_id instead.
IDL text that already contains placeholder or stale explicit ids
parse_idl already computes the canonical id when the suffix is absent. So tooling can do:
- generate or load IDL,
- remove explicit service ids from non-
@partialservice declarations, - call
parse_idl/parse(), - read
service.interface_idfrom the returned AST, - optionally render the IDL again with the computed ids filled in.
The only exception is @partial: those must keep an explicit id, because their subset cannot safely recompute the original service id from the partial signature.
Keep cycle detection
The extends cycle detection is still worth keeping. I checked the merge base: old ServiceInterfaceId::compute_service_id only looked in computed and then recursively computed bases, so service A { extends { A } } and A -> B -> A would recurse indefinitely. There was no previous cyclic / visiting validation in the parser path.
Per @vobradovich's review: parse_idl already returns canonical interface_ids in the AST when callers omit explicit @0x... suffixes on non-@partial services, so the new computeInterfaceIds API mostly duplicates surface area while bypassing the standard validation mismatch. Tooling that needs ids deterministically can: 1. strip explicit @0x... from non-@partial service decls, 2. call parse_idl, 3. read service.interface_id from the returned AST. Remove: - compute_interface_ids in sails-idl-parser-v2 - compute_interface_ids_to_json WASM FFI export - SailsIdlParser.computeInterfaceIds() and its IIdlParser entry - the dedicated test file Keep (these benefit parse_idl too — they are real bug fixes the reviewer explicitly asked to retain): - post_process: cycle detection on `extends` graphs (`service A { extends { A } }` and longer cycles previously stack-overflowed in compute_service_id with no validation guard) - post_process: duplicate-service-name guard at index-build time (BTreeMap previously collapsed the second entry, then trapped later via `expect`) - idl-parser-wasm: replace `String::from_utf8_unchecked` with safe `str::from_utf8` (was UB on any non-UTF-8 input) - idl-parser-wasm: sanitize NUL bytes before `CString::new` (validation errors can quote user input containing \0) - idl-parser-wasm: accept zero-length input with a non-null pointer so empty/whitespace-only IDLs do not surface as NullPtr errors Tests: 3 Rust regression tests via parse_idl (cycle, self-cycle, duplicate names) and 3 matching JS regression tests via parse(). Update docs/idl-v2-spec.md to document the strip-and-parse pattern in place of the removed API. Closes #1319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@vobradovich Done in 704032f — you were right to push back on the duplicate surface. Removed: Kept (per your request): cycle detection on Docs: rewrote the Diff against master is now 5 files, +194/-23. |
Closes #1319 (with documentation, not a new API — see thread for the discussion).
Background
The original report asked for a
computeInterfaceIdsAPI to avoid the "submit IDL with@0x0000..., paste hex from the validation error, repeat" loop when authoring v2 IDL fixtures. Per @vobradovich's review, the existingparse_idlalready returns canonical ids in the AST when callers omit explicit@0x...suffixes on non-@partialservices. So this PR closes the issue by documenting that pattern rather than adding a parallel API.What this PR ships
A handful of correctness fixes that surfaced while exploring the original API design — each one is a real bug or UB on the existing
parse_idl_to_jsonpath:rs/idl-parser-v2/src/post_process.rsextendsgraphs (service A { extends { A } }and longer cyclesA → B → Apreviously stack-overflowed incompute_service_idwith no validation guard)rs/idl-parser-v2/src/post_process.rsexpect(interface_id must be set))rs/idl-parser-wasm/src/lib.rsString::from_utf8_uncheckedwith safestr::from_utf8(was UB on any non-UTF-8 input from a raw FFI caller)rs/idl-parser-wasm/src/lib.rsCString::newincreate_parse_result(validation errors can quote user input containing\0, e.g.@entry_id: 1\0, which previously panicked the WASM module)rs/idl-parser-wasm/src/lib.rsNullPtrerrorsDocumentation
docs/idl-v2-spec.mdnow spells out the strip-and-parse pattern for tooling that needs ids deterministically up front:@0x...suffixes from non-@partialservice declarations.sails_idl_parser_v2::parse_idl(Rust) orSailsIdlParser.parse(idl)(sails-js).interface_idfrom the returned AST.@partialservices must keep their explicit id — the partial signature is a subset, so the parser cannot recompute the original service's id from it.Test plan
cargo test -p sails-idl-parser-v2 --locked— 35 tests pass (3 new regression tests exercise cycle / self-cycle / duplicate-name detection viaparse_idl)cargo clippy -p sails-idl-parser-v2 -p sails-idl-parser-wasm --all-targets --locked -- -D warnings— cleancargo fmt --all --check— cleanpnpm --filter sails-js-parser-idl-v2 build— clean rebuildjs/parser-idl-v2— 8 tests pass (3 new regression tests for cycle / self-cycle / duplicate names viaparse())🤖 Generated with Claude Code