Skip to content

bluejay-parser: performance optimizations#95

Open
swalkinshaw wants to merge 2 commits intomainfrom
parser-perf-optimizations
Open

bluejay-parser: performance optimizations#95
swalkinshaw wants to merge 2 commits intomainfrom
parser-perf-optimizations

Conversation

@swalkinshaw
Copy link
Contributor

~16% faster schema parsing, ~12% faster executable parsing

Key optimizations:

  • Rewrite block string parser: direct string processing instead of sub-lexer + Vec<Vec> (~10%)
  • Compact Span: u32 start+len (8 bytes) instead of Range (16 bytes), add Copy (~3%)
  • Field: consume-then-check for alias instead of peek(1) (~2%)
  • Optimize next_if_* methods: peek+consume in single buffer operation (~1%)
  • Lazy depth_limiter.bump(): only bump when optional elements exist
  • Add Copy to DepthLimiter + preallocate Vec capacity in DefinitionDocument

Also adds benchmarks for ExecutableDocument parsing with a large fixture, and updates downstream crates to use Copy semantics on Span (clone → deref).

Note: this is a manually curated (with the help of Claude) and cleaned up version of an /autoresearch run

…rsing

Key optimizations:
- Rewrite block string parser: direct string processing instead of sub-lexer + Vec<Vec<Token>> (~10%)
- Compact Span: u32 start+len (8 bytes) instead of Range<usize> (16 bytes), add Copy (~3%)
- Field: consume-then-check for alias instead of peek(1) (~2%)
- Optimize next_if_* methods: peek+consume in single buffer operation (~1%)
- Lazy depth_limiter.bump(): only bump when optional elements exist
- Add Copy to DepthLimiter + preallocate Vec capacity in DefinitionDocument

Also adds benchmarks for ExecutableDocument parsing with a large fixture,
and updates downstream crates to use Copy semantics on Span (clone → deref).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@swalkinshaw swalkinshaw requested a review from adampetro March 17, 2026 16:47
Comment on lines +35 to +42
let arguments = if VariableArguments::is_match(tokens) {
Some(VariableArguments::from_tokens(
tokens,
depth_limiter.bump()?,
)?)
} else {
None
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just change the return type of TryFromTokens::try_from_tokens to return the transposed type and have this be the implementation? I think we call transpose almost every time we call this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one caller in directives.rs uses Option<Result<...>> 😓 so we could do that but we'd have to update the usage globally. I don't think this is so bad because it makes it obvious that we're manually avoiding calling depth_limiter.bump() when the branch isn't taken?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the implementation of TryFromTokens to do the bump internally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It becomes a bit messy since there's a few cases which don't bump so it would simplify most of the callers but make a few more complex. Just changing the return type from Option -> Result we can at least get rid of the manual transpose()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored this a bit

Comment on lines +7 to +8
start: u32,
len: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why u32 instead of usize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes the struct smaller so all the .clone() calls become Copy and avoids heap allocations

@swalkinshaw swalkinshaw requested a review from adampetro March 23, 2026 15:34
@swalkinshaw swalkinshaw force-pushed the parser-perf-optimizations branch from 6c780f0 to 6e20145 Compare March 23, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants