Skip to content

match: Use an aggregate equality comparison for constant array/slice patterns when possible#155216

Open
jakubadamw wants to merge 6 commits intorust-lang:mainfrom
jakubadamw:issue-110870-103073
Open

match: Use an aggregate equality comparison for constant array/slice patterns when possible#155216
jakubadamw wants to merge 6 commits intorust-lang:mainfrom
jakubadamw:issue-110870-103073

Conversation

@jakubadamw
Copy link
Copy Markdown
Contributor

@jakubadamw jakubadamw commented Apr 12, 2026

When every element in an array or slice pattern is a constant and there is no .. subpattern, the match builder will now emit a single call to PartialEq::eq instead of comparing each element from the value one by one against the respective constant in the pattern.

This drastically reduces the number of MIR basic blocks for large constant-array matches – e.g. a 64-element [u8; 64] match previously generated 64 separate comparison blocks and now generates just one PartialEq::eq call that LLVM can lower to a memcmp(). The optimisation is gated on having at least two constant elements, meaning single-element arrays will still use a plain scalar comparison.

Example:

const FOO: [u8; 64] = *b"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef";

pub fn foo(x: &[u8; 64]) -> bool {
    // Before: 64 basic blocks, one per byte.
    // After:  a single `PartialEq::eq()` call.
    matches!(x, &FOO)
}

Closes #103073.
Closes #110870.

…s when possible

When every element in an array or slice pattern is a constant and there
is no `..` subpattern, the match builder now emits a single call to
`PartialEq::eq` instead of comparing each element one by one.

This drastically reduces the number of MIR basic blocks for large
constant-array matches – e.g. a 64-element `[u8; 64]` match previously
generated 64 separate comparison blocks and now generates just one
`PartialEq::eq` call that LLVM can lower to a `memcmp()`

The optimisation is gated on having at least two constant elements.
Single-element arrays still use a plain scalar comparison.

Example:

```rust
const FOO: [u8; 64] = *b"0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef";

pub fn foo(x: &[u8; 64]) -> bool {
    // Before: 64 basic blocks, one per byte.
    // After:  a single `PartialEq::eq()` call.
    matches!(x, &FOO)
}
```
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 12, 2026

Some changes occurred in match lowering

cc @Nadrieril

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 12, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 12, 2026

r? @JonathanBrouwer

rustbot has assigned @JonathanBrouwer.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: compiler, mir
  • compiler, mir expanded to 69 candidates
  • Random selection from 12 candidates

@rust-log-analyzer

This comment has been minimized.

@jakubadamw jakubadamw marked this pull request as draft April 12, 2026 23:12
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 12, 2026
…t contexts

This is unless the `const_cmp` feature is enabled, in which case `PartialEq`
becomes available in said contexts.
@jakubadamw jakubadamw marked this pull request as ready for review April 12, 2026 23:50
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 12, 2026
@JonathanBrouwer
Copy link
Copy Markdown
Contributor

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request Apr 13, 2026
match: Use an aggregate equality comparison for constant array/slice patterns when possible
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 13, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 13, 2026

☀️ Try build successful (CI)
Build commit: 5a6dba6 (5a6dba60b1150a8b57fc739b0829fa4a65c5b8b3, parent: 14196dbfa3eb7c30195251eac092b1b86c8a2d84)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (5a6dba6): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 1
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.4% [-0.4%, -0.4%] 1

Max RSS (memory usage)

Results (primary -0.0%, secondary 0.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.9% [1.0%, 2.8%] 2
Regressions ❌
(secondary)
2.7% [1.2%, 5.0%] 3
Improvements ✅
(primary)
-3.9% [-3.9%, -3.9%] 1
Improvements ✅
(secondary)
-1.8% [-2.2%, -1.6%] 3
All ❌✅ (primary) -0.0% [-3.9%, 2.8%] 3

Cycles

Results (primary -2.3%, secondary 14.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
14.4% [2.4%, 26.4%] 2
Improvements ✅
(primary)
-2.3% [-3.1%, -1.7%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.3% [-3.1%, -1.7%] 4

Binary size

Results (primary -0.1%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.1%] 1
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 2
Improvements ✅
(primary)
-0.2% [-0.2%, -0.1%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.1% [-0.2%, 0.1%] 5

Bootstrap: 491.114s -> 490.988s (-0.03%)
Artifact size: 394.23 MiB -> 394.33 MiB (0.03%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 13, 2026
@JonathanBrouwer
Copy link
Copy Markdown
Contributor

@rustbot reroll
Not familiar enough with this code to comfortably review this, but from a quick glance this looks great, thanks <3

@rustbot rustbot assigned jackh726 and unassigned JonathanBrouwer Apr 15, 2026
impl<'a, 'tcx> Builder<'a, 'tcx> {
/// Check if we can use aggregate `PartialEq::eq` comparisons for constant array/slice patterns.
/// This is not possible in const contexts unless `#![feature(const_cmp, const_trait_impl)]` are enabled,
/// because`PartialEq` is not const-stable.
Copy link
Copy Markdown
Contributor

@dianne dianne Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// because`PartialEq` is not const-stable.
/// because `PartialEq` is not const-stable.

View changes since the review

Comment on lines +47 to +55
fn can_use_aggregate_eq(&self) -> bool {
let const_partial_eq_enabled = {
let features = self.tcx.features();
features.enabled(sym::const_trait_impl) && features.enabled(sym::const_cmp)
};
let in_const_context = self.tcx.is_const_fn(self.def_id.to_def_id())
|| !self.tcx.hir_body_owner_kind(self.def_id).is_fn_or_closure();
!in_const_context || const_partial_eq_enabled
}
Copy link
Copy Markdown
Contributor

@dianne dianne Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wary of changing MIR building based on whether features are enabled, especially with one being a library feature. That feels subtle.

It's also unfortunate that MIR building would be conditional on being in a const context, but I'm not sure what could be done about that other than waiting for const_cmp to stabilize. I'd assumed from the discussion on #110870 that that was blocking this approach to simplifying array/slice pattern tests.

View changes since the review

Comment on lines +32 to +36
if let PatKind::Constant { value } = pat.kind {
Some(ty::Const::new_value(tcx, value.valtree, value.ty))
} else {
None
}
Copy link
Copy Markdown
Contributor

@dianne dianne Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be worth reconstructing aggregate constants for arrays/slices of arrays of constants, etc.? I'm not a specialization expert, but it looks like arrays of bytewise-comparable things are also bytewise-comparable, at least for common array lengths1. Since array and slice equality are specialized based on their element types' bytewise-comparability, we should be able to get better codegen for nested array patterns too (as long as the inner arrays are of one of those common lengths), I think?

View changes since the review

Footnotes

  1. https://github.com/rust-lang/rust/blob/f29256dd1420dc681bf4956e3012ffe9eccdc7e7/library/core/src/cmp/bytewise.rs#L74-L85

Comment on lines +322 to +326
// When there is no `..`, all elements are constants, and
// there are at least two of them, collapse the individual
// element subpairs into a single aggregate comparison that
// is performed after the length check.
if slice.is_none()
Copy link
Copy Markdown
Contributor

@dianne dianne Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An additional possibility: even if there is a .., the comparisons for the sub-slices before and after the .. could be done via aggregate equality when applicable. Credit to #121540, which I think did this?

Even if only handling the case with no .., it might be worth moving the special-casing into prefix_slice_suffix to share it between PatKind::Slice and PatKind::Array, since that's where the commonalities live.

View changes since the review

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 18, 2026

☔ The latest upstream changes (presumably #155472) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Copy Markdown
Member

@Nadrieril Nadrieril Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach feels unfortunate: we transformed constants into patterns in const_to_pat, and now we try to reverse that transformation. Have we tried keeping the original constant around in the output of const_to_pat and using it at runtime? Tho this has the same issue of const-dependent MIR lowering that @dianne pointed out.

View changes since the review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, small handwritten arrays behave pretty much like tuples, it could be better codegen sometimes not to make them into constants:

match foo {
  // compiles to three nested `if`s today, would become 8 sequential `if`s if turned into constants
  [false, false, false] => ...,
  [false, false, true] => ...,
  [false, true, false] => ...,
  ...
}

It's admittedly a stretch but I'm tempted to err on the side of respecting user intent especially before the MIR boundary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. Maybe we should have a test for that in mir-opt/building/match/sort_candidates.rs or such if we start optimizing array/slice comparisons? I think as-is this PR may also turn that into 8 sequential tests, each a TestKind::AggregateEq for a different constant.

@jackh726
Copy link
Copy Markdown
Member

@dianne do you want to take over review here?

Copy link
Copy Markdown
Contributor

@dianne dianne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much context on the ctfe or const traits side of things to evaluate the approach. cc @oli-obk maybe? It feels like we have to make some sort of compromise here to avoid calling anything const-unstable. Possibly we could block on const_cmp stabilizing, or possibly we could have some workaround until then if we want to land this first.

I think I can handle the technical review, at least. Tentatively, r? me (though feel free to steal the assignment if that'd be easier ^^)

View changes since this review

Comment on lines +185 to +220
TestKind::AggregateEq { value } => {
let tcx = self.tcx;
let success_block = target_block(TestBranch::Success);
let fail_block = target_block(TestBranch::Failure);

let aggregate_ty = value.ty;
let ref_ty = Ty::new_imm_ref(tcx, tcx.lifetimes.re_erased, aggregate_ty);

// The constant has type `[T; N]` (or `[T]`), but calling
// `PartialEq::eq` requires `&[T; N]` (or `&[T]`) operands.
// Valtree representations are the same with or without the
// reference wrapper, so we can reinterpret by replacing the type.
let expected_value = ty::Value { ty: ref_ty, valtree: value.valtree };
let expected_operand =
self.literal_operand(test.span, Const::from_ty_value(tcx, expected_value));

// Create a reference to the scrutinee place.
let actual_ref_place = self.temp(ref_ty, test.span);
self.cfg.push_assign(
block,
self.source_info(test.span),
actual_ref_place,
Rvalue::Ref(tcx.lifetimes.re_erased, BorrowKind::Shared, place),
);

// Compare using `<T as PartialEq>::eq` where `T` is the array or slice type.
self.non_scalar_compare(
block,
success_block,
fail_block,
source_info,
aggregate_ty,
expected_operand,
Operand::Copy(actual_ref_place),
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from missing the assertion that value.ty is str, this looks the same as the case for TestKind::StringEq. Could they be merged?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.

@rustbot rustbot assigned dianne and unassigned jackh726 Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Large amount of generated code for match statements with large arrays Weird Match Statement Codegen With Byte Strings

8 participants