-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
match: Use an aggregate equality comparison for constant array/slice patterns when possible #155216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c6676a2
05633c2
6049501
ba47e98
00fd1a1
85524ad
3da8657
5ba2ad7
e552045
a287bc9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,14 +4,56 @@ use rustc_abi::FieldIdx; | |
| use rustc_middle::mir::*; | ||
| use rustc_middle::span_bug; | ||
| use rustc_middle::thir::*; | ||
| use rustc_middle::ty::{self, Ty, TypeVisitableExt}; | ||
| use rustc_middle::ty::{self, Ty, TyCtxt, TypeVisitableExt}; | ||
|
|
||
| use crate::builder::Builder; | ||
| use crate::builder::expr::as_place::{PlaceBase, PlaceBuilder}; | ||
| use crate::builder::matches::{ | ||
| FlatPat, MatchPairTree, PatConstKind, PatternExtraData, SliceLenOp, TestableCase, | ||
| }; | ||
|
|
||
| /// Below this length, an array or slice pattern is compared element by element | ||
| /// rather than as a single aggregate, since the per-element comparisons are | ||
| /// unlikely to be more expensive than a `PartialEq::eq` call. | ||
| const AGGREGATE_EQ_MIN_LEN: usize = 4; | ||
|
|
||
| /// Checks whether every pattern in `elements` is a `PatKind::Constant` and, | ||
| /// if so, reconstructs a single aggregate `ty::Value` that represents the whole | ||
| /// array or slice. Returns `None` when any element is not a constant or the | ||
| /// sequence is too short to benefit from an aggregate comparison. | ||
| fn try_reconstruct_aggregate_constant<'tcx>( | ||
| tcx: TyCtxt<'tcx>, | ||
| aggregate_ty: Ty<'tcx>, | ||
| elements: &[Pat<'tcx>], | ||
| ) -> Option<ty::Value<'tcx>> { | ||
| // Short arrays are not worth an aggregate comparison. | ||
| if elements.len() < AGGREGATE_EQ_MIN_LEN { | ||
| return None; | ||
| } | ||
| let branches = elements | ||
| .iter() | ||
| .map(|pat| { | ||
| if let PatKind::Constant { value } = pat.kind { | ||
| Some(ty::Const::new_value(tcx, value.valtree, value.ty)) | ||
| } else { | ||
| None | ||
| } | ||
|
Comment on lines
+36
to
+40
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might also be worth reconstructing aggregate constants for arrays/slices of arrays of constants, etc.? I'm not a specialization expert, but it looks like arrays of bytewise-comparable things are also bytewise-comparable, at least for common array lengths1. Since array and slice equality are specialized based on their element types' bytewise-comparability, we should be able to get better codegen for nested array patterns too (as long as the inner arrays are of one of those common lengths), I think? Footnotes
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dianne, interesting. I’ll look into this next! 🙂 |
||
| }) | ||
| .collect::<Option<Vec<_>>>()?; | ||
| let valtree = ty::ValTree::from_branches(tcx, branches); | ||
| Some(ty::Value { ty: aggregate_ty, valtree }) | ||
| } | ||
|
|
||
| impl<'a, 'tcx> Builder<'a, 'tcx> { | ||
| /// Check if we can use aggregate `PartialEq::eq` comparisons for constant array/slice patterns. | ||
| /// This is not possible in const contexts, because `PartialEq` is not const-stable yet. | ||
| fn can_use_aggregate_eq(&self) -> bool { | ||
| let in_const_context = self.tcx.is_const_fn(self.def_id.to_def_id()) | ||
| || !self.tcx.hir_body_owner_kind(self.def_id).is_fn_or_closure(); | ||
| !in_const_context | ||
| } | ||
|
jakubadamw marked this conversation as resolved.
|
||
| } | ||
|
|
||
| /// For an array or slice pattern's subpatterns (prefix/slice/suffix), returns a list | ||
| /// of those subpatterns, each paired with a suitably-projected [`PlaceBuilder`]. | ||
| fn prefix_slice_suffix<'a, 'tcx>( | ||
|
|
@@ -220,10 +262,36 @@ impl<'tcx> MatchPairTree<'tcx> { | |
| _ => None, | ||
| }; | ||
| if let Some(array_len) = array_len { | ||
| for (subplace, subpat) in | ||
| prefix_slice_suffix(&place_builder, Some(array_len), prefix, slice, suffix) | ||
| // When all elements are constants and there is no `..` | ||
| // subpattern, compare the whole array at once via | ||
| // `PartialEq::eq` rather than element by element. | ||
| if slice.is_none() | ||
| && suffix.is_empty() | ||
| && cx.can_use_aggregate_eq() | ||
| && let Some(aggregate_value) = | ||
| try_reconstruct_aggregate_constant(cx.tcx, pattern.ty, prefix) | ||
| { | ||
| MatchPairTree::for_pattern(subplace, subpat, cx, &mut subpairs, extra_data); | ||
| Some(TestableCase::Constant { | ||
| value: aggregate_value, | ||
| kind: PatConstKind::Aggregate, | ||
| }) | ||
| } else { | ||
| for (subplace, subpat) in prefix_slice_suffix( | ||
| &place_builder, | ||
| Some(array_len), | ||
| prefix, | ||
| slice, | ||
| suffix, | ||
| ) { | ||
| MatchPairTree::for_pattern( | ||
| subplace, | ||
| subpat, | ||
| cx, | ||
| &mut subpairs, | ||
| extra_data, | ||
| ); | ||
| } | ||
| None | ||
| } | ||
| } else { | ||
| // If the array length couldn't be determined, ignore the | ||
|
|
@@ -235,33 +303,57 @@ impl<'tcx> MatchPairTree<'tcx> { | |
| pattern.ty | ||
| ), | ||
| ); | ||
| None | ||
| } | ||
|
|
||
| None | ||
| } | ||
| PatKind::Slice { ref prefix, ref slice, ref suffix } => { | ||
| for (subplace, subpat) in | ||
| prefix_slice_suffix(&place_builder, None, prefix, slice, suffix) | ||
| // When there is no `..`, all elements are constants, and | ||
| // there are at least two of them, collapse the individual | ||
| // element subpairs into a single aggregate comparison that | ||
| // is performed after the length check. | ||
| if slice.is_none() | ||
|
Comment on lines
+310
to
+314
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Edit: assuming prefixes and suffixes are typically small and hand-written, it's probably not worth the trouble to use aggregate equality for them.
Edit: after |
||
| && suffix.is_empty() | ||
| && cx.can_use_aggregate_eq() | ||
| && let Some(aggregate_value) = | ||
| try_reconstruct_aggregate_constant(cx.tcx, pattern.ty, prefix) | ||
| { | ||
| MatchPairTree::for_pattern(subplace, subpat, cx, &mut subpairs, extra_data); | ||
| } | ||
|
|
||
| if prefix.is_empty() && slice.is_some() && suffix.is_empty() { | ||
| // This pattern is shaped like `[..]`. It can match a slice | ||
| // of any length, so no length test is needed. | ||
| None | ||
| } else { | ||
| // Any other shape of slice pattern requires a length test. | ||
| // Slice patterns with a `..` subpattern require a minimum | ||
| // length; those without `..` require an exact length. | ||
| Some(TestableCase::Slice { | ||
| len: u64::try_from(prefix.len() + suffix.len()).unwrap(), | ||
| op: if slice.is_some() { | ||
| SliceLenOp::GreaterOrEqual | ||
| } else { | ||
| SliceLenOp::Equal | ||
| subpairs.push(MatchPairTree { | ||
| place, | ||
| testable_case: TestableCase::Constant { | ||
| value: aggregate_value, | ||
| kind: PatConstKind::Aggregate, | ||
| }, | ||
| subpairs: Vec::new(), | ||
| pattern_span: pattern.span, | ||
| }); | ||
| Some(TestableCase::Slice { | ||
| len: u64::try_from(prefix.len()).unwrap(), | ||
| op: SliceLenOp::Equal, | ||
| }) | ||
| } else { | ||
| for (subplace, subpat) in | ||
| prefix_slice_suffix(&place_builder, None, prefix, slice, suffix) | ||
| { | ||
| MatchPairTree::for_pattern(subplace, subpat, cx, &mut subpairs, extra_data); | ||
| } | ||
|
|
||
| if prefix.is_empty() && slice.is_some() && suffix.is_empty() { | ||
| // This pattern is shaped like `[..]`. It can match | ||
| // a slice of any length, so no length test is needed. | ||
| None | ||
| } else { | ||
| // Any other shape of slice pattern requires a length test. | ||
| // Slice patterns with a `..` subpattern require a minimum | ||
| // length; those without `..` require an exact length. | ||
| Some(TestableCase::Slice { | ||
| len: u64::try_from(prefix.len() + suffix.len()).unwrap(), | ||
| op: if slice.is_some() { | ||
| SliceLenOp::GreaterOrEqual | ||
| } else { | ||
| SliceLenOp::Equal | ||
| }, | ||
| }) | ||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| // MIR for `array_match` after built | ||
|
|
||
| fn array_match(_1: [u8; 4]) -> bool { | ||
| debug x => _1; | ||
| let mut _0: bool; | ||
| let mut _2: &[u8; 4]; | ||
| let mut _3: bool; | ||
| scope 1 { | ||
| } | ||
|
|
||
| bb0: { | ||
| PlaceMention(_1); | ||
| _2 = &_1; | ||
| _3 = <[u8; 4] as PartialEq>::eq(copy _2, const &*b"\x01\x02\x03\x04") -> [return: bb4, unwind: bb8]; | ||
| } | ||
|
|
||
| bb1: { | ||
| _0 = const false; | ||
| goto -> bb7; | ||
| } | ||
|
|
||
| bb2: { | ||
| falseEdge -> [real: bb6, imaginary: bb1]; | ||
| } | ||
|
|
||
| bb3: { | ||
| goto -> bb1; | ||
| } | ||
|
|
||
| bb4: { | ||
| switchInt(move _3) -> [0: bb1, otherwise: bb2]; | ||
| } | ||
|
|
||
| bb5: { | ||
| FakeRead(ForMatchedPlace(None), _1); | ||
| unreachable; | ||
| } | ||
|
|
||
| bb6: { | ||
| _0 = const true; | ||
| goto -> bb7; | ||
| } | ||
|
|
||
| bb7: { | ||
| return; | ||
| } | ||
|
|
||
| bb8 (cleanup): { | ||
| resume; | ||
| } | ||
| } |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general approach feels unfortunate: we transformed constants into patterns in
const_to_pat, and now we try to reverse that transformation. Have we tried keeping the original constant around in the output ofconst_to_patand using it at runtime? Tho this has the same issue of const-dependent MIR lowering that @dianne pointed out.View changes since the review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, maybe we could also synthesize a constant during THIR building for hand-written array/slice patterns, like this PR currently does in MIR building? That way, we wouldn't end up worse codegen for hand-written array patterns than for const array items used as patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno, small handwritten arrays behave pretty much like tuples, it could be better codegen sometimes not to make them into constants:
It's admittedly a stretch but I'm tempted to err on the side of respecting user intent especially before the MIR boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. Maybe we should have a test for that in
mir-opt/building/match/sort_candidates.rsor such if we start optimizing array/slice comparisons? I think as-is this PR may also turn that into 8 sequential tests, each aTestKind::AggregateEqfor a different constant.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nadrieril, as an alternative to being guided solely by user “intent”, would it be sensible to raise the threshold on the number of elements in an array pattern where we would use the aggregate equality? Right now it’s
> 2. Perhaps 4 would work better? I suppose a quantitative comparison with benchmarks could be of use here, but sadly I can’t commit to that with my present schedule.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yeah. I'm not really familiar with codegen yet, so I couldn't say what the best way to do that would be. Naïvely, I'm imagining adding bulk comparison to MIR; it'd contain the constant we want to bulk-compare with so that we can generate easily-optimizable LLVM IR ourselves. Unfortunately, needing to be non-panicking means we can't just model it as a call to a normal intrinsic, which means we can't provide a default implementation in
coreunless we got a panic effect so we could have non-panicking calls (..and we'd need powerful enough const generics to make the constant a generic arg, but that seems more realistic). As for how to lower it, judging by what worked for rustc 1.71.0, a straight-line sequence ofloads andicmps strung together withselects optimizes pretty well (it gets auto-vectorized), and if I understand LLVM's undef semantics, it should be correct too (as long as eachselectis switching on the result of the previous sequence of comparisons, not the new comparison). Adding to MIR looks like a pain maintenance-wise though (and potentially opsem-wise, though this should reduce to a bunch ofswitchInts so theoretically there's nothing new, I think?).. hopefully there's a lighter-weight option! Maybe I should actually try to figure out what regressed here; it'd be educational for me if nothing else.If we ever get a LIR for optimizations, maybe that could have a non-branching select in it so we could get rid of branches from these matches before passing them off to LLVM. But getting rid of branching does seem like the sort of thing LLVM should be doing... I wonder why it stopped working.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a lot of complexity. I'd rather change our semantics to lower "match on a constant" to an equality comparison, which is most likely what users expect anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I also don't think we should bake bulk comparisons into MIR ^^ that's way too much maintenance burden.
If we had LIR, I imagine simplifying control-flow could be useful for more than matches against constants; I don't think anything that complex should be done just for this.
I hadn't considered changing semantics. I'd want to look into what's going wrong first, but an equality comparison is probably what users expect yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a thought: I think that if we define the semantics of matching on a byte slice constant to be
PartialEq::eq, then our current lowering is a valid "optimization" over that. Does that seem right to you?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's right. Of course, if we apply it at MIR building time, as we currently do, we'd need a false unwind edge and validity assertion to make borrowck and Miri not "optimization"-dependent. Maybe it'd also need something to account for the opsem consequences of taking a reference? I'm not really familiar with those semantics.
Only having the validity assertion for byte slices/arrays seems relatively fine to me since I think that matches how
PartialEq::eqworks on slices/arrays: if it specializes to acompare_bytes/raw_eqit asserts validity for the whole slices/arrays; otherwise it compares element-by-element with short-circuiting. I'm not really sure how to feel about matches against byte slice/array constants being more restrictively borrow-checked; maybe other structural matches against named/associated constants could have false unwind edges to pretend they're more like calls toPartialEq::eq? Idk whether that's something we'd want to hold space for, though.