gh-138912: Add fast path for match class patterns without sub-patterns#144820
gh-138912: Add fast path for match class patterns without sub-patterns#144820cdce8p wants to merge 1 commit intopython:mainfrom
Conversation
aa2f4a9 to
8cfaa01
Compare
8cfaa01 to
a515a54
Compare
a515a54 to
7f8b4b2
Compare
|
Opcodes are a very limited resource, so we need to justify it carefully when adding new ones. Is pattern matching widely used enough yet to justify this? I don't know. An approach to optimizing pattern matching that I would be more enthusiastic about is optimizing the decision tree. For example, match m:
case A(a, b):
X
case A():
Ycan be transformed into the decision tree, like this: if m isa A:
if tmp := unpack_attributes(m, 2):
a, b = tmp
X
else:
Ywhich can be converted into code that has simpler operations and is more amenable to further specialization/jitting. |
|
Inspired by Mark's comments I looked at the generated bytecode: Output: This can be simplified to (diff main...eendebakpt:pattern_match_class_v0) A quick benchmarks shows this is a bit faster. I tried changing the bytecode generation to use an |
Is it widely used, probably not. However, I'd guess that's mostly because support for Python 3.9 was only dropped at the end of last year. So most projects didn't have the option to start adopting it before then. Furthermore it's unlikely that developers go through existing code bases just to look for good opportunities to replace existing if-statements with match. I did it for pylint, just because I was curious, e.g. pylint-dev/pylint#10529. Sure this refactor is mostly unnecessary but while doing so, I noticed that especially the class pattern can be quite useful. E.g. I replaced this # old
if isinstance(var, A):
...
elif isinstance(var, B):
...
elif isinstance(var, C):
...
# new
match var:
case A():
...
case B():
...
case C():
...We ended up keeping it and I've seen a similar PR proposed for pytest. What I did realize though was that I inadvertently ended up introducing a noticeable performance regression, roughly 7-10% for the whole program, on Python 3.13. That's what lead me down the path of exploring how it can be speed up, first with #138915 and now here. Approach / ImplementationSince class patterns are so useful, I'd like to make them as fast as possible. Ideally so that there isn't a real difference between equivalent if and match statements. To archive that I'd like to eliminate the intermediary tuple completely, at least if only keyword patterns are used. This will change the evaluation from BFS to DFS but I think that is an acceptable tradeoff. To do that however, I'll need two new opcode. First the
I'd consider something like that shortly as I was checking if there are ways to cache the results of the isinstance check in class patterns. Didn't investigate it further though. -- |
Extracted from #139080. This builds on the work merged with 75d4839.
Add a fast path for match class patterns without any sub-patterns. This avoids creating a new (empty) tuple, pushing it to the stack and subsequently unpacking the tuple result from the
MATCH_CLASSopcode again.__
Micro benchmarks with all specializations enabled.
Micro benchmark (bare class pattern)
If specialization disabled, the performance improvement is even better with -40.5%.
📚 Documentation preview 📚: https://cpython-previews--144820.org.readthedocs.build/