-
Notifications
You must be signed in to change notification settings - Fork 64
Can the compiler fold class patterns? #129
Comments
In Ivan's original proposal, you would not be able to make this assumption because the The various competing proposals for custom matching do allow this sort of optimization because they are more restricted, which means that an optimizer can make more assumptions. There are two conditions which must hold true in order to make this optimization:
This means that for any |
TL;DR: No, a global variable
Actually no, it's possible for users to implement a sequence type with a |
I think it’s fine to optimize the implementation with the assumption that certain things don’t change while we’re in pattern selection mode. Whether we also assume guards to have no side effects is less clear. The bottom line however must be that if an object lies or cheats, only runtime exceptions may be raised (or incorrect conclusions reached), but no segfaults or bad writes or reads from uninitialized memory can happen. (I think .sort() has a similar rule — if < lies, your data won’t be sorted, but you won’t lose it.) |
I think it's best to invalidate all known information when hitting a guard. Basically, I'm working on building a decision tree at compile-time, and it's become clear that it's only safe to do so when we stay in "pattern land", where the normal rules don't apply. For example: match s:
case <p0> | <p1>: ...
case <p2> | <p3> if <g0>: ...
case <p4> | <p5>: ... It's safe to use the same decision tree for |
I realize that we're not doing custom matching yet, but there is a potential one-way door here - we have to make sure that any future custom matching protocol doesn't violate the assumptions made here by the optimizer. The best possible optimization can be obtained by putting a fairly strict restriction on custom matchers: they have to be pure functions with no side effects. So you aren't allowed to have a customer matcher that increments an internal counter and uses that to decide whether or not to match (I know that's a ridiculous example, but it's the best I could come up with in the moment.) Personally I think that custom matchers should be pure functions anyway, since it makes them easier to reason about. If we made a rule that said that any future custom match implementation had to be a pure function, and also that it was not permissible to re-bind the type name once it had been called, then the VM would only need to call the custom match function once for each input value, regardless of how many times that matcher appeared in a given match statement. |
But we don't have to say anything now. In the future when we're ready to
design a custom match protocol we can take care to design the protocol and
add constraints to the customization code to allow the optimizer to make
reasonable assumptions.
|
I think this is fully pepped now that we have a section "Performance Considerations", right? |
Yeah. While we don't explicitly come out and say "we can cache name lookups", it logically follows from the rest of the section. |
Just to toss in my two cents here... My own proof-of-concept implementation used For Brandt's example, I basically agree with that match s:
case int(i) if i >= 0: ...
case tuple((x, y)) if weird_magic(x, y): ...
case str(): ... In this example, the two guards could not possibly interfere with the subsequent patterns. At the time a guard expression is evaluated, we already know that none of the other cases could possibly match. The thing is, though, that looking into this kind of optimisations is clealy way beyond the scope of what we are doing. It merely demonstrates that the actual evaluation order might end up being quite complex and not strictly follow a left-to-right-top-to-bottom order.
I fully agree. It would be quite nice if we could restrict any future custom A final thought: Patterns clearly have a tree structure. According to the usual left-to-right evaluation rules of Python, we might expect that pattern matching follows a depth-first approach. But we actually envision an algorithm that is free to choose a breadth-first approach instead, or any combination thereof. For instance, if you do not have a stack machine, you don't want to have intermediate values lying around if you can help it. Hence, in case of something like After all, we should keep in mind that patterns are a completely new thing in that load and store semantics (necessarily) mix and intermingle here. In particular: a pattern is not an expression and hence the usual rules of expressions simply don't apply. Case in point: if you consider something like |
I’m with you all the way, except for your last sentence. In that case the order is still prescribed by the language spec. But yes, we should resist the urge to prescribe how pattern matching is implemented. We should just constrain it (e.g. no slicing or negative indexing). |
(Also no getitem without contains check first, to handle defaultdict. That’s worth specifying.) |
I believe the decision tree should be opaque when executed in the match statement, regardless of the action of any guards or "interesting" classes. The analogy I'm thinking of is a CPython and other implementations don't allow this, by ensuring that the loaded iterator is opaque to Python code in a Note this doesn't imply that there is some transactional view on these loads - all or nothing by the decision tree - when viewed from another thread. So this means that any lookups can be performed as needed. Obviously if the lookup is not cached, something interesting can be done, but hopefully docs would be enough here. Nor does that necessitate a specific bytecode architecture or helper objects - that's left to the implementation. |
@gvanrossum My intention with the last sentence was really to point out that we are mixing two different 'cultures' (rather than saying that it is without or against the specs), so any arguments along the lines of "but all expressions are evaluated like ..." are quite invalid. But you are right: it is certainly not the best example. @jimbaker I like the idea of an "opaque decision tree" and thinks it is a good way to approach this. So, the decision tree is a black box where we feed in a subject, turn the cranks and see where the result is spit out. Then we apply any guards to check whether we are happy with the result. But here is the point: if we are not happy, we need to put the subject back into the black box and turn the cranks again. And in the middle, the guard might have tampered with our black box... Even by making the entire matching machinery opaque, there is still the issue that we might have to run more than once during one matching process. And given Python's semantics, we might have to permit the guard to tinker with the machinery—even if we don't like it. |
Concretely, if we wrote
there could be a case where I am fine with that. |
Interesting: this is quite the other way round of what I was thinking about all along. Great example! And yes, I totally agree that our semantics should allow that |
@Tobias-Kohn The abstract semantics of the "black box" model is roughly
I believe that @gvanrossum 's model here is the practical/best effort version where we throw up our hands and say "undefined behavior" if the black box model's semantics is violated (so long as it is not possible to crash Python), because we didn't actually implement this as strictly as the black box model demands. (The black box model is really impractical! Just think of assignment expressions.) Still there is at least one possible optimization here with respect to all cases evaluated - this is effectively like a computed switch when doing this on say an enumeration. Whether that is really feasible or not, I don't know. I don't think it is in CPython, but it could be in implementations like PyPy where the JIT observes that in this tight loop, the same match statement is being executed repeatedly, as might be the case with a state machine or interpreter; and of course the enumeration is "stable". |
@jimbaker I think I understand now what you are saying. The black box was not really intended as describing the actual implementation. But even in case of "assignment expressions" I would like to point out that while we borrow the syntax and idea of something that already existists in Python, its actual semantics might slightly differ. So, the walrus operator does not need to bind immediately. Anyway, let me give an example for your model, just to make sure I get it right. Given the following code: match s:
case int(x) if x >= 0: ...
case int(x) if x < 0: ...
case _: ... this would be compiled to something roughly equivalent to: if isinstance(x := s, int):
if x >= 0: ...
elif x < 0: ...
else:
... So, the |
Correct, and this is a good example of what can practically be folded. |
As long as it’s also okay to dots things strictly one at a time, repeating
redundant lookups.
--
--Guido (mobile)
|
In short, should we be able to do simple optimizations like executing the pattern
C(<p0>) | C(<p1>)
asC(<p0> | <p1>)
at runtime?Wearing my implementer hat, my answer is "yes". We've made it clear that patterns don't always follow the same well-defined behavior of expressions, and explicitly noted that e.g.
__instancecheck__
calls may be cached. It seems like this could be a nice performance win for some common cases (e.g.Node('+', lt, rt) | Node('-', lt, rt)
->Node('+' | '-', lt, rt)
).But of course there are pathological cases where the act of matching a pattern could re-bind the class name using inspection or some other magic. I don't feel a need to accommodate those... in fact, the optimized behavior would be arguably more correct in these cases.
Thoughts?
The text was updated successfully, but these errors were encountered: