-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added SymPEP for MatchPy dependency #3
base: main
Are you sure you want to change the base?
Conversation
Related PR on SymPy: sympy/sympy#20849 |
@asmeurer here a draft for a SymPEP to include MatchPy into SymPy. |
I guess that SymPEPs like this should be discussed on the mailing list. |
|
Benchmarks run on RUBI (Rule-based integration) have shown that even large sets | ||
of about 10 000 patterns can be run in extremely fast times, once all rules are | ||
loaded. In particular, MatchPy was able to beat Wolfram Mathematica in speed on | ||
large sets of pattern matching rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide some actual timing comparisons (or point to them) so that we can see what there is to gain in a specific way?
If we add a hard dependency, we need to feel confident that SymPy developers can and will maintain the dependency if necessary. I think that is my only concern for adding any hard dependency that is a small project with few developers maintaining it. Symengine was added as an optional dependency to sympy.physics.vector/mechanics and it causes us a lot of difficulties because the sympy.physics.vector/mechanics developers are not going to develop symengine (outside of our current skillset and interest). So I think this pep needs to address how things will be handled if matchpy isn't developed or can't keep pace with SymPy. |
I don't think that the way that SymPy currently integrates with matchpy is quite right and this relates to my comments here: It would be good to use matchpy but to do it properly we need something different from Basic/Expr as they are currently implemented. |
If we use the "do it properly" as the bar to jump to get changes into SymPy, then it mostly just discourages contributions. If someone has the energy to make matchpy work and keep it working as a dependency then I think we should enable them to do it. |
Or maybe, to state that differently, if someone makes a sympep we deny, then we should state the reasons why and then close it, so the hopeful person can move on to other things and other proposals. @Upabjojr has been trying for this for years and I don't see what the real blocker is. |
Beware that the algorithms behind MatchPy are quite complicated. I wouldn't recommend modifying it too much, unless you have enough time to work available. |
The SymPEP here does not really explain what Francesco's actual intended plan is. I think that there are two parts to it:
Leaving RUBI aside for a moment I think that Francesco and I are in agreement about the desired end state here:
Loosely the above means moving SymPy towards something that more resembles Mathematica's evaluation engine. I think the difference though between Francesco's plan and mine is how we would expect to reach that end state:
There are many reasons that I think building a new system from the ground up is needed but being able to work better with things like matchpy is precisely one of those reasons. The existing design of Basic with all of its automatic evaluation and expensive subroutines is not really suitable for e.g. defining all of the rules that would be used to implement the matchpy-based evaluation scheme described above. Likewise you really need to be able to do things like conjure up new expression heads any time (as is done in Mathematica) but the design of Basic means that every expression head needs to be a class which makes everything a mess. I think that if this is Francesco's intended plan then he should spell out at least some part of it in the SymPEP so that a reader has some idea where this is intended to lead. Of course we can implement something now that exposes matchpy for use by users as is intended by this SymPEP but there needs to be something clearly tangible for end users that is achieved by adding matchpy as a hard dependency. Note that downstream libraries can just add the dependency on matchpy themselves if they want the features it enables so we really need to see some benefit for the end user in exchange for installing matchpy before adding it as a hard dependency. As it stands the SymPEP proposes to add matchpy as a hard dependency but does not offer any concrete benefit in exchange for doing so rather than keeping it as an optional dependency. I think that the reason Francesco wants it to be a hard dependency is because of the core evaluation scheme that I outlined above but the SymPEP does not say anything about that and personally I don't think that making those changes incrementally is the best route forwards (although I think we agree about the desired end state). As for RUBI: If RUBI is actually working now then I would be absolutely fine with adding it back to sympy and adding the necessary dependency on matchpy for it to work. Last time I checked RUBI did not actually work though or at least I couldn't get it to do anything useful when I tried. |
That has never been my understanding, but I may certainly have misunderstood. I thought that matchpy was simply going to be added so that it can be used in SymPy internally and for now only specifically for getting RUBI based integration to work. I did not know the intention is to rewrite the core based on matchpy. |
I might be reading too much into this myself. Francesco can clarify if either of us has misunderstood. If the intention is only for SymPy to use matchpy internally then we should clarify how this brings (or will bring) a concrete benefit to users before adding a hard dependency. If the intention is to make RUBI work then I don't think that having SymPy add a hard dependency on matchpy actually helps with any of the main problems. Perhaps I should try RUBI again but it barely worked when I tested it and I have not heard of anything since that would change that. I don't see how SymPy adding a dependency on matchpy would help to change that situation though. Often PEPs are expected to have a "reference implementation" which shows that the general idea for the code works and can be used to test what sort of improvements it might bring e.g. to compare timings. Here that could just be a pull request with something like "this makes I think I understand where Francesco is coming from based on reading between the lines in the PEP and based on other comments made in other contexts. The PEP as it stands does not explain these things as I understand them though. The blocker here (for me) is simple: the PEP does not give any reason for matchpy to be a hard dependency rather than continuing to be an optional dependency. Since that is precisely what the PEP proposes what I am saying is that the PEP does not give any justification for the actual change that it proposes. |
The SYMPEP is clearly not finished, so we probably shouldn't be debating it until it is. Maybe these should be marked as drafts until they are in a form that is ready for review. |
Regarding Rubi, we had two great GSoC years with good students who ported most of the work. The usage of MatchPy inside the Rubi module was not generalisable to the rest of SymPy. That's why I created Unfortunately it's been a while that no good GSoC candidates have shown up to continue the work on Rubi. Maybe we should find a way to better advertise the GSoC program for SymPy? Concerning the goals, they are mainly restricted by time. We could reimplement SymPy's core with pattern matching rules and then finish matchPyCpp to compile those rules into C++ decision trees and have a very fast, yet easy to extend SymPy core. Most of the hard algorithms are already implemented and this project would just require to fix and test the code. But will we find good GSoC students to do that? |
To sum it up: the goals are time constrained. Depending on how much dedicated programming time is dedicated to this project, we could implement either a small test or a full SymPy rewrite. I keep stressing this point: we need to find good GSoC students, we need to better advertise SymPy's GSoC participation to university students. I'm pretty sure there are good candidates out there, we just need to let them know they can be candidates. |
I suggest changing this PR to draft until someone is prepared to complete and move it forward. I don't think a GSoC student will have the background to be able to draft and champion a SYMPEP such as this (at least not a new GSOC students). |
The ones we had for Rubi were pretty good. It's only a matter of waiting for the right candidate. |
I agree. What is needed is really a way to break this down into incremental chunks of work somehow. I have a plan for this that I will explain shortly. My hope is that you and others will want to work together with me on this and that it will lead to many nicely packaged GSOC projects. Currently I think that making changes to the core is quite difficult which in turn makes it difficult to make any straight-forward GSOC project that could improve things. Let's leave the "reimplement the core" idea aside for the moment and focus on the proposal here to add matchpy as a hard dependency and what more immediate improvements it might bring. The question asked above was:
I don't think that there is any real blocker. Currently it is possible to make progress on the things that matchpy would be used for without adding it as a hard dependency yet. It is not clear to me what an immediate next step would be that would be enabled by adding matchpy as a hard dependency. If there is some immediate improvement that could be brought but that would only work with matchpy as a hard dependency then the PEP just needs to explain what that is and then I don't think that anyone would object. My only objection is to the idea that we add a hard dependency now without (yet) being able to offer users any immediate improvement that results from that. There is a nonzero cost to both end users and downstream libraries, packagers, Python distributions, packaging systems etc that results from SymPy adding matchpy as a hard dependency. I just don't think that we should add a hard dependency if we are not in an immediate position to say that it brings some benefit that would not be otherwise (easily) achievable. If the features it enables are optional and unlikely to be used my most SymPy users then for most people the hard dependency brings a small downside with no corresponding upside in return. |
Some ideas:
|
Or use it in manualintegrate. This is like RUBI (it's a pattern matching integrator), but it's much smaller. The current implementation uses hand-rolled pattern matching system that would be simplified greatly if it just used normal patterns as expressions. |
I think that manualintegrate is still quite complicated. The easiest target would be Laplace transforms. @hanspi42 has been working on converting the That is a prime target for using matchpy's simultaneous efficient matching of multiple patterns against a single target expression. Already Laplace transforms are faster, better more complete because of the rule-based approach but if we could make them extremely fast with matchpy then this would really be a nice demonstration of SymPy's use of matchpy as well as a very nice feature of SymPy. That would be the kind of "benefit for end users" that could justify adding matchpy as a dependency (if the intention would be to extend that sort of usage of matchpy further in future). I suspect that there would be some difficulty in transitioning to use matchpy in place of SymPy's
I'm not sure that matchpy would allow matching e.g. a missing term from an This example might not be a good one but in general there will be differences and that means that the rules would need to be defined differently. Most likely a larger set of rules would be needed with matchpy than with SymPy's existing |
I think actually that |
This is supported, although in a different manner. You can create optional wildcards that have a default value in case there are no matches (it's a bit clunkier because it's unaware of the neutral element of the operation, you have to specify it manually). Furthermore MatchPy supports the "one identity" property for nodes, which means that the nodes with only one argument tagged as "one identity" will be considered equivalent to just that single argument. Nodes like Add can be tagged that way. |
In [1]: from sympy.utilities.matchpy_connector import *
In [2]: a = WildDot("a", optional=1)
In [3]: b = WildDot("b", optional=0)
In [4]: from sympy.abc import t
In [5]: p = cos(a*t + b)
In [8]: replacer = Replacer()
In [11]: from sympy import Tuple
In [12]: replacer.add(p, Tuple(a, b))
In [13]: replacer.replace(cos(t))
Out[13]: (1, 0) |
No description provided.