-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blr pedestal and performance #724
Conversation
29a7de7
to
e33965a
Compare
f1a3152
to
9036da0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is true to the advertising lines:
-Refactorises the blr function removing the baseline calculation to be done outside the cythonised functions. In this way alternate functions can be tested.
True, and tests are added
- Also changes the cythonised PMT loop for a standard map to improve performance. Per event time reduced from ~200 ms to ~25 ms on local machine.
Also true and demonstration added.
Code concise and to the point. Good job.
It does exactly what it claims
IIUC, moving some code from Cython to pure-python-using-numpy resulted in a speedup by a factor of 8. Do you have some high-level explanation for why the Cython was that much slower, or some general lesson that can be learned from this? |
Very empirical opinion maybe but the cython function that's replaced used an The baseline calculation didn't really make any difference to performance (maybe even slightly slower but within error) but gives the user more flexibility. My take away, without really having used cython, is that you don't seem to gain anything unless you're doing something novel that can't use the internal optimisations of numpy or python itself. |
Yes, in general you want to let Numpy do any looping over its arrays, rather than writing your own loops over numpy arrays. (Well, I have seen Numba make hand-coded loops run faster than internal Numpy loops, but the gains were certainly not worth the hassle.)
Indeed, this is the rule of thumb that everyone should follow (for Numpy; for plain Python it's a completely different story). If this is not obvious to everyone, then it's worth highlighting it here: Hand-coded Cython is unlikely to outperform things that Nupmy can do with internal loops! So one might say that the problem with the original Cython function was that
Thanks for the summary ... and the PR! |
Refactorises the blr function removing the baseline calculation to be done outside the cythonised functions. In this way alternate functions can be tested.
Also changes the cythonised PMT loop for a standard map to improve performance. Per event time reduced from ~200 ms to ~25 ms on local machine.