Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumPy Computation Efficiency #326

Closed
lkilcher opened this issue Jan 14, 2016 · 12 comments
Closed

NumPy Computation Efficiency #326

lkilcher opened this issue Jan 14, 2016 · 12 comments

Comments

@lkilcher
Copy link

The last paragraph of the NumPy support doc page, states:

This behaviour introduces some performance penalties and increased memory usage. Quantities that must be converted to other units require additional memory and cpu cycles. On top of this, all ufuncs are implemented in the Quantity class by overriding __array_wrap__, a NumPy hook that is executed after the calculation and before returning the value. To our knowledge, there is no way to signal back to NumPy that our code will take care of the calculation. For this reason the calculation is actually done twice: first in the original ndarray and then in then in the one that has been converted to the right units. Therefore, for numerically intensive code, you might want to convert the objects first and then use directly the magnitude.

I'm curious if anyone is thinking about resolving the 'calculation is actually done twice' issue using some of the recent changes to the NumPy subclassing structure. In particular it seems like either __array_prepare__, or the forthcoming __numpy_ufunc__ could be used to modify input arrays before NumPy computes values in the ufunc call. I know Quantity is not technically a numpy.ndarray subclass, but since you're using the __array_wrap__ functionality, I thought this might be something that could work.

I'd be happy to work on this unless you see a reason that it won't work.

I'm also interested in why pint doesn't have a Quantity-like subclass of numpy.ndarray, but that's not as critical to me at this point as this performance issue.

@hgrecco
Copy link
Owner

hgrecco commented Jan 14, 2016

Thanks for your comments.

As far as I know people have used three "solutions" for this:

  1. Wrapping: Creating a numpy wrapper module. Forces the users to import functions like cos from it making it difficult to reuse code
  2. Black Magic: Monkey patching numpy to do what is proposed in (1) but automatically. Works most of the time but t can be dangerous.
  3. Ignoring it. Taking the magnitude and giving the output. This is wrong but has a funny story related. For example, it was said that Pint does not support numpy.cos because numpy.cos([1, 2, 3] * ureg.meter) raises an exception. Raising an exception is the right thing as the argument is not dimensionless

It would be great to tackle this. To my knowledge nobody is trying as the source of the problem is the NumPy's API. Maybe the recent changes in NumPy allow us to do this. I would be happy to help you and discuss with you if you want to work on it!.

Regarding the other questions we tried in two directions. See the _ndarray branch for part of the work and this issue in numpy (numpy/numpy#4072) for a different one.

@lkilcher
Copy link
Author

OK. Here is a first attempt at trying to make this happen: lkilcher@b4e2b4b

Note that according to the docs, __numpy_ufunc__ supercedes __array_wrap__ + __array_prepare__ when it is present. This appears to work as expected with py27 and a build of bleeding-edge numpy (git-clone).

I ran the unit tests, and this passes all but 5 of them. Two of those failures are due to the non-standard version number of the current numpy-dev branch.

The other three come-up in a __ipow__ test. I haven't exactly figured out what is going on here, but I'm beginning to suspect that this may be an upstream issue in numpy. I created a minimal example of the bug here: lkilcher@abb0aaa. In particular, note that the test.py script fails with TypeError: unsupported operand type(s) for ** or pow(): 'numpy.ndarray' and 'Quantity'. This failure occurs for stable numpy (1.10.4), and a dev- build. However, if you don't define the __numpy_ufunc__ method, the script runs fine. Also note that this issue only seems to arise for **=. Other operators/statements (**, *=, etc.) seem to work fine. This may have something to do with the second note under the __numpy_ufunc__ docs, but I haven't exactly figured out why it's only an issue for **=, and not *= or others.

Also, at this point I've mostly just copied code from your __array_wrap__ method. Assuming that will be left in-place for backward compatability, it may be worthwhile to move the duplicate code to other sub-methods. I didn't do anything like this for this demonstration code, and because I couldn't quite decipher what you are doing with the long try: ... except: block in __array_wrap__.

Anyway, let me know what you think.

@hgrecco
Copy link
Owner

hgrecco commented Jan 27, 2016

Sorry for the delay. Nice work!. I would be very interested in some benchmarks. Can you try your branch with and without your new code (both with the same version of Python and Numpy)?

Just be sure that pint is not installed (so you get your working version), move to the root pint folder (the one containing setup.py) and run in linux/osx

PYTHONPATH=. python bench/bench.py

If you are working on windows just copy bench.py from inside the bench folder and run

python bench.py

(I think I will put bench in the root just to make it easier to explain!)

@hgrecco
Copy link
Owner

hgrecco commented Jan 27, 2016

Regarding your questions:

  1. I would leave __array_wrap__ + __array_prepare__, maybe prepend a comment to the definitions saying this.
  2. Regarding Number ** Quantity, I agree that it might be because __rpow__ takes precedence. Notice that the implementation is conceptually different from __rmul__. Maybe you can just write a copy the definition __rpow__ to your test script and try myrpow(l, two) as a way to start debugging in simpler way.
  3. I think the handling part (and related instance attributes) can go away.
  4. The try: ... except: block in __array_wrap__ was to make sure that __handling would be properly cleaned. It also helped to debug some problems coming from the lack of examples about defining __array_wrap__. It can go now!

@lkilcher
Copy link
Author

OK. I've run the bench tests. I assume you're only interested in the bench_numpy.yaml output, so that's the only piece I'm including. The results for bench_base.yaml are--obviously--all basically the same.

Using NumPy 10.4 (pre __numpy_ufunc__ functionality), the results of the bench tests are basically the same regardless of whether I use pint/master, or my lkilcher/dev branch. This makes sense because NumPy 10.4 is unaware of __numpy_ufunc__ and so it simply uses the old __array_wrap__ + __array_prepare__ functionality in both cases. The results for this case are:

pint master + NumPy 10.4

5.77e-07 NumPy cosine (base)
5.85e+01 NumPy cosine radian (normalized)
7.09e+01 NumPy cosine dimensionless (normalized)
1.05e+02 NumPy cosine degree (normalized)

When I switch to NumPy-dev, the results change dramatically for both pint master and lkilcher/dev:

pint master + NumPy dev

5.82e-07 NumPy cosine (base)
6.77e+02 NumPy cosine radian (normalized)
6.99e+02 NumPy cosine dimensionless (normalized)
5.73e+02 NumPy cosine degree (normalized)

lkilcher/dev + NumPy dev

5.73e-07 NumPy cosine (base)
4.65e+01 NumPy cosine radian (normalized)
5.55e+01 NumPy cosine dimensionless (normalized)
8.71e+01 NumPy cosine degree (normalized)

What's interesting here is that pint master + NumPy dev is an order of magnitude slower than pint master + NumPy 10.4. Perhaps this is a result of my NumPy 10.4 having been optimized in some way? In that case, why is the NumPy cosine (base) test the same in all three cases? The other option is that something else in the NumPy core has changed.

Anyway, on the plus side, the lkilcher/dev + NumPy dev results are an order of magnitude faster than pint master + NumPy dev, and about 20% faster than pint master + NumPy 10.4.

I haven't figured out the __ipow__ issue yet, but I'll keep looking at it.

@hgrecco
Copy link
Owner

hgrecco commented Jan 29, 2016

The only explanation that I have is that NumPy dev has actually broken (slowed down) __array_wrap__ + __array_prepare__.

In any case, as you mentioned, your patch is definitely a good thing. I would suggest to clean up the handling part, try to fix the __rpow__ and then we merge it.

@lkilcher
Copy link
Author

OK, I've boiled this down as much as I know how. I'm beginning to think this is an upstream bug where __rpow__ isn't triggered appropriately for **= when __numpy_ufunc__ is present. I created this gist in an attempt to document and describe the issue. Am I missing something, or does this look like a NumPy bug to you too?

@hgrecco
Copy link
Owner

hgrecco commented Jan 30, 2016

I think your gist shows it quite clearly. Unless we are missing something obvious, I think it would be good to have some feedback from the NumPy Devs using your gist.

@lkilcher
Copy link
Author

OK. Let's see what they say.

@lkilcher
Copy link
Author

Well, it looks like this is all for not. Unfortunately __numpy_ufunc__ is not stable at all, and has no release timeline. It sounds like the name is even likely to change. I'll go ahead and close this for now. If this ever gets resolved perhaps this thread will be useful again. ;)

Should we stash these changes in a branch of your repo so that I don't accidentally delete them?

@hgrecco
Copy link
Owner

hgrecco commented Jan 30, 2016

I will definitely pull your changes into a new branch. I am sure we will use them. The current __array_wrap__ and related have some problems and something like __numpy_ufunc__ is going in the right way. Thanks

@hgrecco
Copy link
Owner

hgrecco commented May 12, 2017

It seems that there might be a new way to deal with this: __array_ufunc__ upcoming in NumPy 1.13 (See Release Notes and NEP)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants