Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster cumsum & cumprod. Fixes #7342. #7359

Merged
merged 1 commit into from
Jun 22, 2014
Merged

Faster cumsum & cumprod. Fixes #7342. #7359

merged 1 commit into from
Jun 22, 2014

Conversation

timholy
Copy link
Member

@timholy timholy commented Jun 21, 2014

Should we export cumsum! and cumprod!?

Note if you include the copy operation that would have been needed in #7342 (comment), this implementation is actually a little faster because it only needs to make one pass through the data rather than two.

@JeffBezanson
Copy link
Member

That was fast! 👏 🍸

@lindahua
Copy link
Contributor

@timholy You probably want a special path for cases where dim = 1.

@lindahua
Copy link
Contributor

I agree with exporting cumsum! and cumprod!.

@ViralBShah
Copy link
Member

Yes, we should probably export cumsum! and cumprod!.

@JeffBezanson
Copy link
Member

I'm never in favor of the ! functions, but we certainly have tons of them already so if you insist, I guess why not.

@ViralBShah
Copy link
Member

If only the code without ! was magically as fast...

@JeffBezanson
Copy link
Member

Well, this PR makes cumsum 10x faster. The ! does not add much on top of that. Exporting so many of these will make it harder to get rid of them in the future.

@timholy
Copy link
Member Author

timholy commented Jun 22, 2014

Right, once we have a faster gc the ! will only make a difference for small arrays. With our current gc, when testing in a loop even for size 5000x800, gc accounts for ~30% of the time!

EDIT: Of course, the bigger benefit in principle arises with SharedArrays or mmapped arrays---the ! form provides a natural mechanism to exploit parallelism or do things that might otherwise exhaust your memory. Though currently we lack a good framework for automatically generating SharedArray variants of functions.

@timholy
Copy link
Member Author

timholy commented Jun 22, 2014

You probably want a special path for cases where dim = 1

True enough. Not as important as for reductions, because you still have to access each element again to store the result, but it does save one array-access per element. See if this is better (~20% additional speedup for dim = 1).

@lindahua
Copy link
Contributor

I understand that we don't want too many !-functions. However, cumsum is the kind of functions that the inplace version is often needed.

@lindahua
Copy link
Contributor

@timholy I am happy with the latest commit. This fix comes amazingly fast!

@ViralBShah
Copy link
Member

@JeffBezanson I don't think that exporting matters. Even if something is not exported, but is widely used, it is just as difficult to take it away. I am hoping we will get things to be fast enough so that we can at least unexport the ! functions by 1.0.

@ViralBShah
Copy link
Member

Also, in a perverse way, now that this is 10x faster, it is all the more important to have the ! version. Before, cumsum! would have only been 5% faster and not worthwhile, but now it is much faster than cumsum.

timholy added a commit that referenced this pull request Jun 22, 2014
@timholy timholy merged commit b6ebf91 into master Jun 22, 2014
@timholy timholy deleted the teh/cumsumprod branch June 22, 2014 11:24
@tkelman
Copy link
Contributor

tkelman commented Jun 22, 2014

I'm confused, what's the motivation behind not liking the in-place versions of functions? Unless there are plans afoot to somehow make them unnecessary for performance, in which case I don't understand how that would be possible. A standardized mechanism for doing things in-place by default and automating the copying definitions would be good, of course.

@jakebolewski
Copy link
Member

Well if you switch to array view semantics instead of copy be default then the in place method performance difference will likely be less dramatic. It is still nice to have an escape hatch if you don't want the overhead of creating even a lightweight view object.

@lindahua
Copy link
Contributor

@jakebolewski I am afraid that I don't see how the array views are relevant to the discussions here.

Even with array views, you still need inplace functions to avoid creating temporary arrays.

@StefanKarpinski
Copy link
Member

I definitely think they should be exported – these are quite useful.

On Jun 21, 2014, at 6:47 PM, Dahua Lin notifications@github.com wrote:

I agree with exporting cumsum! and cumprod!.


Reply to this email directly or view it on GitHub.

@GunnarFarneback
Copy link
Contributor

I'm biased by Matlab to prefer a default of 1 but don't have a strong feeling about it for the multidimensional case. For the vector case, however, having to explicitly specify dim 1 is just annoying.

@timholy
Copy link
Member Author

timholy commented Sep 14, 2014

Yes, you're right about handling vectors. To me that's by far the strongest argument in favor of defaulting to 1 for all arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants