Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #31758: out of bounds write in sparse broadcast #31763

Merged
merged 1 commit into from
Apr 19, 2019
Merged

Conversation

fredrikekre
Copy link
Member

No description provided.

@fredrikekre fredrikekre added sparse Sparse arrays bugfix This change fixes an existing bug broadcast Applying a function over a collection labels Apr 18, 2019
@mbauman
Copy link
Member

mbauman commented Apr 18, 2019

Since it took a bit to figure out, here's Fredrik's and my understanding of what this code means to do: it's trying to get a good maximum bound for the number of remaining stored entries in the array. We're in the zero-preserving case, so anywhere that all the arguments have zeros, the function evaluation will return zero. Now computing the exact union across all the arguments is an expensive task. This (over) estimates it by simply looking at the total number of nonzeros in all arguments; that's _sumnnzs(As...) . We can further refine that estimate because we're partway through the algorithm — we know how many total stored values we've seen in the arguments and how many stored values they ended up creating. So we add in the total number of values we've stored so far (Ck) and subtract out the total number of stored values in all the arguments that led to that result (sum(ks) - N — the - N accounts for the 1-based indices across the N arrays). Of course this could be more than the total possible number of elements, so we ensure it's no larger than length(C) with a min.

Put that all together and you can see that the entire estimate algorithm should all go inside the same argument of that min call.

@mbauman
Copy link
Member

mbauman commented Apr 18, 2019

Given that this often over-allocates, I wonder how much more efficient (or not) a simple push! based algorithm would be. We also wondered how much @inbounds actually affects performance in such a complicated algorithm. But those can be investigations for another day — this patch is straightforward and lets not derail it by looking into other possible algorithms.

@fredrikekre fredrikekre merged commit c0c6f96 into master Apr 19, 2019
@fredrikekre fredrikekre deleted the fe/bc-sparse branch April 19, 2019 07:36
This was referenced Apr 19, 2019
KristofferC pushed a commit that referenced this pull request Apr 20, 2019
KristofferC pushed a commit that referenced this pull request Apr 20, 2019
KristofferC pushed a commit that referenced this pull request Apr 20, 2019
@Sacha0
Copy link
Member

Sacha0 commented Apr 20, 2019

Good catch! :)

We also wondered how much @inbounds actually affects performance in such a complicated algorithm.

Depends on which @inbounds you are thinking of; some of them impact performance substantially. I have a now-stale perf test suite for this code somewhere that I've long wanted to package up for nanosoldier in that mythical thing that is free time... 😄

@KristofferC
Copy link
Member

I'm surprised that the inbounds annotations have a large effect since there are almost always conditionals and non propagate_inbounds function calls in the loop body. Do you recall any such example?

@Sacha0
Copy link
Member

Sacha0 commented Apr 20, 2019

I recall perf measurements benefitting from certain inbounds annotations, but what with the many intervening months cannot recall precisely which measurements and annotations 🤷‍♀️. IIRC I started without the annotations, but found that I needed at least some subset of them to close the gap between the then-new generic implementations and the preexisting specialized implementations. Best!

KristofferC pushed a commit that referenced this pull request Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection bugfix This change fixes an existing bug sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants