Slowness in series subtraction #12

bluescarni · 2015-10-04T13:18:28Z

The current routine for series addition/subtraction results in extreme slowness under particular circumstances. For instance, when one subtracts two identical series, the complexity will be quadratic in the number of terms, whereas the complexity should normally be linear instead.

This is related to an implementation detail inside the hash_set class, but it can be worked-around by re-coding the series addition/subtraction routines.

The text was updated successfully, but these errors were encountered:

Sumith1896 · 2015-10-04T20:37:19Z

This would be an interesting fix.

bluescarni · 2015-10-05T01:52:35Z

It looks like the bad behaviour is only in debug mode, due to this line:

https://github.com/bluescarni/piranha/blob/master/src/series.hpp#L1186

The problem is when we get the begin() iterator in that line:

m_container.begin();

What happens here is that when we subtract a series from itself we first create a copy of it (to init the return value of the operation), and then we start removing terms from the beginning of the copy until no terms are left (this is the general subtraction algorithm - it is implemented as a negated copy).

The key thing is that, for performance reasons, Piranha's hash set is implemented in such a way that the cost of asking for the begin() iterator is proportional to the number of empty buckets before the first non-empty bucket. So, as we remove more and more terms from the head of the series, the cost of asking for a begin() iterator increases linearly. Basically, each successive call to begin() costs more and more as we need to traverse an increasing number of empty buckets before finding the first non-empty one.

This is normally not a problem, as the number of non-empty buckets in a series is roughly proportional to the total number of buckets in normal operations. But it does become a problem in corner cases like this one, in which there will be eventually zero non-empty buckets.

This is one of the key differences of Piranha's hash set with respect to std::unordered_set. The latter provides a series of guarantees on the constant complexity of operations such as traversal, erasure, etc. which Piranha's hash set does not. See this post for more information:

http://bannalia.blogspot.de/2013/10/implementation-of-c-unordered.html

I am not inclined to switch hash_set to an implementation which provides such guarantees, as they have a non-negligible cost, both in terms of performance and code complexity, but also with respect to threading safety.

In the next few days I'll experiment with a new add/sub implementation that avoids this problem in debug mode, and I will do a code review in order to understand if corner cases like this can actually arise in practice.

bluescarni added the bug label Oct 4, 2015

bluescarni self-assigned this Oct 4, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowness in series subtraction #12

Slowness in series subtraction #12

bluescarni commented Oct 4, 2015

Sumith1896 commented Oct 4, 2015

bluescarni commented Oct 5, 2015

Slowness in series subtraction #12

Slowness in series subtraction #12

Comments

bluescarni commented Oct 4, 2015

Sumith1896 commented Oct 4, 2015

bluescarni commented Oct 5, 2015