Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extra::stats suffers from catastrophic cancellation #10851

Closed
huonw opened this issue Dec 7, 2013 · 11 comments
Closed

extra::stats suffers from catastrophic cancellation #10851

huonw opened this issue Dec 7, 2013 · 11 comments
Labels
E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.

Comments

@huonw
Copy link
Member

huonw commented Dec 7, 2013

It is using the naive algorithm which is numerically unstable; it should be using something like Kahan summation.

extern mod extra;
use extra::stats::Stats;

fn main() {
    // prints 0, but should be 1
    println!("{}", [1e20, 1.0, -1e20].sum());
}
@huonw
Copy link
Member Author

huonw commented Dec 7, 2013

I'm happy to mentor someone through this as a relatively easy first bug. (Feel free to talk to me here, or ping me (nick: dbaupp) on IRC.)

@g3xzh
Copy link
Contributor

g3xzh commented Dec 9, 2013

If I've understood correctly, I have to implement Kahan summation here.
Probably to change this part |p,q| p+q*.
Right?

@huonw
Copy link
Member Author

huonw commented Dec 9, 2013

Yes, that's exactly correct; you can either convert the .fold into a for loop or just make the fold pass through a tuple with the two piece of state, i.e. .fold((0.0, 0.0), |(sum, c), q| { ... (new_sum, new_c) }).

Also, another neat thing to do would be to make the variance calculation more numerically stable, e.g. using this or this.

@huonw
Copy link
Member Author

huonw commented Dec 9, 2013

(Also, sorry for repeatedly missing you on IRC; I'm in Australia which means that our times seem to not line up. :( )

@g3xzh
Copy link
Contributor

g3xzh commented Dec 10, 2013

@huonw
Regarding the IRC, that's fine. :)
Anyway, if you don't mind, I take a shot and work on it this week.

@huonw
Copy link
Member Author

huonw commented Dec 10, 2013

I certainly don't mind, go for it! Feel free to attempt to catch me in IRC if you have any questions (or just throw them here, or email me (should be visible in my github profile)).

@g3xzh
Copy link
Contributor

g3xzh commented Dec 11, 2013

Hey @huonw,

I have implemented the compensated summation but it has not done any impact.
When [1e20, 1.0, -1e20].sum() the 1.0 is too small relative to 1e20, making it negligible, thus 1e20+1.0=1e20.
Another way to fix it, is to sort the vector by its elements' absolute values and then do the summation.
However, the sort will add significant overhead and I am not sure if it is efficient and worthwhile.

@huonw
Copy link
Member Author

huonw commented Dec 11, 2013

Hm, Python has math.fsum which works perfectly:

>>> import math
>>> math.fsum([1e20, 1, -1e20])
1.0

so we may need to use a different algorithm.

@g3xzh
Copy link
Contributor

g3xzh commented Dec 11, 2013

OK, I will take a look at this :D.
BTW: that's my implementation.
http://pastebin.com/QqjVWLR1

@huonw
Copy link
Member Author

huonw commented Dec 11, 2013

The implementation of fsum, which appears to be the msum algorithm from my previous link.

It would be good for the documentation on sum to mention (a) that this method sacrifices performance at the altar of accuracy, and (b) to cite the paper that it apparently comes from (although I have not read it).

(That implementation of Kahan summation looks good.)

bors added a commit that referenced this issue Dec 19, 2013
`[1e20, 1.0, -1e20].sum()` returns `0.0`. This happens because during
the summation, `1.0` is too small relative to `1e20`, making it
negligible.

I have tried Kahan summation but it hasn't fixed the problem.
Therefore, I've used Python's `fsum()` implementation.
For more details, read:
www.cs.cmu.edu/~quake-papers/robust-arithmetic.ps
#10851

Python's fsum (msum)
http://code.activestate.com/recipes/393090/

@huonw, your feedback is more than welcome.
It looks unpolished; Do you have suggestions how to make it more beautiful and elegant?

Thanks in advance,
@huonw
Copy link
Member Author

huonw commented Dec 19, 2013

Fixed by #10927.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.
Projects
None yet
Development

No branches or pull requests

2 participants