-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Why does pandas dataframe.var() method return unbiased variance by default?? #27202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
As far as I know, the |
@jorisvandenbossche As far as I know, this method return |
And this is the "sample variance" you are asking for (and which is the same as the default in R). (closing as this is not something to fix in pandas, but feel free to ask for further clarification. Or indicate what can be improved in the docs to make this clearer) |
@jorisvandenbossche |
Why does pandas dataframe.var() method return "unbiased variance" by default?
I think variance implies "sample variance" in general.
Off course, I know this method can return "sample variance" if we provide ddof=0 option.
However, this setting option is very confusing.
Why don't you add new methods sample_var() and unbiased_var() or return "sample variance" by default?
Other data analysis OSS such as numpy, R and so on, their method return "sample variance" by default.
I think pandas is outstanding library for data analysis with Python! Therefore, I would like to know about this confusing specification.
The text was updated successfully, but these errors were encountered: