-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: clarify corr behaviour when using a callable #25732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -7021,8 +7021,9 @@ def corr(self, method='pearson', min_periods=1): | |||
* kendall : Kendall Tau correlation coefficient | |||
* spearman : Spearman rank correlation | |||
* callable: callable with input two 1d ndarrays | |||
and returning a float | |||
|
|||
and returning a float. Note that the returned matrix from corr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is needed to said. This is by-definition that same-same are correlated to 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you want instead to add a link to a correlation matrix that would be ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to use the method argument of corr
to compute p-values for the significance of the correlation. Then, the resulting matrix is not a correlation matrix anymore. Currently, pandas assumes that the callable supplied to method computes actual correlations, but this assumption is not explicitly mentioned. At least for me, it would have been helpful to know this behaviour.
In the issue we agreed that explaining that the callable is not called for
the diagonal and the other half of the matrix would be welcome (since by definition those
are supposed to be 1 and symmetric).
…On Thu, Mar 14, 2019 at 10:59 AM Jeff Reback ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/core/frame.py
<#25732 (comment)>:
> @@ -7021,8 +7021,9 @@ def corr(self, method='pearson', min_periods=1):
* kendall : Kendall Tau correlation coefficient
* spearman : Spearman rank correlation
* callable: callable with input two 1d ndarrays
- and returning a float
-
+ and returning a float. Note that the returned matrix from corr
if you want instead to add a link to a correlation matrix that would be ok
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#25732 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIiscxZEczvMcPFnszlieLn6zCKYdks5vWnHVgaJpZM4b0l5p>
.
|
Codecov Report
@@ Coverage Diff @@
## master #25732 +/- ##
=======================================
Coverage 91.25% 91.25%
=======================================
Files 172 172
Lines 52973 52973
=======================================
Hits 48338 48338
Misses 4635 4635
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm as well @jreback
thanks |
* origin/master: DOC: clean bug fix section in whatsnew (pandas-dev#25792) DOC: Fixed PeriodArray api ref (pandas-dev#25526) Move locale code out of tm, into _config (pandas-dev#25757) Unpin pycodestyle (pandas-dev#25789) Add test for rdivmod on EA array (GH23287) (pandas-dev#24047) ENH: Support datetime.timezone objects (pandas-dev#25065) Cython language level 3 (pandas-dev#24538) API: concat on sparse values (pandas-dev#25719) TST: assert_produces_warning works with filterwarnings (pandas-dev#25721) make core.config self-contained (pandas-dev#25613) CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721) TST: Check pytables<3.5.1 when skipping (pandas-dev#25773) DOC: Fix typo in docstring of DataFrame.memory_usage (pandas-dev#25770) Replace dicts with OrderedDicts in groupby aggregation functions (pandas-dev#25693) TST: Fixturize tests/frame/test_missing.py (pandas-dev#25640) DOC: Improve the docsting of Series.iteritems (pandas-dev#24879) DOC: Fix function name. (pandas-dev#25751) Implementing iso_week_year support for to_datetime (pandas-dev#25541) DOC: clarify corr behaviour when using a callable (pandas-dev#25732) remove unnecessary check_output (pandas-dev#25755) # Conflicts: # doc/source/whatsnew/v0.25.0.rst
fixes #25726
git diff upstream/master -u -- "*.py" | flake8 --diff