Skip to content

DOC: clarify corr behaviour when using a callable #25732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 18, 2019

Conversation

fbnrst
Copy link
Contributor

@fbnrst fbnrst commented Mar 14, 2019

fixes #25726

@@ -7021,8 +7021,9 @@ def corr(self, method='pearson', min_periods=1):
* kendall : Kendall Tau correlation coefficient
* spearman : Spearman rank correlation
* callable: callable with input two 1d ndarrays
and returning a float

and returning a float. Note that the returned matrix from corr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is needed to said. This is by-definition that same-same are correlated to 1.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want instead to add a link to a correlation matrix that would be ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use the method argument of corr to compute p-values for the significance of the correlation. Then, the resulting matrix is not a correlation matrix anymore. Currently, pandas assumes that the callable supplied to method computes actual correlations, but this assumption is not explicitly mentioned. At least for me, it would have been helpful to know this behaviour.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Mar 14, 2019 via email

@codecov
Copy link

codecov bot commented Mar 14, 2019

Codecov Report

Merging #25732 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #25732   +/-   ##
=======================================
  Coverage   91.25%   91.25%           
=======================================
  Files         172      172           
  Lines       52973    52973           
=======================================
  Hits        48338    48338           
  Misses       4635     4635
Flag Coverage Δ
#multiple 89.82% <ø> (ø) ⬆️
#single 41.74% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 96.79% <ø> (ø) ⬆️
pandas/core/series.py 93.68% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64f5961...f79fc71. Read the comment docs.

@jreback jreback added the Docs label Mar 14, 2019
@TomAugspurger TomAugspurger added this to the 0.25.0 milestone Mar 15, 2019
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm as well @jreback

@jreback jreback merged commit a703313 into pandas-dev:master Mar 18, 2019
@jreback
Copy link
Contributor

jreback commented Mar 18, 2019

thanks

sighingnow added a commit to sighingnow/pandas that referenced this pull request Mar 20, 2019
* origin/master:
  DOC: clean bug fix section in whatsnew (pandas-dev#25792)
  DOC: Fixed PeriodArray api ref (pandas-dev#25526)
  Move locale code out of tm, into _config (pandas-dev#25757)
  Unpin pycodestyle (pandas-dev#25789)
  Add test for rdivmod on EA array (GH23287) (pandas-dev#24047)
  ENH: Support datetime.timezone objects (pandas-dev#25065)
  Cython language level 3 (pandas-dev#24538)
  API: concat on sparse values (pandas-dev#25719)
  TST: assert_produces_warning works with filterwarnings (pandas-dev#25721)
  make core.config self-contained (pandas-dev#25613)
  CLN: replace %s syntax with .format in pandas.io.parsers (pandas-dev#24721)
  TST: Check pytables<3.5.1 when skipping (pandas-dev#25773)
  DOC: Fix typo in docstring of DataFrame.memory_usage  (pandas-dev#25770)
  Replace dicts with OrderedDicts in groupby aggregation functions (pandas-dev#25693)
  TST: Fixturize tests/frame/test_missing.py (pandas-dev#25640)
  DOC: Improve the docsting of Series.iteritems (pandas-dev#24879)
  DOC: Fix function name. (pandas-dev#25751)
  Implementing iso_week_year support for to_datetime (pandas-dev#25541)
  DOC: clarify corr behaviour when using a callable (pandas-dev#25732)
  remove unnecessary check_output (pandas-dev#25755)

# Conflicts:
#	doc/source/whatsnew/v0.25.0.rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected results for diagonal entries when using generic callable in corr
4 participants