-
-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new function for pairwise T-tests between columns of a dataframe (pingouin.ptests) #291
Conversation
Codecov Report
@@ Coverage Diff @@
## master #291 +/- ##
==========================================
+ Coverage 98.75% 98.76% +0.01%
==========================================
Files 19 19
Lines 3298 3332 +34
Branches 529 536 +7
==========================================
+ Hits 3257 3291 +34
Misses 24 24
Partials 17 17
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @raphaelvallat , zero problems here.
I was working on something yesterday and realized that I could really use this new feature! I wasn't sure if it was implemented yet, took a look and saw you were still waiting for a review. I hope you don't mind I jumped in 👋
Sidenote, I was hoping there was a bit more flexibility in the upper triangle. I'm sure you already considered that so you probably landed on the current structure for good reason. But just fyi, I was thinking the stars
parameter could be replaced with something like upper="pvals" (or "stars", "effsize", ...)
. I don't wanna get too crazy, but offering a similar flexibility in the lower triangle too (an analogous lower
parameter) would allow easy access to non-parametrics etc.
|
||
Passing custom parameters to the lower-level :py:func:`scipy.stats.ttest_ind` function | ||
|
||
>>> df.ptests(alternative="greater", equal_var=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up, I got an error thrown from ttest_ind
when I initially ran this in an environment with scipy version 1.7.3
:
ValueError: nan-containing/masked inputs with nan_policy='omit' are currently not supported by permutation tests, one-sided asymptotic tests, or trimmed tests.
I updated straight to 1.9.0
and it worked fine 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for letting me know! I think I'll keep the requirements of scipy>=1.7 for now, and we'll bump it to 1.9 in a future Pingouin release.
Thanks so much @remrama — I was desperately waiting for a reviewer :-) So about being more flexible in the output, I agree that this could be a nice addition in a future PR. My worry — and the reason I did not implement it — is that for increased speed we are using the lower-level scipy functions here and not a call to if paired:
func = ttest_rel
else:
func = ttest_ind
t, p = func(self[a], self[b], **kwargs, nan_policy="omit") Unfortunately however, scipy only returns the T and p-values, so we'd have to either recalculate the effsize / degrees of freedom manually, or, simpler but probably much slower, do a call to That said, I'll have to do some benchmarks on how much slower this is going to be. I tend to be very obsessed about code speed, but most of the time the differences are barely visible to the users in real-world data... Thanks! |
As discussed in #290, this PR adds the
ptests
(pairwise_ttest) method to pandas.DataFrame to calculate pairwise T-tests between columns of a pandas DataFrame. This can be used as an alternative to thepingouin.pairwise_tests
function when the data is in wide-format instead of long-format. Unlike the pairwise_tests function, theptests
function only return the T-values (lower triangle) and p-values (upper triangle). Please see examples below:I'm looking for one reviewer to review the PR. Thanks!