-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: update pandas.core.resample.Resampler.nearest docstring #20381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/resample.py
Outdated
@@ -498,23 +498,142 @@ def pad(self, limit=None): | |||
|
|||
def nearest(self, limit=None): | |||
""" | |||
Fill values with nearest neighbor starting from center | |||
Fill the new missing values with their nearest neighbor value, based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should fit on a single line. Can you try rephrasing?
pandas/core/resample.py
Outdated
2018-01-01 02:00:00 3 | ||
Freq: 20T, dtype: int64 | ||
|
||
Resample in the middle: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "in the middle?"
pandas/core/resample.py
Outdated
|
||
Limited fill: | ||
|
||
>>> s.resample('10min').nearest(limit=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tad long. Can you change it to '20min'
. I think that'll still make the point as the first will be filled and the second will be NaN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Codecov Report
@@ Coverage Diff @@
## master #20381 +/- ##
==========================================
+ Coverage 92.22% 92.23% +<.01%
==========================================
Files 161 161
Lines 51187 51197 +10
==========================================
+ Hits 47209 47220 +11
+ Misses 3978 3977 -1
Continue to review full report at Codecov.
|
pandas/core/resample.py
Outdated
2018-01-01 01:40:00 6.0 5 | ||
2018-01-01 02:00:00 6.0 5 | ||
|
||
Resampling a DataFrame with shuffled indexes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fully sure this example is needed, as it is something general to resample
, and not specific to the nearest
method
pandas/core/resample.py
Outdated
2018-01-01 02:00:00 3.0 | ||
Freq: 10T, dtype: float64 | ||
|
||
Resampling a DataFrame that has missing values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add as small explanation about that the initial NaN is preserved
Hello @thicorfon! Thanks for updating the PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased, added couple of minor fixes, and made the examples shorter.
@jbrockmendel if you don't mind taking a look here too and merging on green if looks good. I'm trying to close all pending PRs from the docsting. I'm happy if what we merge if correct, future improvements can be addressed later in separate PRs. Thanks!
pandas/core/resample.py
Outdated
|
||
When resampling data, missing values may appear (e.g., when the | ||
resampling frequency is higher than the original frequency). | ||
The nearest fill will replace ``NaN`` values that appeared in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"nearest fill" refers to this method, right? Maybe make that explicit with "nearest
fill method"?
pandas/core/resample.py
Outdated
the resampled data with the value from the nearest member of the | ||
sequence, based on the index value. | ||
Missing values that existed in the original data will not be modified. | ||
If `limit` is given, fill only `limit` values in each direction for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the second "limit
" could be "this many"? Probably worthwhile to get a non-native speaker to weight in what is clearest.
|
||
.. versionadded:: 0.21.0 | ||
|
||
Returns | ||
------- | ||
an upsampled Series | ||
Series or DataFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datapythonista do we have a convention for saying same-type-as-input?
pandas/core/resample.py
Outdated
pad : Forward fill ``NaN`` values. | ||
pandas.Series.fillna : Fill ``NaN`` values in the Series using the | ||
specified method, which can be 'backfill'. | ||
pandas.DataFrame.fillna : Fill ``NaN`` values in the DataFrame using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thoroughness is good, but this seems excessive. @datapythonista what's the convention for how much to write in this section?
…also section shorter
Thanks for another great review @jbrockmendel. Made the changes, I think this should be ready to be merged on green. For the "same type as caller", we try to use only Python types in the types of the return and parameters. Ideally at some point we can parse those types and detect suspicious things or get statistics. In generic methods, so far we're using |
@jreback this should be ready to merge, if you want to have a look |
thanks @thicorfon |
…fixed * upstream/master: DOC: Removing rpy2 dependencies, and converting examples using it to regular code blocks (pandas-dev#23737) BUG: Fix dtype=str converts NaN to 'n' (pandas-dev#22564) DOC: update pandas.core.resample.Resampler.nearest docstring (pandas-dev#20381) REF/TST: Add more pytest idiom to parsers tests (pandas-dev#23810) Added support for Fraction and Number (PEP 3141) to pandas.api.types.is_scalar (pandas-dev#22952) DOC: Updating to_timedelta docstring (pandas-dev#23259)
Checklist for the pandas documentation sprint:
scripts/validate_docstrings.py pandas.core.resample.Resampler.nearest
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single pandas.core.resample.Resampler.nearest
Please include the output of the validation script below between the "```" ticks:
The error is caused by
.. versionadded:: 0.21.0
, which has no period in the end.