-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Added stride/offset aliases in to_datetime #26631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added support for stride and offset aliases to the 'unit' keyword in 'to_datetime'. Documented support for additional 'unit' intervals that were available.
Codecov Report
@@ Coverage Diff @@
## master #26631 +/- ##
==========================================
- Coverage 91.88% 91.87% -0.01%
==========================================
Files 174 174
Lines 50692 50692
==========================================
- Hits 46576 46572 -4
- Misses 4116 4120 +4
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26631 +/- ##
==========================================
- Coverage 91.88% 91.87% -0.01%
==========================================
Files 174 174
Lines 50692 50692
==========================================
- Hits 46576 46572 -4
- Misses 4116 4120 +4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this close an existing issue? Also tests are critical to make sure this actually works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yearly and monthly offsets were deprecated recently. #16344.
Week offsets are on the chopping block too so it's best not to advertise them as well. #14024 (comment)
I couldn't find an existing issue. I had an issue that I didn't report, however. I was trying to use After working with the code found it easy to add stride. Passed current tests, so I didn't break old behaviors. Where should I put new tests? |
There isn't a deprecation warning in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a bunch of tests for this already in pandas/tests/indexes/timedeltas/test_tools.py
so I suppose some of these were not caught if you actually need to do this?
also would need to do an asv run on appropriate benchmarks for this change
pandas/_libs/tslibs/timedeltas.pyx
Outdated
unit = 'L' | ||
elif unit == 'us': | ||
unit = 'U' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this an else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. I actually want "unit" to be unchanged if not one of the tested values so what would an "else" do? Wouldn't it force all "unit" values to be "U"? I have in the past used a dictionary instead to do this same thing if that is a better and more readable approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else:
unit = unit.upper()
pandas/core/tools/datetimes.py
Outdated
unit : string, default is 'N' | ||
The unit of the arg. Uses a subset of the pandas offset aliases. | ||
|
||
- 'Y', 'A' for yearly (long term average of 365.2425 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove Y, A, M here I think we deprecated these anyhow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already my working version is a deprecation warning for "Y", "A", "M" and "W". The "W" deprecation is discussed earlier in this conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so don't mention the deprecated ones
No test for the added stride, also almost all of the tests are only for unit='D' and unit='s'.
I think reasonable to add tests for stride and for other accepted units. Working on it.
I have not done this before. Will look around on how to make that happen. |
pandas/_libs/tslibs/timedeltas.pyx
Outdated
unit = 'L' | ||
elif unit == 'us': | ||
unit = 'U' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else:
unit = unit.upper()
pandas/core/tools/datetimes.py
Outdated
unit : string, default is 'N' | ||
The unit of the arg. Uses a subset of the pandas offset aliases. | ||
|
||
- 'Y', 'A' for yearly (long term average of 365.2425 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so don't mention the deprecated ones
pandas/_libs/tslibs/timedeltas.pyx
Outdated
|
||
unit, stride = _base_and_stride(unit) | ||
|
||
# Normalize old or lowercase codes to standard offset aliases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure we are actually testing all of these, can you see where we are and validate
can you merge master and respond to comments |
Almost done. Ran into this numpy/numpy#8325, which affects several of the tests and for some time thought it was my code failing instead. Tried to resolve/fix but just spun wheels instead. Ran the benchmarks multiple times and got different answers each time. For example, see the attached files of the results summaries from two benchmark runs. Both done overnight while my Chromebook was not used for anything else. Should I squash all commits into one? |
we don't use numpy directly for (very little) nan operations, and not for timedeltas so not sure where you are hitting this. |
Initial bug report is #26964 that shows the problem in pandas. Closed it when I found that numpy had the same bug and thought, maybe incorrectly, that pandas was using numpy and therefore getting the same behavior. |
Added support for stride and offset aliases to the 'unit' keyword in 'to_datetime'. Documented support for additional 'unit' intervals that were available.
The 'origin' in to_datetime can be any resolution. New tests for origin resolution and strided unit codes. Deprecated warning for to_datetime units: 'A', 'Y', 'M', 'W'. Edited docstring to reflect strided unit codes and origin resolution. Replaced re test with simple lstrip. Moved pattern to only other file where it is used. Added 'd' to represent daily, but default should be 'D'. Changed 'd' to 'D' in tests.
Looks like a git merge issue. Let us know if you need help fixing. |
Added support for stride and offset aliases to the 'unit' keyword in 'to_datetime'. Documented support for additional 'unit' intervals that were available.
The 'origin' in to_datetime can be any resolution. New tests for origin resolution and strided unit codes. Deprecated warning for to_datetime units: 'A', 'Y', 'M', 'W'. Edited docstring to reflect strided unit codes and origin resolution. Replaced re test with simple lstrip. Moved pattern to only other file where it is used. Added 'd' to represent daily, but default should be 'D'. Changed 'd' to 'D' in tests.
Yep. Need help with git. I though everything looked good here, but obviously it is a mess. I could send the log of what I had done if that is helpful. |
In theory, a
should get you straightened out. |
@timcera can you merge master again? Looks like old builds are no longer available |
numpydev is failing. Does this PR introduce new warnings from Numpy? Running with |
closing as stale; if you want to continue working, pls merge |
Added support for stride and offset aliases to the 'unit' keyword in
'to_datetime'. Documented support for additional 'unit' intervals that
were available.
git diff upstream/master -u -- "*.py" | flake8 --diff