-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functionality to parse datetimes according C standard format codes. #165
Add functionality to parse datetimes according C standard format codes. #165
Conversation
No f-string please |
@wimglenn done. |
Thanks. I just blackened the code in #166 which made some conflicts for you, sorry about that |
# Conflicts: # parse.py
@wimglenn fixed |
@wimglenn is merging this on the roadmap? Are there any more changes you would like to request? |
@bendichter This looks like a great idea to me, I'll give a careful review when I have a spare moment. |
Sounds good. I'd be happy to answer any questions |
I do have a question - I'd really like anything that works with strftime to work here, e.g. this should be able to roundtrip:
So on line 616 where there is |
Yes, good catch! I'll make that change and add a test |
parse.py
Outdated
"Y": "[0-9]{4}", | ||
"H": "[0-9]{1,2}", | ||
"B": "(?:January|February|March|April|May|June|July|August|September|October|November|December)", | ||
"b": "(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these should be made locale-aware...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good idea I'll look into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be addressed at same time as #1
I gave it more flexibility for the %z directive. %Z gets complicated
(source) If using a timezone out of that range, datetime.strptime("2023-11-21 13:23:27 PST", "%Y-%m-%d %H:%M:%S %Z") ValueError: time data '2023-11-21 13:23:27 PST' does not match format '%Y-%m-%d %H:%M:%S %Z' If using a timezone within that range, datetime.strptime("2023-11-21 13:23:27 EST", "%Y-%m-%d %H:%M:%S %Z") datetime.datetime(2023, 11, 21, 13, 23, 27) This seems to me like a bug. And in any case things are getting messy so I'd rather not support this. |
Yes the zone stuff looks tricky due to Python itself (e.g. https://bugs.python.org/issue22377) |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #165 +/- ##
==========================================
- Coverage 94.72% 94.59% -0.13%
==========================================
Files 1 1
Lines 512 537 +25
Branches 123 131 +8
==========================================
+ Hits 485 508 +23
- Misses 17 18 +1
- Partials 10 11 +1 ☔ View full report in Codecov by Sentry. |
parse.py
Outdated
@@ -271,6 +271,46 @@ def date_convert( | |||
return d | |||
|
|||
|
|||
dt_format_to_regex = {symbol: "[0-9]{2}" for symbol in "ymdIMSUW"} | |||
dt_format_to_regex.update({"-" + symbol: "[0-9]{1,2}" for symbol in "ymdIMS"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason I get a problem with these:
>>> datetime(2023, 1, 1).strftime("%Y/%-m/%-d")
'2023/1/1'
But
>>> parse("{:%Y/%-m/%-d}", "2023/1/1")
ValueError: '-' is a bad directive in format '%Y/%-m/%-d'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I'll take a look tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now fixed and tested. It turns out strptime does not use the negative sign (I'm not quite sure why I thought it did). In strftime, %d outputs a zero-padded number e.g. "01". For strptime, %d matches a zero-padded number and can also match a non-zero padded number e.g. "1".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the following works:
>>> parse("{:%Y/%m/%d}", "2023/01/01")
<Result (datetime.datetime(2023, 1, 1, 0, 0),) {}>
>>> parse("{:%Y/%m/%d}", "2023/1/1")
<Result (datetime.datetime(2023, 1, 1, 0, 0),) {}>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's better. %j should get the same treatment, because it has the same issue (i.e. "%-j" doesn't work with strptime, and %j doesn't care if the number is zero-padded or not).
After removing the mapping for "-j"
, you can also remove the re.escape
since the remaining characters are all letters and don't need escaping.
…_flexible_dates # Conflicts: # parse.py
hold it, I think you want j to map to {1,3}. Let's also add a test for it |
I tried to see if I could replace some of the existing datetime logic, but most of them allow for variants of each format, e.g. using either slashes or dashes, so this approach does not work well. I was able to change "tc" to use this approach though. |
…_flexible_dates # Conflicts: # parse.py
Let's not change those, because there is little benefit, and I can't be sure it doesn't modify existing behaviour. I'm also not sure about the condition for returning a datetime vs a time. I think we should either remove that, and always return a datetime, or extend it to return a datetime, time, or date (depending on whether the directives are solely time-related or solely date-related). |
OK, I reverted to the old logic for "tc". The datetime vs. date vs. time thing is tricky. I would like to avoid situations like having Line 194 in 04314d8
but I still think that only makes sense when a month is present as it is in Lines 575 to 589 in 04314d8
I suppose the logic could be:
|
Maybe consider configuring black as a precommit hook, e.g. https://github.com/catalystneuro/neuroconv/blob/1d3d58d4b53570d20b5e81abde7ada7f9f02fa47/.pre-commit-config.yaml#L8-L12 |
Yes, that logic sounds good to me. |
If it contains day, month, or year, it's a date. Missing years default to the current year. If it contains hours, minutes, seconds, or milliseconds, its a time. If it is a date and a time it's a datetime.
@wimglenn Thanks for all the collaboration on this. I am happy with its current state. Feel free to merge when you are ready. |
Merged and released 1.20.0 to PyPI. Thank you for your contributions, Ben! |
elif is_time: | ||
return dt.time() | ||
else: | ||
ValueError("Datetime not a date nor a time?") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bendichter Somehow we both missed a raise
here (ref #196)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. Thanks for catching that
fix #150