-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support partially known dates #36
Conversation
Co-authored-by: Cole Crawford <16374762+ColeDCrawford@users.noreply.github.com>
Co-authored-by: Cole Crawford <16374762+ColeDCrawford@users.noreply.github.com>
Co-authored-by: Cole Crawford <16374762+ColeDCrawford@users.noreply.github.com>
Codecov Report
@@ Coverage Diff @@
## develop #36 +/- ##
==========================================
Coverage ? 98.65%
==========================================
Files ? 4
Lines ? 223
Branches ? 0
==========================================
Hits ? 220
Misses ? 3
Partials ? 0 |
@ColeDCrawford I think you and I worked on the early stages of this months ago. Thought maybe you our @jdamerow could look over the more recent changes so we could get this merged. I've added logic to track known granularity of the date with the degrees of precision that we support now, and updated string method and duration for You'll see there are some TODOs — I think there are decisions still to be made that will be easier to make once we add support for more formats. |
I've added an example notebook comparing duration calculation in undate and in my Shakespeare and Company Project based on the exported values (durations are calculated by the code), as I had proposed doing in #57 This was a helpful exercise because I found and corrected an error in the logic for dates with unknown years that wrap from one year to the next. I also found there's a difference in logic between the two implementations, which is whether you count both the start and end date or only one of them (or half of each). It seems ok to me to leave it as is for now, but might be something that should be configurable. You can review the notebook in this branch. update: I enabled reviewnb for this repository, so the notebook can be reviewed here: https://app.reviewnb.com/dh-tech/undate-python/pull/36/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to merge to me.
Should this be flagged as an issue for future follow-up?
I also found there's a difference in logic between the two implementations, which is whether you count both the start and end date or only one of them (or half of each). It seems ok to me to leave it as is for now, but might be something that should be configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running through the notebook was great, thanks for adding that as an example!
sphinx_rtd_theme | ||
m2r2 | ||
# pin sphinx because 7.0 currently not compatible with rtd theme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like RTD is now: sphinx >=5,<8
so we could try to upgrade Sphinx at some point. Not a blocker for this though
@@ -136,4 +264,12 @@ def test_duration(self): | |||
month_noyear_duration = UndateInterval( | |||
Undate(None, 12, 1), Undate(None, 1, 1) | |||
).duration() | |||
assert month_noyear_duration.days == 31 | |||
assert month_noyear_duration.days == 32 | |||
# this seems wrong, but we currently count both start and dates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something we should create an issue to address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created #63
# TODO: should not allow initializing year/day without month; | ||
# should we infer unknown month? or raise an exception? | ||
# assert str(Undate(2022, day="2X")) == "2022-XX-2X" # currently returns 2022-2X | ||
# assert str(Undate(2022, day=7)) == "2022-XX-07" @ currently returns 2022-07 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see a situation with a corrupted text where we legitimately only know the year and day of the month, but we can't do much with the additional day info in this case. I don't think it should error; it's also unlikely we can infer the month in this situation. Is it possible to bump to the known granularity (year) for calculation purposes, and leave the day in a string rendering of the undate
? Can't remember
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created #64 based on your comment here
@ColeDCrawford thank you for the review and comments! Glad the notebook was helpful. I think you're right, we have at least two new issues coming out of this: how to calculate duration, how to handle missing information. I'll create them and ask for input on slack, and then I'll go ahead and merge this. |
ref #3 and #57