-
-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct time of day subsetting when using hours. #327
Correct time of day subsetting when using hours. #327
Conversation
I only had time to take a quick look, but this looks very thorough! I will do a more comprehensive review this weekend. I did notice you added this test: I won't ask you to add comprehensive test coverage for existing functionality before accepting this PR (though I appreciate any tests you do add!). But it's best practice to write and run your unit tests before making any changes, to ensure they produce expected behavior (including errors you intend to fix). So I would appreciate if you ran your tests against |
Regarding this, do you want to support |
I would suggest requiring the colon. The ISO standard, as I recall, expects the colon, and forcing a model of ##:##:##.######## seems to leave the smallest number of ways in qhich this could be misinterpreted. |
I just looked at the standard. The basic format does not use a colon, while the extended format does. Both are acceptable. Edit: also, the standard requires each element be zero-padded, so these are the only supported formats:
That said, we can still support |
Remove support for non-zero-padded minutes and seconds. The ISO 8601 standard requires all time components be zero-padded. That said, we do support hours that aren't zero-padded, for convenience. Also make the validation regex stricter. The first digit for minutes and seconds cannot be > 5. Separate regex into multiple pieces, to make them easier to understand later. See joshuaulrich#326. See joshuaulrich#327.
The initial change for this fix only supports the extended format. (hh:mm:ss.sss). We supported the basic format (hhmmss.sss) previously. This restores the original functionality, making the colon optional. I noticed how lastof() was being used to get the last timestamp for the time string, and thought it would be good to find the first timestamp in a similar manner. The getTimeComponents() function extracts the hours, minutes, seconds, and sub-seconds into a list that we can use with firstof() and lastof(). Note that firstof() returns the first timestamp in 1970, and lastof() returns a timestamp at the end of 1970. The difference in years, months, and days does not matter because we're only concerned with the time of day components. See joshuaulrich#326. See joshuaulrich#327.
e.g. subsetting with "T01/T03" does not currently give expected results. This requires fixing .subsetTimeOfDay() Unit tests also updated for time of day subsetting. Fixes joshuaulrich#326
The other file is approaching 300 lines, and better to do this now. Doing it later would detach the history in the new file from the original file.
This was removed in the commit with the subject: "Correct time of day subsetting when using hours." It's not clear why it was removed, so I'm restoring it.
Remove support for non-zero-padded minutes and seconds. The ISO 8601 standard requires all time components be zero-padded. That said, we do support hours that aren't zero-padded, for convenience. Also make the validation regex stricter. The first digit for minutes and seconds cannot be > 5. Separate regex into multiple pieces, to make them easier to understand later. See joshuaulrich#326. See joshuaulrich#327.
The initial change for this fix only supports the extended format. (hh:mm:ss.sss). We supported the basic format (hhmmss.sss) previously. This restores the original functionality, making the colon optional. I noticed how lastof() was being used to get the last timestamp for the time string, and thought it would be good to find the first timestamp in a similar manner. The getTimeComponents() function extracts the hours, minutes, seconds, and sub-seconds into a list that we can use with firstof() and lastof(). Note that firstof() returns the first timestamp in 1970, and lastof() returns a timestamp at the end of 1970. The difference in years, months, and days does not matter because we're only concerned with the time of day components. See joshuaulrich#326. See joshuaulrich#327.
They create noise in the test results file. This makes it harder to quickly find any problematic tests.
b84483a
to
f47ebcc
Compare
Remove support for non-zero-padded minutes and seconds. The ISO 8601 standard requires all time components be zero-padded. That said, we do support hours that aren't zero-padded, for convenience. Also make the validation regex stricter. The first digit for minutes and seconds cannot be > 5. Separate regex into multiple pieces, to make them easier to understand later. See #326. See #327.
The initial change for this fix only supports the extended format. (hh:mm:ss.sss). We supported the basic format (hhmmss.sss) previously. This restores the original functionality, making the colon optional. I noticed how lastof() was being used to get the last timestamp for the time string, and thought it would be good to find the first timestamp in a similar manner. The getTimeComponents() function extracts the hours, minutes, seconds, and sub-seconds into a list that we can use with firstof() and lastof(). Note that firstof() returns the first timestamp in 1970, and lastof() returns a timestamp at the end of 1970. The difference in years, months, and days does not matter because we're only concerned with the time of day components. See #326. See #327.
@claymoremarshall thanks a lot for taking the initiative to fix this bug! Your code and tests were a really good start, and made it a lot easier for me to make updates. I restored the 'basic format' functionality and made the regex a bit more robust. I also removed the ability to omit zero-padding on everything except hours. That wasn't supported previously, so we're not losing anything, and it makes the xts time-of-day strings conform to the standard (except that the leading zero can be omitted from hours, which isn't in the standard). |
e.g. subsetting with time of day using hours only, e.g. "T01/T03", does not currently give expected
results, returnong the incorrect rows for the time window.
This requires modifying
.subsetTimeOfDay()
, which did not cover all behaviour previously covered by subsetting which used.parseISO8601()
.Careful attention is needed to ensure the second time string (T03 in the above example) covers all bars up to the end of the hour (03:59:59.9999) as was the behaviour previously with
.parseISO8601()
.An (optional) timeString validation is done to ensure passed in time strings are valid.
Unit tests also updated for time of day subsetting.
Fixes #326