Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds lazy reader support for timestamps #623

Merged
merged 24 commits into from
Aug 28, 2023
Merged

Adds lazy reader support for timestamps #623

merged 24 commits into from
Aug 28, 2023

Conversation

zslayton
Copy link
Contributor

Builds on outstanding PRs #609, #612, #613, #614, #616, #617, #619, #620, #621, and #622.

Adds support for reading text timestamps in the lazy reader.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@zslayton zslayton changed the base branch from main to lazy-annotations August 14, 2023 07:20
@codecov
Copy link

codecov bot commented Aug 14, 2023

Codecov Report

Patch coverage: 93.46% and project coverage change: +0.28% 🎉

Comparison is base (01354e8) 81.25% compared to head (e8a9713) 81.54%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #623      +/-   ##
==========================================
+ Coverage   81.25%   81.54%   +0.28%     
==========================================
  Files         123      123              
  Lines       22715    23158     +443     
  Branches    22715    23158     +443     
==========================================
+ Hits        18458    18885     +427     
+ Misses       2554     2549       -5     
- Partials     1703     1724      +21     
Files Changed Coverage Δ
src/lazy/text/encoded_value.rs 75.96% <0.00%> (-0.60%) ⬇️
src/lazy/text/value.rs 67.04% <0.00%> (-0.78%) ⬇️
src/lazy/any_encoding.rs 49.37% <60.00%> (+0.80%) ⬆️
src/lazy/text/raw/reader.rs 88.65% <84.21%> (-0.37%) ⬇️
src/lazy/text/matched.rs 76.56% <88.65%> (+9.23%) ⬆️
src/lazy/text/buffer.rs 91.10% <100.00%> (+2.62%) ⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour

Comment on lines +553 to +556
assert_eq!(
reader.next()?.expect_value()?.read()?,
RawValueRef::Timestamp(Timestamp::with_year(2023).with_month(8).build()?)
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This simple test confirms that the LazyRawAnyReader can surface timestamps as expected. More rigorous correctness testing happens later in the diff.

/// Matches a timestamp offset of any format.
fn match_timestamp_offset(self) -> IonParseResult<'data, MatchedTimestampOffset> {
alt((
value(MatchedTimestampOffset::Zulu, tag("Z")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it have any affect on memory allocation or other operations if we added another branch here for +00:00? I.e.:

value(MatchedTimestampOffset::Zulu, tag("+00:00"))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a wash; it makes reading +00:00 cheaper at the expense of an extra branch in the general case. I've added the change for now, we can always revisit it later with benchmarks.

let good_inputs = &[
"2023T",
"2023-08T",
"2023-08-13", // T is optional for ymd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what? 🤯
How did I go so long without knowing this?

// throughout.
let year_text = matched_input.slice(0, 4).as_text().unwrap();
let year = u32::from_str(year_text).unwrap();
let timestamp = Timestamp::with_year(year);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice timestamp builder you have there. 😉

Base automatically changed from lazy-annotations to main August 23, 2023 20:13
Copy link
Contributor

@popematt popematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two test cases that need a minor fix, but otherwise it looks great.

Comment on lines 1684 to 1685
"2023-08-18T14:35:52.+24:30", // Out of bounds offset hour
"2023-08-18T14:35:52.+00:60", // Out of bounds offset minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These both have a dot but no fractional digits, so the test case can pass even if the out of bounds offset is not correctly rejected.

@zslayton zslayton merged commit eda4e2b into main Aug 28, 2023
@zslayton zslayton deleted the lazy-timestamps branch August 28, 2023 16:34
@zslayton zslayton self-assigned this Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants