Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow default values when parsing, and error on unused fields #1094

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

pitdicker
Copy link
Collaborator

@pitdicker pitdicker commented May 28, 2023

Default values when parsing

NaiveTime:
When parsing seconds and nanoseconds may already be omitted. This is now extended to minutes.

NaiveDate:
When a day field is missing, default to the first day of the month when parsing year-month-day, or default to the first day of the week when parsing year-week-weekday.
If there are multiple choices, such as when there is no day field but both a month and week field, we don't arbitrarily pick a default.

NaiveDateTime:
Uses the combination of fallbacks from NaiveTime and NaiveDate. And if hour is missing, defaults that to zero also.

DateTime:
No changes. Only seconds and nanoseconds may be omitted.

Error on unused fields

  • NaiveTime errors if there are any date fields or an offset present.
  • NaiveDate errors if there are any time fields or an offset present.
  • NaiveDateTime errors if there is an offset present.

The various choices here are relatively easy to change. We could for example make NaiveDateTime require a full date and at least an hour.

Fixes #528, #961.
cc @MarcoGorelli

@pitdicker pitdicker force-pushed the parsing_defaults branch 3 times, most recently from 7ec6b30 to e46deec Compare May 28, 2023 15:08
src/naive/time/tests.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jtmoon79 jtmoon79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual, it's possible I'm misunderstanding, but I think some necessary test asserts are missing.

src/naive/date.rs Outdated Show resolved Hide resolved
@pitdicker
Copy link
Collaborator Author

I have removed the part from this PR to error on unused fields. Now it should accept strictly more strings as valid when parsing, so it should be good for the 0.4.x branch.

I will open a follow up PR against main if this is merged.

Comment on lines 522 to 447
None => {
if self.second.is_none() && self.nanosecond.is_none() {
0
} else {
return Err(NOT_ENOUGH);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: writing this as a match expression will be slightly more concise and IMO easier to read.

@@ -553,7 +553,16 @@ impl Parsed {
/// Either way those fields have to be consistent to each other.
pub fn to_naive_datetime_with_offset(&self, offset: i32) -> ParseResult<NaiveDateTime> {
let date = self.to_naive_date();
let time = self.to_naive_time();
let time = if self.hour_div_12.is_none()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your consideration for doing this here? Why not do this inside to_naive_time()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in to_naive_time() we want to have at least an hour, an empty format string should not be accepted as a valid time in my opinion. But if we have a date I think it is reasonable to make the hour optional.

NaiveDate::from_isoywd_opt(isoyear, isoweek, weekday).ok_or(OUT_OF_RANGE)?;
(verify_ymd(date) && verify_ordinal(date), date)
}
#[rustfmt::skip]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why skip formatting there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rustfmt turns this:

            #[rustfmt::skip]
            (true, Some(year), None, &Parsed { month: Some(month),
                                               week_from_sun: None, week_from_mon: None, .. }) => {
                let date = NaiveDate::from_ymd_opt(year, month, 1).ok_or(OUT_OF_RANGE)?;
                (verify_isoweekdate(date) && verify_ordinal(date), date)
            }
            #[rustfmt::skip]
            (true, Some(year), None, &Parsed { week_from_sun: Some(week),
                                               week_from_mon: None, month: None, .. }) => {
                let date = resolve_week_number(year, week, Weekday::Sun, Weekday::Sun)?;
                (verify_ymd(date) && verify_isoweekdate(date) && verify_ordinal(date), date)
            }
            #[rustfmt::skip]
            (true, Some(year), None, &Parsed { week_from_mon: Some(week),
                                               week_from_sun: None, month: None, .. }) => {
                let date = resolve_week_number(year, week, Weekday::Mon, Weekday::Mon)?;
                (verify_ymd(date) && verify_isoweekdate(date) && verify_ordinal(date), date)
            }

into this:

            (
                true,
                Some(year),
                None,
                &Parsed { month: Some(month), week_from_sun: None, week_from_mon: None, .. },
            ) => {
                let date = NaiveDate::from_ymd_opt(year, month, 1).ok_or(OUT_OF_RANGE)?;
                (verify_isoweekdate(date) && verify_ordinal(date), date)
            }
            (
                true,
                Some(year),
                None,
                &Parsed { week_from_sun: Some(week), week_from_mon: None, month: None, .. },
            ) => {
                let date = resolve_week_number(year, week, Weekday::Sun, Weekday::Sun)?;
                (verify_ymd(date) && verify_isoweekdate(date) && verify_ordinal(date), date)
            }
            (
                true,
                Some(year),
                None,
                &Parsed { week_from_mon: Some(week), week_from_sun: None, month: None, .. },
            ) => {
                let date = resolve_week_number(year, week, Weekday::Mon, Weekday::Mon)?;
                (verify_ymd(date) && verify_isoweekdate(date) && verify_ordinal(date), date)
            }

It felt less readable to me (but now it honestly seems not as bad as while I was playing with this code).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I've learned to abide with rustfmt's choices.

Copy link
Collaborator Author

@pitdicker pitdicker Jul 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I have to learn that yet 😆 :. In some cases rustfmt makes the code less readable, and I try to rewrite it in a way that formats nice again.
Using an extra pair of braces here seems to give pretty nice formatting:

            (
                (true, Some(year), None),
                &Parsed { month: Some(month), week_from_sun: None, week_from_mon: None, .. },
            ) => {
                let date = NaiveDate::from_ymd_opt(year, month, 1).ok_or(OUT_OF_RANGE)?;
                (verify_isoweekdate(date) && verify_ordinal(date), date)
            }
            (
                (true, Some(year), None),
                &Parsed { week_from_sun: Some(week), week_from_mon: None, month: None, .. },
            ) => {
                let date = resolve_week_number(year, week, Weekday::Sun, Weekday::Sun)?;
                (verify_ymd(date) && verify_isoweekdate(date) && verify_ordinal(date), date)
            }

self.to_naive_date_inner(true)
}

fn to_naive_date_inner(&self, use_default: bool) -> ParseResult<NaiveDate> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for this? I'd prefer not to do this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a use_default flag, so that we can require that a date which includes an offset does not accept a string without minutes, hours or days.

But to_naive_date is a public function, so I could not change the signature of that one and made an 'inner' method. It is also not my preference...

@@ -580,7 +580,7 @@ impl Parsed {
(None, Some(_)) => 0, // UNIX timestamp may assume 0 offset
(None, None) => return Err(NOT_ENOUGH),
};
let datetime = self.to_naive_datetime_with_offset(offset)?;
let datetime = self.to_naive_datetime_with_offset_inner(offset, false)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a date includes an offset in hours and maybe minutes, I think we should require the date and time to also have a precision of hours and minutes without fallback.

@pitdicker pitdicker force-pushed the parsing_defaults branch 2 times, most recently from 897b50f to b352038 Compare June 12, 2023 10:29
NaiveDate::from_isoywd_opt(isoyear, isoweek, weekday).ok_or(OUT_OF_RANGE)?;
(verify_ymd(date) && verify_ordinal(date), date)
}
#[rustfmt::skip]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I've learned to abide with rustfmt's choices.

/// If the format string doesn't contain a day field, it defaults to the first day.
/// If year and month are given it defaults to the first day of the month.
/// If year and week are given it defaults to the first day of the week.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this detailed explanation in the docstring.

However, it's somewhat confusing.

My suggestion includes referring specifically to fmt, use the phrase specifier (that is used elsewhere), and being more explicit about the inferred subject:

/// If `fmt` does not contain a day specifier then the day value defaults to the first day of the week.
/// If `fmt` only has year and month specifiers then the day value defaults to the first day of the month.
/// If `fmt` only has year and week specifier then the day value value defaults to the first day of the week.

(I'm not sure if my proposed sentences are technically correct, you'll have to review that)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. I have updated it to:

    /// If `fmt` does not contain enough specifiers to get a complete date, parsing inserts a
    /// default value:
    /// - If `fmt` has year and month specifiers parsing assumes it is the first day of the month.
    /// - If `fmt` has year and week specifiers parsing assumes it is the first day of the week.

/// Missing hours, minutes, seconds and nanoseconds are assumed to be zero.
/// A missing day will, depending on the other available fields, default to the first day of
/// the month or the first day of the week.
/// Out-of-bound times or insufficient fields are errors otherwise.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this detailed explanation in the docstring.

However, it's somewhat confusing.

My suggestion includes referring specifically to fmt, using the formal phase specifier (as in "strftime specifier" which is how it's referred elsewhere) and being more explicit about the inferred subject:

/// If `fmt` does not contain an hours, minutes, seconds or nanoseconds specifier than that is assumed to be zero.
//// If `fmt` does not contain a day specifier then it will first day of the month or first day of the week, depending upon other specifiers.
/// Otherwise, out-of-bound times or insufficient fields are errors.

Ok(date)
}

fn resolve_year(y: Option<i32>, q: Option<i32>, r: Option<i32>) -> ParseResult<Option<i32>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring for resolve_year.

Please use full variables names, e.g. y should be year, etc.
I don't know what q or r mean (I have some guesses, but it'd be better to use full words for those variables).

Tangential: a PR that type-aliased years, months, etc. would greatly increase code readability, i.e. type year = i32;, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangential: a PR that type-aliased years, months, etc. would greatly increase code readability, i.e. type year = i32;, etc.

I quite like bare integers. But are you interested in trying out how it looks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants