Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"critical" flag should be preserved when parsing #2403

Closed
ljharb opened this issue Sep 13, 2022 · 12 comments
Closed

"critical" flag should be preserved when parsing #2403

ljharb opened this issue Sep 13, 2022 · 12 comments

Comments

@ljharb
Copy link
Member

ljharb commented Sep 13, 2022

If I start with a ! in string x, and parse that string, the information that "the critical flag was present" is lost.

If I want to preserve it, I'd have to keep the string around alongside the object, or I'd have to parse the string myself (which I would hope nobody should ever have to do).

If that bit is kept, then toString could output the ! by default - toString could accept a three-state option for "default", "critical", or "not critical". Those that don't care about the flag could continue to ignore it, and those who do can preserve that info if desired, and can explicitly override it as desired.

@ljharb
Copy link
Member Author

ljharb commented Sep 13, 2022

cc @FrankYFTang

@syg
Copy link

syg commented Sep 13, 2022

AFAIU the intention of the flag from IETF is to inform the immediate next consumer of the string, such as Temporal's parser. Notably it is not "preserve this in perpetuity for serialization/deserialization chains". Temporal objects preserving the marked critical annotations goes against that intention and is undesirable for implementations.

Do you have a use case for why a flag intended for the immediate next consumer of a string is information that must be preserved? There is really no reason to do this preservation.

If I want to preserve it, I'd have to keep the string around alongside the object

That's exactly what you ought to do, that's not an extra burden. Temporal instances don't preserve original input -- if your use case is to have lossless transmission of the original input, you should keep the string alongside its Temporal instantiation.

@ljharb
Copy link
Member Author

ljharb commented Sep 13, 2022

Sure. However, if my intent is to normalize as much as possible, I might still want to know if it was marked as critical in the first place.

Whether it implicitly passes along to the toString output is of much lesser consequence to me - i'd be totally fine if the default toString never included a !, only when explicitly requested - but i'd still potentially need to know when to request it based on whether it was present in the original string or not.

@syg
Copy link

syg commented Sep 13, 2022

but i'd still potentially need to know when to request it based on whether it was present in the original string or not.

The consequences of that are highly undesirable for implementations: we'd have to keep storage for every optional bracket. I agree with @FrankYFTang's position here and oppose having accessors that expose the input state on potentially needing them alone. There needs to be substantive use cases.

@justingrant
Copy link
Collaborator

I agree with @syg here. Having been involved in the IETF discussions that led to the critical flag, my guess is that the main users of critical flags will be:

  1. Users who add their own custom extensions and who want to ensure that receivers will fail who don't know about those custom extensions. Temporal's data modal won't store custom extensions anyways, so I think the proposed solution above (storing the original string) seems like a good idea.
  2. Users who want to ensure that receivers will always reject timezone vs. offset conflicts caused by political changes to time zones that happen after strings are stored. Temporal already rejects these conflicts by default, so this seems unnecessary for us to worry about. Where the critical flag may be useful is when communicating with Java (which doesn't reject conflicts by default) if a future Java release supports the critical flag. But that doesn't really affect us, IMO.

Also, critical flags are just one of many things that Temporal won't round-trip when parsing... we don't round-trip number precision, we don't round trip whether optional fields were present, we don't round-trip case sensitivity, etc. Not round-tripping the critical flag seems like consistent behavior.

TL;DR - IMO we should wait until there's actual demand for this before complicating implementations with use cases that have not yet been proven in the wild.

@justingrant
Copy link
Collaborator

justingrant commented Sep 13, 2022

I learned that this issue came up in plenary (my apologies, I had a conflict for this meeting and couldn't attend) so I'll add a bit of context and add some feedback.

First, it's helpful to note that the critical flags Temporal recognizes (on time zone and calendar) are ignored during parsing because the critical flags already match Temporal's default parsing behavior. If we see a calendar we don't know, we throw. If we see a time zone we don't know, we throw. If the time zone and the offset conflict with each other, by default we throw. Therefore, whether or not the critical flag is present, Temporal will act the same. Programmers can almost always ignore it.

Second, when I'm sending a string to a Java app (or any existing legacy library that uses the same string formats that we do in Temporal), I must not send a calendar annotation because that app will crash. For this reason, Temporal's default toString output was designed to hide the ISO calendar annotation, even if the original input included [u-ca=iso8601]. The critical flag seems like a similar case: it represents default behavior in input strings that we omit from the output because we want default output to be as compatible as possible.

Third, if we forwarded the critical flag (or [u-ca=iso8601]) downstream, it'd provide a way for valid input sent by accident (or by a malicious sender) to cause output that will later break a downstream reader without having any noticeable impact to the behavior of the ECMAScript program in the middle. Those "silent carrier" bugs are something we'd like to avoid.

So I'm glad we're not focusing on changing the default behavior of toString!

Whether it implicitly passes along to the toString output is of much lesser consequence to me - i'd be totally fine if the default toString never included a !, only when explicitly requested - but i'd still potentially need to know when to request it based on whether it was present in the original string or not.

This is a good starting point for this discussion! I assume there are at least two ways to achieve "need to know when to request it based on whether it was present in the original string or not":

  1. Persist the critical flags in Temporal objects and expose their values publicly (but without changing default behavior of toString)
  2. Enhance existing and/or add new parsing methods to return info about whether the critical flag was present in the input.

Either one or these could be web-compatible. So is there urgency to decide now? Seems like we can wait to see whether there's demand, and can add in a follow-up proposal if needed.

BTW, this isn't the first time we've heard about requests for richer parsing for uncommon use cases. Others we've heard about include: parsing strings with some valid and some invalid fields, retaining numeric precision and trailing zeroes, knowing whether optional values were in the input string, and a few more.

So if we were to address this critical flag parsing thing, then we may want to gang it up with other advanced parsing cases if possible.

just one of many things that Temporal won't round-trip when parsing

In case you're curious, here's a quick list of other things about the original string that Temporal's parsing won't tell the caller about:

  • A calendar annotation of [iso8601] vs. omitting a calendar annotation altogether (because ISO is the default)
  • Leap seconds in the original string e.g. 23:59:60
  • Original time zone (time zones are converted to canonical time zone names)
  • Original calendar (calendars can be canonicalized, for example the deprecated islamiccc calendar can be replaced with islamic-civil)
  • Sub-second numeric precision / trailing zeroes
  • Whether seconds were present when parsing a time, or if the default of 0 was used instead.
  • Case sensitivity of time zone names or calendar names
  • Punctuation used or not (e.g. 20010101 vs. 2001-01-01)
  • Separator character (if any) used to separate date and time - can use T, t, or a space.
  • UTC offset syntax e.g. +02 vs. +02:00 vs. +0200

My point isn't to say that Temporal's parsing is bad, only that the critical flag is one of many—and not even the most popular!—cases that Temporal's parsing doesn't support. To be clear, we don't support them in V1 for a reason: none of them seemed to be worth making a complex API even more complex at this time, esp. in advance of programmer demand. We can always extend later.

@ptomato
Copy link
Collaborator

ptomato commented Sep 13, 2022

FWIW I also agree that the flag is not intended to be preserved. It's a message to the parser — it basically says "you must refuse to parse this if you can't honor the annotation" — so once we have accepted the string, I don't believe the semantics need to be preserved across deserialization and reserialization.

It's not impossible that someone might want to parse a string, use Temporal to adjust the date or time, and write the string back out while preserving the exact text format of the string. (As a more extreme example of some of the things Justin mentioned, you might want to add one hour to +0020220913 102045,000[!aMeRiCa/vAnCoUvEr][keep=this][u-ca=iso8601] and write it back out exactly like it was, without normalizing it to 2022-09-13T10:20:45-07:00[America/Vancouver].) I agree with Justin that's a responsibility for a dedicated parsing library (which we deemed out of scope for Temporal) or at least a future proposal that adds richer parsing methods.

I'm not sure going back and forth here is going to be productive — it seems we don't (and probably won't) agree on the intrinsic value of accommodating that theoretical usage if it brings real costs with it.

Therefore, what I'd propose is that we proceed with PR #2397, but make sure that it would be possible to get that information in the future in a web-compatible way, should a concrete use case present itself. I agree with Justin that that already is the case with #2397 as written. @ljharb, would that be something you can agree with? I hope that would let us move forward, since the door for supporting the as-yet-unknown use case remains open.

@ljharb
Copy link
Member Author

ljharb commented Sep 13, 2022

@justingrant thanks for the context. your first point, i understand: Temporal is stricter than IETF requires, which is great. Your second and third points as well: i'm convinced the default output should be maximally compatible, as long as all the edge case output formats are available.

Regarding your two options, I prefer the first - merely providing a .isCritical or similar, and leaving it to userland to figure out how to get that into the output (the toString options object seems like a fine way). I do agree that if the default format is unchanged, then it would be web-compatible to add later - however, it's worth noting that shipping without isCritical and adding it later adds a sizable burden on polyfills, although of course, this doesn't hold weight in normative decisions :-)

@ptomato Given that I'm now convinced that default output should remain unchanged, I'm content to proceed as-is with a potential temporal v2 follow-on for adding an accessor for this.

@ljharb ljharb closed this as completed Sep 13, 2022
@ptomato
Copy link
Collaborator

ptomato commented Sep 13, 2022

Thanks. I will try to get an item onto the agenda for Wednesday.

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Sep 16, 2022

I want to point out while Temporal does parse this string the design goal of Temporal is not a parser. It is not the design requirement for Temporal to provide information for anything it parsed from the grammar. If you think the user really need the .isCritical, please propose a new stage 0 TemporalParser proposal to expose that. What you are asking is a reasonable usage for a TemporalParser but we have no such object exist. All the Temporal object are not a parser, they just use an internal parser to perform the parsing job and the interface of these Temporal objects themself are not designed as a parser.

@ljharb
Copy link
Member Author

ljharb commented Sep 17, 2022

If they use an internal parser then they’re still a parser, they just don’t expose the parsed structure directly :-)

@FrankYFTang
Copy link
Contributor

If they use an internal parser then they’re still a parser, they just don’t expose the parsed structure directly :-)

yes, it USES a parser but it is NOT a parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants