Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Well-defined timezone handling #25

Closed
dchiba opened this issue Jan 30, 2020 · 4 comments
Closed

Well-defined timezone handling #25

dchiba opened this issue Jan 30, 2020 · 4 comments
Labels
requirements Issues related with MF requirements list resolve-candidate This issue appears to have been answered or resolved, and may be closed soon.

Comments

@dchiba
Copy link

dchiba commented Jan 30, 2020

This thread is a spin-off from the requirements gathering (issue #3) about the way timezone is handled in formatting a date/time value.

Original description:

The way a timezone adjustment is made in date/time formatting should be clearly specified. The default timezone conversion behavior should be reasonable and unambiguous. The message author should be able to optionally specify a desired timezone conversion. This is meant to make it easier for applications to support timezones correctly.

@grhoten commented:

It's my preference that the time zone handling should be a part of the calendar object being formatted and not in the message format.

The date/calendar object being formatted could carry timezone information, in which case the formatter can simply print the value in that timezone. However, in many cases the original value is normalized in UTC or otherwise missing timezone information, so the formatter must figure out the timezone to print the value in.

Typically, a calendar object uses a field based internal data structure which contains a timezone field, while a date object uses a simple incremental millisecond count since the Unix time epoch in UTC. The latter type is often chosen as the only native datatype for date/time in modern programming environments as it is easier to evaluate and meets a great majority of the needs of the applications.

Requiring the application to convert a date to a calendar just for presentation in the desired timezone is undesirable. As a matter of fact, modern programming environments support implicit conversion today for basic application scenarios such as printing the current date/time or a timestamp in the user's local time. Applications don't have to specify the presentation timezone, because a 'default timezone is automatically applied for you. For instance, in a browser, the default timezone is the local time where it is running, and new Date().toString() would print the current date and time in the local timezone.

The desired default timezone is often the local timezone of the end user, but there are many exceptions; for instance, on a mobile device, the desired default may not be the timezone of the user's physical location. Instead, it may be that of the user's "home" location.

Here is a summary of different patterns to apply a timezone in formatting a date/time value:

  1. The value is timezone independent so no conversion is applied. e.g. DOB
  2. The "default timezone" as defined by the environment should be used.
  3. The user's preferred/home timezone set in the user profile should be used. e.g. appointments in personal calendar
  4. A business entity has an associated timezone that should be used. e.g. flight schedule, stock price quotes
  5. A timezone may be explicitly specified.

The syntax should use reasonable defaults and support options that enable the message author to specify how timezone should be applied.

=== Examples for Each Pattern ===

Pattern 1, timezone independent scenario: Most date values that represent a specific day are timezone independent. So date values should be printed in a timezone neutral way by default. If time portion was supplied, it should be ignored.

To print a date of birth, the message template may look similar to the following (The markup is for illustration of the concept only. 'dob' is the parameter name, 'date' is the type.):

DOB: {dob, date}

Let's consider a sample output 'DOB: January 29, 2020' in US locale. The argument set to dob may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29" - no timezone conversion
  • ISO 8601 datetime string : "2020-01-29T12:34:56Z" - 'T' and the rest is ignored
  • A linear millisecond count since the Unix time epoch : 1580336155304 - evaluated in UTC and time portion is chopped off
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,...} Only the significant fields (year, month and day) are used, the others are ignored.

Pattern 2, default timezone scenario: This pattern is for compatibility with the platform defaults so the behavior would vary depending on the platform's implementation. I feel this pattern does not have to be supported as long as the other patterns are supported. I am still including this as a pattern because it helps understand the differences from the other patterns and why using a platform default could cause problems.

In particular, because of the dependency on the platform's default implementation, the application behavior may be unpredictable or require a specific configuration to make the behavior predictable.

To print a date of birth using the platform's default timezone, the message template may look similar to the following (The markup is for illustration of the concept only. 'dob' is the parameter name, 'date' is the type, 'platform' is for selecting this option.):

DOB: {dob, date, platform}

Let's consider a sample output 'DOB: January 29, 2020' in US locale. The argument set to dob may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29" - no timezone conversion
  • ISO 8601 datetime string : "2020-01-29T12:34:56Z" - the UTC based moment is converted to the default timezone and the date fields get printed. This may be one day ahead or behind of the date in UTC depending on the UTC offset of the default timezone.
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the default timezone and the date fields get printed. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,...} Only the significant fields (year, month and day) are used, the others are ignored. Platform default designation is also ignored, or an exception may be thrown because the expression and argument type are contradicting.

Pattern 3, user's preferred/home timezone: A timestamp value that represents a specific moment is typically to be converted to a presentation timezone, which is normally the timezone of the user's home location. So timestamp values should be printed with timezone convertion applied by default. If timezone was not set, a default may be used.

To print when an object is last updated, the message template may look similar to the following (The markup is for illustration of the concept only. 'instant' is the parameter name, 'datetime' is the type, 'user' selects this option.):

Last updated: {instant, datetime, user}

Let's consider a sample output 'Last updated: January 29, 2020 05:20:30 PM' in US locale where the user's preferred locale is US Pacific time (behind 8 hours from UTC). The argument set to instant may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29T17:20:30" - no timezone conversion
  • ISO 8601 datetime string : "2020-01-30T01:20:30Z" - the UTC based moment is converted to the user timezone 'America/Los_Angeles' and all fields get printed. The sample output is one day behind of the date in the UTC based timestamp because of the UTC offset of the user timezone (-8 hours).
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the user timezone and all fields get printed. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,hour:17,minute:20,second:30,tz:"America/Los_Angeles",...} Only the significant fields (year, month, day, hour, minute, second and tz) are used, the others are ignored. User timezone designation is also ignored, or an exception may be thrown because the expression and argument type are contradicting.

Pattern 4, business entity timezone: A timestamp value that represents a specific moment may be converted for presentation to a timezone associated with a business entity.

When printing the time of intraday stock prices, the stock exchange is the business entity and the timezone of its location would be the relevant business entity timezone in which the timestamps should be presented. In this case, the message template may look similar to the following (The markup is for illustration of the concept only. 'time' is the parameter name, 'datetime' is the type, 'tz' may be the name of a function that supplies the timezone of the stock market, 'price' is where the stock price would go.):

{time, datetime, tz} {price}

Let's consider a sample output 'January 29, 2020 12:00:00 PM $234.56' in US locale in the timezone of the stock market. The argument set to time may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29T12:00:00" - no timezone conversion; rare, an exception may be thrown for the mismatch.
  • ISO 8601 datetime string : "2020-01-29T17:00:00Z" - the UTC based moment is converted to the timezone specified by the tz function. e.g. 'America/New_York' and 5 hours, the offset between UTC and the US Eastern timezone is applied for the adjustment. The sample output is 5 hours behind the date in the UTC based timestamp.
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the timezone resolved by the tz parameter. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,hour:12,minute:00,second:00,tz:"America/New_York",...} Only the significant fields (year, month, day, hour, minute, second and tz) are used, the others are ignored. User timezone designation is also ignored, or an exception may be thrown because the expression and argument type are contradicting.

Pattern 5, Hardcoding scenarios: A spcific timezone may be explicitly set on the message when it is appropriate to do so. This may be the case when the timezone for presentation is known when composing the message.

To print intraday stock prices in an application for NYSE, it may be fine to hardcode the US Eastern timezone. Then the message template may look similar to the following (The markup is for illustration of the concept only. 'time' is the parameter name, 'datetime' is the type, 'America/New_York' is the presentation timezone, 'price' is where the stock price would go.):

{time, datetime, 'America/New_York'} {price}

Let's consider a sample output 'January 29, 2020 12:00:00 PM $234.56' in US locale in US Eastern time. The argument set to time may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29T12:00:00" - no timezone conversion; rare, an exception may be thrown for the mismatch.
  • ISO 8601 datetime string : "2020-01-29T17:00:00Z" - the UTC based moment is converted to the specified timezone. The sample output is 5 hours behind the date in the UTC based timestamp.
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the timezone resolved by the tz parameter. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,hour:12,minute:00,second:00,tz:"America/New_York",...} Only the significant fields (year, month, day, hour, minute, second and tz) are used, the others are ignored. In this case, it may be appropriate to convert from the timezone of the calendar to the specified timezone. This is moot. An exception may be thrown because the expression and argument type are contradicting.

=== Summary ===

Well-defined timezone handling enables the message author to control the presentation timezone to achieve the intended application behavior, by making the timezone conversion behavior predictable. The set of arguments the application needs to supply at runtime is clearly defined and a strict coding pattern is enforced to reduce the chance to encounter an unexpected result. Correct coding patterns to acheive desired application behaviors are promoted through syntax checking of the message, linting of the application and good documentation.

This is opposite from what is typically happening today: Currently, the way application code is written largely defines the presentation behavior so it is hard for the message author to tell if the message will be printed in the intended timezone. It is the application developer who is responsible for writing the correct code that performs the intended timezone conversion. With well-defined timezone handling, application developers would no longer play the primary role, because the message itself describes how timezone conversion is to happen.

@mihnita mihnita added the requirements Issues related with MF requirements list label Sep 24, 2020
@aphillips
Copy link
Member

Time zone is interesting because it is orthogonal to locale and can have a "contextual" value (i.e. a default present in the runtime) as well as a need for developers to control it for a given MessageFormat instance.

For example, my host might be running in UTC. I might be formatting a message for a customer in America/Los_Angeles. My pattern is something like:

{Your order was processed on {$date :date skeleton=EyMd} at {$date :date skeleton=jm}}

To get the right results, I would expect to be able to control the timezone externally on the formatter. To use a Java example:

MessageFormatter fmt = MessageFormatter.builder(pattern)
           .setLocale(customerLocale)
           .setTimeZone(customerTZ)
           .build();

However... this seems like an implementation detail or feature. The core question is whether we should require implementations to do anything? I can't think of anything in the core syntax that requires changing here.

@aphillips aphillips added the resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. label Jun 28, 2023
@macchiati
Copy link
Member

macchiati commented Jun 28, 2023 via email

@aphillips
Copy link
Member

@macchiati noted:

I think datetimes formatting are like units. You really want to format an
element that consists of two parts: the date and the timezone.

I agree and modern Temporal data types help encapsulate this. That said, there are a variety of use cases in which the caller needs or wants to specify a specific time zone (or conversely wants a floating time presentation of a value that appears to have a time zone). Supporting these use cases requires some ability to pass in the desired time zone.

Some work on this is being documented in updates to the W3C time zone document, such as here: https://w3c.github.io/timezone/#serializations (shoutout to CJ Butenhoff and Abhijeet Kumar among others who did a lot of research on this). The ability to float and unfloat times and to convert or apply time zones to a given incremental time value is important in getting consistently the desired results.

One way to handle one part of this might be by providing tz options in the pattern syntax that developers can pass as arguments. For example:

{Today is {$date :datetime skeleton=EyMdjm tz=$myPassedTimeZone}}

What I don't see is a need for specific syntactical changes to the ABNF related to time zone. The registry entry for datetime should deal with time zone options. Implementations might also provide support for formatter-specific time zone overrides, such as in my example above (since the default time zone of the runtime is not always the one you want as your formatting default).

@aphillips
Copy link
Member

Closing resolve-candidates per discussion in 2023-07-24 call

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
requirements Issues related with MF requirements list resolve-candidate This issue appears to have been answered or resolved, and may be closed soon.
Projects
None yet
Development

No branches or pull requests

4 participants