Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing a string with a custom format #2

Open
ptomato opened this issue Feb 11, 2021 · 4 comments
Open

Parsing a string with a custom format #2

ptomato opened this issue Feb 11, 2021 · 4 comments

Comments

@ptomato
Copy link
Collaborator

ptomato commented Feb 11, 2021

Summary of discussion from tc39/proposal-temporal#796:

When working with legacy or third-party systems, you often find yourself receiving date strings in the most unwieldy formats. To remedy this, date libraries often support custom format strings to make it possible to parse such strings.

A Temporal.(type).fromFormat(string, formatString) method was proposed. This format from Unicode TR 35 was identified as a likely microformat to use. It's also used by date-fns with a few additions.

This was the most popular feature that didn't make it into the original Temporal proposal, as determined unscientifically by GitHub emoji reactions and number of times it was mentioned in our feedback survey.

Advantages:

Parsing is difficult to do correctly, and possibly out of reach entirely for inexperienced developers. This is one of the reasons developers have turned to third-party libraries instead of using legacy Date.

Parsing textual month names or era names in a locale-aware way would require including a large bundle of locale data, defeating the purpose of using Temporal rather than Moment or another locale-aware library.

Concerns:

The philosophy in the original Temporal proposal was that parsing anything other than ISO strings is in the domain of business logic because everybody knows their own use case better than Temporal can.

Whatever is enabled here should not repeat the mistakes of Date.parse().

Prior art:

  • Moment.js supports passing a format string as a second argument: moment("07-28-2020", "MM-DD-YYYY")
  • Luxon also supports this through its .fromFormat() method: DateTime.fromFormat("07-28-2020", "MM-DD-YYYY")
  • date-fns: parse('07/28/2020', 'MM/dd/y', new Date())
  • Java's DateTimeFormatter is an entity you construct with your pattern ('YYYY-MM-DD' etc), and use for both parsing and printing/formatting.

Constraints / corner cases:

  • Temporal.PlainDate.fromFormat('2021-02-11', 'yyyy-MM') — format string doesn't include all the required fields
  • Temporal.PlainDate.fromFormat('2021-13-11', 'yyyy-MM-dd') — month is invalid in the ISO calendar but could be valid in another calendar. Need a way to indicate in which calendar the string is parsed.
@craigmiller160
Copy link

I am a huge proponent of this proposal. The temporal APU is absolutely amazing, I am so impressed by it's capabilities. The inability to properly format it, however, is a huge limitation

@Eonasdan
Copy link

Hello, I maintain a popular OSS date/time picker. I'm excited to see what Temporal is bringing to the JavaScript table because let's face it, the JS Date object kinda sucks.

Originally, I was using momentjs to handle input field parsing. A developer could provide the format string they wished to use and my picker, via moment, would parse string to date and date back to string into the input field.

Momentjs is now deprecated and should not be used on new projects. At first, I looked at dayjs since it is almost a drop-in replacement, but this would require a different library and sometimes 2-3 extra scripts to be loaded to be on par.

I ended up crafting my own solution that enhances the native Date object, and makes heavy use of Intl.  However, even with that, I find that extracting and displaying date information is too limited with the current native solutions. I went back to dayjs for examples of how they handle string parsing and created a plugin that none en-US users could use to provide and parse custom date strings. I'm now looking at pulling this plugin into the main code since Intl doesn't allow you to provide an output format, e.g. MM-yyyy.

My custom code uses format tokens closely to the C# format tokens. I couldn't find an agreed upon set of standards for that.

All that to say is that the one thing I find missing from Temporal is parsing in and formatting out. It seems to me that once Temporal is fully supported by the browsers that I will still need something custom to do this part. Given the number of JS libraries that attempt to do this in any meaningful way, I think that it would be a great idea if a native JS solution existed to handle this.

It was mentioned that a concern is if a string + format was incorrect/missing. I think in this instance, it is perfectly reasonable for the API to simply error.

If you'd like to see what I have, you can find that here. It's not perfect, and I am hunting down a bug with it, but it works pretty well so far.

Thanks for your consideration, and sorry for the wall of text.

@justingrant
Copy link

Hi @Eonasdan - thanks so much for your kind words about Temporal! Glad it can be helpful for you.

All that to say is that the one thing I find missing from Temporal is parsing in and formatting out.

For formatting out, there's the toLocaleString methods of each Temporal object, which are wrappers around essentially the same functionality in an upcoming, Temporal-aware version of Intl.DateTimeFormat.

What's missing for you with toLocaleString that makes it inadequate for your needs?

For parsing, locale-aware parsing is a long-running technical problem that Temporal is not planning to solve any time soon, so libraries like yours (or Luxon or day.js or...) will be helpful!

@Eonasdan
Copy link

I'm hoping to make a POC of my picker using Temporal at some point btw. I'm curious how well it will work and what problems it could solve and what it could replace that I had to do myself.

The problem with Intl.DateTimeFormat and toLocaleString is that it is very rigid. Intl.DateTimeFormat('en', {dateStyle: 'short', timeStyle: 'long'}).format() will produce 10/26/22, 5:58:58 PM EDT for me, but what if I don't want the comma? There is no built in way to specify that I want MM/dd/yyyy hh:mm:ss A z or MM-dd-yyyy-hh:mm:ss a z without some kind of library to deal with this.

There's no way to get MM-yyyy and so on

If JS had an opinionated way of saying "these are the tokens you can use to parse dates in/out". There would be less need for all these date libraries out there. When you look at the sea of JS libraries that attempt to solve a hole in JS, it seems like it would be worth reviewing the merits of providing a native solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants