Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revisit round-trip matching constraint for number literal inferencing #57404

Closed
6 tasks done
dimitropoulos opened this issue Feb 14, 2024 · 16 comments
Closed
6 tasks done
Assignees

Comments

@dimitropoulos
Copy link
Contributor

dimitropoulos commented Feb 14, 2024

🔎 Search Terms

infer number, extends number, extends bigint, binary number, number representation, number notation, exponential notation, binary notation, hex numbers, hexadecimal numbers, hexadecimal notation, literal numbers, number literals

✅ Viability Checklist

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
    • strictly speaking might depend on the implementation, but it seems like some exist
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
  • This isn't a request to add a new utility type: https://github.com/microsoft/TypeScript/wiki/No-New-Utility-Types
    • it seems like it can be implemented without a utility type, but that will depend on the implementation
  • This feature would agree with the rest of our Design Goals: https://github.com/Microsoft/TypeScript/wiki/TypeScript-Design-Goals
    • I reread them twice just now and I can strawman reasons against checking this box, but it seems fairly reasonable, really

⭐ Suggestion

Back in #48094, constrained "infer" types in template literals were set to be limited by a "round trip" constraint. Meaning: numeric inference of string literals would only be allowed for literals that remain the same going from string to number and then back to string again. E.g.:

  • "123" satisfies the constraint because "123" -> 123 -> "123"
  • 🟥 "0x10" isn't allowed because "0x10" -> 16 -> "16"

That makes sense. I can certainly understand the tradeoff of preferring type system performance and simplicity over the nuance edge case of number-to-string conversion. These type-system arithmetics weren't a common use case for end users at the time.

However, since #48094, there've been quite a few use cases that have popped up.

Some of them even impact real-world libraries* that have been inconvenienced by not being able to infer number literals that don't satisfy the round-trip constraint.

Let's consider a common ToNumber utility type and less-common variant ToBigInt, defined roughly as:

type ToNumber<T extends string> = T extends `${infer N extends number}`
  ? N
  : never;

The following table has:

  • ✅ Allowed: what can be inferred as a number literal today
  • 🟥 Blocked: what would be possible if the round-trip constraint were removed
Use Case Post TS 4.8 Behavior Notes
Fractional Numbers
Use Case Literal ToNumber
2
2.1
-2
-2.1
2.0 🟥 number
2.10 🟥 number

Fractional number representations ending in zero are probably the one that people hit the most.

This is one that people try to use recursion to fix. For example, see @anuraghazra's attempt at fixing this problem (link). Since you pay one recursion tax per digit of the number this is probably ok since the recursion max is 100.

The 1e-6 Boundary
Use Case Literal ToNumber
0.000001
0.0000001 🟥 number
1e-6 🟥 number
1e-7

For small numbers, TypeScript switches its underlying notation somewhat arbitrarily at the 1e-6 boundary.

This causes a "flip" where you can infer numbers between the 0 and 1e-6 boundary, but as soon as your number gets smaller than that, the inferencing breaks if you're not using the same notation.

The 1e20 Boundary
Use Case Literal ToNumber
100000000000000000000
1000000000000000000000 🟥 number
1e+20 🟥 number
1e20 🟥 number
1e+21
1e21 🟥 number

This one is sort of an inverse of the above (just for large numbers) but with an added footgun: if you don't have the + sign once you get into the e-notation range, then it will also not work because TypeScript always includes the + in this range.

Base Notations
Use Case Literal ToNumber
0b10 🟥 number
0xff 🟥 number
0o12345670 🟥 number

Binary, Hexadecimal, and Octal number notations don't work. Common approaches for binary require lots of recursion if you have a scenario where you need to convert a binary number to decimal.

Hex numbers perhaps with even more use cases. A lot of the use-cases that are listed in #54925 also apply here (e.g. RGB values, reading bytes, etc.).

Numeric Separators
Use Case Literal ToNumber
1_000.000_1 🟥 never
1_000 🟥 never
0b11_1110_1000 🟥 never
0x31_78_c6 🟥 never
1_000n 🟥 never
Separators are allowed in number literals but never in strings because they're always dropped in TypeScript's representation and therefore any string input with a separator can never match.
BigInts BigInts as inputs are not allowed*. The typeical code you see for this looks like
type ToBigInt = T extends `${infer N extends bigint}` ? N : never;
2n as a literal works but ToBigInt<"2n"> results in never and ToBigInt<"2"> is the way to get 2n. This is different from the above because in this situation the string representation and the number representation do match but it only works if the input is not a bigint. This seems to break the round-trip rule, because in this case if the input is 2n and the output is 2n then you'd think they'd match.

* You can sorta flip the result of the last two cases by adding an n after the matching clause:
type ToBigIntN = T extends ${infer N extends bigint}n ? N : never;
But this seems pretty inconsistent with how binary and hex numbers do actually match (although, not "all the way" to the point of being a literal).
also: a quirky consequence regarding `-0n` (don't laugh)

I also noticed that there's a (presumably unintended) behavioral mismatch regarding -0n. As far as I can tell, this is the only situation where you can get a bigint out "the other side".

There is no negative-zero BigInt as there are no negative zeros in integers. -0.0 is an IEEE floating-point concept that only appears in the JavaScript Number type (source). Yet, TypeScript allows it. I think that's sorta fine because, actually the BigInt constructor also allows it, and that's presumably what this code courses through anyway.

type PeopleOnTwitterAreGonnaMakeFunOfAnyoneWhoComplainsAboutThis = [
  /*✅*/ -0n, // 0n
  /*🟥*/ ToBigInt<"-0n">, // never

  /*✅*/ ToBigInt<"1">, // 1n
  /*✅*/ ToBigInt<"0">, // 0n
  /*🟥*/ ToBigInt<"-0">, // bigint
  /*✅*/ ToBigInt<"-1">, // -1n
]

⏯ Playground Link

Playground Link

thanks to @JoshuaKGoldberg for suggestions on how to clean up this issue's formatting

@fatcerberus
Copy link

fatcerberus commented Feb 14, 2024

I don’t think any of these are bugs because `${number}` very intentionally only matches strings that roundtrip through a string-number-string coercion. In other words it doesn’t mean “any numeric string” but rather “any string which can be produced by `${num}` at runtime”. The behaviors you’ve observed are all a natural consequence of that, AFAICT.

As for your addendum, that’s a different thing entirely (but is also intentional): #46124 (comment)

When placeholders are immediately next to each other, the first placeholder just infers a single character from the source. The kind of placeholder being inferred to doesn't matter for determining the inferred text, it only matters for validating the text.

@dimitropoulos
Copy link
Contributor Author

Thanks! I'm on the same page with you there about this being a consequence of the round-tripping. That's why I quoted Ron at the top the Context section describing how the round-tripping was originally considered at design time. Actually, the test cases in that PR were a large motivator for finally opening this issue (as a bug, not a feature request) because those test cases happen to just fly by some of the nuance I mention here that feels buggy from a "normal person using TypeScript" standpoint.

If it's more appropriate to convert this to a feature request, that's fine by me: I just want to know if there's any way to go about improving this (again, I have ideas, but I wanna make sure the problem domain is agreed on first). Seems like there should be some way to protect against most of the cases I brought up (hopefully 😄).

Thanks also for the note re: the addendum!

@jcalz
Copy link
Contributor

jcalz commented Feb 14, 2024

I don’t think `${number}` actually has such a round-trip requirement, even though the inference does:

#41893 (comment)

It’s inconsistent but I’m not sure they have an appetite for doing anything to change it.

@dimitropoulos
Copy link
Contributor Author

haha, I totally know what you mean @jcalz re: appetite to improve type numbers. Actually, as a personal rule I try very hard to never submit "make numbers better" feature requests (as a matter of respect). In this case, I realized when you add it all up it's quite a mountain of strange behavior (until you realize what's really going on re: round-tripping).

I think most people hit the "trailing zero" problem first, and it's the only one with a workaround (albeit a tad expensive, recursion-wise). My hope with this issue was to show all the places with the same root cause. When you look at it all from a distance.. it's a lot! And besides, I have the appetite to fix it now that it's thoroughly blocked me (hundreds of thousands of binary numbers, making recursive techniques a non-option).

@jcalz
Copy link
Contributor

jcalz commented Feb 14, 2024

Nobody here is asking for my opinion but here it comes anyway:

I think template literal types should only refer to what happens when you use a template literal string. So `${number}` and T extends `${infer N extends number}` ? N : never should only refer to the set of values you can get when serializing a numeric value via template literal string, which would essentially require round-tripping.

It is a very natural and reasonable thing to want to have some way to represent a string which could successfully be parsed as a number, but that's not what template literal strings do, at all; and pushing such semantics into template literal types feels like a category error to me.

In my own personal version of TypeScript that lives only in my dreams, I would have a completely different set of tools for what you're trying to do that don't (ab)use template literals. Imagine an intrinsic type like Numberable corresponding to any type which would successfully be parsed as a number (I suppose this would exclude NaN as being considered a "success") and then an intrinsic type ToNumber<T extends Numberable> which represents the type of the output of Number(t) where t is of type T. Or some other syntax, but it should stay well away from template literals.

But in the issue I referenced it was made clear that nobody but me wants it that way. Now this issue is requesting that such behavior be expanded to include inference. I think that's probably ultimately fine; I'd happily use such a feature if it existed (and it is a feature request, not a bug, this is working as intended). But it's hard to explain how or why this would have anything at all to do with template literals, leading to a weird mental model that does special crazy magic for numbers but not for, say, booleans.

@dimitropoulos
Copy link
Contributor Author

dimitropoulos commented Feb 14, 2024

Imagine an intrinsic type like Numberable

That would be fantastic! Last this came up the blocker was something to do with intrinsic being strongly tied to string literal types, but perhaps in the two years since then (especially with NoInfer having landed), maybe that requirement is loosened now? I'm probably just missing it but it's not immediately clear to me how the issue you referenced means that nobody wants it that way.

re: bug vs feature request. sure: that's fine by me :) -> I guess all I wanted to know first is something like:

yes, it's intended behavior that ToNumber<"0.000001"> returns a literal but ToNumber<"0.0000001"> doesn't.

After all, not every observable behavior of the number inferring has been considered by-design, for example whitespace infers to number and is accepted as a bug. To me, some of the things in this PR are not far afield from that one.

@fatcerberus
Copy link

fatcerberus commented Feb 14, 2024

@jcalz FWIW, I agree with you - especially given @ahejlsberg’s stated insistence that template type inference not “turn into another regex engine”. It would make perfect sense to me if `${number}`-the-type consistently represented only the set of things producible by `${numberVar}`-the-string-literal.

Of course, I’m also a pragmatist who recognizes that what I want doesn’t really square with the way template type inference is used by TS coders today; as such, I consider this genie to be already out of the bottle, and have to reluctantly agree with the implied feature request: if it’s indeed intentional that "2.0" is assignable to `${number}`, then that should work in the opposite direction too.

@RyanCavanaugh RyanCavanaugh added the Needs Investigation This issue needs a team member to investigate its status. label Feb 14, 2024
@RyanCavanaugh
Copy link
Member

I don't know how we'd ever square this circle in a way that people found acceptable. Probably 99.9% of embedded ${number} literals are consumed in a parseInt/parseFloat/+n context and it seems beyond comprehensibility to reject "2.0" in that context.

This is a crash because,

A crash is when tsc exits abnormally due to an exception. Untasteful behavior is not a crash 😉

@fatcerberus
Copy link

it seems beyond comprehensibility to reject "2.0" in that context.

...well yes, that's the point of the issue - "2.0" is rejected when inferring from a template literal. 😉

@dimitropoulos
Copy link
Contributor Author

dimitropoulos commented Feb 14, 2024

@RyanCavanaugh ohhhhh as in a runtime crash! sorry about that! I stared at the options on the form for a long time

- This is a crash
- This changed between versions ______ and _______
- This changed in commit or PR _______
- This is the behavior in every version I tried, and I reviewed the FAQ for entries about _________
- I was unable to test this on prior versions because _______

and I was thinking about it in terms of "a type that parses can suddenly stop parsing and return never" but now I see that was a pretty near-sighted of me, haha. I almost forgot that <sarcasm>some people use TypeScript for more than just the type system</sarcasm>! heh. sorry!

It seems like that's a clear indication it should be a feature request so I updated it as such (as best I could, hopefully I didn't screw something up).


re:

in a way that people found acceptable

I realized that I didn't say it above anywhere but I just wanted to clarify that I find the currently implementation completely acceptable. Anyone who says that TypeScript hasn't "gone far enough" with this stuff is being silly. ✨ TypeScript is wonderful ✨ even if you can find ways (like 2.0) to trip it up. Everything's a trade-off, haha. I totally understand.

So on that note @rbuckton since you're assigned to this I wanted to say the intentions for making this issue were:

  1. (most importantly) To confirm that the behaviors I described above all known and acceptable. I know some of it may be, but I didn't see discussion or tests that covered all of them.
  2. There didn't seem to be a place that sortof compiled the current "here's all the edge cases" and gave them the context of what they have in common. even if this issue is immediately closed, it can be a reference for that stuff whenever it comes up.
  3. To indicate that I am highly motivated to help in any way I can to get this work over the line. I have the time and I'm willing to sink it into this problem.

@dimitropoulos dimitropoulos changed the title number literal inferencing is dependent on TS's internal representation due to a round-trip matching constraint revisit round-trip matching constraint for number literal inferencing Feb 14, 2024
@vtgn
Copy link

vtgn commented May 13, 2024

Hi!
I've got the same problems than some of the listed ones above, and I'm disappointed my types don't work as logically expected. :'(
I think that ToNumber type should return the same result as the following type for the same argument in number:

type ResolvedNumber<N extends number> = N

Example:
ResolvedNumber<2e1> returns 20
The same way, it would be logical that ToNumber<"2e1"> returns 20.

@vtgn
Copy link

vtgn commented May 13, 2024

@dimitropoulos You forgot the ".<digits>" format in your table, that doesn't work either.
Ex: ToNumber<.1> returns number instead of 0.1

@rbuckton
Copy link
Member

While there may be some odd quirks to review for some cases, the general principle is that you can only use infer to pull out of a string what would have been put into the string. If you write `${2.0}` in a JS engine, you will always get "2" and never "2.0". Regardless has to how you write your number, what actually gets put into the string is the canonical numeric string representation of that number, so that is all we should ever try to extract. infer is not a general-purpose ToNumber mechanism.

@rbuckton
Copy link
Member

rbuckton commented May 30, 2024

As to the specific use cases mentioned:

Fractional Numbers

The canonical string representations of 2.0 and 2.10 are "2" and "2.1", respectively.

The 1e-6 Boundary
The 1e20 Boundary

This is specific to how JavaScript formats IEEE floats for a radix of 10. Per Step 6 of Number::toString. Once you have passed a specific order of magnitude in either direction, the canonical string representation uses the scientific notation form (Steps 7+ of the algorithm). This range is from -5 to 21 - k (where k is always >= 1, making the range -6 to 20), and is thus why it switches at these boundaries.

Base Notations

The canonical string representation of a given Number is always presented either as a base 10 fixed decimal or via scientific notation, via the rules mentioned above.

Numeric Separators

Numeric separators are not preserved in an actual Number value and thus are not included in the canonical string representation.

BigInts

The canonical string representation for a given BigInt never includes the n suffix.


I don't believe any of these should be supported by infer as they are not actually produced by .toString() for any runtime values of Number or BigInt. If you wanted something like a type ToNumber<S, Radix> = intrinsic, that's a different feature entirely.

@dimitropoulos
Copy link
Contributor Author

I don't believe any of these should be supported by infer

Thanks @rbuckton! That's the answer I was looking for. I really appreciate you taking a closer look. I just wanted clarification if all of these are intended, and it seems like they are so.. that's that! I continue to be eternally grateful for the powerhouse of engineering known as TypeScript. I'll be fine without this little wrinkle (of JavaScript itself, ultimately) being ironed out.

If you wanted something like a type ToNumber<S, Radix> = intrinsic, that's a different feature entirely.

Totally agree. I'm sure anyone that tries to make the case for this in the future will refence this issue or the cases I mentioned, but I won't be the one to file it! haha.


For the 100th time. Hats off to the TypeScript team for what's already possible. Anyone who reads this issue and somehow draws the conclusion that TypeScript isn't good enough.... I'd suggest you reconsider.

@teamchong
Copy link

The round-trip constraint for numeric literals ensures consistency in TypeScript, as noted in the discussion. Using TypeScript types to parse number formats can address many edge cases. Here’s an example approach: https://tsplay.dev/weAbgw

@rbuckton rbuckton removed the Needs Investigation This issue needs a team member to investigate its status. label Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants