Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow pattern literal types like http://${string} to exist and be reasoned about #40598

Merged
merged 8 commits into from
Sep 23, 2020

Conversation

weswigham
Copy link
Member

There's no open issue, but we mentioned how /${string} simplified to just string was a bit of a shortcoming in teams, and it looked easy enough to change, so here it is. With this change, we can now reason over nongeneric patterns and match them, allowing things like:

type Protocol<T extends string, U extends string> = `${T}://${U}`;
function download(hostSpec: Protocol<"http" | "https" | "ftp", string>) { }
// ok, has protocol
download("http://example.com/protocol");
// issues error - no protocol
download("example.com/noprotocol");
// issues error, incorrect protocol
download("gopher://example.com/protocol");

@ahejlsberg
Copy link
Member

ahejlsberg commented Sep 17, 2020

I had code substantially similar to this in my branch for a while, but decided against keeping it in the initial PR. I really like the core concept if we can overcome the concerns I had. They are:

  • This starts to feel like RegEx string literals, except not as powerful. Counterpoint of course is that we already have the functionality in inference and inference is supposed to reflect our type relations.
  • What about union and intersection reduction involving these template literal pattern types? Take for example the intersection type `get${string}` & ('getFoo' | 'setFoo'). Should that reduce to just 'getFoo'?. What about `get${string}` & `${string}Prop`? Should that reduce to `get${string}Prop`? This could get complicated.
  • This will handle ${string} and we already handle patterns with ${boolean} because that reduces to 'true' | 'false'. But what about ${number} and ${bigint}. Seems like we need to support those as well, perhaps by saying we succeed for a string that round trips to numeric and back to the same string representation. But then what about `${number}${number}`? Should that infer a single digit similar to how we infer a single character for strings? And what about `${number}.{$string}` when inferring from '123.456'? Would we infer 123.456 for the number position, or just 123?
  • If we support ${number} and ${bigint}, how would it be possible to constrain an infer T declaration to only infer a number? Would we permit constraints in infer declarations, like infer T extends number?

In general I want to be as minimalistic as possible, but it is not entirely clear where to stop.

@RyanCavanaugh
Copy link
Member

@ahejlsberg I think those are great open questions and ones that we should have some tentative next steps on. The motivating scenario here

type HttpLink<T extends string> = `http://${T}>`
function fn(foo: HttpLink<string>) { }
fn("wat");

just looks so broken without this that I don't feel super comfortable shipping it without at least issuing some kind of interim error when the widening to string occurs

@weswigham
Copy link
Member Author

@ahejlsberg

This starts to feel like RegEx string literals, except not as powerful. Counterpoint of course is that we already have the functionality in inference and inference is supposed to reflect our type relations.

Having implemented those: Yeah, sure, they do feel similar in that it captures subsets of strings - but regexes are definitely the kitchen sink of string matching, while this is quite focused (and inline with what we've already decided to ship). If anything, considering the reaction within the team, there's some surprise they don't behave like this already - a string is conceptually just the set of all string literals, right? So the character / followed by "any string literal" is conceptually distinct from "all string literals" - a subset.

What about union and intersection reduction involving these template literal pattern types? Take for example the intersection type get${string} & ('getFoo' | 'setFoo'). Should that reduce to just 'getFoo'?.

Probably! Hopefully the subtype relationship works out for the one half of that, and all that's needed for the other half is simplifying get${string} & "StringThatDoesn'tMatch" to never as one of our literal intersection rules.

What about get${string} & ${string}Prop? Should that reduce to get${string}Prop? This could get complicated.

I wanna say "no" - matching patterns up to patterns, while possible, I don't think is as intuitive as matching literals to patterns - I see leaving it as get${string} & ${string}Prop as fine. Especially since with something like a${string} & ${string}a, there's no simple template it could merge down to and actually be equivalent, without also extracting specific literal cases for common prefixes and postfixes, like "a" | a${string}a. So yeah, while it's possible, I think it's not as high value - there's probably more clarity in retaining the input patterns, anyway.

This will handle ${string} and we already handle patterns with ${boolean} because that reduces to 'true' | 'false'. But what about ${number} and ${bigint}. Seems like we need to support those as well, perhaps by saying we succeed for a string that round trips to numeric and back to the same string representation. But then what about ${number}${number}? Should that infer a single digit similar to how we infer a single character for strings? And what about ${number}.{$string} when inferring from '123.456'? Would we infer 123.456 for the number position, or just 123?

I don't think we directly need to handle non-stringy holes quite so much. In fact, non-string holes I question the value of allowing at all; it feels very.... coercive, which consequently feels unsafe.

@ahejlsberg
Copy link
Member

I don't think we directly need to handle non-stringy holes quite so much. In fact, non-string holes I question the value of allowing at all; it feels very.... coercive, which consequently feels unsafe.

I think an argument can be made that we should handle ${number} and ${bigint} holes in relations, but have all inference results be strings (i.e. infer T positions are always constrained to string and there's no way to change that). I say this because it is already the case for ${boolean} holes, so we're halfway there already.

With respect to matching on ${number} positions, I think we should simply slurp up characters the same we we do for ${string} and then afterwards require that the string round trips.

@weswigham
Copy link
Member Author

I think an argument can be made that we should handle ${number} and ${bigint} holes in relations

While that makes sense (sort-of) for template-to-template relations (so, the current check generalizes to non-string types) where the string parts are similar enough, there's not much of a reasonable (IMO) literal-to-template relation for holes like that (and literal-to-template relations are where these patterns have value validating APIs). Plus, I think such holes are of little practical use, anyway - there's not really a demonstrated need for them, at least not yet (TBH, I think there's a bigger need for things like an Infinity and NaN type than those, since those capture edge cases in existing code while nongeneric ${bigint} holes would capture a coercive operation I don't think anyone's ever even asked us to think about checking yet). I'm more fine leaving the holes as mostly strings only - definitely strings only for the non-generic ones.

@DanielRosenwasser
Copy link
Member

Chatted about that with @RyanCavanaugh earlier today, and they're kind of impossible to reason about. What's the implementation for compatibility with `${number}`?

@treybrisbane
Copy link

Sorry to randomly jump in here, but if I may throw a random thought at y'all... 🙂

I don't think we directly need to handle non-stringy holes quite so much. In fact, non-string holes I question the value of allowing at all; it feels very.... coercive, which consequently feels unsafe.

Given the existence of #40580, would it make sense to introduce a ToString<T> intrinsic type to make that coercion more of a conversion?
E.g.

type Nay<S extends string, N extends number> = `${S}-${N}`; // "Error: Cannot embed non-string type N into string"

type Yay<S extends string, N extends number> = `${S}-${ToString<N>}`; // Works

IMO, this would make the "all inference results are strings" approach much more intuitive, as infer T doesn't have any kind of explicit "from string" operation (thus indicating that the inferred type would remain in string form).

It would also provide a bit of future flexibility, as it would leave the door open to some hypothetical future infer T extends ParseNumber<T> syntax to implement reverse conversions.

Thoughts?

@ahejlsberg
Copy link
Member

Plus, I think such holes are of little practical use, anyway - there's not really a demonstrated need for them, at least not yet

See #40538. We currently error on types like `${string}` and `${number}`, but we can't keep them from happening through instantiation. We need some behavior for them and our current behavior is not ideal. Hence my suggestion above.

@ahejlsberg
Copy link
Member

What's the implementation for compatibility with `${number}`?

The same as `${string}`, except we would require the matching substring to satisfy (+str).toString() === str. We already have a similar check in the isNumericLiteralName function in the compiler.

@ahejlsberg
Copy link
Member

Given the existence of #40580, would it make sense to introduce a ToString<T> intrinsic type to make that coercion more of a conversion?

That's a possibility, but what would ToString<number> resolve to? Presumably string, but then we still have the issue in #40538:

type Foo<T extends number> = `${ToString<T>}`;
const foo : Foo<number> = "bar";  // Huh?

@treybrisbane
Copy link

That's a possibility, but what would ToString<number> resolve to? Presumably string, but then we still have the issue in #40538

You're correct, that would still be a problem.

From a theoretical perspective, number is effectively "the union of all possible numeric literals". Assuming the hypothetical ToString operator distributes over unions, ToString<number> would presumably resolve to "the union of the string form of all possible numeric literals". This is obviously not quite string, but the actual type is not something that can currently be expressed in TypeScript. The closest alternative is, as you point out, string.

So I guess one way to get a "correct" result for ToString<number>/${number} would be to somehow introduce the concept of "the union of the string form of all possible numeric literals". Sounds like a nontrivial increase in scope though...

@weswigham
Copy link
Member Author

Again - we don't have to define these holes such that they have "meaningful" results for all type inputs - we can issue errors if the holes have non-string types in them in checked contexts, and if they get a non-string type via instantiation, do something like what we do for T[K] where K is non-property-key and return never as the result (or the error any type, depending on context). That, ofc, also leaves that open to being broadened to @ahejlsberg 's hypothetical behavior in the future, if we see fit to do so.

@DanielRosenwasser
Copy link
Member

DanielRosenwasser commented Sep 18, 2020

I tend to err on the side of picking the conservative never behavior, but I don't really know how non-observable the behavior is (i.e. how breaky is it if you change it later?).

@weswigham did you mean for all primitives? Or everything apart from strings?

@weswigham
Copy link
Member Author

Everything apart from strings.

@ahejlsberg
Copy link
Member

we can issue errors if the holes have non-string types in them in checked contexts

In the original PR there's logic to error when the type of a placeholder is exactly string, number, or bigint, so that's there already.

and if they get a non-string type via instantiation, do something like what we do for T[K] where K is non-property-key and return never as the result

So `${number}` would resolve to never or would stick around but not be compatible with anything? Either way seems even stranger than the string resolution we have now.

I'll say it again, I think the right solution here is to have ${number} and ${bigint} placeholders behave just like ${string} for purposes of matching, but then additionally require that the matched substring successfully parses as a number or bigint.

@weswigham
Copy link
Member Author

@ahejlsberg per our discussion in today's design meeting, this now allows number and bigint to persist in template holes as well, and they have appropriate "is parseable as" checks in relationship checking. In addition, I broadened the template constraint to allow null and undefined in the holes as well, since those trivially stringify to "null" and "undefined" just as true and false stringify. In addition, I handle any in a template - it behaves essentially the same as string, but allows for more flexible assignability in the new template -> template assignability relation which also now exists. The relation is what you may expect - the text must exactly match (which is potentially stricter than is required, but is a fine place to start) and all the corresponding holes must be related - except all hole types are assignable to string (and any) type'd holes (since they accept any characters). Because of this, the position is marked unreliable for variance calculations.

@weswigham
Copy link
Member Author

So to anyone watching: Reviews would be welcome now~

@tpict
Copy link

tpict commented Sep 18, 2020

Is it possible to get a playground of this PR to compare it against #40538?

@weswigham
Copy link
Member Author

@typescript-bot pack this

@typescript-bot
Copy link
Collaborator

typescript-bot commented Sep 18, 2020

Heya @weswigham, I've started to run the tarball bundle task on this PR at 916aec6. You can monitor the build here.

@typescript-bot
Copy link
Collaborator

typescript-bot commented Sep 18, 2020

Hey @weswigham, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/85793/artifacts?artifactName=tgz&fileId=7D8AAC1002D527E1F7882C8F4446E1EDCC8042C72C8F7CE66574F92142CC7D8402&fileName=/typescript-4.1.0-insiders.20200918.tgz"
    }
}

and then running npm install.


There is also a playground for this build.

@tpict
Copy link

tpict commented Sep 19, 2020

Thank you!

What is the intended behavior when named pattern literal types are themselves used in a template literal type? For example:

type A = `${number}`;
type B = `${A} ${A}`;    // reduces to string
const example: B = "anything";

I'm not sure if it's viable to "unwrap" the types inside a template literal to preserve strictness, but as-is this looks like a variation of #40538.

@weswigham
Copy link
Member Author

Templates placed into templates should essentially concatenate - good catch; I'm surprised we didn't already have that behavior.

@ahejlsberg
Copy link
Member

ahejlsberg commented Sep 20, 2020

Templates placed into templates should essentially concatenate - good catch; I'm surprised we didn't already have that behavior.

We did have that behavior early on in the original PR, but I took it out when I added casing modifiers because it gets somewhat more complicated. However, the casing modifiers are going away in #40580, so I'll add the normalization logic back in there.

EDIT: Normalization now back in #40580.

@weswigham
Copy link
Member Author

@ahejlsberg now that #40580 is merged, I've resolved the conflicts, sync'd this, and added an explicit test of concatenating a pattern with another pattern.

Copy link
Member

@ahejlsberg ahejlsberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few minor changes.

@weswigham weswigham merged commit a960463 into microsoft:master Sep 23, 2020
@weswigham weswigham deleted the template-patterns branch September 23, 2020 08:09
@tpict
Copy link

tpict commented Sep 23, 2020

Thanks for this, big improvement 🙂

One question/comment about inserting string/number/bigint into a template literal type as opposed to a union of literals. As an outsider, it feels like there's a little bit of a disconnect between the two. When assigning to one of these types, my mental models are:

  1. for "pattern literal types", the typechecker will scan through the type until it finds a mismatch with the value or it reaches the end
  2. for "template literal types", the cross product of each union is taken, and the typechecker attempts to assign the value to each product until it succeeds or it reaches the end

My question is: in the second case, why don't we preserve the original type declaration instead of taking the product, and evaluate it when attempting assignment? I'm sure there's a technical reason for this, and it should be publicly documented–to take one of @ahejlsberg's examples, it feels really weird that you can't express a zip code, which on the surface appears to only require up to 50 comparisons to check valid assignments:

type Digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9;
type Zip = `${Digit}${Digit}${Digit}${Digit}${Digit}`;  // Error

and yet something like

type Example = `hello ${string}`;

has infinitely more permutations, but works fine.

@ahejlsberg
Copy link
Member

why don't we preserve the original type declaration instead of taking the product, and evaluate it when attempting assignment?

We favor the normalized string literal union representation because it enables more scenarios. For example, string literal union types can be unioned and intersected with other literal types, can narrow in control flow analysis, can be transformed using distributive conditional types, etc.

You're right that in certain cases that lead to very large cross products we might do better by preserving the un-normalized representation and match on that instead. It's an optimization we could consider, though I don't know how practical it is or how useful it would really be.

@tpict
Copy link

tpict commented Sep 23, 2020

That makes total sense!

I would appreciate the optimisation being considered–I imagine there's a negative correlation between cases where "the normalised representation of this type is oversized" and cases where "I want to put this type in a switch statement or map over it". Template literal types are a big win for TS users even when they're used for string validation alone. It's a little sad to not have that feature in scenarios that would be incompatible with features (mapping, control flow etc) that wouldn't make a whole lot of sense to use with a union of that complexity anyway.

I guess there's a question of "how does one communicate the difference between normalised/unnormalised template types to TypeScript users?", and the solution might lie somewhere in the delineation of applications (validation vs transforming & control flow).

@jcalz
Copy link
Contributor

jcalz commented Oct 20, 2020

So it looks like this person wants a type like { [K in `x${number}`]?: number } but that isn't currently useful and evaluates to {}. It makes sense that the compiler wouldn't magically support this use case, but is there some canonical issue about asking for constraint types in a property key position? I assume regex-validated types would suffer from the same problem.

@jcalz
Copy link
Contributor

jcalz commented Jan 17, 2021

heh, this question wonders why pattern literals can't usually be matched via infer:

type SplitComma<T> =
    T extends `${infer L},${infer R}` ? [L, R] : never;

type Okay = SplitComma<`left,right`> // ["left", "right"]
type NotOkay = SplitComma<`left,${string}`> // never!

or even

type Simpler<T> = T extends `${infer U}` ? U : never;
type Okay1 = Simpler<"abc"> // abc
type Okay2 = Simpler<`${string}`> // string
type NotOkay1 = Simpler<string> // never!
type NotOkay2 = Simpler<`a${string}`> // never!

Playground link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants