Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support user-defined type guard functions #1007

Closed
RyanCavanaugh opened this issue Oct 30, 2014 · 85 comments
Closed

Support user-defined type guard functions #1007

RyanCavanaugh opened this issue Oct 30, 2014 · 85 comments
Labels
Fixed A PR has been merged for this issue Suggestion An idea for TypeScript

Comments

@RyanCavanaugh
Copy link
Member

We currently only support some baked-in forms of type checking for type guards -- typeof and instanceof. In many cases, users will have their own functions that can provide run-time type information.

Proposal: a new syntax, valid only in return type annotations, of the form x is T where x is a declared parameter in the signature, or this, and T is any type. This type is actually considered as boolean, but "lights up" type guards. Examples:

function isCat(a: Animal): a is Cat {
  return a.name === 'kitty';
}

var x: Animal;
if(isCat(x)) {
  x.meow(); // OK, x is Cat in this block
}

class Node {
  isLeafNode(): this is LeafNode { throw new Error('abstract'); }
}
class ParentNode extends Node {
  isLeafNode(): this is LeafNode { return false; }
}
class LeafNode extends Node {
  isLeafNode(): this is LeafNode { return true; }
}
var someNode: LeafNode|ParentNode;
if(someNode.isLeafNode()) {
  // someNode: LeafNode in this block
}

The forms if(userCheck([other args,] expr [, other args])) { and if(expr.userCheck([any args])) would apply the type guard to expr the same way that expr instanceof t and typeof expr === 'literal' do today.

@RyanCavanaugh RyanCavanaugh added Suggestion An idea for TypeScript In Discussion Not yet reached consensus labels Oct 30, 2014
@Nevor
Copy link

Nevor commented Oct 31, 2014

The second example does not seem to scale very well :

class Node {
  isLeafNode(): this is LeafNode { throw new Error('abstract'); }
  isOtherNode(): this is OtherNode { throw new Error('abstract'); }
}
class ParentNode extends Node {
  isLeafNode(): this is LeafNode { return false; } 
  isOtherNode(): this is OtherNode { return false; } 
}
class OtherNode extends Node {
  isLeafNode(): this is LeafNode { return false; }
  isOtherNode(): this is OtherNode { return true; } 
}
class LeafNode extends Node {
  isLeafNode(): this is LeafNode { return true; }
  isOtherNode(): this is OtherNode { return false; } 
}
var someNode: LeafNode|ParentNode|OtherNode;
if(someNode.isLeafNode()) {
  // someNode: LeafNode in this block
}

Furthermore, we are back with typing that is dependent on class construction while one would expect this to work with interface only for items as simple as these.

Finally, given that we are stuck with classes, we can currently encode this idiom more lightly with instanceof :

class SuperNode { }

class ParentNode extends SuperNode { 
  private constrain;  // using dummy private to disjoint classes
}
class OtherNode extends SuperNode { 
  private constrain; 
}
class LeafNode extends SuperNode { 
  private constrain; 
  leafNode : string;
}


var someNode : ParentNode | OtherNode | LeafNode;

if(someNode instanceof LeafNode) {
  someNode.leafNode;
}

The first example seems to be an excellent case for #1003

@RyanCavanaugh
Copy link
Member Author

Perhaps the second example was not clear in its intent. Consider something like this, where instanceof would not work:

interface Sortable {
  sort(): void;
}

class BaseCollection {
  isSortable(): this is Sortable { return false; }
}

class List extends BaseCollection implements Sortable {
  isSortable(): this is Sortable { return true; }
  sort() { ... }
}

class HashSet extends BaseCollection {
  isSortable(): this is Sortable { return false; }
}

class LinkedList extends BaseCollection implements Sortable {
  isSortable(): this is Sortable { return true; }
  sort() { ... }
}

@Nevor
Copy link

Nevor commented Oct 31, 2014

Indeed its intent is indeed clearer. So I have two comments :

  1. With the current compiler, we can leverage that everything is either "undefined" or something to create an optional conversion :
interface Sortable  {
  sort(): void;
}

class BaseCollection {
  asSortable() : Sortable { return undefined }
}

class List extends BaseCollection implements Sortable {
  asSortable() { return this }
  sort(){}
}

class HashSet extends BaseCollection {
}

class LinkedList extends BaseCollection implements Sortable {
  asSortable() { return this; }
  sort() { }
}

function someFun(collection : BaseCollection) {
  var asSortable = collection.asSortable();
  if(asSortable) {
    asSortable.sort();
  }
}

But I agree that it would be strange to do something through asSortable and then come back on collection to call other methods.

  1. I assume that the compiler verifies that the result of "this is Sortable" is in accordance with what the class actually implements (the verification might be done structurally). In that case, what about generating the "return true" ?
class List extends BaseCollection implements Sortable {
  isSortable() : this is Sortable; // would return true
  sort() { ... }
}

class OtherList extends BaseCollection {
  isSortable() : this is Sortable; // would return true
  sort() { ... }
}

class OtherList2 extends BaseCollection {
  isSortable() : this is Sortable; // would return false
}

class OtherList3  {
  isSortable() : this is Sortable; // would return true
  sort() { ... }
}

class OtherList4 {
  isSortable() : this is Sortable; // would return false
}

@bgever
Copy link

bgever commented Nov 20, 2014

This is a cool idea. Have you considered combining it with generics? Then you could do something like this for array checks:

function isNumber(nbr): nbr is number {
    return typeof nbr === 'number';
}

function isString(str): str is string {
    return typeof str === 'string';
}

function isArrayOf<T>(of: (item) => item is T, a: any[]): a is T[] {
  // Ensure that there's at least one item of T in the array, and all others are of T too.
}

function isArrayOrEmptyOf<T>(of: (item) => item is T, a: any[]): a is T[] {
  // Accepts an empty array, or where all items are of T.
}

function(input: string|string[]|number[]) {
    if (isString(input)) {
        // It's a string.
    }
    else if (isArrayOrEmptyOf(isString, input)) {
        // It's an array of string, or
        // an empty array (could be either of string or number!)
    }
    else if (isArrayOf(isNumber, input)) {
        // It's an array of number.
    }
}

@ahejlsberg
Copy link
Member

@bgever We did indeed consider that. It breaks down to (a) allowing x is T where T is a type parameter, and (b) extending type inference such that a type argument can be inferred from the target type of a user defined type guard function. All up I think this could be really useful.

@Arnavion
Copy link
Contributor

One example usage for the standard library - Array.isArray can be changed to:

interface ArrayConstructor {
    isArray<T>(arg: any): arg is T[];
}

@awerlang
Copy link

@Arnavion I believe that for Array.isArray it would be more correct to express in a non-generic way:

interface ArrayConstructor {
    isArray(arg: any): arg is any[];
}

This would support union types like MyCustomObject | any[]

And how about:

interface Object {
    hasOwnProperty<T | any>(v: string): v is T;
}

Object.hasOwnProperty<MyCustomObject>.call(obj, 'id');

Or it could narrow to possible matching types obj could be:

var obj: { id: number; description: string; } | any;
if ('id' in obj) {
  // obj.id is valid here  
}

@Arnavion
Copy link
Contributor

@awerlang

Actually, I suppose isArray<T>(): T[] would give the false assurance that the array only has elements of type T. It might be better to require the user to give an explicit assertion <T[]> so that they know they have to perform some validation after the call to isArray to be sure they actually have a T[].

Your last example results in obj being of type any because of union type behavior, so obj.id is also allowed already. If you meant you wanted obj to be considered of the first type in the union (and thus obj.id would have type number) inside the if block, then that doesn't seem right. obj could be { id: "foo" } which would not be assignable to the first type.

For your second example, I assume you meant hasOwnProperty<T>(obj: T | any, v: string): obj is T;. This has the same problem - the presence of an 'id' member doesn't really tell you that obj is of MyCustomObject type and not some other type that also has an id property. So it should be something like hasOwnProperty(obj: any, v: string): obj is { [v]: any };.

Even if the obj was of type { id: number, description: string } | { foo: string } (i.e., the union doesn't contain any), the second and third example should still result in obj being of type { id: any } inside the if-block, because obj could actually be { foo: "5", id: "bar" } which isn't assignable to the first type but is assignable to the second.

@awerlang
Copy link

@Arnavion

I think having isArray(arg): arg is any[] is convenient, even though we don't assert element type.

About inferring an object type by duck-typing, I consider it is not a problem in its own, although I see the problems it may lead to. This is a concern in #1427 . How would you handle this?

@Arnavion
Copy link
Contributor

Sorry, perhaps I wasn't clear. I was agreeing with you that isArray should return arg is any[].

And yes, #1427 is exactly the same as what I said for your other two examples.

@mintern
Copy link

mintern commented Feb 17, 2015

Here's another use case based on a filesystem Entry as defined in the not-yet-standard File System API:

interface Entry {
    isFile: boolean;
    isDirectory: boolean;
    name: string;
    ...
}

interface DirectoryEntry extends Entry { ... }
interface FileEntry extends Entry { ... }

When (<Entry>entry).isFile, entry is a FileEntry, and when entry.isDirectory, entry is a DirectoryEntry.

Note that the *Entry classes are not easily accessible, so instanceof can't be used.

@RyanCavanaugh RyanCavanaugh added Help Wanted You can do this and removed In Discussion Not yet reached consensus labels Apr 27, 2015
@RyanCavanaugh RyanCavanaugh added this to the Community milestone Apr 27, 2015
@RyanCavanaugh
Copy link
Member Author

Approved; assignment to Anders is tentative (someone else can give this a shot if they're feeling up to the challenge)

@tinganho
Copy link
Contributor

@RyanCavanaugh I decided to take shot on this. Just one question. Can the type guard function take multiple arguments?

function isCat(a: Animal, b: number): a is Cat {
  return a.name === 'kitty';
}

and if so does the argument index need to match the parameter index?

if(isCat(b, a)) {
  a.meow();// error
}

No error if matched index:

if(isCat(a, b)) {
  a.meow();
}

I'm not sure of its's usefulness though.

@RyanCavanaugh
Copy link
Member Author

Multiple arguments should be allowed.

I don't understand the example, though. What are the types of the unbound variables a and b in the bottom two code blocks?

@tinganho
Copy link
Contributor

a is of type Animal and b just a number. let's assume that isCat takes any as argument too.

@spion
Copy link

spion commented Sep 12, 2016

@shelby3

First, you can't populate the prototype chain with interfaces. This will not work because instanceof does not work reliably cross-realm. For example a instanceof Array can return false if the array a came from e.g. another iframe (more generally, another realm). See this discussion

Secondly, npm (the most popular package manager for JavaScript at the moment) compounds the above problem by installing multiple semver-incompatible (and up until npm 2.0, also semver-compatible) versions of the same library into different directories. This in turn means that a class defined in such a module may actually have more than one value; and again instanceof won't work reliably, similarly to the way it doesn't work reliably cross-realm.

Finally, this is simply not how most JavaScript is written. Infact it cannot be how most JS is written, as this would be a typescript-only feature. And here we come to a clash to TypeScript's design goals: to be a type system which helps with existing JavaScript code. Just look at how Promises/A+ thenables are specified. Its all about a method then present on the thenable object. Not about some non-existing constant "Thenable" that should be in the prototype chain. Admittedly, some of this is a product of the other two instanceof problems above. The rest of the reasons are complex, but mainly its a combination of "no single module system" (this constant would need to be defined in some JS module and exported from it), desire to keep JS code small and therefore devoid of dependencies, desire for interoperability etc. Nevertheless, these reasons confine most JS code to structural checking (and since TypeScript aims to model JS code, its therefore confined to structural types)

As to why type guards are okay, I'll just quote myself without the "you are wrong" part:

This sort of feature is precisely what makes TypeScript different from other typed languages. While they decide NOT to trust the user unless their input fits their narrow preconceptions of what is correctly modelled code, TypeScript takes a different approach. TypeScript mostly trusts the user. The user, in turn, needs to take special care in ensuring some of their code is correct, like type guards. But not all, as it would be with dynamic languages.

This is the biggest win of TypeScript: it lets you pick between "great power, great responsibility" (carefully hand-check the code, ensure that your type assertions are correct) and "lesser power, lesser responsibility" (once you hand-"prove" type assertions are correct, you don't have to do it for the rest of the code).

Its not a type system that ensures correctness. Its a type system that works with the user to ensure it. Its a "bring your own lemma" type system

So this feature is sound in this sense: "The compiler cannot automatically check this, but if you supply your own unchecked proof that the type is indeed correct, it will accept that". This is still useful, as the code that needs to be carefully checked by a human is confined to a type guard.

@shelby3
Copy link

shelby3 commented Sep 12, 2016

@aluanhaddad wrote:

However, it's perfectly valid to program with functions and object literals and to never use classes or even manually wired inheritance via prototypes.

Agreed. That is why I mentioned the purely structural option as an alternative to the feature of this issue which was adopted.

It is difficult to have an open discussion and ignore discussion. Therefor...

First, I am unblocking you (perhaps contrary to my better judgement) because it seems I can trust you to talk technicals (and you may have valuable discussion to share) and to not to involve me in discrimination claims. If that changes, I may regrettably be forced to backtrack. I am not stating this as if I desire any authority or control over you (nor to insinuate any judgement of blame or correctness), rather this is just a statement of my personal policy w.r.t. to you. Please avoid making any insinuations that would cause me to consider a legal protection stance to be more important than open discussion. For me, open discussion is paramount, but I do have to be pragmatic in this era where we all commit 3 felonies per day just by breathing.

Simply match on structure at the guard instead (and use the result of the match to narrow the type). So you'd need some new compiler provided function that performs this structural match similar to my second paragraph in this comment. And at compile-time, this function need not be called because the compiler can infer the type. At run-time, the function would actually need to be called to do a structural matching to check for the interface type .

What specifically do you have in mind? Type guards are most often used when type information has been lost because it comes from untyped APIs and unknown sources. If the compiler could always determine the actual type, we wouldn't be having this discussion but that is simply not possible.

if (someNode.isA(Sortable) {
    someNode.sort()
}

Note the compiler has type checked that at compile-time in the above case.

The compiler would emit:

if (someNode.isA({ sort:function() {} }) {
    someNode.sort()
}

So the isA function would check that the properties of the interface match structurally up the capabilities of what the runtime can check structurally (no instanceof nominal checks in this strategy), e.g. the existence of the sort property, that it has a typeof x == 'function', and the number of parameters of the function.

That seems to be much more sane than the feature that was adopted, because at least it enforces structural type checking at compile-time (rather than depending on human error) and even marginal structural type checking at runtime.

Note if there is no else case on the guard, I presume by default it should throw an exception at runtime if the if condition is false.

Perhaps a compiler option would be to omit the runtime checks, then the programmer is confident their runtime environment is cordoned soundly.

@spion wrote:

So this feature is sound in this sense: "The compiler cannot automatically check this, but if you supply your own unchecked proof that the type is indeed correct, it will accept that". This is still useful, as the code that needs to be carefully checked by a human is confined to a type guard.

Maybe useful to some but terribly unsound because it breaks the internal consistency of the TypeScript type system (not just bypassing it to enable runtime unsoundness), and I believe I have shown above that there is another way that wouldn't break the internal consistency of the TypeScript type system.

@shelby3
Copy link

shelby3 commented Sep 12, 2016

The following concerns the nominal typing idea I promulgated in this thread, which is orthogonal to the prior comment of mine explaining a purely structural idea.

@spion wrote:

First, you can't populate the prototype chain with interfaces. This will not work because instanceof does not work reliably cross-realm.

Structural typing can fail also due to false positive matches (both at compile-time and runtime). Nominal typing can fail dynamically at runtime (due to changes to the prototype chain or as you explained below), but not at compile-time.

Choose your poison.

For example a instanceof Array can return false if the array a came from e.g. another iframe (more generally, another realm). See this discussion

Yeah I was aware of that from this, and thanks for citing that source which explains it more completely.

Secondly, npm (the most popular package manager for JavaScript at the moment) compounds the above problem by installing multiple semver-incompatible (and up until npm 2.0, also semver-compatible) versions of the same library into different directories. This in turn means that a class defined in such a module may actually have more than one value; and again instanceof won't work reliably, similarly to the way it doesn't work reliably cross-realm.

I'd need a more thorough explanation to understand how npm managed to break instanceof, but what an individual framework does to make itself incompatible with one of JavaScript's capabilities, should not preclude us from supporting and not ignoring that capability.

As in all things with JavaScript, the programmer has to be aware and be careful, because JavaScript is a dynamic, highly open ecosystem. Programmers will pressure frameworks in a free market and the free market will work it out. It is not our authority to decide for the free market.

I do not consider these potential pitfalls with nominal typing to be a rational justification to completely avoid nominal typing with JavaScript. As I wrote above, structural typing also has pitfalls. Programmers should have both nominal and structural typing in their toolchest.

It is bogus for anyone to claim that JavaScript is only set up for structural typing. JavaScript has prototype inheritance (Douglas Crockford), which can support nominal typing. The fact that it's globally retroactive and mutable, is one of its features.

Finally, this is simply not how most JavaScript is written.

By 'this', I assume you are referring to employing JavaScript's prototype chain for nominal typing in general, and specifically for instanceof guards.

I doubt very much that instanceof or constructor.name are never used for nominal runtime typing guards in the entire JavaScript universe.

We don't write general purpose programming languages (i.e. TypeScript) to cater only to 90% of the programmers. A general purpose programming language that is supposed to be compatible with JavaScript ecosystem should offer the entire language of capability.

You don't get to decide for the universe. This is an ecosystem and free market.

Infact it cannot be how most JS is written, as this would be a typescript-only feature.

How is supporting a JavaScript feature only a TypeScript-only feature? Offering ways to type the prototype chain is providing a way to interopt (to some degree more than now) with the use of that prototype chain in non-TypeScript software.

You seem to often make declarations of fact which are factually devoid of complete evidence, e.g. "you are wrong", "you are inconsistent", and "in fact it cannot be". Could you please try to be a bit more open-minded and focus on fully proving your arguments (and allowing the possibility that through discussion you might realize otherwise) before declaring them as fact.

And here we come to a clash to TypeScript's design goals: to be a type system which helps with existing JavaScript code.

What is the proven clash? And I don't think the goal is "existing JavaScript code" but rather "the existing ECMAScript standard".

Just look at how Promises/A+ thenables are specified. Its all about a method then present on the thenable object. Not about some non-existing constant "Thenable" that should be in the prototype chain. Admittedly, some of this is a product of the other two instanceof problems above.

In my code, I detect instanceof Promise because I am using ES6 generators to simulate ES7 async / await. You even wrote a recent blog which sort of explains why I prefer to use generators (great minds think alike eh :)

You somehow think you know what every existing JavaScript code in the universe is doing. How did you achieve such omniscience given that the speed-of-light is finite?

The rest of the reasons are complex, but mainly its a combination of "no single module system" (this constant would need to be defined in some JS module and exported from it), desire to keep JS code small and therefore devoid of dependencies, desire for interoperability etc. Nevertheless, these reasons confine most JS code to structural checking (and since TypeScript aims to model JS code, its therefore confined to structural types)

Prototype inheritance is inherently locally coherent and modular because it is object-based, so there doesn't need to be any global coherence. Could you please explain more clearly what problem you envision?

@spion
Copy link

spion commented Sep 12, 2016

I do not consider these potential pitfalls with nominal typing to be a rational justification to completely avoid nominal typing with JavaScript. As I wrote above, structural typing also has pitfalls. Programmers should have both nominal and structural typing in their toolchest.

You can consider it whatever you want, the fact is that the kind of JS that developers normally write is based on basic structural checking, and as such its TypeScript's primary job to support that.

Additionally, nominal type systems suffer from the "interface afterthought" problem. Example:

  1. First promise library P1 appears, written by author A1
  2. A1 never gave second thought to the "Thenable" idea, so they never wrote a separate "Thenable" (interface) module.
  3. Another author A2 writes a better promise library P2. Due to the nature of nominal types, its incompatible with A1
  4. Because of incompatibility with P1, P2 cannot achieve enough traction and adaptation without explicitly depending on P1 or inheriting from P1, as a lot of things use P1, and P1 uses nominal checking to ensure correct types.
  5. Frustrated by this, A2 writes Thenable module, and urges A1 to adopt it.
  6. A1 sees no value in this, due to the fact that library P2 doesn't have enough traction to justify a change in P1, and due to the fact that from their point of view P1 is obviously better.
  7. The development of promises has stalled.

Regarding your structural check idea, can you please tell me what the cost will be to check the following?

interface Node {
  __special_tag_to_check_if_value_is_node: string;
  data: <T>
  children: Array<Node<T>>
}

Because with type guards, I can make it be O(1) and be reasonably sure its correct unless someone is trying to deliberately subvert it.

@shelby3
Copy link

shelby3 commented Sep 12, 2016

@spion wrote:

You can consider it whatever you want, the fact is that the kind of JS that developers normally write is based on basic structural checking, and as such its TypeScript's primary job to support that.

Please re-read my prior comment as I have rebutted this "normally" argument.

Additionally, nominal type systems suffer from the "interface afterthought" problem. Example:

You are conflating nominal typing with subclassing. That is why I am preparing to promulgate typeclasses typing of the prototype chain. I am hopefully going to radically impact the JavaScript universe on a significant scale. TypeScript can come along for the ride or it can be obstinate. Either way, I am going to see this concept gets implemented (eventually), unless I discover it is flawed. I've been working on this concept for past several years (on and off) and very intensely this past May. If I can get others interested now, that would be best.

@spion
Copy link

spion commented Sep 12, 2016

I do not consider these potential pitfalls with nominal typing to be a rational justification to completely avoid nominal typing with JavaScript. As I wrote above, structural typing also has pitfalls. Programmers should have both nominal and structural typing in their toolchest.

The difference in pitfalls is fundamental. With instanceof checks, code that is supposed to work breaks. With structural checks, code that is not supposed to work breaks at run time, rather than compile time.

@shelby3
Copy link

shelby3 commented Sep 12, 2016

@spion wrote:

The difference in pitfalls is fundamental. With instanceof checks, code that is supposed to work breaks. With structural checks, code that is not supposed to work breaks at run time, rather than compile time.

That is an interesting perspective, but it depends on who and what was "supposed to". If instanceof breaks, is it because the programmer was supposed to be aware of the couple of general ways it can fail and avoid them? So then was it supposed to work or not supposed to work?

Your logic presumes that structural code is suppose to fail because the programmer designed the code wrong, but wasn't he supposed to design it correctly? And structural code can fail at compile-time, if we presume that not having the ability to distinguish between nominal intent and structure as a failure of structural compile-time typing as compared to nominal.

I hope you are somewhat convinced that your choices were somewhat arbitrary.

Apparently one of the ways I piss people off without even trying to, is I think much more generally (or let's say I just keep thinking and don't assume I've ever finalized my understanding) and they just can't understand why I don't adhere to the very obvious viewpoint that they think is the only possible one. Unfortunately I am not smart enough to be able to both think generally and find a way to hide it and bring it out in a politically astute way making others think that I adhered to their view and then we together generalized it together (or some political methodology like that). I tend to be too matter-of-fact, especially when I am at the bandwidth limit of my capabilities.

@spion
Copy link

spion commented Sep 12, 2016

Prototype inheritance is inherently locally coherent and modular because it is object-based, so there doesn't need to be any global coherence. Could you please explain more clearly what problem you envision?

Promises/A+ thenables are a good example. If you want to specify a Thenable nominal interface, there needs to be a single value that represents it in the prototype chain (in order for instanceof to work). To get this single value into all libraries that implement Thenable, it needs to be a module of a module system that guarantees a single instance will be delivered when requested via import. AFAIC This is not guaranteed by either ES6 modules or CommonJS modules, so at best you would need to ensure it in the module loader spec, and any environment that uses different loaders as well (nodejs?)

Btw, ES6 tried and failed to solve this problem (for users) with Symbols. The final solution ended up being a string-addressed global symbol registry.

@spion
Copy link

spion commented Sep 12, 2016

@shelby3

That is an interesting perspective, but it depends on who and what was "supposed to". If instanceof breaks, is it because the programmer was supposed to be aware of the couple of general ways it can fail and avoid them? So then was it supposed to work or not supposed to work?

That would mean avoiding instanceof to check arguments passed externally, as they may come from another realm. Which means avoiding its use as a type guard in many cases where such arguments may come externally (e.g. a library accepting arguments provided by the consumer would not be able to use this)

edit: removed problematic section.

@shelby3
Copy link

shelby3 commented Sep 12, 2016

@spion wrote:

That is an interesting perspective, but it depends on who and what was "supposed to". If instanceof breaks, is it because the programmer was supposed to be aware of the couple of general ways it can fail and avoid them? So then was it supposed to work or not supposed to work?

That would mean avoiding instanceof to check arguments passed externally, as they may come from another realm. Which means avoiding its use as a type guard.

Or avoiding the other realms. We shouldn't presume the only use of ECMAScript is in broken realms such as the browser and NPM. The (expanse of possibilities in the unpredictable future of the) universe is not so tiny.

I believe you are somewhat oversimplifying and pigeonholing your case studies (which is a problem if we extrapolate these as the only truth worth caring about).

The language can be orthogonal to realms. For example using ECMAScript to code mobile apps. (which is one of the reasons I am here)

I won't disagree that the browser and NPM are huge and important realms today, but even today they are not 100% of the JS ecosystem.

We also can't know if those existing huge realms won't fix themselves when under free market pressure to do so, or be superceded by larger new realms.

We shouldn't conclude the language features are eternally broken just because some huge legacy realms (which I believe are dying) broke those features.

@RyanCavanaugh
Copy link
Member Author

Previous comments here have been running afoul of the Code of Conduct, but I appreciate everyone redirecting their attention to the technical discussion at hand. Let's keep it that way.

@shelby3
Copy link

shelby3 commented Sep 12, 2016

@spion wrote:

Prototype inheritance is inherently locally coherent and modular because it is object-based, so there doesn't need to be any global coherence. Could you please explain more clearly what problem you envision?

Promises/A+ thenables are a good example. If you want to specify a Thenable nominal interface, there needs to be a single value that represents it in the prototype chain (in order for instanceof to work).

If we are referring to subclassing and not typeclasses, the Promise constructor function controls what will be put in the prototype chain. Even if you import multiple instances of a Promise, they will all for each use only one Thenable interface per prototype chain. But with redundant imports, all of these Thenable interfaces will not have the same reference in memory, since we'd have multiple instances of the Promise constructor function. So I agree that non-redundant imports are necessary if we expect to have a unified instanceoffor all instances if we are basing instanceof on matching instance by reference in memory and not matching names (and possibly the source code) of the constructor function.

To get this single value into all libraries that implement Thenable, it needs to be a module of a module system that guarantees a single instance will be delivered when requested via import.

Yes and I designed such an import system for my coding, but it doesn't solve the cross-realm issue. And this would require me to be sure all libraries I use which can pass my code a Promise also use a consistent importing system that enforces non-redundant imports.

A consistent importing system wide is important. I agree but not if the other possibilities mentioned above and below can work sufficiently well.

AFAIC This is not guaranteed by either ES6 modules or CommonJS modules, so at best you would need to ensure it in the module loader spec, and any environment that uses different loaders as well (nodejs?)

I understand there are broken legacy realms. C'est la vie. We move forward anyway.

Btw, ES6 tried and failed to solve this problem (for users) with Symbols. The final solution ended up being a string-addressed global symbol registry.

Instead of string keys, they could have used 160-bit cryptographic hashes (or any approximation to a random oracle) to be probabilistically sure of no collisions.

Perhaps they needed me around to suggest that? I find it difficult to imagine that no one else would have thought of using hashes to solve the problem of global collisions.

And this seems it would be a good way to make name space issues orthogonal to the module import system.

Did it fail because of name space collisions, lack of adoption, or what?

@spion
Copy link

spion commented Sep 12, 2016

Or avoiding the other realms. We shouldn't presume the only use of ECMAScript is in broken realms such as the browser and NPM. The (expanse of possibilities in the unpredictable future of the) universe is not so tiny.

This is not what realms means. A realm can be thought of as a fresh global "scope". For example an iframe has a different "realm" from its parent window. That means it has different unique global (window) object, as well as other globals: e.g. Array constructor and array prototype. As a result, passing an array you got from an iframe to a function that checks instanceof Array means the array will fail the check.

This is not merely a theoretical concern

In CommonJS, every module is wrapped by a header and footer that form a closure:

function(module, exports, ...) {

<module code here>

}

Which is then called with an empty exports and initialized module object, at discretion, by the module system.

Its the same problem with e.g. class expressions:

function makeClass() {
  return class C {
    // definition here
  }
}

let C1 = makeClass(), C2 = makeClass(), c1 = new C1(), c2 = new C2();

assert(c1 instanceof C2) // fails
assert(c2 instanceof C1) // fails

Did it fail because of name space collisions, lack of adoption, or what?

It failed because they came up with a neat little way to define unique constants that don't clash with anything existing and can be used as object keys, then wanted to expose this mechanism to users somehow, but failed to take cross-realm issues into account. The keys were now too unique: user code that executed in each realm generated its own; only language-defined ones were guaranteed to be the same cross-realm. So we're back to string keys, which is where we were in the first place before Symbols entered the scene.

By all means, take 160-bit cryptographic hashes idea to esdiscuss. May I ask though, exactly what is the thing that you plan to hash to get the unique key that solves the multi-realm problem?

@yortus
Copy link
Contributor

yortus commented Sep 13, 2016

As of ES6, instanceof is decoupled from the prototype chain due to Symbol.hasInstance. Walking the prototype chain is now just the default behaviour. But in obj instanceof Obj, if the Obj value has its own [Symbol.hasInstance] property, then that will determine how instanceof behaves.

@eggers
Copy link

eggers commented Sep 13, 2016

@shelby3 TypeScript wasn't meant to be a new language. It was meant to just add types on top of Javascript. One of the current trends of JavaScript is something called DuckTyping. (If it walks like a Duck and quacks like a Duck, then it's a Duck.) That is a very different concept than inheritance and is in fact antithetical to it. Interfaces are TypeScripts answer to DuckTyping. Putting Interfaces on the prototype chain would defeat their purpose. User defined type guards were meant to solve type guards for interfaces. Maybe there is a better way of creating them. (I personally would prefer that they were more tightly bound to the interfaces themselves.) However, user defined type guards definitely belong in TypeScript, and they definitely don't belong on the prototype chain.

/* DuckTyping */

interface Foo { foo:string };

function bar(foo: Foo) {
  // something
}

var foo = {foo: 'bar'};
bar(foo); // Legal even though Foo was never explicitly implemented.
/* Multiple Inheritance */

interface Car {
  goto(dest: Land): void;
}

interface Boat {
  goto(des: Water): void;
}

class HoverCraft {
  goto(dest: Water|Land) {
    // something
  }
}

@yortus That's awesome. Maybe a separate mechanism for user defined type guards could be implemented that could be down compiled as the current type guards are. I personally think that it would be more intuitive to do write something like this (The type guards should be more closely bound to the interfaces than they currently are):

interface Cat {
  name: string;
  static [Symbol.hasInstance](animal: Animal) {
    return a.name === 'kitty';
  }
}

if (c instanceof Cat) { // or maybe `c implements Cat`
  // dog barks.
}

which could could compile to:

// es6
class Cat {  
  static [Symbol.hasInstance](instance) {
    return a.name === 'kitty';
  }
}

if (c instanceof Cat) {
  // dog barks.
}

// es5

var Cat = (function () {
    function Cat() {
    }
    Cat[Symbol.hasInstance] = function (instance) {
        return a.name === 'kitty';
    };
    return Cat;
}());

if (Cat[Symbol.hasInstance(c)) {
  // dog barks.
}

@shelby3
Copy link

shelby3 commented Sep 13, 2016

@spion wrote:

Or avoiding the other realms. We shouldn't presume the only use of ECMAScript is in broken realms such as the browser and NPM. The (expanse of possibilities in the unpredictable future of the) universe is not so tiny.

This is not what realms means.

I did not define 'realms'.

A realm can be thought of as a fresh global "scope". For example an iframe has a different "realm" from its parent window.

I claim it is evident that I knew that by noticing that "avoiding" that problem could involve "avoiding ... broken realms such as [in] the browser". The point is that if the browser is creating these fresh global "scopes" without some mechanism such as Symbol (which btw I wasn't aware of until you mentioned it) to fix the problem, then the browser is a promulgator of broken design w.r.t. to realms.

I do not presume that the problem with realms can't be fixed any where. I am not claiming you presume it can't. If you are confident it is broken every where and/or can't or won't be fixed every where (or no where of significance from your perspective), I am very much interested to read your explanation. I am presuming until you specify otherwise, that your predominant concern is the global "scopes" (realms) issue. I realize you are also concerned about existing popular module systems.

In CommonJS, every module is ...

It is possible to insure every module is only instantiated once within the same realm "scope". I have module code doing it. It may or may not be possible with CommonJS and other existing modules. I haven't looked into that yet.

Its the same problem with e.g. class expressions:

What is the problem you envision? If the module for the function makeClass() was only instantiated once, then by default all instances will have the same prototype and [Symbol.hasInstance] properties.

By all means, take 160-bit cryptographic hashes idea to esdiscuss. May I ask though, exactly what is the thing that you plan to hash to get the unique key that solves the multi-realm problem?

Yeah I realized today while I was driving, that in my sleepless state I had forgotten to specify what gets hashed. It would need to be the entire module's code concatenated with a nonce incremented for each unique key requested by that module.

@shelby3
Copy link

shelby3 commented Sep 13, 2016

@eggers please note I have made three different possible suggestions to choose from. One of them is to use purely structural matching for the user guard (which afaics appears to fix the serious soundness flaws that this issue's "fix" created), so I am not advocating putting any interface in the prototype chain for that suggestion.

@yortus thank you.

@shelby3
Copy link

shelby3 commented Sep 13, 2016

@aluanhaddad wrote:

I would definitely be curious to see your formal type class proposal.
...
Walking the prototype chain and doing reference comparisons is far from reliable.

I think that as you have yourself stated, a different language, one targeting a type aware VM may well be the only way to achieve your goals.

To the extent that TypeScript can embrace unreliability of expected structural (interface and subclassed) types at runtime (even possibly with optional runtime nominal and structural checks that exception to a default or error case), I think perhaps the similar level of typing assurances can be attained with typeclasses. Typeclasses would only make sense nominally, as otherwise they are same as the structural interfaces we have already.

And I believe typeclasses are much more flexible for extension, compared to subclassing. I intend to attempt to explain that soon in the issue thread I created for that.

If we are going to add some nominal capabilities that are compatible with JavaScript's existing paradigms, such as instanceof, then typeclasses would give us the flexibility of extension we get with structural types. Note that instanceof may become much more reliable.

P.S. you are referring to the comment I made about Google's SoundScript and that if they succeed to accomplish runtime soundness, I believe it will essentially be a much different language.

@eggers
Copy link

eggers commented Sep 13, 2016

@shelby3 Ah, I missed that structural proposal. There has been a lot to read the last couple of days in here.

I actually do think something like that would work. It would add some runtime overhead for a large interface as checking for the existence of many fields would take some time, but probably not prohibitive. By default the TypeScript compiler could out put code that checked for all properties/functions, with the option of overriding [Symbol.hasInstance] with a custom check. However rather than using a special function on interfaces isA(x), I would use a keyword like implements or implementationOf, maybe overloading instanceOf.

@shelby3
Copy link

shelby3 commented Sep 13, 2016

@eggers wrote:

By default the TypeScript compiler could out put code that checked for all properties/functions, with the option of overriding [Symbol.hasInstance] with a custom check.

Also, when the compiler constructed the instance within the function, then it could optimize away the runtime structural check.

@eggers
Copy link

eggers commented Sep 13, 2016

@shelby3 can you explain more about how it could optimize away the structural check? If you need to do one thing if it's an implementation of Animal and another if it's one of Vehicle, you still need to structurally check which object it implements.

@shelby3
Copy link

shelby3 commented Sep 14, 2016

@eggers when the compiler knows that the instance was constructed within the function, then it knows at runtime it has to be of the type that was constructed, thus it doesn't need to do any runtime check for the structural type.

Also I want to add that afaics ideas for tagging the structural type (i.e. roughly a simulation for nominal type) instead of checking its structure (which I presume exist to increase performance), such as the feature that was implemented for this issue #1007, are afaics breaking structural type checking. And the feature of #1007 is even worse IMO, because it additionally breaks the internal consistency of the compiler because it relied on the human error to tell the compiler what the type of a (metadata) tag corresponds to.

Today (since I have now caught up on some sleep) I am going to be initiating+participating in a more holistic analysis of all this and tying it into the nominal typing discussion, as well as my proposal for typeclasses. I'll try to remember to cross-link from this issue discussion. I also will learn more about the adhoc tagging paradigm that has been adopted by JS frameworks and libraries. This is a learning process for me as well.

@nickredmark
Copy link

Is there a way to define a type guard that activates a type if it returns? Something that would allow you to write something like this:

try {
   checkIsA(o)
   // from here on o has type A
} catch(e) {
}

@kitsonk
Copy link
Contributor

kitsonk commented Oct 13, 2017

This is not a support forum.

Questions should be asked at StackOverflow or on Gitter.im.

@yortus
Copy link
Contributor

yortus commented Oct 13, 2017

@nmaro it has been suggested but not implemented so far. See for example #8655, and other issues linked from there.

@microsoft microsoft locked and limited conversation to collaborators Jun 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Fixed A PR has been merged for this issue Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests