Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option Infer Turbo (aka Smart Variable Typing) #172

Open
AnthonyDGreen opened this issue Oct 2, 2017 · 16 comments
Open

Option Infer Turbo (aka Smart Variable Typing) #172

AnthonyDGreen opened this issue Oct 2, 2017 · 16 comments

Comments

@AnthonyDGreen
Copy link
Contributor

I often get asked why when you perform a type-test the variable doesn't magically get that type inside the If.

Dim obj As Object = "A string"

If TypeOf obj Is String AndAlso obj.Length > 0 Then
    obj.StringStuff()
End If

There's a list of reasons why it's not that simple but I think I have a design that addresses them.

Back-compat

This is worth burning an Option statement on. When we added local type inference it would have been a breaking change so we added Option Infer for back-compat reasons, leaving it Off on project upgrade but On for new projects.

Caveats

  • Only works on non-static local variables and value (ByVal) parameters.

This avoids problems where a property would return a different object of a different type on subsequent invocations, or a field or ByRef parameter is mutated on a different thread or even the same thread. Given that both the current pattern of first TryCasting the value into a local and then testing it for null, as well as pattern matching also require this it's not a detractor vs alternatives.

How does it work under the hood?

I think of it like a leaky binder. When you have constructs which have boolean conditionals there's an opportunity for a binder (context) to "leak" out either on the "true path" or the "false path" depending on the operators involved. In this context there's a sort of shadow variable with the same name as the variable being tested with the type of the type test.

So, for example take the expression TypeOf obj Is String AndAlso obj.Length > 0 OrElse obj Is Nothing the binder "leaks" into the right operand of the AndAlso operator so in that context 'obj' refers to the String typed 'obj' variable, not the original one. It doesn't leak into the right side of the OrElse because that's not on the "true path". By contrast in the expression TypeOf obj IsNot String OrElse obj.Length = 0 the binder does leak into the right hand of the OrElse because TypeOf ... IsNot ... leaks on the "false path".

This is what lets guard statements work:

If TypeOf obj IsNot String Then Throw New Exception()

' obj has type 'String' here.

The "scope" of the binder is everything after the If statement (within the same block). This means that within that scope overload resolution will always treat obj as a String.

This leaking has to apply to the short-circuiting logic operators, the ternary conditional operator, If, Do, and While statements and maybe When clauses on exceptions. So, for example:

' This code has a bug in it, I know.
' Or maybe this should have been 'Do While TypeOf node IsNot StatementSyntax'
Do Until TypeOf node Is StatementSyntax 
    node = node.Parent
Loop

' At this point, node has the type StatementSyntax.

This all happens during "initial binding"; it's not based on flow-analysis.

What about Where clauses in queries?

We can go one of two ways.

  1. You only get the strong typing within the where if the expression is joined with a boolean operator or conditional because we can't know if the Where clause actually executed the lambda and the use of this feature should never result in exceptions.

  2. We could translate the Where into a Let, a Where, and then a Select. It's a big of a stretch but we're already doing magic on this feature so...

Does it automagically upcast?

This doesn't happen if the type test would widen the type of the variable so:

Dim str As String = ""

If TypeOf str Is Object Then
    ' str is NOT reduced to 'Object' here.
End If

What if the same variable is tested multiple times?

The types are intersected. We actually support intersection types in generic methods today when a type parameter has multiple constraints. It's the one place in the language where you can say something is an IDisposable AND an IComparable so we should follow all the same rules there.

What about value types?

The idea is that this feature creates strongly typed aliases to objects. So the scenario for testing for a value type necessarily requires a boxed value type on the heap. Today when you unbox a value type from the heap we immediately copy it into a local variable so that any mutation to the copy doesn't change the value on the heap. For this feature we want to preserve the idea that it's just a strongly-typed reference, not a copy, and IL lets us do this. The unbox IL instruction actually pushes a managed reference on the stack. Instead of copying the value type we can copy this reference into a "ref local` (and this would be transparent to the user) so a mutation to that value either through say an interface method or the typed value will be consistent. It's critical to preserve identity.

Are the variables immutable?

No. But here are the rules for mutation:

  • Within that scope you can assign the variable a value of the same type or more derived as long as the invariants at that point aren't broken. Under the hood we'd have to reassign every alias up to that point, I guess.

  • You can also assign things of a wider type (anything assignable to the original variable). This does not cause an implicit narrowing conversion. Instead, from that point it's illegal to use that variable in a manner which relies on the type guard having succeeded. That's where flow analysis comes in. So even if you re-assign an Object variable which has been promoted to a String variable with an Integer value you can still use it like an Object. It's just that any code which used it like a String, including overload resolution, type inference, member accesses, etc, will report an error.

Dim obj As Object = ""

If TypeOf obj Is String Then
    Console.WriteLine(obj) ' Calls String overload.

    GC.KeepAlive(obj) ' Calls  ' Object overload. No error.

    obj = 1

    GC.KeepAlive(obj) ' Calls  ' Object overload. No error.

    Console.WriteLine(obj) ' Still calls String overload but reports an error.
End If

This way flow analysis doesn't have to feed type information back into initial binding. It sort of works on the idea that the Object alias of obj gets re-assigned, but the String alias of obj becomes unassigned. So flow analysis just tracks reads of String that are unassigned. In theory one could reassign the String alias of obj to fix this. And any usage of obj and an Object (e.g. by calling members of Object or implicit widening conversion) really reads from the Object alias so doesn't count as a read from unassigned.

The solution in this situation is either to remove the write to the variable, re-guard the code that requires obj to be String, or explicitly cast obj to Object. While all of those workarounds seem ugly they're also the only legitimate code to write in those situations.

This idea that flow analysis reports an error rather than "downgrading" the type is super important to avoid silently changing the meaning of code with shadowed members:

Class C
    Public Shadows ToString As Integer = 5
End Class

Dim obj As Object = New C

If TypeOf obj Is C Then
    Console.WriteLine(obj.ToString) ' Calls Integer overload.

    obj = New Object

    ' Still calls Integer overload but reports an error.
    ' Doesn't silently start calling Object overload when you
    ' add the line of code above.
    Console.WriteLine(obj.ToString) 
End If

What about Gotos?

The same asignment analysis applies. If the reference is reachable at a point where the alias is unassigned an error is reported and the same solutions apply:

Dim str As Object = ""

If TypeOf str Is String Then
    
    1:
    Console.WriteLine(str) ' Error reported.
End If

Goto 1

Does an assignment cause re-inference if a narrower type is assigned?

That would be madness. We should discuss it!

What about Select Case on type?

I've always thought of the principle function of Select Case being to use the same "left" operand for multiple tests without repeating the name over and over. So if TypeOf is the operator, the natural syntax for Select Case would look like applying it multiple times.

Select Case TypeOf obj
    Case Is String
        ' obj has String type here.
    Case Is Integer
        ' obj has Integer type here.
End Select

Or

Select Case obj
    Case TypeOf Is String
        ' obj has String type here.
    Case TypeOf Is Integer
        ' obj has Integer type here.
End Select

The advantage of the first form is it has a little less repetition of the TypeOf keyword and reads very straightforwardly--"What's the syntax in VB for doing a Select Case on the type of an object?" Select Case TypeOf obj.

The advantage of the second form is it doesn't put Select Case into any special mode and so you can still use all the other kinds of Case clauses in the same block. I don't know how often that's actually a scenario though.

Both forms reuse a concept already in the language (TypeOf) and don't add a whole new thing (Match) for a common scenario. In a lot of ways the Case s As String design was a consolation prize to semantics like this.

How would this work in the IDE?

I imagine we'd use a slightly different classification to indicate that the variable is "enhanced" at that point in the program. So let's say your identifiers are black by default, in a region where the type has been re-inferred it'll be purple. Then, if you loose the enhancement somehow it'll go back to black. Maybe if you hover over it the quick type will say something like "This variable has been enhanced with String type and can be used like a string here." or something.

Summary

I think this is the most "Visual Basic" feature ever! It's very "Do what I mean" and is fairly intuitive. The last time a developer asked me why when he checks the type it doesn't automatically get that type and I sat down to write a whoe blog essay about all the technical reasons that won't work and for VB, as much as we can, it's nice to avoid a first-time programmer needing to read an essay from some compiler nerd about threading and overload resolution and shadowing (like what are any of those things?) to explain why their very reasonable intuition doesn't work.

And this is nothing particularly innovative or out there; this is actually how TypeScript and other languages work already.

I also like the idea of rehabilitating the very readable TypeOf operator which I've felt has suffered a lot since the introduction of TryCast. It's like TypeOf is so self-explanatory but we have this sort of inside baseball gotcha that "Ah-ha, FxCop will tell you that really TypeOf uses the isinst instruction which pushes a casted value on the stack and checks it for null so doing a castclass after that is really just casting twice so you shouldn't do it and instead you should use the TryCast operator and check for null for performance or FxCop and people on forums will laugh at you--THEY'RE ALL GOING TO LAUGH AT YOU!". From the same folks who brought you "Ah-ha! Lists start with 0 here because of pointer arithmetic :)"

@AdamSpeight2008
Copy link
Contributor

@AnthonyDGreen
Does this strategy work with multiple types? (eg #23)

If TypeOf obj Is T0 OrElse TypeOf obj Is T1 OrElse TypeOf obj Is T2 ... Then
' What's the type of obj ?
End If

@AnthonyDGreen
Copy link
Contributor Author

No , because the only type obj could have would have to be a union type, which we don't have in VB/C# yet. And even then to interact with it would require separate type checks.

I guess given two class types we could compute the most derived common ancestor. That could be pretty neat, actually.

@AnthonyDGreen
Copy link
Contributor Author

I thought more on it, I think nearest common ancestor would be very complicated implementation wise and in that case it's enough to explicit type obj as the nearest common ancestor to get the same effect. I would like nearest common ancestor for ternary If though...

@mattwar
Copy link

mattwar commented Oct 2, 2017

Is this a real type change of the local variable (which implies either a change to the runtime, IL or lots of injected type conversions each time its referenced) or a new variable using the same name?

If its a new variable, what happens if you assign to the variable inside the block and then reference it outside the block?

Dim obj As Object = "Initial Text"

If TypeOf obj Is String Then
    obj = "Different Text"
End If

Console.WriteLine(obj)

What text gets written?

@AnthonyDGreen
Copy link
Contributor Author

See the section "Are the variables immutable?". I think it answers your questions.

@johnnliu
Copy link

johnnliu commented Oct 3, 2017

Thank you for an awesome long weekend holiday read. Really fancy that language inferences from works in Typescript (and typeless JavaScript) has an influence in how we can resurrect an old keyword like TypeOf

Q: Does intellisense not get confused when you hover over different variables and it decides that there's a more specific type in play within the block?

@AnthonyDGreen
Copy link
Contributor Author

@johnnliu,

IntelliSense doesn't get confused so long as the compiler doesn't. It just asks the compiler for its understanding of the identifier under the cursor and displays the result. Sometimes there's some extra smarts to link up related but otherwise separate entities.

@AdamSpeight2008
Copy link
Contributor

@AnthonyDGreen Greatest Common Type (GCT) is already used in in array literals, so it should be possible to use it in the multiple possible type scenario.

@AdamSpeight2008
Copy link
Contributor

@AnthonyDGreen Or am I thinking of Dominant Type.

@AnthonyDGreen
Copy link
Contributor Author

That is dominant type.

@esentio
Copy link

esentio commented Oct 15, 2017

Kind of reminds me pattern matching, although it's not really the same thing. Is this supposed to be "VB version" of pattern matching or would VB eventually get both features?

@AnthonyDGreen
Copy link
Contributor Author

@esentio

Pattern matching is a general term that can mean different things. What I've referred to in the past as "Type Case":

Select Case obj
    Case t As T
End Select

could be described as a special-case of pattern matching or it could not. For Visual Basic 2015 we were originally looking at doing it stand-alone (as a simple extension to Select Case). In Visual Basic 2017 we were thinking of implementing it as a special-case of a broader pattern-matching infrastructure (which didn't exist) as C# 7 did. For VB 16 we're thinking of addressing the scenario of concisely checking and casting an object as a stand-alone scenario without subsuming it under the umbrella of pattern matching.

That said, there are still scenarios beyond type-checking for which pattern matching could add value. Proposals #101, #140, #139, #141, #160, and #124 discuss those scenarios.

So it's not so much that this is pattern matching or an alternative to all pattern matching. It is one approach to addressing a common programming scenario which could also be solved by some forms of pattern matching. Some languages take this approach and others rely solely on pattern matching, however they are not mutually exclusive.

That said, whatever pattern matching VB does get will depend on the merits of those other scenarios and right now #140, and to a lesser extent #139 and #101 are the only scenarios that I feel would significantly move the needle for VB users (myself included). The rest seems neat but uncommon. What do you think?

@VBAndCs
Copy link

VBAndCs commented Jul 9, 2020

If TypeOf obj Is s As String AndAlso s.Length > 0 Then
   
End If

@VBAndCs
Copy link

VBAndCs commented Oct 18, 2020

I think the perfect syntax can result from combining the Anthony's proposal with mine, so, we need to declare no new variables to deal with the target type. The Select TypeOf will be the indication to the compiler to do this trick:

Select TypeOf O
   Case Nothing
      Console.WriteLine("Nothing")
   Case String
      Console.WriteLine(O[0])
   Case Date
      Console.WriteLine(O.ToShortDateString( ))
End Select

Which can be lowered to:

If O is Nothing Then
   Console.WriteLine("Nothing")
ElseIf TypeOf O is String Then
   Dim O1 = CType(O, String)
   Console.WriteLine(O1[0])
ElseIf TypeOf O is Date Then
   Dim O2 = CType(O, String)
   Console.WriteLine(O2.ToShortDateString())
End Select

which can avoid any complications in Anthony's proposal.

@zspitz
Copy link

zspitz commented Oct 18, 2020

any complications in Anthony's proposal.

Please clarify what complications you are referring to. Your syntax is irrelevant to this proposal, which discussing aliasing the current variable to the type described by the TypeOf ... Is ... test.

This:

If TypeOf O Is Nothing Then
    Console.WriteLine("Nothing")
ElseIf TypeOf O Is String Then
    Console.WriteLine(O(0))
ElseIf TypeOf O Is Date Then
    Console.WriteLine(O.ToShortDateString( ))
End If

could also be lowered in the way you describe, without introducing any new syntax.

And if what's bothering you is the clunkiness of the TypeOf ... Is ... syntax, then that should certainly be addressed.

@AdamSpeight2008
Copy link
Contributor

When me mental health allows I am still investigating a TypeClauseSyntax for Select Case, I after see how compatible it is the current IsClause.

Select Case obj
       Case Is Nothing 
            Console.WriteLine("Nothing")
       Case Is String  Into S
            Console.WriteLine($"Is String {S}"
       Case Is Date    Into D
            Console.WriteLine($"Is Date {D}")
       Case Else
End Select

It would be a great fit with when clauses eg

        Case Is String  Into S When S.Length > 2
            Console.WriteLine($"Is String {S}"as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants