Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Definite assignment versus pattern-matching when pattern always matches #14651

Closed
gafter opened this issue Oct 20, 2016 · 34 comments
Closed

Definite assignment versus pattern-matching when pattern always matches #14651

gafter opened this issue Oct 20, 2016 · 34 comments
Assignees
Labels
Area-Compilers Area-Language Design Feature Request Language-C# New Language Feature - Pattern Matching Pattern Matching Resolution-Fixed The bug has been fixed and/or the requested behavior has been implemented
Milestone

Comments

@gafter
Copy link
Member

gafter commented Oct 20, 2016

This issue arises from a recent question by @uTeisT on pattern-matching.

Consider a pattern-matching operation that always succeeds:

    void TestFunc(int i)
    {
        if (i is int j) { }
        Console.WriteLine(j);
    }

This results in a definite assignment error on the use of j. That is because the current spec says that pattern variables are definitely assigned when true for the pattern-matching operation. The implication is that they are not definitely assigned when then pattern match fails (even though that cannot happen here), and so they are not definitely assigned after the if statement.

We could modify the specification so that a pattern-matching operation that always succeeds results in the pattern variables being definitely assigned. That is, a pattern-matching operation that is irrefutable (i.e. known to always succeed) would cause all variables to be definitely assigned when false.

That would allow this code to compile without error.

Note that the converse situation, a pattern-matching operation that cannot succeed (because, for example, of a type mismatch), is a compile-time error, so no special definite assignment rule is required.

@tamlin-mike
Copy link

I'm not convinced that's a good idea. {} denotes scope. Should really that 'if' in the example introduce 'j' to the surrounding scope? I would have expected it only to be valid within the scope of that statement (i.e. inside the "{}" in this case). That would to me be the path of least surprise.

If starting to release what at first glance seems to be scope-limited variable introductions into outer scopes, ... that seems like a very slippery slope (read: not a can but a whole barrel of worms).

To quote wiser persons than myself: Here be dragons!

@HaloFour
Copy link

@tamlin-mike

I agree with your sentiment, but the scoping issue was already discussed and effectively decided upon: #12939

This issue specifically addresses definite assignment of j in that scope.

@tamlin-mike
Copy link

@HaloFour

Gotcha. OK, so it's (currently, and hopefully for a looong time) limited to 'out' variables, and implicit 'out' variables like 'j' in the scenario presented?

While syntactically a bit confusing at first blush, I now see the merit and I'm less concerned. Provided scope doesn't expand, explicitly or implicitly (even by mistake, f.ex. with an unnamed tuple), I'm (reluctantly, from a syntactical purist POV) cool with it.

@DavidArno
Copy link

DavidArno commented Oct 21, 2016

@gafter,

"I told you so" is unprofessional, but...

The team ignored lots of sensible advice and (a) introduced non-conditional out var's and (b) refused to introduce guard and broke scoping in the process. You've made your bed and now have to lie in it. Having introduced leaky scope for variables, C# is going to suffer years of problems like this one.

@gafter
Copy link
Member Author

gafter commented Oct 21, 2016

@DavidArno This isn't a "problem", it is a place where the behavior has to be specified. The same "problem" exists no matter what the scope, due to operators such as && and ||.

@ghost
Copy link

ghost commented Oct 21, 2016

@gafter: Have you, or anyone else, considered this:

int? ni = (a is int i || int.TryParse(a, out i) ? i : null);

// "i" is in scope here

How does this make any sense?

Correction: Fixed syntax error pointed out by @HaloFour.

@HaloFour
Copy link

@Kaelum

That expression wouldn't be valid anyway, you're trying to declare i twice.

@ghost
Copy link

ghost commented Oct 21, 2016

@HaloFour that is not true. With the current scoping rules (#12939), "i" would only be defined once. If it is an int, the is declaration is used and the TryParse is never executed. If it isn't an int, the is declaration does not occur and the TryParse declaration is used.

@HaloFour
Copy link

@Kaelum

Whether or not the TryParse is executed isn't relevant, the variable is still declared twice and the scopes still overlap. That statement would not be valid even under the previous scoping rules since i would have remained in scope for at least the remainder of the expression, even if the pattern evaluated to false.

@ghost
Copy link

ghost commented Oct 21, 2016

@HaloFour: Again, that is not true. In fact, that is an almost exact replica of some code that @MadsTorgersen wrote in the exact same thread:

#12939 (comment)

@HaloFour
Copy link

@Kaelum

He's using out i, not out int i. That does work because it's not redeclaring i, it's simply reusing the existing i pattern variable which, despite never being assigned, may be passed as an out variable.

@ghost
Copy link

ghost commented Oct 21, 2016

@HaloFour: Ok, remove the int then. The statement still stands, how does the following make any sense?

int? ni = (a is int i || int.TryParse(a, out i) ? i : null);

// "i" is in scope here

@HaloFour
Copy link

@Kaelum

I'd agree that it doesn't and that the new scoping rules are a mistake, but it seems that the language design team has settled that decision.

@DavidArno
Copy link

@gafter,

@DavidArno This isn't a "problem"

You are right. It isn't a "problem"; it's a problem.

The way I see it, @HaloFour and I have really quite strongly disagreed with each other over a whole raft of ideas here. Occasionally, we've semi-agreed. On this issue though, we are as one: the new scoping rules are a mistake. If such often diverse views are so totally in agreement, the language team really ought to stop and take notice. Sadly, you won't though.

@gafter
Copy link
Member Author

gafter commented Oct 21, 2016

The "problem" cited is the definite assignment state in a syntactic position where "definitely assigned when true" does not apply. Scope is a separate issue.

@ghost
Copy link

ghost commented Oct 22, 2016

@gafter: That is an inaccurate statement. If the scope didn't leak, the "problem" would not exist.

Into the rabbit hole we go...

@tamlin-mike
Copy link

tamlin-mike commented Oct 22, 2016

Previously I was thinking "Well, to prevent leaking, just wrap it in curly braces". Then comes @Kaelum with a voice of sanity and an example (many thanks for it) displaying "Whooops, that won't work!".

I wonder (just brainstorming here): Could it become "cleaner" if instead allowing a statement to not end in ';', but have a following {} (where introduced variables would be "live", but not after closing brace)?
I'm in no way suggesting it, merely bouncing the idea, but it could provide both the scoping we probably want, while keeping the convenience of defining variables on-the-fly-ish.

Edit: But then, how the heck would a result from such a compound statement be expressed? Should a "return" statement be mandatory? sigh this is a mess.

Perhaps Mads should be included in this discussion?

@qrli
Copy link

qrli commented Oct 22, 2016

@Kaelum
Regarding the scoping rule, my understanding is both choices have examples which do not make sense. They are both wrong to some extent. That's why there is long discussion and flipflop. The design team has to make a compromise in the end.

So raising up a counter example will help not flip the decision again. And you example can be workaround by using switch expression, which looks less critical than non-sense examples of the opposite choice. To make a change, we need a much smarter solution which can solves the non-sense examples of both sides.

@ghost
Copy link

ghost commented Oct 22, 2016

@qrli
There wasn't any public RFC discussions prior to this absolutely ridiculous change, which is not only a serious breaking change, but it is the biggest breaking change that has ever been made to any C based language. By not questioning the decision, and just accepting it, we would be negligent in our duties to ensure that changes like this do not continue to happen. What we had previously didn't break anything, it only had some limitations that were inline with existing functionality.

I do not accept this change, and never will, as long as it allows scope to leak in this manner. I do have a question for you, how do you think that "switch" can be used to replace the code in the example that I showed? My example is a single line of code, of which there are several when used in an actual application, and is a variation of code that is currently in use in several applications.

This latest scope change needs to be rolled back to what we previously had, and then we can move forward. This "Option 3" is by far the worst of all the options, and will continue to spiral down the rabbit hole. I'd rather force the discussion now, than have people realize a year later that we are now totally f*****.

@gafter
Copy link
Member Author

gafter commented Oct 22, 2016

That is an inaccurate statement. If the scope didn't leak, the "problem" would not exist.

Not true. In code such as if (i is int j || M(j)) precisely the same "problem" occurs, and it occurs under either set of scope rules.

it is the biggest breaking change that has ever been made to any C based language

@Kaelum Can you please provide a specific example of code that used to compile, and now either does not compile, or has different behavior?

@ghost
Copy link

ghost commented Oct 22, 2016

Not true. In code such as if (i is int j || M(j)) precisely the same "problem" occurs, and it occurs under either set of scope rules.

@gafter: I don't know how you can make that statement, as "j" would only be in scope for the then block of the if statement under the previous proposed set of rules. Whereas it is in scope at the if statement level under the current proposed set of rules. That is a very glaring difference.

Can you please provide a specific example of code that used to compile, and now either does not compile, or has different behavior?

@gafter: Yes I can, and I am so glad that you asked this exact question, but first, we need to go back a bit to look at the bigger picture. If you must, the example is at the bottom of this post. However, this entire mess has snowballed from a couple shortcomings, and the current methods for working around them.

What are the shortcomings? The major one is, the inability to use Fields or Properties as out parameters, so developers create temporary variables to work-around the shortcoming. And the minor one is, not being able to pass a Nullable<T> for the out parameter, when T is expected. I won't discuss the merits or pitfalls of the latter, as I don't know if that would even be possible. However, the major shortcoming should be possible.

To work-around the major shortcoming, you are required to create a temporary variable, so that you can do something similar to a for-loop. However, you are required to do this at the block level of the if statement, or higher, thus poluting the scope with a variable that is truly never needed at that level. Nothing earth shattering, and you can easily see the declaration for the temporary variable.

So, what happened? Someone put in a request for a "nice to have feature". The ability to declare a variable in the method call with an out parameter. This is not a required feature, and not having it would not break anything. However, if done correctly, it could cleanup code for those who use it, and make the overall code appearance much nicer. It would also be very similar to how the for-loop works, so people would use variables like i, j, or k for the temporary variable. Which is exactly where we were with the previously proposed set of rules.

With the horrendous changes that have been proposed as "Option 3", bringing us to the current proposed set of rules, we now have "hidden" variable declarations in places that can be extremely difficult to find, and there is no valid reason allowing this to happen. Remember, this is for "nice to have feature"! This is not required in any way. So why is there a proposal to forever screw up the scoping rules of an AWESOME language, simply to implement a "nice to have feature"?

That is the $64K question! How did a simple "nice to have feature" turn into the "Scoping Rule Proposal from HELL"? All this being said, the example below exhibits how implementing this "nice to have feature" in existing code, will cause scoping issues with existing code:

// If you do something like the following (an example directly from @MadsTorgensen)
if (o is int i || (o is string s && int.TryParse(s, out i)))
{
    // i is definitely assigned and o is either an int or a string with an int in it
}
.
. // a few pages of code
.
if (a is int t || int.TryParse(a, out t))
{
    // some code
}
.
. // a few pages of code
.
// And then have this
for (int i = 0; i < 10; i++)
{
    // Your code
}
.
. // a few pages of code
.
// And then have this
List<int> list = new int[] { 0, 1, 2, 3, 4, 5, 6 }.Where(t => t < 3).ToList();

You would not expect "i" or "t" (not sure what would happen with this) to have already been declared. You also have the following to consider:

int? one = (a is int i || int.TryParse(a, out i) ? i : null);
int? two = (b is int j || int.TryParse(b, out j) ? j : null);
int? three = (c is int k || int.TryParse(c, out k) ? k : null);
int? four = (d is int l || int.TryParse(d, out l) ? l : null);
int? five = (e is int m || int.TryParse(e, out m) ? m : null);
int? six = (f is int n || int.TryParse(f, out n) ? n : null);
int? seven = (g is int o || int.TryParse(g, out o) ? o : null);
int? eight = (h is int p || int.TryParse(h, out p) ? p : null);

// i-p are in scope here

The above code would become a reality. Due to poluted scoping, what do you think it would look like if we attempted the same thing using a DBNull test on a DataReader for a database record that has 30+ columns?

The bottom line is that every single argument that I have seen to support this change, only covers pieces of the picture, not realizing that each small piece is destroying the landscape. This is why I am vehemently opposed to this scoping change, and everyone should be extremely concerned that this is even being considered.

@ghost
Copy link

ghost commented Oct 23, 2016

And one more thing. I don't know if the previous set of rules allowed this, but it would be in line with existing scoping:

if ((int i = GetInt()) > 3)
{
    // i is in scope here.
}

// i is not is scope here.

Again, this is very useful, but not absolutely required. However, this would not be possible under the currently proposed scoping rules.

@ghost
Copy link

ghost commented Oct 23, 2016

BTW, if someone truly wants i to be in scope at the if level, what is stopping them from doing this?

int i;

if (o is int i || (o is string s && int.TryParse(s, out i)))
{
    // i is definitely assigned and o is either an int or a string with an int in it
}

The is int i would only create the variable i if it doesn't already exist. Maybe @MadsTorgersen should weigh in on this.

EDIT: This is "Option 4".

@gafter
Copy link
Member Author

gafter commented Oct 23, 2016

@Kaelum None of your examples appear to be legal C# 6. You said there was a breaking change. Where is the C# 6 code that has changed behavior?

Not true. In code such as if (i is int j || M(j)) precisely the same "problem" occurs, and it occurs under either set of scope rules.

@gafter: I don't know how you can make that statement, as "j" would only be in scope for the then block of the if statement under the previous proposed set of rules. Whereas it is in scope at the if statement level under the current proposed set of rules. That is a very glaring difference.

I can make that statement as the problem occurs in the argument to the call to M in the condition of the if. It occurs under either set of scoping rules.

@ghost
Copy link

ghost commented Oct 23, 2016

@gafter, I have stated my position. If you don't understand my position, there is nothing more to say.

I can make that statement as the problem occurs in the argument to the call to M in the condition of the if. It occurs under either set of scoping rules.

I just took another look at your code. How is that code even valid? M(j) could never work, as j would not be initialized.

@gafter
Copy link
Member Author

gafter commented Oct 23, 2016

In that code, M is never called except when j is definitely assigned.

@ghost
Copy link

ghost commented Oct 23, 2016

@gafter, you used || not &&.

If you intended to use &&, and your statements stand, then your definition of the rules is completely different from @MadsTorgersen's definition of the rules as seen at #12939 (comment).

@qrli
Copy link

qrli commented Oct 23, 2016

@Kaelum Your examples are not good examples. I understand there is scary legacy code which has very long functions, but we are talking about new language features which will only be used in new code. When you try to use the new feature in legacy code, you will also do some refactor to extract some sub functions to simplify the beast. So it is not a problem to reuse short variable names.

int? one = (a is int i || int.TryParse(a, out i) ? i : null);
int? two = (b is int j || int.TryParse(b, out j) ? j : null);
int? three = (c is int k || int.TryParse(c, out k) ? k : null);
int? four = (d is int l || int.TryParse(d, out l) ? l : null);
int? five = (e is int m || int.TryParse(e, out m) ? m : null);
int? six = (f is int n || int.TryParse(f, out n) ? n : null);
int? seven = (g is int o || int.TryParse(g, out o) ? o : null);
int? eight = (h is int p || int.TryParse(h, out p) ? p : null);

No, you will not do that. Write a simple function to remove all that duplication:

int? ExtractInt(object obj) => obj is int i || int.TryParse(obj, out i) ? i : null;

@ghost ghost mentioned this issue Oct 24, 2016
@ghost
Copy link

ghost commented Oct 24, 2016

I just created #14697 as an alternate proposal to that defined in #12939, which would elimite this issue, and I believe makes significantly more sense.

@timgoodman
Copy link

What's the purpose of writing a pattern match which is guaranteed to succeed? Is it just to make the local variable readonly?

I wonder if such a statement is more likely to be a mistake, in which case it might be better to leave it as an error. One could always write int j = i; for now, and live with the fact that it's not readonly for one more release of C#, after which I hope we'll get something like let j = i; which much better conveys that the intent is to make j be readonly, not to check whether i is an int.

@HaloFour
Copy link

@timgoodman

Pattern variables aren't readonly anymore.

@timgoodman
Copy link

@HaloFour Thanks, I didn't realize that. (It's a bit of a shame, I liked how the original proposal with let gave you readonly locals as a bonus.)

So that being the case, why would anyone write code like above, instead of just int j = i;? Maybe having a pointless pattern match should be at least a warning. (Although I guess having it be a definite assignment error is a bit more confusing.)

@gafter
Copy link
Member Author

gafter commented Oct 26, 2016

So that being the case, why would anyone write code like above

It is much more compelling when written to get the effect of Swift's guard statement (i.e. using an if statement), which will make much more sense once we expand support for pattern-matching to recursive constructs. But we want to get the definite assignment correct today so that the extension to recursive patterns is smooth.

@AdamSpeight2008
Copy link
Contributor

AdamSpeight2008 commented Oct 27, 2016

@gafter

It is much more compelling when written to get the effect of Swift's guard statement

Only if it explicitly stated that is coder meant, eg by making them explicitly state that intention.

I think this is an more of an issue related to introducing variable into scoping.
Introducing a variable should into the nearest outer scope, not nearest inner scope. (by default). Then require to coder to explicit specify a change to use the inner scopes of say if``'sthenscope or the`else` scope. True this will be more typing for those cases, but it be easier to understand the scoping(s) if it explicitly written when change from the default.

    void TestFunc(int i)
    {
        if (i is int j) { j is in scope and assigned }
        else { /* j is in scope and not assigned */ }
        Console.WriteLine(j); /* j is in scope and assigned if the pattern succeeds, otherwise not assigned*/
    }

Inner

    void TestFunc(int i)
    {
        if (i is int ~j) { j is in scope and assigned }
        else { /* j is not in scope and not assigned */ }
        Console.WriteLine(j); /* j is not in scope and not assigned */
    }
    void TestFunc(int i)
    {
        if (i isnot int ~j) { j is not in scope and not assigned }
        else { /* j is in scope and assigned */ }
        Console.WriteLine(j); /* j is not in scope and not assigned */
    }

@gafter gafter modified the milestones: 2.0 (RTM), 2.0 (RC.2) Oct 28, 2016
gafter added a commit to gafter/roslyn that referenced this issue Oct 28, 2016
…specified

Also clean up some grammar in Syntax.xml
Fixes dotnet#14651
@gafter gafter added 4 - In Review A fix for the issue is submitted for review. and removed 3 - Working labels Oct 28, 2016
gafter added a commit that referenced this issue Oct 31, 2016
…specified (#14801)

Also clean up some grammar in Syntax.xml
Fixes #14651
@gafter gafter added Resolution-Fixed The bug has been fixed and/or the requested behavior has been implemented and removed 4 - In Review A fix for the issue is submitted for review. labels Oct 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Area-Language Design Feature Request Language-C# New Language Feature - Pattern Matching Pattern Matching Resolution-Fixed The bug has been fixed and/or the requested behavior has been implemented
Projects
None yet
Development

No branches or pull requests

7 participants