Skip to content
This repository has been archived by the owner on Apr 13, 2023. It is now read-only.

throw and return statements should affect flow sensitive typing #3997

Closed
CeylonMigrationBot opened this issue Jan 5, 2014 · 52 comments
Closed

Comments

@CeylonMigrationBot
Copy link

[@sgalles] throw and return statements should affect flow sensitive typing :

Foo foo = ... ;
if (!is Bar foo) {
     throw FooBarException();
}
foo.bar(); //foo is a Bar here 

Same behavior expected if return is used instead of throw

[Migrated from ceylon/ceylon-spec#891]
[Closed at 2015-10-05 15:41:31]

@CeylonMigrationBot
Copy link
Author

[@pthariensflame] I like this, but it might be undecidable. What do you think, @RossTate?

@CeylonMigrationBot
Copy link
Author

[@loicrouchon] Might be linked with #3642

@CeylonMigrationBot
Copy link
Author

[@RossTate] It's pretty easy to decide. At any branch point in the CFG you refine the types of variables as appropriate given the branch condition. At any merge point in the CFG, you join the types of variables. If there's a problem, it's because of loops, but I think for this application we'd be fine. Note that I'm saying "in the CFG" rather than in the syntax, which addresses all the Ceylon issues I've seen around on this topic.

@CeylonMigrationBot
Copy link
Author

[@gavinking] It's recently come to my attention that there is a whole class of people who vehemently prefer to write:

void fun(String? str, Integer int) {
    if (!exists str) { return; }
    if (str.empty) { return; }
    if (int<0) { return; }
    //do the real work here
}

Instead of, for example:

void fun(String? str, Integer int) {
    if (exists str, !str.empty) {
        if (int>=0) { 
            //do the real work here
        }
    }
}

There's even a whole name for former style: "guards". I almost never write code like this and in fact I actively dislike it.

Nevertheless, I don't think it's a good thing that Ceylon actually prevents you from using this style. So we should make stuff like if (!exists x) { return; } have the same effect that assert (exists x); has today.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Yahoo!!!! :)

@CeylonMigrationBot
Copy link
Author

[@luolong] I emphatically vote Yes! :)

@CeylonMigrationBot
Copy link
Author

[@gavinking] So even though this looks not much more difficult than #3180, sadly, it's actually significantly harder. It would require doing control flow analysis and type assignment at once in the same visitor, essentially merging ControlFlowVisitor with ExpressionVisitor into one big blob of waaay too many responsibilities.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Why does control flow depend on type assignment?

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister]
@FroMage: I tried swapping the phases in the typechecker, and it seems the following cause problems:

Anything returnsDefinitely1() {
    while (true) {
        return "";
    }
}
Anything returnsDefinitely2() {
    assert (false);
}

(possibly amongst other things)

@CeylonMigrationBot
Copy link
Author

[@gavinking]

Why does control flow depend on type assignment?

Because control flow analysis looks at things like if (true).

@CeylonMigrationBot
Copy link
Author

[@FroMage] But that sounds like a very special case. Perhaps the part of flow-control that deals with special cases like this can be split into a later phase? Or are these affecting flow-sensitive typing?

@CeylonMigrationBot
Copy link
Author

[@gavinking] No, it has to all be done at once. For example if I have:

Integer fun() {
    if (1==1) { assert (false); } else { return 0; }
}

The only way I can tell that this function definitely returns is by knowing the type of false.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Yes, of course, but what I meant to ask is: do we need to know if something definitely returns for flow typing? Or just a subset of flow visitor?

@CeylonMigrationBot
Copy link
Author

[@gavinking]

Yes, of course, but what I meant to ask is: do we need to know if something definitely returns for flow typing?

Well, yes, we do, that's the whole point of this issue!

@CeylonMigrationBot
Copy link
Author

[@gavinking] String fun(String? str) {
if (is Null str) { assert (false); }
return str;
}

@CeylonMigrationBot
Copy link
Author

[@FroMage] Well, no, the point of the issue is that we handle returns, like:

void fun(String? str){
 if(is Null str){ return; }
 print(str);
}

The case of assert(false) is pathological, especially given that it should be written:

String fun(String? str) {
    assert (!is Null str);
    return str;
}

I don't care that assert(false) is not handled, it's such a special case. Now, perhaps there are more interesting cases that need typing for return flow, I just can't think of one at the moment.

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister]

The only way I can tell that this function definitely returns is by knowing the type of false.

I don’t think it needs the type of false or true:

shared Anything f() {
    value myTrue = true;
    if (myTrue) {
        return nothing;
    }
    // error: does not definitely return
}

It seems to depend on the identity of the true / false value, which is also what the spec says (5.4.3):

A boolean condition is considered to be always satisfied if it is a value reference to true. A boolean condition is considered to be never satisfied if it is a value reference to false.

Perhaps this will change with #3971, when we can denote the types \otrue and \ofalse, but right now I don’t think this needs type information: It only depends on imports and scoping. Perhaps it’s possible to insert a phase before flow control that only tracks references to true and false? (I. e., if a true / false BME refers to the language local values.)

@CeylonMigrationBot
Copy link
Author

[@gavinking] @lucaswerkmeister resolving references and assigning types is something that has to happen at the same time as part of one phase. We can't resolve references and then later assign types to things.

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister] But we wouldn’t be resolving arbitrary references in that phase – we would just track two names and check if they’re shadowed by anything else (import alias, other import, toplevel value in package, value in surrounding scope). We don’t need to know any types, because only the original true and false count (see above), and we don’t need to inspect QMEs, and therefore don’t need the qualifying expression’s type, because any qualified member is never true or false (unless someone uses package.true in the language module itself).

Or does flow control need more type information than “always satisfied” and “never satisfied”?

@CeylonMigrationBot
Copy link
Author

[@gavinking] I also have to take into account inherited members. It's nowhere near as simple as you're making out.

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister] Ohh, the superclass can also have a member true. Dangit. That probably makes the phase too complicated, and too close to full typechecking.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Note that the functionality which blocks this popular issue is #3784 and #3783, both of which have been around since M6, along with other related similar stuff in ControlFlowVisitor.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Given how difficult this would be to implement, we should consider an alternative, which would be to add a block to assert, for example:

Hell, I'm not even sure if the else keyword is really needed for readability here. How about just:

assert (nonempty list) { return null; }
assert (is Foo arg) { throw IllegalTypeException(); }
assert (!name.empty, length>0) { return ""; }

Or, if you guys think it's necessary for readability, we could require an else keyword here:

assert (nonempty list) else { return null; }
assert (is Foo arg) else { throw IllegalTypeException(); }
assert (!name.empty, length>0) else { return ""; }

This is much, much easier to implement, and doesn't actually seem worse to me.

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister] I think the else makes the meaning much more obvious to readers unfamiliar with the syntax. With else, I really like this idea (though I’d probably still want to have the original issue eventually).

@CeylonMigrationBot
Copy link
Author

[@sgalles] Well I'm certainly not against it because https://groups.google.com/d/msg/ceylon-users/AbcLpWNrlCY/9Aa7XLXhPcQJ 😄
And yes, I also like the version with else

@CeylonMigrationBot
Copy link
Author

[@jvasileff] I agree with @lucaswerkmeister's comment.

It may also make sense to allow break and continue.

@CeylonMigrationBot
Copy link
Author

[@lukedegruchy] FWIW, I vote for the else version.

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister] Actually, I don’t fully understand the semantics yet. What does this do?

void printFirst(Anything[] seq) {
    assert (nonempty seq) else { print("<empty>"); }
    print(seq.first.string);
}

Implicit return after print("<empty>");? Or implicit throw? Or error from ControlFlowVisitor?

@CeylonMigrationBot
Copy link
Author

[@jvasileff] Error, "Does not definitely return", right?

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister] Good idea, that error message could apply there.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Also, assert typically lets you express the conditions in non-negated form, i.e. exists instead of !exists.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Alright, so I have, for 1.2, pushed a totally rubbish implementation of this feature which handles things like:

if (!exists foo) { return; }
foo.bar();

and:

if (!is Foo foo) { throw; }
foo.bar();

I detect if blocks which end in a return, throw, break, or continue.

  • I do not detect assert (false) nor while (true).
  • Nor do I evaluate conditional branches to see if every branch definitely returns.

Furthermore, the code that supports this is by far the nastiest in the whole typechecker.

My motivation for pushing this now, right before the 1.2 release is that I discovered how much code like this is being written right now:

if (!exists foo) {
    return null;
}
assert (exists foo);

Code like the above is an abomination in Ceylon, and undermines the typesafety we provide. Furthermore, code like that breaks immediately as soon as the issue is solved. So you guys definitely should not have been writing code like that!

Therefore, it was better to do this sooner rather than later.

@CeylonMigrationBot
Copy link
Author

[@quintesse]
@jvasileff you have a condition and a true-block and a false-block, to me that's the definition of an if statement.

@sgalles well, first I don't find that it makes code "super clear", but I'll admit it's useful. But an assert is an assert and an if is an if, one is used to throw an exception for the highly unanticipated situation where the condition isn't true while an if is used to make choices. The moment you give an assert an else-block and use it to make decisions it's become an if and you should be using an if in my opinion.

So you guys definitely should not have been writing code like that!

Well you can say that to us, but you just can't prevent the rest of the world from doing that, it's the closest thing Ceylon permits to what is an "early-exit" code style.

@CeylonMigrationBot
Copy link
Author

[@gavinking] I have reimplemented this in a slightly less-crappy way.

@CeylonMigrationBot
Copy link
Author

[@FroMage] So this works for what sort of conditions? is and exists for sure. Others? Multiple conditions?

Due to the semantics of is/exists wrt variables and transient values, it's not possible to get a new definition (virtual else var, after the if condition) that is not constant, which means that it doesn't matter that int is not evaluated three times here (contrary to how it looks):

            if (!exists int) {
                return 0;
            }
            return int+int;

In fact, ATM with my implementation it's evaluated twice instead of the once that would be possible. I assume that's not a problem?

@CeylonMigrationBot
Copy link
Author

[@gavinking]

  • is, exists, nonempty
  • No, as soon as you have multiple conditions, you can't apply this reasoning, since you have no idea which of the conditions failed.

@CeylonMigrationBot
Copy link
Author

[@gavinking]

Due to the semantics of is/exists wrt variables and transient values, it's not possible to get a new definition (virtual else var, after the if condition) that is not constant, which means that it doesn't matter that int is not evaluated three times here (contrary to how it looks):

Right. Exactly. @tombentley totally misunderstood the point I was making about what was wrong with the previous implementation. It does not matter if you re-evaluate the variable. It does matter if you re-evaluate the condition. Which was a problem with the previous implementation.

@CeylonMigrationBot
Copy link
Author

[@FroMage] OK thanks.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Mmm, does that also work with destructuring?

@CeylonMigrationBot
Copy link
Author

[@gavinking] @FroMage no, I don't think so.

@CeylonMigrationBot
Copy link
Author

[@FroMage] OK.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Just keep in mind that I won't implement anything for what you say is not supported ;)

@CeylonMigrationBot
Copy link
Author

[@FroMage] Done.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Sorry I meant to close the backend issue ;)

@CeylonMigrationBot
Copy link
Author

[@gavinking] Well, it's done anyway, and moved to #4540.

@CeylonMigrationBot
Copy link
Author

[@lucaswerkmeister] Shouldn’t we have a tracking issue for making this work even with assert (false); and while (true) (milestone “postponed indefinitely”)?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants