-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issues with the semantics of top-level expressions #24569
Comments
Would it be possible to compile loops only if they don't call anything that does one of the above special actions or results in eval being called? I suspect that requires trying some amount of compilation and bailing out if those conditions are encountered, which is a bit suboptimal. |
I think that's undecidable. |
No kidding, but we can be conservative here. We don't need a precise answer, we just want to be able to optimize cases where it's obvious that none of the problematic operations happen. |
For a start that requires no function call and I think that pretty much rule out almost all the useful cases. |
It doesn't eliminate cases where the entire loop body is inlined, which is not a useless case. Even if there are function calls but no dynamic dispatch, it would be possible to know that particular method doesn't itself call any top-level-affecting constructs. Once you have dynamic dispatch or call a known method that itself cannot be certain not to affect the top-level, then sure, the game is up. But I really don't think that's a negligible or useless set of cases. |
The loop also has to not do any allocation since the finalizer can modify global states. |
That's a tautology. You can't know if the loop body is inlined unless you can define that it can't have side-effects. Or we need to implement (2) above: teach inference how to reason about all of the possible side-effects of top-level code. |
That is why I mentioned trying to compile the code first and bailing out if it turns out to be too dynamic or too to do something top-level. |
That's exactly what I said is impossible. No-allocation isn't even defined in language semantics so finalizer can happen anywhere. |
I don't think we should worry about finalizers here. They should not be adding methods, and we can either say they are not allowed to, or that it's undefined when those effects are seen. Other than that the problem is the time needed to develop either an analysis for this or deoptimization techniques. |
I think this is fine, but in which case we should disallow it explicitly, just like task switches.
This we can certainly do but it doesn't help since we still need to satisfy causality. The point is that we don't have to do that but the user can do the synchronization themselves. As long as we allow finalizers to run on the same thread / task that's currently running code (which can be toplevel code), it would be possible for the user to synchronize with that code and expect certain effect to be visible based on transitivity. Also note that such synchrnoization doesn't even require atomics since they are running on the same thread which is an observable that the code can and sometime has to rely on (a good example being recursive lock). This make such synchronization impossible to detect (it can be any heap read/write). Adding threading to this make things much worse since the finalizer can synchrnoize using proper atomics/locks (which they do need sometimes) with another thread that defines new method meaning that the top level loop can be synchronized with a finalizer that synchronize with another thread that defines a new method and none of the above can actually deal with this. I would also point out that I actually also doubt just being able to compile a loop in the same world is going to help much. The loop has to not reference any non-const global for this to be remotely inferrable (otherwise we would have got significantly less complaint from slow toplevel code). Assuming the type of global doesn't change will also not be valid due to finalizers and I think forbit finalizers from mutating global variables is not a good idea. |
Top-level expressions are special because they are able to cause certain side-effects that strongly affect program behavior and that ordinary expressions can't cause. These are:
global
,const
,import
,using
)Fortunately, that's all. (Type definitions could be on this list, but the only tricky part of their behavior is assigning a global name, which is equivalent to item 1.)
The most desirable behavior for any statements not inside methods is "interpreter-like": all statements should see all effects of all previous statements. Unfortunately this is at odds with compiling fast code, which would like to assume a fixed world state (i.e. world counter). This tension has led to several bugs and tricky design decisions, such as:
Code behaving differently based on whether it gets compiled: #2586 #24566
Top-level code breaking inference's fixed-world assumption: #24316
Problems with binding effects: #18933 #22984 #12010
Three general kinds of solutions are possible: (1) Change top-level semantics to enable optimizations, (2) make optimizations sound with respect to top-level semantics, or (3) don't optimize top-level expressions. (1) is likely to be unpopular, since it would lead to things like:
due to inference's usual fixed-world assumption. That leads us to (2). But it wouldn't be able to optimize very much. Consider a case like
where we can't reasonably assume anything about what
f(1)
does.That brings us to (3). In practice, it is basically never useful to infer and compile a top-level expression. The only exceptions are casual benchmarking (
@time
), and maybe an occasional long-running loop in the REPL. So to compromise, for a long time we've been compiling top-level expressions if they have loops. But that's a rather blunt instrument, since compiling loops that e.g. calleval
is not worthwhile, and in other cases is unsound (#24316).I'm not sure what to do, but overall I think we should optimize top-level code much less often (which might improve load times as well). One idea is to add a
@compile
or@fast
macro that turns off top-level semantics, basically equivalent to wrapping the code in a 0-arg function and calling it.Thoughts?
The text was updated successfully, but these errors were encountered: