WIP: World-age partition bindings #54654

Keno · 2024-06-02T21:49:58Z

This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that const bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this.

Motivation

The reasons for this change are manifold:

The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (redefining struct timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in Support type renaming #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of Support type renaming #22721.
A secondary motivation is to make const-redefinition no longer UB (although const redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in Add devdocs on UB #54099 and Behavior of reassignment to a const #38584.
Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (Track backedges through not-yet-defined bindings? #53958).
Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like compiling top-level expressions forces global binding resolution #14055.

Implementation

In this PR:

Binding gets min_world/max_world fields like CodeInstance
Various lookup functions walk this linked list using the current task world_age as a key
Inference accumulates world bounds as it would for methods
Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef
One primary complication is that our IR definition permits const globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in Core.Compiler that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds.
A new @world macro can be used to access bindings from old world ages. This is used in printing for old objects.
The const-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed.

Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like:

f() = Any
g(::f()) = 1
f() = Int

Does not redefine g. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings.

Demo

julia> struct Foo
               a::Int
       end

julia> g() = Foo(1)
g (generic function with 1 method)

julia> g()
Foo(1)

julia> f(::Foo) = 1
f (generic function with 1 method)

julia> fold = Foo(1)
Foo(1)

julia> struct Foo
               a::Int
               b::Int
       end

julia> g()
ERROR: MethodError: no method matching Foo(::Int64)
The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  Foo(::Int64, ::Int64)
   @ Main REPL[6]:2
  Foo(::Any, ::Any)
   @ Main REPL[6]:2

Stacktrace:
 [1] g()
   @ Main ./REPL[2]:1
 [2] top-level scope
   @ REPL[7]:1

julia> f(::Foo) = 2
f (generic function with 2 methods)

julia> methods(f)
# 2 methods for generic function "f" from Main:
 [1] f(::Foo)
     @ REPL[8]:1
 [2] f(::@world(Foo, 0:26898))
     @ REPL[4]:1

julia> fold
@world(Foo, 0:26898)(1)

Performance consideration

On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly.

Semantic TODO

Do we want to change the resolution time of bindings to (semantically) resolve them immediately?
Do we want to introduce guard bindings when inference assumes the absence of a binding?
When (if ever) do globals get declared implicitly?

Implementation TODO

Keno · 2024-06-04T18:21:41Z

Summarizing discussion with @JeffBezanson (only part of it) @StefanKarpinski @vtjnash @topolarity @gbaraldi @oscardssmith from today about the remaining semantic questions:

Semantic TODO

When does binding resolution happen semantically?

Example 1

Right now, the precise point-in-time of binding resolution is ill-defined. To illustrate this, consider:

module Exporter
	export foo
	foo = 1
end

using .Exporter
f() = foo

julia> f()
1

julia> global foo = 2
ERROR: cannot assign a value to imported variable Exporter.foo from module Main
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

julia> global foo = 2
2

julia> f()
2

but also

julia> code_typed(f)
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Main.foo::Any
└──      return %1
) => Any

julia> global foo = 2
ERROR: cannot assign a value to imported variable Exporter.foo from module Main
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

Basically, right now bindings get resolved whenever anything in the system happens to look at a binding, but since the running-or-not of inference is outside the semantics of the language, these semantics are ill-defined.

Example 2

Similar for binding ambiguousness:

module Exporter1
	export foo
	foo = 1
end

module Exporter2
	export foo
	foo = 2
end
using .Exporter1

with

julia> foo
1

julia> using .Exporter2
WARNING: using Exporter2.foo in module Main conflicts with an existing identifier.

julia> foo
1

julia> using .Exporter1

julia> using .Exporter2

julia> foo
WARNING: both Exporter2 and Exporter1 export "foo"; uses of it in module Main must be qualified
ERROR: UndefVarError: `foo` not defined in `Main`
Hint: It looks like two or more modules export different bindings with this name, resulting in ambiguity. Try explicitly importing it from a particular module, or qualifying the name with the module it should come from.

(similar to example 1, the explicit reference of foo is not necessary, anything in the system that might resolve the binding counts).

Proposed semantics

The proposed semantics are (and this was not fully spelled out in the discussion, so there may be some further debate on this) that bindings resolve (in decreasing order of priority)

The most recently declared (in world age terms) global/const binding, unless explicitly deleted.
The most recent (in world age terms) explicit import using Foo: x, unless explicitly deleted.
An implicit import from all using'd modules available in the world age.

Note 1: Only toplevel global declares new bindings. A function level global declares a new (Any-typed) binding only if there was no previous top-level global.
Note 2: global x (even at top level) does not declare a new binding if previously declared global, otherwise equivalent to global x::Any.
Note 3: global x::T does not declare a new binding if x is already a global in the current module and T is egal to the currently declared type of x.

To see concrete effects, the assignment in Example 1, global foo = 2 becomes allowed. f() subsequently returns 2. In the first case of Example 2, the second use of foo will give the same ambiguous error as the first example does.

Overall, these semantics remove any consideration of code-execution order (with the exception of the ordering of world-age-incrementing top-level declarations) and should be a lot clearer. There should also be no change in binding resolution in cases that are not currently errors or warnings. Cases that are may change slightly, but as discussed above, we currently don't actually guarantee those resolutions, because binding resolution may happen at any time.

The one wrinkle here is that currently, there is precisely one case where bindings are introduced by non-toplevel code as discovered in #54607. We discussed this somewhat extensively. This change was introduced in 1.9 to allow modification of existing globals in other modules, but accidentally also permitted the creation of new bindings. The overall consensus was the independent of any semantics changes here, we need to correct this oversight, I will be submitting a PR shortly to attempt to correct this issue for 1.11, though we may have to go through a deprecation since it was in the system for two releases. Regardless, it shouldn't be an impediment for this change.

What to do about replacement of mutable bindings?

The semantics of binding replacement for const bindings are fairly clear and match the semantics of method replacement reasonably closely: The values that you see are the values that happened to be assigned when the world age is captured. However, this issue becomes trickier with mutable bindings. Here are some examples (I'll be using opaque closures as the canonical world age capture mechanism, but feel free to substitute tasks or whatever other world-age capture mechanism you prefer).

Example

For example, what does the following do:

global x::Integer = 1
old_age = get_world_counter()
oc = @opaque ()->(global x = x+1)
global x::Int = 3

Now, consider the following:

@world(x, old_age)
oc();
x
@world(x, old_age)

As implemented in this PR, bindings are fully partitioned, so this would give 1 2 3 2. However, the concern is e.g. with Revise, binding replacement may not always be fully obvious and users may not understand why e.g. long running tasks suddenly stop updating globals.

Other options

For completeness, I will list all the options we discussed, although some of these are probably bad:

Suggestion 1: Merge bindings with egal metadata across world ages.

This is not quite relevant to the example, but there was a proposal that in:

global x::Int # World Age 1
global x::Float64 # World Age 2
global x::Int # World Age 3

the bindings in world ages 1 and 3 should alias. I think we largely discarded this proposal as

Not addressing the problem fully and
Being confusing to users, as narrowing/widening does not get these semantics, so for people who don't know the precise semantics here, when bindings are shared or not can be confusing.

Suggestion 2: setglobal! assigns in every compatible world age

Basically, the semantics here would be getglobal reads the last setglobal! that assigned a value of a compatible type. So the result in the above example would change to 3 4 4 4, but in this slightly modification:

global x::Float64 = 1
old_age = get_world_counter()
oc = @opaque ()->(global x = x+1)
global x::Int = 3

you would get 1.0, 2.0, 3, 2.0 as in the current semantics.

We liked this semantically, but were concerned about the difficulty of implementation.

Suggestion 3: Writing outdated mutable bindings is an error

Relatively straightforward. If you replace a binding, then from that point on, all code running in old world ages will error upon assignment. In our running example, we would have 1 OldBindingError 3 1.
There are two primary downside here:

an additional pointer-sized load on every global assignment, but that's likely reasonable and if you really wanted to could probably be fixed by rewriting the binding pointer.
Any setglobal! call can never be proven nothrow

@topolarity raised the point that it seems odd to disallow this at top level, while permitting mutations through mutable objects, but that same concern applies to mutable values accessed through const

Suggestion 4: Do that, but also error on reads

Basically, as soon as you replace a mutable binding, the old one becomes toxic and loudly errors. In our running example, this gives OldBindingError, OldBindingError, 3, OldBindingError. This is quite aggressive, but I do think it is acceptable, because it requires the combination of code that:

Runs in an old world
Makes use of global mutable bindings
Has those bindings replaced

This situation is not super common, particularly since global mutable state is generally discouraged in Julia (as in other languages).

An open question is whether we want to extend this behavior to const bindings with mutable type.

Suggestion 5 [late submission]: type assert (using the old binding type) on read and write

Similar to suggestion 4, but rather than erroring unconditionally, old world bindings would gain typeasserts (on read with the old type, on write with the new type). global x::T is equivalent to global x::T = convert(T, @world(x, get_world_counter()). Should also still be compatible with suggestion 2.

Conclusion

I think we arrived at starting with suggestion 4. It's the most conservative and we should explicitly reserve the potential of revisiting the error cases in the future. Suggestion 2 was also well liked, but implementation difficulty was a concern. Additionally, with the indicated semantic reservation, switching from suggestion 4 to suggestion 2 is feasible, but not vice versa.

Eager resolution of bindings?

@JeffBezanson is concerned about losing the redefinition error in the following case:

f() = sin(1)
f()
sin = 4

also known as the missing=false issue. There were several proposals to improve this:

Eagerly resolving whether a binding is imported to a global and disallowing switching this until the module is closed (otherwise as above).
Have an opt-in mode that disallows shadowing imported bindings entirely

Should we do guard entries?

I originally made this a semantic question, although it's really more of a performance consideration. That said, with the changes to binding resolution discussed above, I believe guard entries are required for correctness anyway, so I think this question is moot.

As discussed in [1], the implicit creation of bindings through the setglobal! intrinsic was accidentally added in 1.9 unintentionally and will be removed (ideally) or at the very least deprecated in 1.11. The recommended replacement syntax is `Core.eval(mod, Expr(:global, sym))` to introduce the binding and `invokelatest(setglobal!, mod, sym, val)` to set it. The invokelatest is not presently required, but may be required for JuliaLang/julia#54654, so it's included in the recommendation. [1] JuliaLang/julia#54607

PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax. However, the intention of this syntax was always to modify *existing* globals in other modules. Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner case, so nobody caught it at the time. After some extensive discussions and taking into account the near future direction we want to go with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit creation of bindings (but not the ability to assign the *value* of globals in other modules). Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer removal strategy. Across base and stdlib, there's two cases affected by this change: 1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR. 2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102. As noted in that PR, the recommended replacement where implicit binding creation is desired is: ``` Core.eval(mod, Expr(:global, sym)) invokelatest(setglobal!, mod, sym, val) ``` The `invokelatest` is not presently required, but may be needed by #54654, so it's included in the recommendation now. Fixes #54607

vchuravy · 2024-06-05T00:53:11Z

I am not sure about any resolution that will cause code running to error.

I do think it is acceptable, because it requires the combination of code that:

Runs in an old world

Makes use of global mutable bindings

Has those bindings replaced

This situation is not super common, particularly since global mutable state is generally discouraged in Julia (as in other languages).

I am not convinced that this situation isn't common and it would prohibit any use of global mutable bindings by any code that executes in a separate/frozen world-ages. We might in due time want to be able to execute "compilation unit" that are fully separated from the rest of the system.

However, the concern is e.g. with Revise, binding replacement may not always be fully obvious and users may not understand why e.g. long running tasks suddenly stop updating globals.

I do understand the concerns, but for me the conclusion seems more to be that globals must be accessed in the world-age of the task / or long running tasks must be restarted.

I am not convinced that the Revise use-case outweighs the statement: "Old code must be able to keep running"

Keno · 2024-06-05T01:07:14Z

I am not convinced that this situation isn't common and it would prohibit any use of global mutable bindings by any code that executes in a separate/frozen world-ages. We might in due time want to be able to execute "compilation unit" that are fully separated from the rest of the system.

Pretty much yes. I will admit that I generally think mutable globals are the wrong tool for anything beyond basic repl usage and I think of frozen world ages as a somewhat advanced feature, so I don't mind putting the complexity there as much. That said, separate compilation units can of course have their own entirely separate semantics and not observe global assignments from outside at all.

"Old code must be able to keep running"

What about Suggestion 5 where old code does mostly keep running, unless you assign a value to the global that is incompatible with the type restriction that was active in the captured world age. That seems like it would mostly let old code keep running unless you're changing something massively about the globals.

As discussed in [1], the implicit creation of bindings through the setglobal! intrinsic was accidentally added in 1.9 unintentionally and will be removed (ideally) or at the very least deprecated in 1.11. The recommended replacement syntax is `Core.eval(mod, Expr(:global, sym))` to introduce the binding and `invokelatest(setglobal!, mod, sym, val)` to set it. The invokelatest is not presently required, but may be required for JuliaLang/julia#54654, so it's included in the recommendation. [1] JuliaLang/julia#54607

vchuravy · 2024-06-05T01:26:09Z

It feels weird to add extra runtime overhead to a language feature. In some sense I would want to keep the current semantics that long-running tasks must opt-in to seeing "new"/"unexpected" state with invokelatest.

global running::Bool =  true

@spawn begin
      while running
          # 
      end
end

global running::Int

That is the scenario we are worried about? If I now change running my task will never finish.

For me that is equivalent to

const running =  Ref{Bool}(true)

@spawn begin
      while running[]
          # 
      end
end

const running = Ref{Int}(0)

Of course you raised this point in your proposal:

An open question is whether we want to extend this behavior to const bindings with mutable type.

For me information shouldn't travel back in time and thus anything we do here can't impact code executing in a prior world-age. I would rather make the rule simple: "If you modify a binding, that modification is only visible from here on out" and not have complicated rules about unifying bindings across time.

The "right" way fro me to write long-running tasks for that in a hot reloading scenario is probably:

global running::Bool = true
function execute()
     # work
     return running
end

@spawn while true
           invokelates(execute) || break
end

Keno · 2024-06-05T01:31:49Z

It feels weird to add extra runtime overhead to a language feature
Yes, but the overhead is actually less than you might expect, because precompile can already unset bindings, which is somewhat equivalent.

That is the scenario we are worried about? If I now change running my task will never finish.

Yes, or even something more benign like global running::Union{Int, Bool} where the user would be confused by setting running = false after that declaration does not terminate the loop.

For me that is equivalent to
I would rather make the rule simple: "If you modify a binding, that modification is only visible from here on out" and not have complicated rules about unifying bindings across time.

Yes, that was also what I originally wanted to do and is what is implemented in this PR, but you and I both have a very sophisticated understanding of world ages. I fear that most users may not.

Keno · 2024-06-05T01:34:39Z

If you modify a binding

One of the biggest problems I have is that "modifying a binding" is not necessarily an explicit operation. From the users perspective, they just changed the type on a binding, or even worse in the:

struct Foo
   a::Int # changed to `Integer` by the user using Revise
end
global foo::Foo = Foo(1)

case, the user may not be touching the binding at all at the source level, but revise still has to do the rebinding.

vchuravy · 2024-06-05T01:53:42Z

Absolutely! For me the question is if the Revise use-case trumps :

No additional run-time costs
Code that ran once in a world-age will not break, e.g. can we keep freezing world-ages

For me 2. is more important than the Revise use-case. We should empower Revise as much as possible, but also acknowledge that it can't do a perfect job with top-level state.

Keno · 2024-06-05T02:04:08Z

I don't think it's really about Revise. Revise is fine with either semantics. The question is more about the semantics of old-world code, for which there are two competing priorities:

It should keep running as is
It should match the mental model of the users

For 1, I really don't think that the typeassert solution is that bad. In the pre-typed-globals world (which is still extremely common), people would regularly do things like while running::Bool; end. Doing that implicitly is pretty much the semantics of Suggestion 5 with identical old-world behavior and I think it's a lot simpler of a mental model, because it does not require explaining to users the precise extents of world ages.

Keno · 2024-06-05T02:11:22Z

Said another way, I think the issue is mostly about type restriction on globals. If we didn't have that feature, I think the answer would be fairly obvious that there's only one mutable location for a binding (even if you can shadow it with a const for some subset of the world age). So then the question is if the type restriction feature is compelling enough of a reason to world-age fracture by type restriction semantically, and it just seems too niche for that to be realistic.

I think the semantics of:

There's only one global location for a given binding
Optionally you may declare a (world-age-fractured) type for the global which introduces implicit converts and typeasserts.

seems like a very simple mental model.

StefanKarpinski · 2024-06-05T14:32:01Z

I think I like that last approach, but I want to be sure what it means. Is the idea:

There is exactly one current value for a mutable global binding
If the binding in the current world looks like x::T then we guarantee that getting x returns a value of type T by e.g. doing convert(T, currentvalue(:x))::T
If this conversion is an error in the world age where an assignment happens, it's an immediate error and the assignment fails (the old value is retained)
If this conversion is not an error in the world age where an assignment happens, a new "true value" is established and used everywhere
In worlds where the true value can be converted to the expected type, the conversion is performed and the converted value is returned upon access to the global
In worlds where the true value cannot be converted to the expected type, accessing the global is an error.

Is that what you have in mind here, @Keno? One thing to consider is: if the user writes x::T = v do we consider v to be the true value of the global or do we consider convert(T, v)::T to be the true value?

vchuravy · 2024-06-05T15:06:09Z

Thinking about it some more this morning, I think "Suggestion 5" may be workable. typeassert on read and write with the type being world-age based.

It allows changing global running::Bool to global running::Nothing and then back, once the user realized the mistake.

Keno · 2024-06-05T16:03:47Z

There is exactly one current value for a mutable global binding

Yes

If the binding in the current world looks like x::T then we guarantee that getting x returns a value of type T by e.g. doing convert(T, currentvalue(:x))::T

I wasn't suggesting the additional convert on access (only assignment), since we don't usually convert on access. Although arguably we don't have to, since the can never have a mismatch.

If this conversion is an error in the world age where an assignment happens, it's an immediate error and the assignment fails (the old value is retained)

Yes

If this conversion is not an error in the world age where an assignment happens, a new "true value" is established and used everywhere

Yes

In worlds where the true value can be converted to the expected type, the conversion is performed and the converted value is returned upon access to the global

Yes, modulo above question on convert-on-access

In worlds where the true value cannot be converted to the expected type, accessing the global is an error.

Yes

Keno · 2024-06-05T16:18:16Z

only assignment

Thinking about this some more (and consistent with what I wrote yesterday), I don't think we can introduce old-world converts to/from the new type either on access or on write, because in general the new type may be defined in a new world, so the convert is likely to fail. Of course, we could implicitly transition to the latest world, but I think that's too magical. I think the only semantics that are sensible are:

# M.x
getglobal(M, :x)::get_binding_type(M, :x)
# M.x = val
setglobal!(M, :x, convert(get_binding_type(M, :x), val)::invokelatest(get_binding_type, M, :x))

Where get_binding_type looks up the binding type for running world and get set/getglobal are semantically untyped (for these purposes anyway, those type asserts are probably implicit in those intrinsics).

PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax. However, the intention of this syntax was always to modify *existing* globals in other modules. Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner case, so nobody caught it at the time. After some extensive discussions and taking into account the near future direction we want to go with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit creation of bindings (but not the ability to assign the *value* of globals in other modules). Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer removal strategy. Across base and stdlib, there's two cases affected by this change: 1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR. 2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102. As noted in that PR, the recommended replacement where implicit binding creation is desired is: ``` Core.eval(mod, Expr(:global, sym)) invokelatest(setglobal!, mod, sym, val) ``` The `invokelatest` is not presently required, but may be needed by #54654, so it's included in the recommendation now. Fixes #54607

StefanKarpinski · 2024-06-05T18:18:29Z

The reason I mentioned convert-on-access is because you could have a situation like this:

global x::Any     # world 1
global x::String  # world 2
global x::Float64 # world 3
global x::Int     # world 4

What happens when one does x = 1 in world 4? If you had conversion on-access (or conversion on assignment to all compatible types), then when accessing x in world 3, you would get 1.0. I think that doing the convert eagerly versus lazily with caching would be largely equivalent. Either way, accessing x after that in world 2 would be an error. But one of the subtleties I wanted to tease out is what happens if you do x = big(1) in world 4 and then access x in world 1? Does world 1 see big(1) or Int(1)?

Keno · 2024-06-05T18:26:32Z

I think doing anything beyond the standard conversion on assignment we do right now is too magical. convert does not preserve object identity, so if you have a mutable object, different accesses to the objects no longer alias, and it all just becomes very confusing. I think it's fine in your example, for world 3 accesses to error as well.

StefanKarpinski · 2024-06-05T18:31:47Z

So basically:

Convert in the world where assignment happens
If the resulting value is type-compatible with an older world, also bind it there
In worlds where the new value is not type compatible, access becomes an error

A couple of clarifying questions:

What if assignment occurs in a world that is not the latest? Error?
Should we stop scanning historical worlds at the first failed assignment or continue?

Keno · 2024-06-05T18:44:18Z

Basically, the way to think about it is that there is only one storage location for Mod.sym, semantically typed by the binding type in the latest world age.

Convert in the world where assignment happens

yes

If the resulting value is type-compatible with an older world, also bind it there

In worlds where the new value is not type compatible, access becomes an error

There's only one storage location - if the actual stored value is not compatible with the binding type that was declared in a previous world age, access becomes an error.

A couple of clarifying questions:

What if assignment occurs in a world that is not the latest? Error?

Convert according to the old binding type and then attempt to assign (without convert) into the global. If the type (after convert in the old world) is not compatible with the latest declared type (without any convert), this is an error.

Should we stop scanning historical worlds at the first failed assignment or continue?

There's no scanning of historical worlds, the typeassert is on access.

PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax. However, the intention of this syntax was always to modify *existing* globals in other modules. Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner case, so nobody caught it at the time. After some extensive discussions and taking into account the near future direction we want to go with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit creation of bindings (but not the ability to assign the *value* of globals in other modules). Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer removal strategy. Across base and stdlib, there's two cases affected by this change: 1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR. 2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102. As noted in that PR, the recommended replacement where implicit binding creation is desired is: ``` Core.eval(mod, Expr(:global, sym)) invokelatest(setglobal!, mod, sym, val) ``` The `invokelatest` is not presently required, but may be needed by #54654, so it's included in the recommendation now. Fixes #54607

topolarity · 2024-06-05T21:28:38Z

If the type (after convert in the old world) is not compatible with the latest declared type (without any convert), this is an error.

Does this have to be an error? I would have expected this to be an error upon access in the new world, rather than modification in the old one.

Otherwise, it seems like global x::Int has to be an error if the contents of x are incompatible with the new type, even though the variable may be initialized before any use in the new world.

Keno · 2024-06-05T21:31:39Z

Does this have to be an error? I would have expected this to be an error upon access in the new world, rather than modification in the old one.

It is feasible to make it symmetric, but then every new world access will need to have a type-assert on read, even if the binding hasn't been replaced, which has performance implications. I also think I like the semantics of erroring in the old world better, but I'm open to discussion.

Otherwise, it seems like global x::Int has to be an error if the contents of x are incompatible with the new type, even though the variable may be initialized before any use in the new world.

It tries to convert, and is an error if not possible.

Keno · 2024-06-05T21:36:34Z

but then every new world access will need to have a type-assert on read

I guess we could cache the typeassert for the newest world along with the world age and collapse the check into one.

So yeah, I think either is probably feasible. Your proposal does have the advantage of letting old code keep running if the new code never actually touches the global.

@isdefined

This is a prepratory commit for #54654 to change the lowering of `const` and typed globals to be compatible with the new semantics. Currently, we lower `const a::T = val` to: ``` const a global a::T a = val ``` (which further expands to typed-globals an implicit converts). This works, because, under the hood, our const declarations are actually assign-once globals. Note however, that this is not syntactically reachable, since we have a parse error for plain `const a`: ``` julia> const a ERROR: ParseError: # Error @ REPL[1]:1:1 const a └─────┘ ── expected assignment after `const` Stacktrace: [1] top-level scope @ none:1 ``` However, this lowering is not atomic with respect to world age. The semantics in #54654 require that the const-ness and the value are established atomically (with respect to world age, potentially on another thread) or undergo invalidation. To resolve this issue, this PR changes the lowering of `const a::T = val` to: ``` let local a::T = val const (global a) = a end ``` where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`. A similar change is made to const global declarations, which previously lowered via intrinsic, i.e. `global a::T = val` lowered to: ``` global a Core.set_binding_type!(Main, :a, T) _T = Core.get_binding_type(Main, :a) if !isa(val, _T) val = convert(_T, val) end a = val ``` This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`. This is not technically required, but we currently do not use intrinsics for world-age affecting side-effects anywhere else in the system. In particular, after #54654, it would be illegal to call `set_binding_type!` in anything but top-level context. Now, we have discussed in the past that there should potentially be intrinsic functions for global modifications (method table additions, etc), currently only reachable through `Core.eval`, but such an intrinsic would require semantics that differ from both the current `set_binding_type!` and the new `:globaldecl`. Using an Expr form here is the most consistent with our current practice for these sort of things elsewhere and accordingly, this PR removes the intrinsic. Note that this PR does not yet change any syntax semantics, although there could in principle be a reordering of side-effects within an expression (e.g. things like `global a::(@isdefined(a) ? Int : Float64)` might behave differently after this commit. However, we never defined the order of side effects (which is part of what this is cleaning up, although, I am not formally defining any specific ordering here either - #54654 will do some of that), and that is not a common case, so this PR should be largely considered non-semantic with respect to the syntax change.

@isdefined

This is a prepratory commit for #54654 to change the lowering of `const` and typed globals to be compatible with the new semantics. Currently, we lower `const a::T = val` to: ``` const a global a::T a = val ``` (which further expands to typed-globals an implicit converts). This works, because, under the hood, our const declarations are actually assign-once globals. Note however, that this is not syntactically reachable, since we have a parse error for plain `const a`: ``` julia> const a ERROR: ParseError: # Error @ REPL[1]:1:1 const a └─────┘ ── expected assignment after `const` Stacktrace: [1] top-level scope @ none:1 ``` However, this lowering is not atomic with respect to world age. The semantics in #54654 require that the const-ness and the value are established atomically (with respect to world age, potentially on another thread) or undergo invalidation. To resolve this issue, this PR changes the lowering of `const a::T = val` to: ``` let local a::T = val const (global a) = a end ``` where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`. A similar change is made to const global declarations, which previously lowered via intrinsic, i.e. `global a::T = val` lowered to: ``` global a Core.set_binding_type!(Main, :a, T) _T = Core.get_binding_type(Main, :a) if !isa(val, _T) val = convert(_T, val) end a = val ``` This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`. This is not technically required, but we currently do not use intrinsics for world-age affecting side-effects anywhere else in the system. In particular, after #54654, it would be illegal to call `set_binding_type!` in anything but top-level context. Now, we have discussed in the past that there should potentially be intrinsic functions for global modifications (method table additions, etc), currently only reachable through `Core.eval`, but such an intrinsic would require semantics that differ from both the current `set_binding_type!` and the new `:globaldecl`. Using an Expr form here is the most consistent with our current practice for these sort of things elsewhere and accordingly, this PR removes the intrinsic. Note that this PR does not yet change any syntax semantics, although there could in principle be a reordering of side-effects within an expression (e.g. things like `global a::(@isdefined(a) ? Int : Float64)` might behave differently after this commit. However, we never defined the order of side effects (which is part of what this is cleaning up, although, I am not formally defining any specific ordering here either - #54654 will do some of that), and that is not a common case, so this PR should be largely considered non-semantic with respect to the syntax change.

@isdefined

This is a prepratory commit for #54654 to change the lowering of `const` and typed globals to be compatible with the new semantics. Currently, we lower `const a::T = val` to: ``` const a global a::T a = val ``` (which further expands to typed-globals an implicit converts). This works, because, under the hood, our const declarations are actually assign-once globals. Note however, that this is not syntactically reachable, since we have a parse error for plain `const a`: ``` julia> const a ERROR: ParseError: # Error @ REPL[1]:1:1 const a └─────┘ ── expected assignment after `const` Stacktrace: [1] top-level scope @ none:1 ``` However, this lowering is not atomic with respect to world age. The semantics in #54654 require that the const-ness and the value are established atomically (with respect to world age, potentially on another thread) or undergo invalidation. To resolve this issue, this PR changes the lowering of `const a::T = val` to: ``` let local a::T = val const (global a) = a end ``` where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`. A similar change is made to const global declarations, which previously lowered via intrinsic, i.e. `global a::T = val` lowered to: ``` global a Core.set_binding_type!(Main, :a, T) _T = Core.get_binding_type(Main, :a) if !isa(val, _T) val = convert(_T, val) end a = val ``` This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`. This is not technically required, but we currently do not use intrinsics for world-age affecting side-effects anywhere else in the system. In particular, after #54654, it would be illegal to call `set_binding_type!` in anything but top-level context. Now, we have discussed in the past that there should potentially be intrinsic functions for global modifications (method table additions, etc), currently only reachable through `Core.eval`, but such an intrinsic would require semantics that differ from both the current `set_binding_type!` and the new `:globaldecl`. Using an Expr form here is the most consistent with our current practice for these sort of things elsewhere and accordingly, this PR removes the intrinsic. Note that this PR does not yet change any syntax semantics, although there could in principle be a reordering of side-effects within an expression (e.g. things like `global a::(@isdefined(a) ? Int : Float64)` might behave differently after this commit. However, we never defined the order of side effects (which is part of what this is cleaning up, although, I am not formally defining any specific ordering here either - #54654 will do some of that), and that is not a common case, so this PR should be largely considered non-semantic with respect to the syntax change. Also fixes #54787 while we're at it.

This is a re-worked extraction of #54654, adjusted to support the new semantics arrived at over the course of that thread. Note that this is the data-structure change only. The semantics in this PR still match current master to the greatest extent possible. In particular, this does not fully implement thread-safety of the old semantics, since that works differently after #54654 (this PR essentially implements the old semantics by mutating in place data structures that are supposed to be immutable), though note that there were some thread safety concerns in the existing also. The core idea here is to split `Binding` in two: A new `Binding` with minimal data and a `BindingPartition` that holds all data that is world-age partitioned. In the present PR, these are always in 1:1 correspondednce, but after #54654, there will be multiple `BindingPartition`s for every `Binding`. Essentially the `owner` and `ty` fields have been merged into a new `restriction` field of `BindingPartition`, which may also hold the value of a constant (consistent with the final semantics reached in #54654). The non-partitioned binding->value field is now used exclusively for non-constant globals. The disambiguation for how to interpret the `restriction` field happens via flags. `->imported` grew to 2 bits and can now be one of `NONE`/`IMPLICIT`/ `EXPLICIT`/`GUARD`. `GUARD` corresponds to the old `b->owner == NULL` case. `NONE` corresponds to the old `b->owner == b` case, while IMPLICIT/EXPLICIT correspond to `b->owner != b` and the old `imported` flag. Other than that, the behavior of the flags is unchanged. Additionally, fields are provided for `min_world`/`max_world`/`next` and a `declared` flag in `Binding`, but these are unused in this PR and prepratory only. Because of above mentioned thread-safety concerns, I am not anticipating merging this PR by itself, but I think it is a good review partition to cut down on the number of changes in the main PRs.

Currently we error when attempting to serialize Bindings that do not beloing to the incremental module (GlobalRefs have special logic to avoid looking at the binding field). With #54654, Bindings will show up in more places, so let's just unique them properly by their module/name identity. Of course, we then have two objects so serialized (both GlobalRef and Binding), which suggests that we should perhaps finish the project of unifying them. This is not currently possible, because the existence of a binding object in the binding table has semantic content, but this will change with #54654, so we can do such a change thereafter.

This is a follow up to resolve a TODO left in #54773 as part of preparatory work for #54654. Currently, our lowering for type definition contains an early `isdefined` that forces a decision on binding resolution before the assignment of the actual binding. In the current implementation, this doesn't matter much, but with #54654, this would incur a binding invalidation we would like to avoid. To get around this, we extend the (internal) `isdefined` form to take an extra argument specifying whether or not to permit looking at imported bindings. If not, resolving the binding is not required semantically, but for the purposes of type definition (where assigning to an imported binding would error anyway), this is all we need.

Currently we error when attempting to serialize Bindings that do not beloing to the incremental module (GlobalRefs have special logic to avoid looking at the binding field). With #54654, Bindings will show up in more places, so let's just unique them properly by their module/name identity. Of course, we then have two objects so serialized (both GlobalRef and Binding), which suggests that we should perhaps finish the project of unifying them. This is not currently possible, because the existence of a binding object in the binding table has semantic content, but this will change with #54654, so we can do such a change thereafter.

This is a follow up to resolve a TODO left in #54773 as part of preparatory work for #54654. Currently, our lowering for type definition contains an early `isdefined` that forces a decision on binding resolution before the assignment of the actual binding. In the current implementation, this doesn't matter much, but with #54654, this would incur a binding invalidation we would like to avoid. To get around this, we extend the (internal) `isdefined` form to take an extra argument specifying whether or not to permit looking at imported bindings. If not, resolving the binding is not required semantically, but for the purposes of type definition (where assigning to an imported binding would error anyway), this is all we need.

timholy · 2024-07-11T20:40:31Z

Upon discussion at JuliaCon, this seems more workable than I feared. My one remaining concern is whether it will still be possible to re-evaluate the same (unchanged) structure definition without redefining the binding and invalidating the old type. Revise currently does a lot of that en route to its quest to compute accurate method signatures (which often involve creating new structs).

Currently if you re-evaluate the same method definition it does invalidate the old one and create a new (equivalent) one. Revise works hard to avoid doing this, but doesn't take the same care for types because types are needed for signatures and re-evaluating them currently doesn't trigger any invalidation. (Just an error message if they've changed.)

bvdmitri · 2024-08-09T16:58:00Z

I didn't find it during the discussion, but I found another implicit way to resolve the binding using tab-completions:

julia> module A
           function foo end
           export foo
       end
Main.A

julia> module B
           function foo end
           export foo
       end
Main.B

julia> using .A

julia> foo(<TAB> # after pressing `TAB` simply remove the line but do not call foo

julia> using .B
WARNING: using B.foo in module Main conflicts with an existing identifier.

The warning suggests that the binding for foo has been resolved implicitly, though the value of foo itself hasn't been used in any form. Without using TAB-completion no warning is printed (which suggests that the binding hasn't been resolved yet)

julia> module A
           function foo end
           export foo
       end
Main.A

julia> module B
           function foo end
           export foo
       end
Main.B

julia> using .A

julia> using .B

I don't know if its an issue here, just an interesting observation that may be relevant to the discussion.

Keno · 2024-08-09T16:59:41Z

Yes, it was discussed. This will change to no longer implicitly resolve the binding.

This is a re-worked extraction of #54654, adjusted to support the new semantics arrived at over the course of that thread. Note that this is the data-structure change only. The semantics in this PR still match current master to the greatest extent possible. The core idea here is to split `Binding` in two: A new `Binding` with minimal data and a `BindingPartition` that holds all data that is world-age partitioned. In the present PR, these are always in 1:1 correspondednce, but after #54654, there will be multiple `BindingPartition`s for every `Binding`. Essentially the `owner` and `ty` fields have been merged into a new `restriction` field of `BindingPartition`, which may also hold the value of a constant (consistent with the final semantics reached in #54654). The non-partitioned binding->value field is now used exclusively for non-constant globals. The disambiguation for how to interpret the `restriction` field happens via flags. `->imported` grew to 2 bits and can now be one of `NONE`/`IMPLICIT`/ `EXPLICIT`/`GUARD`. `GUARD` corresponds to the old `b->owner == NULL` case. `NONE` corresponds to the old `b->owner == b` case, while IMPLICIT/EXPLICIT correspond to `b->owner != b` and the old `imported` flag. Other than that, the behavior of the flags is unchanged. Additionally, fields are provided for `min_world`/`max_world`/`next`, but these are unused in this PR and prepratory only.

Now that I've had a few months to recover from the slog of adding `BindingPartition`, it's time to renew my quest to finish #54654. This adds the basic infrastructure for having multiple partitions, including making the lookup respect the `world` argument - on-demand allocation of missing partitions, `Base.delete_binding` and the `@world` macro. Not included is any inference or invalidation support, or any support for the runtime to create partitions itself (only `Base.delete_binding` does that for now), which will come in subsequent PRs.

Keno requested a review from timholy June 2, 2024 21:50

giordano changed the title ~~WIP: World-age parition bindings~~ WIP: World-age partition bindings Jun 2, 2024

Keno mentioned this pull request Jun 4, 2024

It is now possible to create globals in a different module #54607

Closed

Keno mentioned this pull request Jun 5, 2024

Don't rely on implicit binding creation by setglobal JuliaLang/Distributed.jl#102

Merged

Keno mentioned this pull request Jun 5, 2024

Don't let setglobal! implicitly create bindings #54678

Merged

Keno mentioned this pull request Jun 23, 2024

RFC: Export versioning #54905

Open

Keno mentioned this pull request Jul 1, 2024

staticdata: Unique Bindings by mod/name #54993

Merged

Keno mentioned this pull request Jul 2, 2024

lowering: Don't resolve type bindings earlier than necessary #54999

Merged

Keno mentioned this pull request Jul 23, 2024

Canonicalize names of nested functions by keeping a more fine grained counter -- per (module, method name) pair #53719

Merged

Taaitaaiger mentioned this pull request Sep 29, 2024

Constant can be set multiple times with jl_set_const #55921

Closed

Keno mentioned this pull request Oct 18, 2024

Add basic infrastructure for binding replacement #56224

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: World-age partition bindings #54654

WIP: World-age partition bindings #54654

Keno commented Jun 2, 2024 •

edited

Loading

Keno commented Jun 4, 2024 •

edited

Loading

Semantic TODO

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024 •

edited

Loading

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

topolarity commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

timholy commented Jul 11, 2024 •

edited

Loading

bvdmitri commented Aug 9, 2024

Keno commented Aug 9, 2024

WIP: World-age partition bindings #54654

Are you sure you want to change the base?

WIP: World-age partition bindings #54654

Conversation

Keno commented Jun 2, 2024 • edited Loading

Motivation

Implementation

Demo

Performance consideration

Semantic TODO

Implementation TODO

Keno commented Jun 4, 2024 • edited Loading

Semantic TODO

When does binding resolution happen semantically?

Example 1

Example 2

Proposed semantics

What to do about replacement of mutable bindings?

Example

Other options

Suggestion 1: Merge bindings with egal metadata across world ages.

Suggestion 2: setglobal! assigns in every compatible world age

Suggestion 3: Writing outdated mutable bindings is an error

Suggestion 4: Do that, but also error on reads

Suggestion 5 [late submission]: type assert (using the old binding type) on read and write

Conclusion

Eager resolution of bindings?

Should we do guard entries?

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024 • edited Loading

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

topolarity commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

timholy commented Jul 11, 2024 • edited Loading

bvdmitri commented Aug 9, 2024

Keno commented Aug 9, 2024

Keno commented Jun 2, 2024 •

edited

Loading

Keno commented Jun 4, 2024 •

edited

Loading

vchuravy commented Jun 5, 2024 •

edited

Loading

timholy commented Jul 11, 2024 •

edited

Loading