-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Partition bindings by world age #40399
Comments
I'm not sure we can reasonably handle that. I think we don't really handle it for methods either? |
We don't handle backedges to the generated function generator, but we do handle backedes in the expanded code. Here, since we don't have the syntactic version of the generated code available, we'd need to store those as forward edges. |
This could also solve #14055 |
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
On certain workloads, profiling shows a surprisingly high fraction of inference time spent looking up bindings in modules. Bindings use a hash table, so they're expected to be quite fast, but certainly not zero. A big contributor to the problem is that we do basically treat it as zero, looking up bindings for GlobalRefs multiple times for each statement (e.g. in `isconst`, `isdefined`, to get the types, etc). This PR attempts to improve the situation by adding an extra field to GlobalRef that caches this lookup. This field is not serialized and if not set, we fallback to the previous behavior. I would expect the memory overhead to be relatively small, since we do intern GlobalRefs in memory to only have one per binding (rather than one per use). # Benchmarks The benchmarks look quite promising. Consider this artifical example (though it's actually not all that far fetched, given some of the generated code we see): ``` using Core.Intrinsics: add_int const ONE = 1 @eval function f(x, y) z = 0 $([:(z = add_int(x, add_int(z, ONE))) for _ = 1:10000]...) return add_int(z, y) end g(y) = f(ONE, y) ``` On master: ``` julia> @time @code_typed g(1) 1.427227 seconds (1.31 M allocations: 58.809 MiB, 5.73% gc time, 99.96% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` On this PR: ``` julia> @time @code_typed g(1) 0.503151 seconds (1.19 M allocations: 53.641 MiB, 5.10% gc time, 33.59% compilation time) CodeInfo( 1 ─ %1 = invoke Main.f(Main.ONE::Int64, y::Int64)::Int64 └── return %1 ) => Int64 ``` I don't expect the same speedup on other workloads, but there should be a few % speedup on most workloads still. # Future extensions The other motivation for this is that with a view towards #40399, we will want to more clearly define when bindings get resolved. The idea here would then be that binding resolution replaces generic `GlobalRefs` by GlobalRefs with a set binding cache, and any unresolved bindings would be treated conservatively by inference and optimization.
This PR generalizes the idea from #49199 and uses inference to analyze the types of REPL expression. This approach offers several advantages over the current `get_[value|type]`-based implementation: - The need for various special cases is eliminated, as lowering normalizes expressions, and inference handles all language features. - Constant propagation allows us to obtain accurate completions for complex expressions safely (see #36437). Analysis on arbitrary REPL expressions can be done by the following steps: - Lower a given expression - Form a top-level `MethodInstance` from the lowered expression - Run inference on the top-level `MethodInstance` This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that: - aggressively resolve global bindings to enable reasonable completions for lines like `Mod.a.|` (where `|` is the cursor position) - does not optimize the inferred code, as `REPLInterpreter` is only used to obtain the type or constant information of given expressions In particular, aggressive binding resolution presents challenges for `REPLInterpreter`'s cache validation (since #40399 hasn't been resolved yet). To avoid cache validation issue, `REPLInterpreter` only allows aggressive binding resolution for top-level frame representing REPL input code (`repl_frame`) and for child `getproperty` frames that are constant propagated from the `repl_frame`. This works, since 1.) these frames are never cached, and 2.) their results are only observed by the non-cached `repl_frame` Note that the code cache for `REPLInterpreter` is separated from the native code cache, ensuring that code caches produced by `REPLInterpreter`, where bindings are aggressively resolved and the code is not really optimized, do not affect the native code execution. A hack has also been added to avoid serializing `CodeInstances`s produced by `REPLInterpreter` during precompilation to workaround #48453. closes #36437 replaces #49199
This PR generalizes the idea from #49199 and uses inference to analyze the types of REPL expression. This approach offers several advantages over the current `get_[value|type]`-based implementation: - The need for various special cases is eliminated, as lowering normalizes expressions, and inference handles all language features. - Constant propagation allows us to obtain accurate completions for complex expressions safely (see #36437). Analysis on arbitrary REPL expressions can be done by the following steps: - Lower a given expression - Form a top-level `MethodInstance` from the lowered expression - Run inference on the top-level `MethodInstance` This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that: - aggressively resolve global bindings to enable reasonable completions for lines like `Mod.a.|` (where `|` is the cursor position) - aggressively concrete evaluates `:inconsistent` calls to provide reasonable completions for cases like `Ref(Some(42))[].|` - does not optimize the inferred code, as `REPLInterpreter` is only used to obtain the type or constant information of given expressions Aggressive binding resolution presents challenges for `REPLInterpreter`'s cache validation (since #40399 hasn't been resolved yet). To avoid cache validation issue, `REPLInterpreter` only allows aggressive binding resolution for top-level frame representing REPL input code (`repl_frame`) and for child `getproperty` frames that are constant propagated from the `repl_frame`. This works, since 1.) these frames are never cached, and 2.) their results are only observed by the non-cached `repl_frame` `REPLInterpreter` also aggressively concrete evaluate `:inconsistent` calls within `repl_frame`, allowing it to get get accurate type information about complex expressions that otherwise can not be constant folded, in a safe way, i.e. it still doesn't evaluate effectful expressions like `pop!(xs)`. Similarly to the aggressive binding resolution, aggressive concrete evaluation doesn't present any cache validation issues because `repl_frame` is never cached. Also note that the code cache for `REPLInterpreter` is separated from the native code cache, ensuring that code caches produced by `REPLInterpreter`, where bindings are aggressively resolved and the code is not really optimized, do not affect the native code execution. A hack has also been added to avoid serializing `CodeInstances`s produced by `REPLInterpreter` during precompilation to workaround #48453. closes #36437 replaces #49199
This PR generalizes the idea from #49199 and uses inference to analyze the types of REPL expression. This approach offers several advantages over the current `get_[value|type]`-based implementation: - The need for various special cases is eliminated, as lowering normalizes expressions, and inference handles all language features. - Constant propagation allows us to obtain accurate completions for complex expressions safely (see #36437). Analysis on arbitrary REPL expressions can be done by the following steps: - Lower a given expression - Form a top-level `MethodInstance` from the lowered expression - Run inference on the top-level `MethodInstance` This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that: - aggressively resolve global bindings to enable reasonable completions for lines like `Mod.a.|` (where `|` is the cursor position) - aggressively concrete evaluates `:inconsistent` calls to provide reasonable completions for cases like `Ref(Some(42))[].|` - does not optimize the inferred code, as `REPLInterpreter` is only used to obtain the type or constant information of given expressions Aggressive binding resolution presents challenges for `REPLInterpreter`'s cache validation (since #40399 hasn't been resolved yet). To avoid cache validation issue, `REPLInterpreter` only allows aggressive binding resolution for top-level frame representing REPL input code (`repl_frame`) and for child `getproperty` frames that are constant propagated from the `repl_frame`. This works, since 1.) these frames are never cached, and 2.) their results are only observed by the non-cached `repl_frame` `REPLInterpreter` also aggressively concrete evaluate `:inconsistent` calls within `repl_frame`, allowing it to get get accurate type information about complex expressions that otherwise can not be constant folded, in a safe way, i.e. it still doesn't evaluate effectful expressions like `pop!(xs)`. Similarly to the aggressive binding resolution, aggressive concrete evaluation doesn't present any cache validation issues because `repl_frame` is never cached. Also note that the code cache for `REPLInterpreter` is separated from the native code cache, ensuring that code caches produced by `REPLInterpreter`, where bindings are aggressively resolved and the code is not really optimized, do not affect the native code execution. A hack has also been added to avoid serializing `CodeInstances`s produced by `REPLInterpreter` during precompilation to workaround #48453. closes #36437 replaces #49199
This PR generalizes the idea from #49199 and uses inference to analyze the types of REPL expression. This approach offers several advantages over the current `get_[value|type]`-based implementation: - The need for various special cases is eliminated, as lowering normalizes expressions, and inference handles all language features. - Constant propagation allows us to obtain accurate completions for complex expressions safely (see #36437). Analysis on arbitrary REPL expressions can be done by the following steps: - Lower a given expression - Form a top-level `MethodInstance` from the lowered expression - Run inference on the top-level `MethodInstance` This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that: - aggressively resolve global bindings to enable reasonable completions for lines like `Mod.a.|` (where `|` is the cursor position) - aggressively concrete evaluates `:inconsistent` calls to provide reasonable completions for cases like `Ref(Some(42))[].|` - does not optimize the inferred code, as `REPLInterpreter` is only used to obtain the type or constant information of given expressions Aggressive binding resolution presents challenges for `REPLInterpreter`'s cache validation (since #40399 hasn't been resolved yet). To avoid cache validation issue, `REPLInterpreter` only allows aggressive binding resolution for top-level frame representing REPL input code (`repl_frame`) and for child `getproperty` frames that are constant propagated from the `repl_frame`. This works, since 1.) these frames are never cached, and 2.) their results are only observed by the non-cached `repl_frame` `REPLInterpreter` also aggressively concrete evaluate `:inconsistent` calls within `repl_frame`, allowing it to get get accurate type information about complex expressions that otherwise can not be constant folded, in a safe way, i.e. it still doesn't evaluate effectful expressions like `pop!(xs)`. Similarly to the aggressive binding resolution, aggressive concrete evaluation doesn't present any cache validation issues because `repl_frame` is never cached. Also note that the code cache for `REPLInterpreter` is separated from the native code cache, ensuring that code caches produced by `REPLInterpreter`, where bindings are aggressively resolved and the code is not really optimized, do not affect the native code execution. A hack has also been added to avoid serializing `CodeInstances`s produced by `REPLInterpreter` during precompilation to workaround #48453. closes #36437 replaces #49199
This PR generalizes the idea from #49199 and uses inference to analyze the types of REPL expression. This approach offers several advantages over the current `get_[value|type]`-based implementation: - The need for various special cases is eliminated, as lowering normalizes expressions, and inference handles all language features. - Constant propagation allows us to obtain accurate completions for complex expressions safely (see #36437). Analysis on arbitrary REPL expressions can be done by the following steps: - Lower a given expression - Form a top-level `MethodInstance` from the lowered expression - Run inference on the top-level `MethodInstance` This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that: - aggressively resolve global bindings to enable reasonable completions for lines like `Mod.a.|` (where `|` is the cursor position) - aggressively concrete evaluates `:inconsistent` calls to provide reasonable completions for cases like `Ref(Some(42))[].|` - does not optimize the inferred code, as `REPLInterpreter` is only used to obtain the type or constant information of given expressions Aggressive binding resolution presents challenges for `REPLInterpreter`'s cache validation (since #40399 hasn't been resolved yet). To avoid cache validation issue, `REPLInterpreter` only allows aggressive binding resolution for top-level frame representing REPL input code (`repl_frame`) and for child `getproperty` frames that are constant propagated from the `repl_frame`. This works, since 1.) these frames are never cached, and 2.) their results are only observed by the non-cached `repl_frame` `REPLInterpreter` also aggressively concrete evaluate `:inconsistent` calls within `repl_frame`, allowing it to get get accurate type information about complex expressions that otherwise can not be constant folded, in a safe way, i.e. it still doesn't evaluate effectful expressions like `pop!(xs)`. Similarly to the aggressive binding resolution, aggressive concrete evaluation doesn't present any cache validation issues because `repl_frame` is never cached. Also note that the code cache for `REPLInterpreter` is separated from the native code cache, ensuring that code caches produced by `REPLInterpreter`, where bindings are aggressively resolved and the code is not really optimized, do not affect the native code execution. A hack has also been added to avoid serializing `CodeInstances`s produced by `REPLInterpreter` during precompilation to workaround #48453. closes #36437 replaces #49199
…g#49206) This PR generalizes the idea from JuliaLang#49199 and uses inference to analyze the types of REPL expression. This approach offers several advantages over the current `get_[value|type]`-based implementation: - The need for various special cases is eliminated, as lowering normalizes expressions, and inference handles all language features. - Constant propagation allows us to obtain accurate completions for complex expressions safely (see JuliaLang#36437). Analysis on arbitrary REPL expressions can be done by the following steps: - Lower a given expression - Form a top-level `MethodInstance` from the lowered expression - Run inference on the top-level `MethodInstance` This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that: - aggressively resolve global bindings to enable reasonable completions for lines like `Mod.a.|` (where `|` is the cursor position) - aggressively concrete evaluates `:inconsistent` calls to provide reasonable completions for cases like `Ref(Some(42))[].|` - does not optimize the inferred code, as `REPLInterpreter` is only used to obtain the type or constant information of given expressions Aggressive binding resolution presents challenges for `REPLInterpreter`'s cache validation (since JuliaLang#40399 hasn't been resolved yet). To avoid cache validation issue, `REPLInterpreter` only allows aggressive binding resolution for top-level frame representing REPL input code (`repl_frame`) and for child `getproperty` frames that are constant propagated from the `repl_frame`. This works, since 1.) these frames are never cached, and 2.) their results are only observed by the non-cached `repl_frame` `REPLInterpreter` also aggressively concrete evaluate `:inconsistent` calls within `repl_frame`, allowing it to get get accurate type information about complex expressions that otherwise can not be constant folded, in a safe way, i.e. it still doesn't evaluate effectful expressions like `pop!(xs)`. Similarly to the aggressive binding resolution, aggressive concrete evaluation doesn't present any cache validation issues because `repl_frame` is never cached. Also note that the code cache for `REPLInterpreter` is separated from the native code cache, ensuring that code caches produced by `REPLInterpreter`, where bindings are aggressively resolved and the code is not really optimized, do not affect the native code execution. A hack has also been added to avoid serializing `CodeInstances`s produced by `REPLInterpreter` during precompilation to workaround JuliaLang#48453. closes JuliaLang#36437 replaces JuliaLang#49199
To remind myself what was meant here. If you have world-age partitioned bindings, you can (semantically) resolve imports eagerly and then replace them if necessary. |
This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that `const` bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this. # Motivation The reasons for this change are manifold: 1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of #22721. 2. A secondary motivation is to make `const`-redefinition no longer UB (although `const` redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in #54099. 3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (#53958). 4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like #14055. # Implementation In this PR: - `Binding` gets `min_world`/`max_world` fields like `CodeInstance` - Various lookup functions walk this linked list using the current task world_age as a key - Inference accumulates world bounds as it would for methods - Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef - One primary complication is that our IR definition permits `const` globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds. - A new `@world` macro can be used to access bindings from old world ages. This is used in printing for old objects. - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed. Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like: ``` f() = Any g(::f()) = 1 f() = Int ``` Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings. # Demo ``` julia> struct Foo a::Int end julia> g() = Foo(1) g (generic function with 1 method) julia> g() Foo(1) julia> f(::Foo) = 1 f (generic function with 1 method) julia> fold = Foo(1) Foo(1) julia> struct Foo a::Int b::Int end julia> g() ERROR: MethodError: no method matching Foo(::Int64) The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it. Closest candidates are: Foo(::Int64, ::Int64) @ Main REPL[6]:2 Foo(::Any, ::Any) @ Main REPL[6]:2 Stacktrace: [1] g() @ Main ./REPL[2]:1 [2] top-level scope @ REPL[7]:1 julia> f(::Foo) = 2 f (generic function with 2 methods) julia> methods(f) # 2 methods for generic function "f" from Main: [1] f(::Foo) @ REPL[8]:1 [2] f(::@world(Foo, 0:26898)) @ REPL[4]:1 julia> fold @world(Foo, 0:26898)(1) ``` # Performance consideration On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly. # Semantic TODO - [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately? - [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding? # Implementation TODO - [ ] Precompile re-validation - [ ] Various cleanups in the accessors - [ ] Invert the order of the binding linked list to make the most recent one always the head of the list - [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code - [ ] Generated function support
This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that `const` bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this. The reasons for this change are manifold: 1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of #22721. 2. A secondary motivation is to make `const`-redefinition no longer UB (although `const` redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in #54099. 3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (#53958). 4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like #14055. In this PR: - `Binding` gets `min_world`/`max_world` fields like `CodeInstance` - Various lookup functions walk this linked list using the current task world_age as a key - Inference accumulates world bounds as it would for methods - Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef - One primary complication is that our IR definition permits `const` globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds. - A new `@world` macro can be used to access bindings from old world ages. This is used in printing for old objects. - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed. Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like: ``` f() = Any g(::f()) = 1 f() = Int ``` Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings. ``` julia> struct Foo a::Int end julia> g() = Foo(1) g (generic function with 1 method) julia> g() Foo(1) julia> f(::Foo) = 1 f (generic function with 1 method) julia> fold = Foo(1) Foo(1) julia> struct Foo a::Int b::Int end julia> g() ERROR: MethodError: no method matching Foo(::Int64) The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it. Closest candidates are: Foo(::Int64, ::Int64) @ Main REPL[6]:2 Foo(::Any, ::Any) @ Main REPL[6]:2 Stacktrace: [1] g() @ Main ./REPL[2]:1 [2] top-level scope @ REPL[7]:1 julia> f(::Foo) = 2 f (generic function with 2 methods) julia> methods(f) [1] f(::Foo) @ REPL[8]:1 [2] f(::@world(Foo, 0:26898)) @ REPL[4]:1 julia> fold @world(Foo, 0:26898)(1) ``` On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly. - [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately? - [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding? - [ ] Precompile re-validation - [ ] Various cleanups in the accessors - [ ] Invert the order of the binding linked list to make the most recent one always the head of the list - [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code - [ ] Generated function support
I've been doing some work that has a fairly big dependency stack (100-200 packages) and as such takes a few minutes to warm up. Because of this non-Revise-able code is extremely painful, so I'd like to take another look at timholy/Revise.jl#18. This has some fairly long history of course, but to avoid getting buried in the discussion, I figured a new issue for my specific proposal would be best.
One solution that was previously discussed (in #22721) was to have backedges for bindings, but this was rejected as too expensive. My proposal aims to basically do this, but shift the cost of backedge tracking to the redefinition stage where it is more acceptable.
In particular, my proposal is:
getfield(Main, :foo)
which will get resolved via the getfield_tfunc.cc @JeffBezanson @vtjnash
If this proposal seems reasonable, I'd hope to nerdsnipe @timholy into taking charge :)
The text was updated successfully, but these errors were encountered: