-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor CodeInfo/CodeInstance separation and interfaces #53219
Conversation
9371cb3
to
d0e03c7
Compare
This will conflict with #52233 which I am hoping to merge in the next few days. I went through this PR and I didn't see anything to bad, only my unease about uncached CodeInstances :) |
Planning to conduct a thorough review around tomorrow. On a broader note, maybe relevant to Valentin's PR as well, it might be worth discussing whether this change should be incorporated into v1.11. While I agree that external uses of the |
We're still before the feature freeze, so I think it makes sense to merge these kinds of internals-breaking changes. Otherwise, we'll merge it right after the feature freeze and we'll need to maintain both versions through the full RC period. |
d0e03c7
to
1a1de9c
Compare
Rebased onto master after #52233 merge |
jinx, I also just rebased this locally ;). Let's see if we ended up in the same place ;). |
No diff modulo whitespace, so I'll just switch to this. |
There's some more cleanup required here post-rebase. Are you working on that, or should I go ahead? |
Go ahead :) |
base/reflection.jl
Outdated
@@ -1314,7 +1314,7 @@ uncompressed_ir(m::Method) = isdefined(m, :source) ? _uncompressed_ir(m, m.sourc | |||
isdefined(m, :generator) ? error("Method is @generated; try `code_lowered` instead.") : | |||
error("Code for this Method is not available.") | |||
_uncompressed_ir(m::Method, s::CodeInfo) = copy(s) | |||
_uncompressed_ir(m::Method, s::String) = ccall(:jl_uncompress_ir, Any, (Any, Ptr{Cvoid}, Any), m, C_NULL, s)::CodeInfo | |||
_uncompressed_ir(m::Method, s::String) = ccall(:jl_uncompress_ir, Any, (Any, Any), m, s)::CodeInfo | |||
_uncompressed_ir(ci::Core.CodeInstance, s::String) = ccall(:jl_uncompress_ir, Any, (Any, Any, Any), ci.def.def::Method, ci, s)::CodeInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be much safer not to change this API, and just ignore the extra argument. I want to be able to use it in the future to improve compression ratios anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep the change to the C API, but I'll leave the julia side API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like unnecessary code churn. Can we just not include either change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of having C API with unused arguments. External packages shouldn't be calling this directly anyway. If they are that's a bug.
# Even if we already have a CI for this, it's possible that the new CI has more | ||
# information (E.g. because the source was limited before, but is no longer - this | ||
# happens during bootstrap). In that case, allow the result to be recached. | ||
if result.src === nothing || (ci.inferred !== nothing || ci.invoke != C_NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these particular conditions? Is this what already_inferred_quick_test
is already supposed to be checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already_inferred_quick_test
handles the case where you're trying to cache the same MI twice in the same call to typeinf_ext (it just checks mi.inInference
). However, this case is different. Here we check whether we need to upgrade the CodeInstance that's already in the cache to add source to it. This can happen because the last time we inferred it, we couldn't save the source because one of our (unused) body statements had a limited rettype. The condition is essentially ci_has_abi
. I should probably replace it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that both already_inferred_quick_test
and the cache override here are both unique to the bootstrapping process, consolidating this logic within already_inferred_quick_test(interp, mi, result)
might be beneficial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the cache override is unique to the bootstrapping process. You can create the same situation at any time by first inferring something that will poison the the code and then it gets re-inferred when the execution actually reaches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pair of conditions though (ci.inferred !== nothing || ci.invoke != C_NULL)
seems to be not something that has a well-defined meaning in a multithread program (e.g. it introduces a data-race here).
src/codegen.cpp
Outdated
@@ -6804,7 +6806,7 @@ static Function* gen_cfun_wrapper( | |||
|
|||
jl_codectx_t ctx(M->getContext(), params); | |||
ctx.f = cw; | |||
ctx.world = world; | |||
ctx.min_world = ctx.max_world = world; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a little bit tricky to have world
in both params and ctx
, since it leads to odd cases like this. It might be easier to only have them in ctx
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean only have them in the params, since the ctx
has a reference to that?
I see what you mean - will delete from the params struct.
Removing |
It's an internal field like the rest of the fields removed in this PR. Cthulhu's reliance on it is a bug that we'll fix. |
It is documented as being part of the API here: Lines 723 to 725 in 02699bb
If it was a problem for other reasons, we could argue to make a breaking change to remove that, but as it is, it is simple to restore that field without affecting the rest of this PR. |
That's developer documentation, which does not constrain the interface. We do need to update that documentation, both for this change and several others in the past few months. The problem with that field is that there's no good semantic point at which to set it and that any user of the field is likely doing the wrong thing. Before this PR it was just being set at some random point in inference. |
JL_GC_PUSH1(&forced_ci); | ||
if (forced_ci) { | ||
// Force compile of this codeinst even though it already has an ->invoke | ||
_jl_compile_codeinst(forced_ci, NULL, *jl_ExecutionEngine->getContext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this codeinst is not in a cache, that means it is not legal to call this method, as this method pushes the object into the global debuginfo / JIT structs and later returns those, which can result in GC use-after free bugs
int should_skip_inference = !jl_is_method(mi->def.method) || jl_symbol_name(mi->def.method->name)[0] == '@'; | ||
|
||
if (!should_skip_inference) { | ||
codeinst = jl_type_infer(mi, world, 0, SOURCE_MODE_ABI); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to have introduced a datarace on the mi->inInference
field, which seems likely to lead to incorrect compilation and bad performance of threaded code. It was previously protected by the codegen lock.
""" | ||
function ci_has_abi(code::CodeInstance) | ||
ci_has_source(code) && return true | ||
return code.invoke !== C_NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are monotonic loads, so you've introduced a data-race here also. I think you wanted an :acquire
on one of these (may this one) to fix that, so that it must observe this being set. And also switch the order of the checks, so that it correctly pairs with the release store.
In such a case, it will first set the `invoke` field to a method that | ||
will block the thread until compilation is completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the comment here mentions "first", which seems to intend to imply a happens-before, but the code doesn't have that
This drops the unnecessary breaking change from #53219 by re-adding the optional `parent` field to CodeInfo. A few months ago, I had actually already put together a branch that also fixed Keno's complaints about how it was not set reliably, so I copied that code here also, so that it should get set more reliably whenever a CodeInfo is associated with a MethodInstance (either because it called `retrieve_code_info` to get IR from the Method or `uncompress_ir` to get it from the inference cache). This does not entirely fix Cthulhu's test errors, since I don't see any particular reason to re-introduce the other fields (min_world, max_world, inferred, edges, method_for_inference_limit_heuristics) that got removed or are now set incorrectly, and most errors appear to be instead related to the `Expr(:boundscheck, true/false)` change. However, they could be trivially re-added back as placeholders, if someone encounters a broken package that still needs them.
return CodeInstance(mi, cache_owner(interp), Any, Any, nothing, src, Int32(0), | ||
get_inference_world(interp), get_inference_world(interp), | ||
UInt32(0), UInt32(0), nothing, UInt8(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values appear to be mostly set incorrectly now, which can lead to compile corruption (e.g. #51872), as well as now failing to handle the case when retrieve_code_info
may fail and return nothing instead of CodeInfo. The old logic had a helper function jl_get_codeinst_for_src
which used to handle this case, so that function could either be un-deleted, or that operation could be added as an alternative way to call the CodeInstance
constructor, since it needs identical handling to how result
would be treated later.
return CodeInstance(mi, cache_owner(interp), Any, Any, nothing, src, Int32(0), | |
get_inference_world(interp), get_inference_world(interp), | |
UInt32(0), UInt32(0), nothing, UInt8(0)) | |
CodeInstance(interp, src, can_discard_trees=) |
It looks like this causes some fairly extreme performance issues (up to 50x longer inference times), though curiously also sometimes provides up to a 5x speed up |
When we use options like code coverage, we can't use the native code present in the cache file since it is not instrumented. PR #52123 introduced the capability of skipping the native code during loading, but created the issue that subsequent packages could have an explicit or implicit dependency on the native code. PR #53439 tainted the current process by setting `use_sysimage_native_code`, but this flag is propagated to subprocesses and lead to a regression in test time. Move this to a process local flag to avoid the regression. In the future we might be able to change the calling convention for cross-image calls to `invoke(ci::CodeInstance, args...)` instead of `ci.fptr(args...)` to handle native code not being present. --------- Co-authored-by: Jameson Nash <vtjnash@gmail.com>
…3219) The `CodeInfo` type is one of the oldest types in the system and has grown a bit of cruft. In particular, the `rettype`, `inferred`, `parent`, `edges`, `min_world`, `max_world` fields are not used for the original purpose of representing code, but for one or more of (in decreasing order of badness): 1. Smuggling extra results from inference into the compiler 2. Sumggling extra arguments into OpaqueClosure constructors 3. Passing extra information from generated functions to inference The first of these points in particular causes a fair bit of mixup between caching concerns and compiler concerns and results in external abstract interpreters maintainging their own dummy CodeInfos, just to comply with the interface. Originally, I just wanted to clean up that bit, but it didn't really make sense as a standalone piece, so this PR is more comprehensive. In particular, this PR: 1. Removes the `parent`, `inferred` and `rettype` fields of `CodeInfo`. They are largely vestigal and code accessing these is probably doing the wrong thing. They should instead be looking at either the CodeInstance or remembering the query that was asked of the cache in the first place. 2. Makes `edges`, `min_world` and `max_world` used for generated functions only. All other uses were replaced by appropriate queries on the CodeInstance. In particular, inference no longer sets these. In the future we may want to consider removing these also and having generated functions return some other object, but that is a topic to revisit once the broader compiler plugins landscape is more clear. 3. Makes the external type inference interface return `CodeInstance` rather than `CodeInfo`. This results in a lot of cleanup, because many functions had multiple code paths, some for CodeInstance and others for fallback to inference/CodeInfo. This is all cleaned up now. If you don't have a CodeInstance, you can ask inference for one. This CodeInstance may or may not be in the cache, but you can look at its types, compile it, etc. 4. Moves the main inference entrypoint out of the codegen library. There is still a little bit of entangelement, but this makes codegen much more of an independent system that you give a CodeInstance and it just fills in the invoke pointer for. With these changes, only the third use of the above mentioned fields remains. The overall theme here is decoupling. Over time, various parties have wanted to use the julia compiler with custom IR datastructure, backend code generators, caches, etc. This doesn't quite get us all the way there, but makes inference and codegen much more independent with a clear IR-format-independent interface (CodeInstance). --------- Co-authored-by: Valentin Churavy <v.churavy@gmail.com>
…ang#53393) This drops the unnecessary breaking change from JuliaLang#53219 by re-adding the optional `parent` field to CodeInfo. A few months ago, I had actually already put together a branch that also fixed Keno's complaints about how it was not set reliably, so I copied that code here also, so that it should get set more reliably whenever a CodeInfo is associated with a MethodInstance (either because it called `retrieve_code_info` to get IR from the Method or `uncompress_ir` to get it from the inference cache). This does not entirely fix Cthulhu's test errors, since I don't see any particular reason to re-introduce the other fields (min_world, max_world, inferred, edges, method_for_inference_limit_heuristics) that got removed or are now set incorrectly, and most errors appear to be instead related to the `Expr(:boundscheck, true/false)` change. However, they could be trivially re-added back as placeholders, if someone encounters a broken package that still needs them.
) Continuing the development in #53219, according to the plan in https://hackmd.io/@vtjnash/codeinstances, this further separates the meaning behind CodeInfo and CodeInstance, such that CodeInstance can only be used as a call target, and cannot be used for code generation, while CodeInfo can only be used to generate code (or for reflection on what code would be generated), and cannot be used as a call target. Basically, the eventual idea is that CodeInfo will only show up now as an input (e.g. for doing inlining or codegen) and is ephemeral, while a CodeInstance is what shows up in a cache (e.g. as a callable object).
This is the best way I could think of to determine whether an OC contains inferred code after JuliaLang/julia#53219. Also deletes a fastpath we had for OCs containing inferred code in `maybe_evaluate_builtin` as it doesn't really seem necessary and we would otherwise scan the IR twice for OCs we do want to interpret.
This is the best way I could think of to determine whether an OC contains inferred code after JuliaLang/julia#53219. Also deletes a fastpath we had for OCs containing inferred code in `maybe_evaluate_builtin` as it doesn't really seem necessary and we would otherwise scan the IR twice for OCs we do want to interpret.
The
CodeInfo
type is one of the oldest types in the system and has grown a bit of cruft. In particular, therettype
,inferred
,parent
,edges
,min_world
,max_world
fields are not used for the original purpose of representing code, but for one or more of (in decreasing order of badness):The first of these points in particular causes a fair bit of mixup between caching concerns and compiler concerns and results in external abstract interpreters maintainging their own dummy CodeInfos, just to comply with the interface. Originally, I just wanted to clean up that bit, but it didn't really make sense as a standalone piece, so this PR is more comprehensive.
In particular, this PR:
Removes the
parent
,inferred
andrettype
fields ofCodeInfo
. They are largely vestigal and code accessing these is probably doing the wrong thing. They should instead be looking at either the CodeInstance or remembering the query that was asked of the cache in the first place.Makes
edges
,min_world
andmax_world
used for generated functions only. All other uses were replaced by appropriate queries on the CodeInstance. In particular, inference no longer sets these. In the future we may want to consider removing these also and having generated functions return some other object, but that is a topic to revisit once the broader compiler plugins landscape is more clear.Makes the external type inference interface return
CodeInstance
rather thanCodeInfo
. This results in a lot of cleanup, because many functions had multiple code paths, some for CodeInstance and others for fallback to inference/CodeInfo. This is all cleaned up now. If you don't have a CodeInstance, you can ask inference for one. This CodeInstance may or may not be in the cache, but you can look at its types, compile it, etc.Moves the main inference entrypoint out of the codegen library. There is still a little bit of entangelement, but this makes codegen much more of an independent system that you give a CodeInstance and it just fills in the invoke pointer for.
With these changes, only the third use of the above mentioned fields remains.
The overall theme here is decoupling. Over time, various parties have wanted to use the julia compiler with custom IR datastructure, backend code generators, caches, etc. This doesn't quite get us all the way there, but makes inference and codegen much more independent with a clear IR-format-independent interface (CodeInstance).