Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Prevent tainting native code loading from propagating (#53457)
When we use options like code coverage, we can't use the native code present in the cache file since it is not instrumented. PR #52123 introduced the capability of skipping the native code during loading, but created the issue that subsequent packages could have an explicit or implicit dependency on the native code. PR #53439 tainted the current process by setting `use_sysimage_native_code`, but this flag is propagated to subprocesses and lead to a regression in test time. Move this to a process local flag to avoid the regression. In the future we might be able to change the calling convention for cross-image calls to `invoke(ci::CodeInstance, args...)` instead of `ci.fptr(args...)` to handle native code not being present. --------- Co-authored-by: Jameson Nash <vtjnash@gmail.com>
- Loading branch information
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Executing the daily package evaluation, I will reply here when finished:
@nanosoldier
runtests(isdaily = true)
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nanosoldier
runbenchmarks(ALL, isdaily = true)
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your job failed.
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aviatesk can you fix the bug in inference here:
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this was a bug exposed by #53219. Previously to that commit, CodeInfo which had not been optimized would be not be copied into the cache because they did not get the inferred flag set on them. That PR removed that flag in favor of assuming it was always set, a faulty assumption which causes IRInterpreter to encounter unoptimized IR, a situation that it wasn't written defensively to handle, which causes it to crash, as seen here.
Seems the most direct fix may be to revert the deletion of that field, to undo this regression, which might also fix the performance issues noticed on that PR also (#53459)
b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the debugging effort. That explanation makes sense. I've run into something similar problem myself (JuliaDebug/JuliaInterpreter.jl#611 (comment)). And I confirmed the benchmark runs without problem by applying the following diff and revising afterwards:
Meanwhile, I see no issue with reviving the
inferred
field. Moreover, we could use a format likestate::UInt8
, where0x00
represents "lowered",0x01
does "inferred" (after abstract interpretation only), and so0x02
does "optimized"?b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mistaken.
slottypes
turns out to be alwaysnothing
inCodeInfo
decompressed fromjl_uncompress_ir
, meaning the above diff effectively disabled irinterp, which is why there seemed to be no issue.The real issue stems from
abs_call
disablingmay_optimize
. Due to theCodeInstance
refactor, the removal of theci.inferred
check atjulia/base/compiler/typeinfer.jl
Line 341 in aecd8fd
CodeInfo
to be cached forInferenceBenchmarker
. However, irinterp fundamentally requires optimized IR, creating a problem (since it's currently assumed that all cachedCodeInstances
are optimized)."b8a0a39
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are basically 2 issues here: 1 is that cache_the_tree is true here when it should be false (because it is not optimized IR). The second, semi-related issue, is that irinterp doesn't seem to have a state bit that checks whether irinterp is possible to run. I think bringing back
inferred
would fix the first issue (although their might be other fields that are equivalent to checking if it is optimized, that punning makes it more complicated to make changes later). The second issue perhaps is best suited to having a bit of state for that case as well, but it also happens that currentlyinferred::Bool
was exactly equivalent anyways.