Skip to content

Commit

Permalink
Refactor CodeInfo/CodeInstance separation and interfaces (JuliaLang#5…
Browse files Browse the repository at this point in the history
…3219)

The `CodeInfo` type is one of the oldest types in the system and has
grown a bit of cruft. In particular, the `rettype`, `inferred`,
`parent`, `edges`, `min_world`, `max_world` fields are not used for the
original purpose of representing code, but for one or more of (in
decreasing order of badness):

1. Smuggling extra results from inference into the compiler
2. Sumggling extra arguments into OpaqueClosure constructors
3. Passing extra information from generated functions to inference

The first of these points in particular causes a fair bit of mixup
between caching concerns and compiler concerns and results in external
abstract interpreters maintainging their own dummy CodeInfos, just to
comply with the interface. Originally, I just wanted to clean up that
bit, but it didn't really make sense as a standalone piece, so this PR
is more comprehensive.

In particular, this PR:

1. Removes the `parent`, `inferred` and `rettype` fields of `CodeInfo`.
They are largely vestigal and code accessing these is probably doing the
wrong thing. They should instead be looking at either the CodeInstance
or remembering the query that was asked of the cache in the first place.

2. Makes `edges`, `min_world` and `max_world` used for generated
functions only. All other uses were replaced by appropriate queries on
the CodeInstance. In particular, inference no longer sets these. In the
future we may want to consider removing these also and having generated
functions return some other object, but that is a topic to revisit once
the broader compiler plugins landscape is more clear.

3. Makes the external type inference interface return `CodeInstance`
rather than `CodeInfo`. This results in a lot of cleanup, because many
functions had multiple code paths, some for CodeInstance and others for
fallback to inference/CodeInfo. This is all cleaned up now. If you don't
have a CodeInstance, you can ask inference for one. This CodeInstance
may or may not be in the cache, but you can look at its types, compile
it, etc.

4. Moves the main inference entrypoint out of the codegen library. There
is still a little bit of entangelement, but this makes codegen much more
of an independent system that you give a CodeInstance and it just fills
in the invoke pointer for.

With these changes, only the third use of the above mentioned fields
remains.

The overall theme here is decoupling. Over time, various parties have
wanted to use the julia compiler with custom IR datastructure, backend
code generators, caches, etc. This doesn't quite get us all the way
there, but makes inference and codegen much more independent with a
clear IR-format-independent interface (CodeInstance).

---------

Co-authored-by: Valentin Churavy <v.churavy@gmail.com>
  • Loading branch information
2 people authored and tecosaur committed Mar 4, 2024
1 parent 4ce2c5d commit b0dfc9b
Show file tree
Hide file tree
Showing 37 changed files with 673 additions and 668 deletions.
14 changes: 11 additions & 3 deletions base/compiler/inferencestate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ mutable struct InferenceState
dont_work_on_me = false
parent = nothing

valid_worlds = WorldRange(src.min_world, src.max_world == typemax(UInt) ? get_world_counter() : src.max_world)
valid_worlds = WorldRange(1, get_world_counter())
bestguess = Bottom
exc_bestguess = Bottom
ipo_effects = EFFECTS_TOTAL
Expand All @@ -338,13 +338,21 @@ mutable struct InferenceState
InferenceParams(interp).unoptimize_throw_blocks && mark_throw_blocks!(src, handler_at)
!iszero(cache_mode & CACHE_MODE_LOCAL) && push!(get_inference_cache(interp), result)

return new(
this = new(
linfo, world, mod, sptypes, slottypes, src, cfg, method_info,
currbb, currpc, ip, handlers, handler_at, ssavalue_uses, bb_vartables, ssavaluetypes, stmt_edges, stmt_info,
pclimitations, limitations, cycle_backedges, callers_in_cycle, dont_work_on_me, parent,
result, unreachable, valid_worlds, bestguess, exc_bestguess, ipo_effects,
restrict_abstract_call_sites, cache_mode, insert_coverage,
interp)

# Apply generated function restrictions
if src.min_world != 1 || src.max_world != typemax(UInt)
# From generated functions
this.valid_worlds = WorldRange(src.min_world, src.max_world)
end

return this
end
end

Expand Down Expand Up @@ -799,7 +807,7 @@ function IRInterpretationState(interp::AbstractInterpreter,
method_info = MethodInfo(src)
ir = inflate_ir(src, mi)
return IRInterpretationState(interp, method_info, ir, mi, argtypes, world,
src.min_world, src.max_world)
code.min_world, code.max_world)
end

# AbsIntState
Expand Down
9 changes: 2 additions & 7 deletions base/compiler/optimize.jl
Original file line number Diff line number Diff line change
Expand Up @@ -107,19 +107,17 @@ is_declared_noinline(@nospecialize src::MaybeCompressed) =
# OptimizationState #
#####################

is_source_inferred(@nospecialize src::MaybeCompressed) =
ccall(:jl_ir_flag_inferred, Bool, (Any,), src)

function inlining_policy(interp::AbstractInterpreter,
@nospecialize(src), @nospecialize(info::CallInfo), stmt_flag::UInt32)
if isa(src, MaybeCompressed)
is_source_inferred(src) || return nothing
src_inlineable = is_stmt_inline(stmt_flag) || is_inlineable(src)
return src_inlineable ? src : nothing
elseif isa(src, IRCode)
return src
elseif isa(src, SemiConcreteResult)
return src
elseif isa(src, CodeInstance)
return inlining_policy(interp, src.inferred, info, stmt_flag)
end
return nothing
end
Expand Down Expand Up @@ -222,7 +220,6 @@ end
function ir_to_codeinf!(src::CodeInfo, ir::IRCode)
replace_code_newstyle!(src, ir)
widen_all_consts!(src)
src.inferred = true
return src
end

Expand All @@ -240,8 +237,6 @@ function widen_all_consts!(src::CodeInfo)
end
end

src.rettype = widenconst(src.rettype)

return src
end

Expand Down
2 changes: 0 additions & 2 deletions base/compiler/ssair/legacy.jl
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,6 @@ Mainly used for testing or interactive use.
inflate_ir(ci::CodeInfo, linfo::MethodInstance) = inflate_ir!(copy(ci), linfo)
inflate_ir(ci::CodeInfo, sptypes::Vector{VarState}, argtypes::Vector{Any}) = inflate_ir!(copy(ci), sptypes, argtypes)
function inflate_ir(ci::CodeInfo)
parent = ci.parent
isa(parent, MethodInstance) && return inflate_ir(ci, parent)
# XXX the length of `ci.slotflags` may be different from the actual number of call
# arguments, but we really don't know that information in this case
argtypes = Any[ Any for i = 1:length(ci.slotflags) ]
Expand Down
Loading

0 comments on commit b0dfc9b

Please sign in to comment.