-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt to GPUCompiler 0.18 #673
Conversation
c1d0ba9
to
db587cf
Compare
ccb3047
to
2fba470
Compare
2fba470
to
b4085a1
Compare
e04cf32
to
c506bd0
Compare
c506bd0
to
6e2e1d7
Compare
This is now mostly functioning, but performance-wise there remain some issues:
|
@vtjnash Valentin tells me that return type with world can only be inferred on main -- which is causing breakages as we adapt to world. Any ideas (and or possibility of getting that inferred elsewhere)? |
Oh, I had already written that comment, and github didn't post it: JuliaGPU/GPUCompiler.jl#394 (comment) |
t looks like the world variation of Core.Compiler.ReturnType isn't even
inferred on master @vtjnash ? ;
https://github.com/JuliaLang/julia/blob/6b934f91d1b9c50c5b783b6aa36cf2648999461c/base/compiler/tfuncs.jl#L2507
…On Mon, Mar 27, 2023 at 11:12 AM Jameson Nash ***@***.***> wrote:
Oh, I had already written that comment, and github didn't post it: JuliaGPU/GPUCompiler.jl#394
(comment)
<JuliaGPU/GPUCompiler.jl#394 (comment)>
—
Reply to this email directly, view it on GitHub
<#673 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTUXCSXHGVGHLOISGICTDW6GU6HANCNFSM6AAAAAAVZQMR5Y>
.
You are receiving this because you commented.Message ID: <EnzymeAD/Enzyme.
***@***.***>
|
Yeah, the compiler is forbidden from propagating anything about world values, since those are not constants. |
I don't understand , code_typed shows a constant int, no? |
Only because the generator lied to inference |
Trialing out in the EnzymeInterpreter, @vtjnash is there a reason why this is considered harmful? It partially resolves the issue (at least in nested AD, where we can use the enforced interpreter). # call where the function is known exactly
function Core.Compiler.abstract_call_known(interp::EnzymeInterpreter, @nospecialize(f),
arginfo::Core.Compiler.ArgInfo, si::Core.Compiler.StmtInfo, sv::Union{InferenceState, Core.Compiler.IRCode},
max_methods::Int = isa(sv, InferenceState) ? get_max_methods(f, sv.mod, interp) : 0)
(; fargs, argtypes) = arginfo
la = length(argtypes)
if Core.Compiler.is_return_type(f)
wc = Base.get_world_counter()
@show f, argtypes, interp.world, wc
if all(x->isa(x, Core.Const), argtypes)
if length(argtypes) == 4 && isa(argtypes[4].val, UInt64)
world = argtypes[4].val
if world <= wc
res = Core.Compiler.return_type(argtypes[2].val, argtypes[3].val, world)
@show res
info = Core.Compiler.verbose_stmt_info(interp) ? Core.Compiler.MethodResultPure(ReturnTypeCallInfo(call.info)) : Core.Compiler.MethodResultPure()
return Core.Compiler.CallMeta(Core.Const(res), Core.Compiler.EFFECTS_TOTAL, info)
end
end
if length(argtypes) == 3 && isa(argtypes[3].val, UInt64)
world = argtypes[3].val
if world <= wc
res = Core.Compiler.return_type(argtypes[2].val, world)
@show res
info = Core.Compiler.verbose_stmt_info(interp) ? Core.Compiler.MethodResultPure(ReturnTypeCallInfo(call.info)) : Core.Compiler.MethodResultPure()
return Core.Compiler.CallMeta(Core.Const(res), Core.Compiler.EFFECTS_TOTAL, info)
end
end
end
end
return Base.@invoke Core.Compiler.abstract_call_known(interp::AbstractInterpreter,
f::Any, arginfo::Core.Compiler.ArgInfo, si::Core.Compiler.StmtInfo, sv::Union{InferenceState, Core.Compiler.IRCode}, max_methods::Int)
end |
At least |
The information conveyed by |
works except: mul_kernel: Error During Test at /home/wmoses/git/Enzyme.jl/test/cuda.jl:18
Got exception outside of a @test
MethodError: no method matching Value(::Nothing; ctx::Context)
Closest candidates are:
Value(::Metadata; ctx)
@ LLVM ~/.julia/packages/LLVM/HykgZ/src/core/metadata.jl:53
Value(::Ptr{LLVM.API.LLVMOpaqueValue}) got unsupported keyword argument "ctx"
@ LLVM ~/.julia/packages/LLVM/HykgZ/src/core/value.jl:33
Stacktrace:
[1] add_kernel_state!(mod::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/anMCs/src/irgen.jl:562
[2] module_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
@ LLVM ~/.julia/packages/LLVM/HykgZ/src/pass.jl:19
[3] LLVMRunPassManager
@ ~/.julia/packages/LLVM/HykgZ/lib/13/libLLVM_h.jl:4898 [inlined]
[4] run!
@ ~/.julia/packages/LLVM/HykgZ/src/passmanager.jl:39 [inlined]
[5] macro expansion
@ ~/.julia/packages/GPUCompiler/anMCs/src/optim.jl:241 [inlined]
[6] macro expansion
@ ~/.julia/packages/LLVM/HykgZ/src/base.jl:102 [inlined]
[7] optimize!(job::CompilerJob, mod::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/anMCs/src/optim.jl:185
[8] macro expansion
@ ~/.julia/packages/GPUCompiler/anMCs/src/driver.jl:366 [inlined]
[9] macro expansion
@ ~/.julia/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
[10] macro expansion
@ ~/.julia/packages/GPUCompiler/anMCs/src/driver.jl:365 [inlined]
[11] macro expansion
@ ~/.julia/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
[12] macro expansion
@ ~/.julia/packages/GPUCompiler/anMCs/src/driver.jl:355 [inlined]
[13] emit_llvm(job::CompilerJob, method_instance::Any; libraries::Bool, deferred_codegen::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool, ctx::ThreadSafeContext)
@ GPUCompiler ~/.julia/packages/GPUCompiler/anMCs/src/utils.jl:83
[14] emit_llvm
@ ~/.julia/packages/GPUCompiler/anMCs/src/utils.jl:77 [inlined]
[15] compile(job::CompilerJob, ctx::ThreadSafeContext)
@ CUDA ~/.julia/packages/CUDA/q3GG0/src/compiler/compilation.jl:106
[16] #203
@ ~/.julia/packages/CUDA/q3GG0/src/compiler/compilation.jl:100 [inlined]
[17] ThreadSafeContext(f::CUDA.var"#203#204"{CompilerJob{PTXCompilerTarget, CUDA.CUDACompilerParams}})
@ LLVM ~/.julia/packages/LLVM/HykgZ/src/executionengine/ts_module.jl:14
[18] JuliaContext(f::CUDA.var"#203#204"{CompilerJob{PTXCompilerTarget, CUDA.CUDACompilerParams}})
@ GPUCompiler ~/.julia/packages/GPUCompiler/anMCs/src/driver.jl:74
[19] compile
@ ~/.julia/packages/CUDA/q3GG0/src/compiler/compilation.jl:99 [inlined]
[20] actual_compilation(cache::Dict{UInt64, Any}, key::UInt64, cfg::CompilerConfig{PTXCompilerTarget, CUDA.CUDACompilerParams}, ft::Type, tt::Type, world::UInt64, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/anMCs/src/cache.jl:184
[21] cached_compilation(cache::Dict{UInt64, Any}, cfg::CompilerConfig{PTXCompilerTarget, CUDA.CUDACompilerParams}, ft::Type, tt::Type, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/anMCs/src/cache.jl:163
[22] macro expansion
@ ~/.julia/packages/CUDA/q3GG0/src/compiler/execution.jl:310 [inlined]
[23] macro expansion
@ ./lock.jl:267 [inlined]
[24] cufunction(f::typeof(grad_mul_kernel), tt::Type{Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}}}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~/.julia/packages/CUDA/q3GG0/src/compiler/execution.jl:306
[25] cufunction(f::typeof(grad_mul_kernel), tt::Type{Tuple{CuDeviceVector{Float32, 1}, CuDeviceVector{Float32, 1}}})
@ CUDA ~/.julia/packages/CUDA/q3GG0/src/compiler/execution.jl:303
[26] macro expansion
@ ~/.julia/packages/CUDA/q3GG0/src/compiler/execution.jl:104 [inlined]
[27] macro expansion
@ ~/git/Enzyme.jl/test/cuda.jl:24 [inlined]
[28] macro expansion
@ ~/git/Enzyme.jl/julia-1.9.0-rc1/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
[29] top-level scope
@ ~/git/Enzyme.jl/test/cuda.jl:19
[30] include(fname::String)
@ Base.MainInclude ./client.jl:478
[31] top-level scope
@ ~/git/Enzyme.jl/test/runtests.jl:1909
[32] include(fname::String)
@ Base.MainInclude ./client.jl:478
[33] top-level scope
@ none:6
[34] eval
@ ./boot.jl:370 [inlined]
[35] exec_options(opts::Base.JLOptions)
@ Base ./client.jl:280
[36] _start()
@ Base ./client.jl:522
Test Summary: | Error Total Time
mul_kernel | 1 1 10.4s |
JuliaGPU/GPUCompiler.jl#394