Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference regression in 1.11 #55230

Closed
Sbozzolo opened this issue Jul 24, 2024 · 12 comments · Fixed by #55413
Closed

Inference regression in 1.11 #55230

Sbozzolo opened this issue Jul 24, 2024 · 12 comments · Fixed by #55413
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version regression 1.11 Regression in the 1.11 release types and dispatch Types, subtyping and method dispatch
Milestone

Comments

@Sbozzolo
Copy link
Contributor

We are seeing massive latency increases in ClimaAtmos.jl with Julia 1.11, starting from the alphas and continuing in 1.11-rc1 (CliMA/ClimaAtmos.jl#3186).

ClimaAtmos.jl does no longer compile in any reasonable time with Julia 1.11-rc1. I identified that the offending function is get_atmos, a simple function that returns a keyword-defined type. For the most part, this function simply converts string keywords read from YAML files to types, then, it returns a AtmosModel (defined below).

I used SnoopCompile (master) to try to get a little more insight. The get_atmos function, as it is, takes too long to compile (on Julia 1.10, this is instantaneous), so I had to remove part of it (as mentioned in the referenced issue). Once I do so, I see with @snoop_inference that:

InferenceTimingNode: 1.171199/288.711156 on Core.Compiler.Timings.ROOT() with 21 direct children

Plotting the inference:
image

All the time is spent in types.jl line 329. This line simply defines our struct:

Base.@kwdef struct AtmosModel{
    MC,
    MM,
    PM,
    CM,
    CCDPS,
    F,
    S,
    RM,
    LA,
    EXTFORCING,
    EC,
    AT,
    TM,
    EEM,
    EDM,
    ESMF,
    ESDF,
    ENP,
    EVR,
    TCM,
    NOGW,
    OGW,
    HD,
    VD,
    DM,
    SAM,
    VS,
    RS,
    ST,
    IN,
    SM,
    SA,
    NUM,
}
    model_config::MC = nothing
    moisture_model::MM = nothing
    precip_model::PM = nothing
    cloud_model::CM = nothing
    call_cloud_diagnostics_per_stage::CCDPS = nothing
    forcing_type::F = nothing
    subsidence::S = nothing
    radiation_mode::RM = nothing
    ls_adv::LA = nothing
    external_forcing::EXTFORCING = nothing
    edmf_coriolis::EC = nothing
    advection_test::AT = nothing
    tendency_model::TM = nothing
    edmfx_entr_model::EEM = nothing
    edmfx_detr_model::EDM = nothing
    edmfx_sgs_mass_flux::ESMF = nothing
    edmfx_sgs_diffusive_flux::ESDF = nothing
    edmfx_nh_pressure::ENP = nothing
    edmfx_filter::EVR = nothing
    turbconv_model::TCM = nothing
    non_orographic_gravity_wave::NOGW = nothing
    orographic_gravity_wave::OGW = nothing
    hyperdiff::HD = nothing
    vert_diff::VD = nothing
    diff_mode::DM = nothing
    sgs_adv_mode::SAM = nothing
    viscous_sponge::VS = nothing
    rayleigh_sponge::RS = nothing
    sfc_temperature::ST = nothing
    insolation::IN = nothing
    surface_model::SM = nothing
    surface_albedo::SA = nothing
    numerics::NUM = nothing
end

I have not tried without keyword arguments.

I don't think that what we are doing up to this point is particularly complex or unorthodox, and most of the types involved are very simple (mostly singletons and bools, all immutable).

Originally posted by @Sbozzolo in #55171 (comment)

@KristofferC KristofferC added the compiler:latency Compiler latency label Jul 24, 2024
@KristofferC
Copy link
Member

KristofferC commented Jul 24, 2024

Doing the very scientific thing of Ctrl-C after some time it seems to be stuck in subtyping:

Internal error: during type inference of
get_atmos(ClimaAtmos.AtmosConfig{Float32, ClimaParams.ParamDict{Float32}, Base.Dict{String, Any}, ClimaComms.SingletonCommsContext{ClimaComms.CPUSingleThreaded}, Tuple{String}}, ClimaAtmos.Parameters.ClimaAtmosParameters{Float32, Thermodynamics.Parameters.ThermodynamicsParameters{Float32}, RRTMGP.Parameters.RRTMGPParameters{Float32}, Insolation.Parameters.InsolationParameters{Float32}, Nothing, Nothing, CloudMicrophysics.Parameters.WaterProperties{Float32}, SurfaceFluxes.Parameters.SurfaceFluxesParameters{Float32, SurfaceFluxes.UniversalFunctions.BusingerParams{Float32}, Thermodynamics.Parameters.ThermodynamicsParameters{Float32}}, ClimaAtmos.Parameters.TurbulenceConvectionParameters{Float32}, ClimaAtmos.Parameters.SurfaceTemperatureParameters{Float32}})
Encountered unexpected error in runtime:
InterruptException()
subtype at julia/src/subtype.c:1407
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
exists_subtype at julia/src/subtype.c:1651 [inlined]
_forall_exists_subtype at julia/src/subtype.c:1682
forall_exists_subtype at julia/src/subtype.c:1696 [inlined]
ijl_subtype_env at julia/src/subtype.c:2146
jl_type_intersection_env_s at julia/src/subtype.c:4405
jl_typemap_intersection_node_visitor at julia/src/typemap.c:543 [inlined]
jl_typemap_intersection_visitor at julia/src/typemap.c:812
jl_typemap_intersection_visitor at julia/src/typemap.c:770```

@KristofferC
Copy link
Member

I think

using ClimaAtmos
using AtmosphericProfilesLibrary 
using Interpolations
T1 = Tuple{Type{ClimaAtmos.AtmosModel{MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC}, Union{ClimaAtmos.BoxModel, ClimaAtmos.PlaneModel, ClimaAtmos.SingleColumnModel, ClimaAtmos.SphericalModel, Nothing}, Union{ClimaAtmos.DryModel, ClimaAtmos.EquilMoistModel, ClimaAtmos.NonEquilMoistModel, Nothing}, Union{ClimaAtmos.Microphysics0Moment, ClimaAtmos.Microphysics1Moment, ClimaAtmos.NoPrecipitation}, Union{ClimaAtmos.DiagnosticEDMFCloud, ClimaAtmos.GridScaleCloud, ClimaAtmos.QuadratureCloud}, Union{ClimaAtmos.CallCloudDiagnosticsPerStage, Nothing}, Union{ClimaAtmos.HeldSuarezForcing, Nothing}, Union{Nothing, ClimaAtmos.Subsidence{T} where T}, Union{Nothing, ClimaAtmos.RadiationDYCOMS{Float32}, ClimaAtmos.RRTMGPInterface.AllSkyRadiation, ClimaAtmos.RRTMGPInterface.AllSkyRadiationWithClearSkyDiagnostics, ClimaAtmos.RRTMGPInterface.ClearSkyRadiation, ClimaAtmos.RRTMGPInterface.GrayRadiation, ClimaAtmos.RadiationTRMM_LBA{AtmosphericProfilesLibrary.TimeZProfile{Interpolations.Extrapolation{Float32, 2, Interpolations.GriddedInterpolation{Float32, 2, Array{Float32, 2}, Tuple{Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}, Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}}, Tuple{Base.StepRangeLen{Float32, Float64, Float64, Int64}, Array{Float32, 1}}}, Tuple{Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}, Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}}, Interpolations.Flat{Nothing}}}}}, Union{Nothing, ClimaAtmos.LargeScaleAdvection{_A, _B} where _B where _A}, Union{Nothing, ClimaAtmos.GCMForcing{Float64}}, Union{Nothing, ClimaAtmos.EDMFCoriolis{_A, _B, _C} where _C where _B where _A}, Any, Union{ClimaAtmos.NoGridScaleTendency, ClimaAtmos.NoSubgridScaleTendency, ClimaAtmos.UseAllTendency, Nothing}, Union{ClimaAtmos.GeneralizedEntrainment, ClimaAtmos.GeneralizedHarmonicsEntrainment, ClimaAtmos.NoEntrainment, ClimaAtmos.PiGroupsEntrainment}, Union{ClimaAtmos.ConstantAreaDetrainment, ClimaAtmos.GeneralizedDetrainment, ClimaAtmos.GeneralizedHarmonicsDetrainment, ClimaAtmos.NoDetrainment, ClimaAtmos.PiGroupsDetrainment}, Any, Any, Any, Any, Union{Nothing, ClimaAtmos.DiagnosticEDMFX{_A, _B, Float32} where _B where _A, ClimaAtmos.PrognosticEDMFX{_A, _B, Float32} where _B where _A}, Union{Nothing, ClimaAtmos.NonOrographyGravityWave{Float32}}, Union{Nothing, ClimaAtmos.OrographicGravityWave{Float32, String}}, Union{Nothing, ClimaAtmos.ClimaHyperdiffusion{_A} where _A}, Union{Nothing, ClimaAtmos.FriersonDiffusion{_A, Float32} where _A, ClimaAtmos.VerticalDiffusion{_A, Float32} where _A}, Union{ClimaAtmos.Explicit, ClimaAtmos.Implicit}, Union{ClimaAtmos.Explicit, ClimaAtmos.Implicit}, Union{Nothing, ClimaAtmos.ViscousSponge{Float32}}, Union{Nothing, ClimaAtmos.RayleighSponge{Float32}}, Union{ClimaAtmos.RCEMIPIISST, ClimaAtmos.ZonallyAsymmetricSST, ClimaAtmos.ZonallySymmetricSST, Nothing}, Union{ClimaAtmos.IdealizedInsolation, ClimaAtmos.RCEMIPIIInsolation, ClimaAtmos.TimeVaryingInsolation, Nothing}, Union{ClimaAtmos.PrescribedSurfaceTemperature, ClimaAtmos.PrognosticSurfaceTemperature{Int64}}, Union{ClimaAtmos.CouplerAlbedo, ClimaAtmos.ConstantAlbedo{Float32}, ClimaAtmos.RegressionFunctionAlbedo{Float32, ClimaAtmos.var"#134#136"{Float32}}}, ClimaAtmos.AtmosNumerics{EN_UP, TR_UP, ED_UP, ED_SG_UP, DYCORE, LIM} where LIM where DYCORE where ED_SG_UP where ED_UP where TR_UP where EN_UP}

T2 = Tuple{Type{ClimaAtmos.AtmosModel{MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC}, MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC

@time T1 <: T2

is a reproducer

@KristofferC
Copy link
Member

For me, ClimaAtmos precompiles for a very long time on 1.10 though so I cannot really check if this subtype check is fast there...

@Sbozzolo
Copy link
Contributor Author

For me, ClimaAtmos precompiles for a very long time on 1.10 though so I cannot really check if this subtype check is fast there...

Ah yeah, interesting! I just tried your reproducer on 1.10.4 and found that indeed the subtype check is slow there too.

However, if you look at our CI, you'll see that we can run without problems on 1.10.4 but not on 1.11-rc1.

This still reproduces the problem, so there's probably something else going on.

import ClimaAtmos

config = ClimaAtmos.AtmosConfig()
params = ClimaAtmos.create_parameter_set(config)
@time ClimaAtmos.get_atmos(config, params)

For 1.10.4: 3.481628 seconds (12.20 M allocations: 533.572 MiB, 2.93% gc time, 99.90% compilation time)

@KristofferC
Copy link
Member

KristofferC commented Jul 24, 2024

However, if you look at our CI, you'll see that we can run without problems on 1.10.4 but not on 1.11-rc1.

Not sure, locally on my M1 it was precompiling for ever but seems to work ok when SSHing to a different system.

Ah yeah, interesting! I just tried your reproducer on 1.10.4 and found that indeed the subtype check is slow there too.

Okay, might be some other change that causes this subtyping query to take place on 1.11 but not on 1.10 then.

@KristofferC

This comment was marked as outdated.

@JeffBezanson JeffBezanson added regression Regression in behavior compared to a previous version regression 1.11 Regression in the 1.11 release labels Jul 24, 2024
@KristofferC
Copy link
Member

Bisected to a61d1b4 (#50927)

commit a61d1b47a68297704814188a3509c011ec2a8fa1
Author: Jameson Nash <vtjnash@gmail.com>
Date:   Fri Sep 15 18:32:16 2023 -0400

    inference: apply tmerge limit elementwise to the Union (#50927)
    
    This allows forming larger unions, as long as each element in the Union
    is both relatively distinct and relatively simple. For example:
    
        tmerge(Base.BitSigned, Nothing) == Union{Nothing, Int128, Int16, Int32, Int64, Int8}
        tmerge(Tuple{Base.BitSigned, Int}, Nothing) == Union{Nothing, Tuple{Any, Int64}}
        tmerge(AbstractVector{Int}, Vector) == AbstractVector
    
    Disables a test from dc8d885, which does not seem possible to handle currently.
    
    This makes somewhat drastic changes to make this algorithm more
    commutative and simpler, since we dropped the final widening to `Any`.
    
    Co-authored-by: pchintalapudi <34727397+pchintalapudi@users.noreply.github.com>
    Co-authored-by: Oscar Smith <oscardssmith@gmail.com>

cc @vtjnash

@oscardssmith
Copy link
Member

That's unfortunate. Hopefully the subtyping precision improvements are savable

@KristofferC
Copy link
Member

The subtyping query in #55230 (comment) could perhaps be made faster?

@KristofferC KristofferC added this to the 1.11 milestone Jul 30, 2024
@topolarity topolarity added the types and dispatch Types, subtyping and method dispatch label Jul 30, 2024
@gbaraldi
Copy link
Member

gbaraldi commented Jul 30, 2024

With no packages

julia> T1 = Tuple{Union{Val{1}, Val{2}, Val{3}, Val{4}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Nothing},
       Union{Val{1}, Nothing},
       Union{Nothing, Val{1}},
       Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}, Val{6}, Val{7}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Any,
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Val{4}},
       Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}},
       Any,
       Any,
       Any,
       Any,
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}},
       Union{Val{1}, Val{2}},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}},
       Union{Val{1}, Val{2}, Val{3}},
       Val{1}}

julia> T2 = Tuple{MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC

@time T1 <: T2

@vtjnash
Copy link
Member

vtjnash commented Jul 30, 2024

somewhat more simplified:

julia> T1 = Tuple{Union{Val{1}, Val{2}, Val{3}, Val{4}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Nothing},
       Union{Val{1}, Nothing},
       Union{Nothing, Val{1}},
       Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}, Val{6}, Val{7}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Val{3}}}
Tuple{Union{Val{1}, Val{2}, Val{3}, Val{4}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Nothing}, Union{Val{1}, Nothing}, Union{Val{1}, Nothing}, Union{Val{1}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}, Val{6}, Val{7}, Nothing}, Union{Val{1}, Val{2}, Nothing}, Union{Val{1}, Val{2}, Nothing}, Union{Val{1}, Val{2}, Nothing}, Union{Val{1}, Val{2}, Val{3}}}

julia> T2 = Tuple{<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any};

julia> @time T1 <: T2
  2.016903 seconds
true

@oscardssmith
Copy link
Member

Much simpler:

julia> T1 = NTuple{12, Union{Val{1}, Val{2}, Val{3}, Val{4}}}
julia> T2 = Tuple{<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any}
julia> @time T1 <: T2
 19.878233 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version regression 1.11 Regression in the 1.11 release types and dispatch Types, subtyping and method dispatch
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants