Skip to content

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Aug 22, 2025

In --trace-compile Tag nested precompiles with # nested_compile.

Before:

  ./usr/bin/julia --startup=no --trace-compile=stderr -e '
    Base.@assume_effects :foldable function nested1(x)
        sum(collect(x for _ in 1:10_000_000))
    end
    f1(x) = nested1(sizeof(x)) + x
    f1(2)'
  precompile(Tuple{typeof(Main.nested1), Int64})
  precompile(Tuple{typeof(Main.f1), Int64})

  ./usr/bin/julia --startup=no --trace-compile=stderr --trace-compile-timing -e '
    Base.@assume_effects :foldable function nested1(x)
        sum(collect(x for _ in 1:10_000_000))
    end
    f1(x) = nested1(sizeof(x)) + x
    f1(2)'
  #=    8.1 ms =# precompile(Tuple{typeof(Main.nested1), Int64})
  #=   71.3 ms =# precompile(Tuple{typeof(Main.f1), Int64})

After:

  ./usr/bin/julia --startup=no --trace-compile=stderr -e '
    Base.@assume_effects :foldable function nested1(x)
        sum(collect(x for _ in 1:10_000_000))
    end
    f1(x) = nested1(sizeof(x)) + x
    f1(2)'
  precompile(Tuple{typeof(Main.nested1), Int64}) # nested_const_compilation
  precompile(Tuple{typeof(Main.f1), Int64})

  ./usr/bin/julia --startup=no --trace-compile=stderr --trace-compile-timing -e '
    Base.@assume_effects :foldable function nested1(x)
        sum(collect(x for _ in 1:10_000_000))
    end
    f1(x) = nested1(sizeof(x)) + x
    f1(2)'
  #=    8.1 ms =# precompile(Tuple{typeof(Main.nested1), Int64}) # nested_const_compilation
  #=   71.3 ms =# precompile(Tuple{typeof(Main.f1), Int64})

@NHDaly NHDaly marked this pull request as draft August 22, 2025 17:48
@nsajko nsajko added the observability metrics, timing, understandability, reflection, logging, ... label Aug 22, 2025
@NHDaly NHDaly marked this pull request as ready for review August 25, 2025 14:35
@giordano
Copy link
Member

Just to understand, by "nested" here you mean it's not the top-level function call being invoked when triggering the precompilation, so that the user knows that it's sufficient to do

precompile(Tuple{typeof(Main.f1), Int64})

to precompile Main.nested1 too?

Base automatically changed from nhd-fixup-trace-compile-timing-1 to master August 25, 2025 15:34
@KristofferC
Copy link
Member

From what I understand, there is a dynamic dispatch so it is not enough to run that su gle precompile signature because you don't statically resolve the nested call.

@topolarity
Copy link
Member

From what I understand, there is a dynamic dispatch so it is not enough to run that su gle precompile signature because you don't statically resolve the nested call.

This is a dynamic dispatch that is executed (always in an identical way, with identical arguments, etc.) during a concrete-eval of some compile-time function, so it is statically resolved / implied by the outer precompile(...) in that sense

@NHDaly
Copy link
Member Author

NHDaly commented Aug 25, 2025

Exactly like Cody said.

This is a weird edge-case that I only learned about last week, but apparently during compilation, the compiler can decide that a function should evaluate to a compile-time-constant, but Inference can't evaluate the result on its own: the compiler needs to actually run the function to see what it returns. So during compilation, the compiler does a dynamic dispatch to that function (which compiles it and logs this precompile statement), runs it to get the Const return value, and then proceeds compiling the original function.

Since there's a dynamic dispatch, we log the precompile statement. But as Cody says, it is fully implied by the outer call.. So probably you could skip it if you were replaying compilation logs for snooping, but the end result should be exactly identical if you do run it. And it's kind of nice to log it, since it is a dynamic dispatch, but i think we need to clarify to the user that this isn't a normal precompile statement.

Maybe we can find a better term for this than nested_compilation though, since I don't want to imply that this is the same as a "normal" nested compilation like a(x) = b(x); b(x) = x+1, where compiling a would trigger compilation of b.

Maybe we could call this # nested_const_compilation or # nested_compiler_dispatch or # const_eval_nested_compilation or something?

```julia
  ./usr/bin/julia --startup=no --trace-compile=stderr -e '
    Base.@assume_effects :foldable function nested1(x)
        sum(collect(x for _ in 1:10_000_000))
    end
    f1(x) = nested1(sizeof(x)) + x
    f1(2)'
  precompile(Tuple{typeof(Main.nested1), Int64}) # nested_compile
  precompile(Tuple{typeof(Main.f1), Int64})

  ./usr/bin/julia --startup=no --trace-compile=stderr --trace-compile-timing -e '
    Base.@assume_effects :foldable function nested1(x)
        sum(collect(x for _ in 1:10_000_000))
    end
    f1(x) = nested1(sizeof(x)) + x
    f1(2)'
  #=    8.1 ms =# precompile(Tuple{typeof(Main.nested1), Int64}) # nested_compile
  #=   71.3 ms =# precompile(Tuple{typeof(Main.f1), Int64})
```
@NHDaly NHDaly force-pushed the nhd-trace-compile-timing-nested_compile branch from 2c6ebab to 7d6e467 Compare August 25, 2025 19:48
@IanButterworth
Copy link
Member

It's a description so no need for underscores?

@topolarity
Copy link
Member

topolarity commented Aug 25, 2025

(always in an identical way, with identical arguments, etc.)

I think I was incorrect when I said this..

Effects analysis seems too limited right now to construct a practical example, but the definition of :consistent does not enforce that the execution is 100% deterministic.

It only requires that the return value is a deterministic function of the arguments (via egality), so this would be legal:

Base.@assume_effects :foldable function complicated_identity(N::Int)
    T = rand(Bool) ? typeof(N) : Any
    data = T[N,]      # temporary vector, so not an observable side-effect (still foldable)
    return only(data) # the method invoked here is not consistent, but the result is
end

and does not emit the same "nested" precompile for every execution.

@vtjnash
Copy link
Member

vtjnash commented Aug 25, 2025

We should probably be enforcing that any options (like precompile dispatch counting) are forcibly disabled while running the compiler. You do not want to ignore these after the fact or mark them differently in the output, since then you will miss actual real dispatches that occur at runtime simply because they also could occur during compile.

@NHDaly
Copy link
Member Author

NHDaly commented Aug 26, 2025

We should probably be enforcing that any options (like precompile dispatch counting) are forcibly disabled while running the compiler. You do not want to ignore these after the fact or mark them differently in the output, since then you will miss actual real dispatches that occur at runtime simply because they also could occur during compile.

Ah, that's an interesting point. So you're saying we should somehow "forget" that we've compiled this, and then the first time it's dispatched to again in the future -- outside the compiler -- we should log it as if we'd precompiled it then? 🤔 that seems difficult. I do see the merit, but that seems like a much harder code change, and i'm not entirely sure of the merit.

I think that probably people shouldn't skip those during compilation replay for snooping, to be careful about the situation you described. We mostly just wanted to annotate these to avoid double-counting during post-processing aggregation for our observability data: "how much time did this engine spend compiling, according to the precompile traces?" (We also look at other metrics like Base.cumulative_compile_time_ns(), but it's helpful to correlate these.)

Maybe this needs a paragraph in a documentation page about the --trace-compile feature to explain the nuances above?

Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>
@NHDaly
Copy link
Member Author

NHDaly commented Sep 6, 2025

So, @IanButterworth / @gbaraldi: Any thoughts on this? I agree with @vtjnash that long-term we might want to think about not emitting precompile statements at all for these, or doing some other kind of math to subtract the compilation time from their parents or something like that. But for now, i think just tagging these so that the user can differentiate them is a good improvement. Thoughts from your end?

@IanButterworth
Copy link
Member

Perhaps adding it to the docs would help outlining the motivation to highlight it? Without docs I can see users like myself not understanding what it means

@IanButterworth
Copy link
Member

IanButterworth commented Sep 6, 2025

Tests if possible too? News too.

@IanButterworth IanButterworth added needs tests Unit tests are required for this change needs docs Documentation for this change is required needs news A NEWS entry is required for this change labels Sep 6, 2025
@NHDaly NHDaly changed the title In --trace-compile Tag nested precompiles with # nested_compile. In --trace-compile Tag nested precompiles with # nested const compilation. Sep 6, 2025
NHDaly added a commit to RelationalAI/julia that referenced this pull request Sep 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs docs Documentation for this change is required needs news A NEWS entry is required for this change needs tests Unit tests are required for this change observability metrics, timing, understandability, reflection, logging, ...
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants