Avoid writing jitdump code for block data#122274
Avoid writing jitdump code for block data#122274tommcdon wants to merge 5 commits intodotnet:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes an issue (#120419) where JIT dumps fail when reporting stub blocks whose memory is reserved but not yet committed. The fix prevents writing code bytes for block-level allocations while maintaining backward compatibility for individual method reporting.
- Adds a
reportCodeBlockboolean parameter toPAL_PerfJitDump_LogMethod(defaults totrue) - Updates the stub logging logic to skip code bytes for block-type allocations
- Removes dead code check that would never execute
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/coreclr/pal/inc/pal.h | Adds reportCodeBlock parameter with default value true to maintain backward compatibility |
| src/coreclr/pal/src/misc/perfjitdump.cpp | Implements conditional code size reporting based on reportCodeBlock parameter; removes dead code check |
| src/coreclr/vm/perfmap.cpp | Updates call site to pass false for block-type stub allocations |
|
Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag |
| PAL_PerfJitDump_LogMethod((void*)pCode, codeSize, name.GetUTF8(), nullptr, nullptr); | ||
| // For block-level stub allocations, the memory may be reserved but not yet committed. | ||
| // Emitting code bytes in that case can cause jitdump logging to fail, and the bytes | ||
| // are optional in the jitdump specification. |
There was a problem hiding this comment.
It feels unfortunate that we would sacrifice our code bytes. I imagine one of the major reasons someone would choose the jitdump format rather than the perfmap format is to get those code bytes.
I'm wondering...
- If we think folks want JitDumps without code bytes maybe we should have a config that eliminates the code bytes everywhere by design. That would give them a much smaller jitdump file than they have today.
- If we moved where we call ReportStubBlock() from the point where memory is reserved to the point where the memory is committed I'm guessing that is a neglible change in the total number of stubs reported in the file. At most it would be one record per page when we already report other jitted code/stubs at densities of >100 per page.
There was a problem hiding this comment.
It feels unfortunate that we would sacrifice our code bytes.
I believe the intent of #113943 was to add configuration knobs to control the volume of stub data reported to the perf tool. I have noticed that perf inject is much faster after that change, mostly it reduces the symbolic output the perf tool needs to process. Adding @davidwrighton who might have more to add.
Count of jitted*.so files for each DOTNET_PerfMapStubGranularity using the repro from #120419:
0- groups both reporting of stub types as well as allocations (default) - 750 jitted-*.so files after running perf inject. 12 are aggregated functions and therefore do not have code.1- groups both reporting of stub types but report each block allocation - 753 jitted-*.so files after running perf inject, 12 are aggregated functions and therefore do not have code.2- report allocations of grouped stub types - 3774 jitted-*.so files after running perf inject3- enables 1 and 2 - each unique stub type allocation is reported - 3774 jitted-*.so files after running perf inject
For cases 0 and 1, we report groups of stubs as a 'virtual group of stubs'. In that case, I felt that reporting the code bytes for these aggregated functions that we are reporting didn't seem to add value. Perf inject creates a "so" file that represents a function with code inside of it. The so's that represent block data will contain a pointer to a large block of data containing code for multiple stubs (by design with 0 and 1 configurations). While the intent is not to be sacrificing code bytes, we are sacrificing reporting of unique stub data, which in turn means that we do not have unique code data that represents the aggregated symbol. In terms of missing code data overall, a relatively small percentage will be missing code. In the test case, ~1.6% is missing code (out of ~750 jitted-*.so files created, 12 were aggregated functions). I would expect for larger apps, the percentage to decrease.
In cases 2 and 3, the code data will be present.
If we think folks want JitDumps without code bytes maybe we should have a config that eliminates the code bytes everywhere by design. That would give them a much smaller jitdump file than they have today.
My expectation is that if customers do not wish to have code bytes might use perfmaps and not jitdumps. With that said, I would be happy to add another config switch to disable code bytes for jitdumps. Thoughts?
If we moved where we call ReportStubBlock() from the point where memory is reserved to the point where the memory is committed, I'm guessing that is a negligible change in the total number of stubs reported in the file. At most it would be one record per page when we already report other jitted code/stubs at densities of >100 per page.
I will give this try.
There was a problem hiding this comment.
With that said, I would be happy to add another config switch to disable code bytes for jitdumps. Thoughts?
I'm not actively encouraging us to do it, I just wanted to leave the door open in case you or others were aware of customer interest in jitdump without code bytes. I'm not aware of it.
I believe the intent of #113943 was to add configuration knobs to control the volume of stub data reported to the perf tool.
I had the same perception - I don't think 113943 was trying to remove code bytes. It seemed like it was an unintended bug. This PR, as-is, would elevate the missing code from bug to by-design which I'm hoping we can avoid.
I will give this try.
Thanks!
Fixes #120419. The implementation of JIT dumps in the linux
perftool creates dynamic shared objects representing JIT'd methods. In .NET 10, #113943 changed the default granularity for stubs, which now reports the block of memory that stores stubs instead of the individual methods. If the event is fired before the block memory is committed (only is reserved),PerfJitDumpState::LogMethodfails to read the memory from the block and then all subsequent JIT dump events will fail. The linuxperftool will favor jit dumps over perfmaps if jit dumps are available, which breaks DOTNET_PerfMapEnabled=1 or 2 settings. This fix addresses the issue by no longer reporting the code byte for block allocations, which is only enabled when DOTNET_PerfMapStubGranularity is set to 0 or 1.