Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable SuperPMI collection of a libraries tests run #91101

Merged

Conversation

BruceForstall
Copy link
Member

Currently, we have a PMI collection of the libraries tests. A PMI collection doesn't represent actual code run, so doesn't have PGO data and compilations, OSR compilations, and tends to overemphasize generics since it attempts many instantiations that might not occur in practice.

Similar to #74961, which enabled a collection of a run coreclr tests, this change enables a collection of a run of libraries tests.

We collect two different scenarios: "normal", meaning no configuration switch variables set, and "no_tiered_compilation", meaning we set DOTNET_TieredCompilation=0.Because the amount of data collected is so large, we create each of these scenarios as a separate process, and a separate resulting .mch file. (If done all at once, we end up running out of disk space on the Azure DevOps machines that collect all the per-Helix collections and merge them into the single, large resulting .mch file.)

The changes here are similar to (and sometimes a copy of) the changes in #74961, altered because the process of running the libraries tests is somewhat different in a few ways.

The "control flow" is as follows:

  • eng/pipelines/coreclr/superpmi-collect.yml: specifies two additional collection runs, as described above.
  • eng/pipelines/libraries/run-test-job.yml: specifies the scenarios to run, the additional job dependencies, and adds the logic to post-process all the per-Helix-machine .mch files
  • src/libraries/sendtohelix.proj: additional logic to add files needed for collection to the Helix correlation payload. In particular, we need superpmi.py (and dependencies), and superpmi/mcs/superpmi-shim-collector, as well as the JIT dll itself (which is already in the payload, but not in an easily found location). We could probably significantly trim down what we copy, as currently I just copy the entire Core_Root, which is over 1GB, and we only need 4 files.
  • src/libraries/sendtohelixhelp.proj: define some Helix "pre" and "post" commands. The "pre" commands set up the collection before the tests are run. The "post" commands merge/dedup/thin the collection, preparing it to be uploaded to artifact storage.
  • eng/testing/RunnerTemplate.cmd/sh: This is built into every libraries test RunTests.cmd/sh file, and is activated (enabled superpmi collection) by the Helix "pre" commands mentioned above.

The change to CultureInfoCurrentCulture.cs is to fix a problem where the test creates a new clean environment but copies over a few environment variables. It needs to also copy over SuperPMIShim* variables.

The collected data is quite large: about 700,000 methods in the "normal" scenario, and 300,000 in the "no_tiered_compilation" scenario, for a total of about 17GB for both.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 25, 2023
@ghost ghost assigned BruceForstall Aug 25, 2023
@BruceForstall
Copy link
Member Author

Here is win-x64 mcs -jitflags for the "normal" scenario:

Grouped Flag Appearances (645027 contexts)

bits                count  percent  parsed
0000000084050200        1    0.00%  FROZEN_ALLOC_ALLOWED IL_STUB BBINSTR TIER0 VECTOR512_THROTTLING
0000000084090200        1    0.00%  FROZEN_ALLOC_ALLOWED IL_STUB BBINSTR_IF_LOOPS TIER0 VECTOR512_THROTTLING
0000000080100200     7182    1.11%  FROZEN_ALLOC_ALLOWED BBOPT VECTOR512_THROTTLING
0000000080110200     3217    0.50%  FROZEN_ALLOC_ALLOWED IL_STUB BBOPT VECTOR512_THROTTLING
0000000088110200        1    0.00%  FROZEN_ALLOC_ALLOWED IL_STUB BBOPT TIER1 VECTOR512_THROTTLING
0000000080510200      200    0.03%  FROZEN_ALLOC_ALLOWED IL_STUB BBOPT PUBLISH_SECRET_PARAM VECTOR512_THROTTLING
0000000083510200       11    0.00%  FROZEN_ALLOC_ALLOWED IL_STUB BBOPT PUBLISH_SECRET_PARAM REVERSE_PINVOKE TRACK_TRANSITIONS VECTOR512_THROTTLING
0000000080110204       10    0.00%  DEBUG_CODE FROZEN_ALLOC_ALLOWED IL_STUB BBOPT VECTOR512_THROTTLING
0000000080100010       53    0.01%  DEBUG_INFO BBOPT VECTOR512_THROTTLING
0000000084040210   107280   16.63%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR TIER0 VECTOR512_THROTTLING
0000000084080210   346267   53.68%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR_IF_LOOPS TIER0 VECTOR512_THROTTLING
0000000081100210      225    0.03%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT REVERSE_PINVOKE VECTOR512_THROTTLING
0000000088100210    26893    4.17%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING
8600000088100210      145    0.02%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_METHOD_PROFILE HAS_DYNAMIC_PROFILE
8800000088100210        5    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_STATIC_PROFILE
9800000088100210        4    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_LIKELY_CLASS HAS_STATIC_PROFILE
a400000088100210     4307    0.67%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_CLASS_PROFILE HAS_DYNAMIC_PROFILE
a600000088100210       12    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_CLASS_PROFILE HAS_METHOD_PROFILE HAS_DYNAMIC_PROFILE
c400000088100210    74654   11.57%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_DYNAMIC_PROFILE
c600000088100210     2828    0.44%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_METHOD_PROFILE HAS_DYNAMIC_PROFILE
c800000088100210        3    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_STATIC_PROFILE
e400000088100210    36318    5.63%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_CLASS_PROFILE HAS_DYNAMIC_PROFILE
e600000088100210     2549    0.40%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_CLASS_PROFILE HAS_METHOD_PROFILE HAS_DYNAMIC_PROFILE
0000000088140210     7382    1.14%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT TIER1 VECTOR512_THROTTLING
8800000088140210       76    0.01%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_STATIC_PROFILE
9800000088140210      187    0.03%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_LIKELY_CLASS HAS_STATIC_PROFILE
c800000088140210     2719    0.42%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_STATIC_PROFILE
d800000088140210    17949    2.78%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_LIKELY_CLASS HAS_STATIC_PROFILE
0000000084240210      110    0.02%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR FRAMED TIER0 VECTOR512_THROTTLING
0000000084280210      305    0.05%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR_IF_LOOPS FRAMED TIER0 VECTOR512_THROTTLING
0000000088300210       73    0.01%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT FRAMED TIER1 VECTOR512_THROTTLING
a400000088300210        6    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT FRAMED TIER1 VECTOR512_THROTTLING HAS_PGO HAS_CLASS_PROFILE HAS_DYNAMIC_PROFILE
c400000088300210      645    0.10%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT FRAMED TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_DYNAMIC_PROFILE
e400000088300210      277    0.04%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT FRAMED TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_CLASS_PROFILE HAS_DYNAMIC_PROFILE
0000000088340210       72    0.01%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT FRAMED TIER1 VECTOR512_THROTTLING
8800000088340210        5    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT FRAMED TIER1 VECTOR512_THROTTLING HAS_PGO HAS_STATIC_PROFILE
c800000088340210       25    0.00%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT FRAMED TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_STATIC_PROFILE
d800000088340210      151    0.02%  DEBUG_INFO FROZEN_ALLOC_ALLOWED BBINSTR BBOPT FRAMED TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_LIKELY_CLASS HAS_STATIC_PROFILE
0000000080100014       13    0.00%  DEBUG_CODE DEBUG_INFO BBOPT VECTOR512_THROTTLING
0000000080100214      872    0.14%  DEBUG_CODE DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT VECTOR512_THROTTLING
000000008010021c       79    0.01%  DEBUG_CODE DEBUG_EnC DEBUG_INFO FROZEN_ALLOC_ALLOWED BBOPT VECTOR512_THROTTLING
0000000080000230        3    0.00%  DEBUG_INFO MIN_OPT FROZEN_ALLOC_ALLOWED VECTOR512_THROTTLING
0000000080200230       27    0.00%  DEBUG_INFO MIN_OPT FROZEN_ALLOC_ALLOWED FRAMED VECTOR512_THROTTLING
c400000088100290      793    0.12%  DEBUG_INFO OSR FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_DYNAMIC_PROFILE
c600000088100290       50    0.01%  DEBUG_INFO OSR FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_METHOD_PROFILE HAS_DYNAMIC_PROFILE
e400000088100290      976    0.15%  DEBUG_INFO OSR FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_CLASS_PROFILE HAS_DYNAMIC_PROFILE
e600000088100290       66    0.01%  DEBUG_INFO OSR FROZEN_ALLOC_ALLOWED BBOPT TIER1 VECTOR512_THROTTLING HAS_PGO HAS_EDGE_PROFILE HAS_CLASS_PROFILE HAS_METHOD_PROFILE HAS_DYNAMIC_PROFILE

Individual Flag Appearances

     974    0.15%  DEBUG_CODE
      79    0.01%  DEBUG_EnC
  634404   98.35%  DEBUG_INFO
      30    0.00%  MIN_OPT
    1885    0.29%  OSR
  644961   99.99%  FROZEN_ALLOC_ALLOWED
    3441    0.53%  IL_STUB
  135957   21.08%  BBINSTR
  346573   53.73%  BBINSTR_IF_LOOPS
  191033   29.62%  BBOPT
    1696    0.26%  FRAMED
     211    0.03%  PUBLISH_SECRET_PARAM
     236    0.04%  REVERSE_PINVOKE
      11    0.00%  TRACK_TRANSITIONS
  453964   70.38%  TIER0
  179171   27.78%  TIER1
  645027  100.00%  VECTOR512_THROTTLING
    5650    0.88%  HAS_METHOD_PROFILE
  123626   19.17%  HAS_DYNAMIC_PROFILE
   21124    3.27%  HAS_STATIC_PROFILE
   18291    2.84%  HAS_LIKELY_CLASS
   44511    6.90%  HAS_CLASS_PROFILE
  140003   21.70%  HAS_EDGE_PROFILE
  144750   22.44%  HAS_PGO

@@ -62,25 +62,34 @@ private void WriteRunScript(string templateContent, string extension)
{
bool isUnix = extension == ".sh";
string lineFeed = isUnix ? "\n" : "\r\n";
string[] newlineSeparator = new string[] { Environment.NewLine };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that I don't use the changes in this file anymore (I did in an intermediate stage). But it seems like a useful change to take anyway, as currently the generated commands don't handle multi-line commands.

@BruceForstall
Copy link
Member Author

@jakobbotsch PTAL
cc @dotnet/jit-contrib

@BruceForstall BruceForstall added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 25, 2023
@ghost
Copy link

ghost commented Aug 25, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Currently, we have a PMI collection of the libraries tests. A PMI collection doesn't represent actual code run, so doesn't have PGO data and compilations, OSR compilations, and tends to overemphasize generics since it attempts many instantiations that might not occur in practice.

Similar to #74961, which enabled a collection of a run coreclr tests, this change enables a collection of a run of libraries tests.

We collect two different scenarios: "normal", meaning no configuration switch variables set, and "no_tiered_compilation", meaning we set DOTNET_TieredCompilation=0.Because the amount of data collected is so large, we create each of these scenarios as a separate process, and a separate resulting .mch file. (If done all at once, we end up running out of disk space on the Azure DevOps machines that collect all the per-Helix collections and merge them into the single, large resulting .mch file.)

The changes here are similar to (and sometimes a copy of) the changes in #74961, altered because the process of running the libraries tests is somewhat different in a few ways.

The "control flow" is as follows:

  • eng/pipelines/coreclr/superpmi-collect.yml: specifies two additional collection runs, as described above.
  • eng/pipelines/libraries/run-test-job.yml: specifies the scenarios to run, the additional job dependencies, and adds the logic to post-process all the per-Helix-machine .mch files
  • src/libraries/sendtohelix.proj: additional logic to add files needed for collection to the Helix correlation payload. In particular, we need superpmi.py (and dependencies), and superpmi/mcs/superpmi-shim-collector, as well as the JIT dll itself (which is already in the payload, but not in an easily found location). We could probably significantly trim down what we copy, as currently I just copy the entire Core_Root, which is over 1GB, and we only need 4 files.
  • src/libraries/sendtohelixhelp.proj: define some Helix "pre" and "post" commands. The "pre" commands set up the collection before the tests are run. The "post" commands merge/dedup/thin the collection, preparing it to be uploaded to artifact storage.
  • eng/testing/RunnerTemplate.cmd/sh: This is built into every libraries test RunTests.cmd/sh file, and is activated (enabled superpmi collection) by the Helix "pre" commands mentioned above.

The change to CultureInfoCurrentCulture.cs is to fix a problem where the test creates a new clean environment but copies over a few environment variables. It needs to also copy over SuperPMIShim* variables.

The collected data is quite large: about 700,000 methods in the "normal" scenario, and 300,000 in the "no_tiered_compilation" scenario, for a total of about 17GB for both.

Author: BruceForstall
Assignees: BruceForstall
Labels:

area-CodeGen-coreclr, needs-area-label

Milestone: -

@BruceForstall
Copy link
Member Author

@ericstj This touches the libraries CI test run process. Maybe you or someone you suggest should review?

@BruceForstall BruceForstall removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 25, 2023
@BruceForstall BruceForstall added this to the 9.0.0 milestone Aug 25, 2023
@BruceForstall
Copy link
Member Author

I've seen a failure like this multiple times during the "Prepare Testhost" step:

"D:\a\_work\1\s\src\native\corehost\build.cmd" Release x86 commit b468dc3a4c450a2991fda0642935f4931f575aa6 outputrid win-x86 portable incremental-native-build rootdir D:\a\_work\1\s\ runtimeflavor coreclr runtimeconfiguration Checked
  Error: Visual Studio 2022 with C++ tools required. Please see https://github.com/dotnet/runtime/blob/main/docs/workflow/requirements/windows-requirements.md for build requirements.
  Failed to generate native component build project!
D:\a\_work\1\s\src\native\corehost\corehost.proj(159,5): error MSB3073: The command ""D:\a\_work\1\s\src\native\corehost\build.cmd" Release x86 commit b468dc3a4c450a2991fda0642935f4931f575aa6 outputrid win-x86 portable incremental-native-build rootdir D:\a\_work\1\s\ runtimeflavor coreclr runtimeconfiguration Checked" exited with code 1.
##[error]src\native\corehost\corehost.proj(159,5): error MSB3073: (NETCORE_ENGINEERING_TELEMETRY=Build) The command ""D:\a\_work\1\s\src\native\corehost\build.cmd" Release x86 commit b468dc3a4c450a2991fda0642935f4931f575aa6 outputrid win-x86 portable incremental-native-build rootdir D:\a\_work\1\s\ runtimeflavor coreclr runtimeconfiguration Checked" exited with code 1.

I don't think anything I've done would be involved during this step. If the "Visual Studio 2022 with C++ tools required" is not a red herring, then perhaps some of the machines (but not all?) that we are running on in AzDO/Helix for the internal dnceng instance are misconfigured?

Current test run:

https://dev.azure.com/dnceng/internal/_build/results?buildId=2252048&view=results

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I assume these two new collections effectively subsume libraries.pmi and libraries_tests.pmi -- can we remove those collections? Both in the interest of saving download/disk size but also to avoid misleading/disproportionate diffs on changes.

@ericstj ericstj requested a review from a team August 25, 2023 15:41
Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like either @carlossanlop or @ViktorHofer to have a look over this.

@@ -62,7 +67,7 @@
<Message Condition="'$(TestArchiveRuntimeFile)' != ''" Importance="High" Text="TestArchiveRuntimeFile: $(TestArchiveRuntimeFile)" />

<!-- Re-invoke MSBuild on this project to create the correlation payload -->
<MSBuild Projects="$(MSBuildProjectFile)" Targets="PrepareCorrelationPayloadDirectory" Properties="Scenarios=$(_Scenarios)" />
<MSBuild Projects="$(MSBuildProjectFile)" Targets="PrepareCorrelationPayloadDirectory" Properties="$(_PropertiesToPass)" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up passing a lot more global properties than were previously passed. Why do we need to do that? Moreover, why are we even doing an MSBuild invocation rather than just running the target in the same evaluation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a problem to pass through more properties?

I need it to pass through the SuperPmiCollect property, at least. It seemed simpler and more consistent to just pass through all the ones we pass through on the per-scenario msbuild re-invocation. It's a bit unfortunate that the libraries and runtime versions of this file are (so) different: it (src\tests\Common\helixpublishwitharcade.proj) passes through all the properties.

I can't answer the question about re-invocation. Maybe it provides easier-to-control ordering? The coreclr .proj file does this too (and for more cases, it appears).

Comment on lines 295 to 372
########################################################################################################
#
# Finalize SuperPMI collection: (1) merge all MCH files generated by all Helix jobs, (2) upload MCH file to Azure Storage, (3) upload log files
# Note that all these steps are "condition: always()" because we want to upload as much of the collection
# as possible, even if there were test failures.
#
########################################################################################################

- ${{ if eq(parameters.SuperPmiCollect, true) }}:

# Create required directories for merged mch collection and superpmi logs
- ${{ if ne(parameters.osGroup, 'windows') }}:
- script: |
mkdir -p $(MergedMchFileLocation)
mkdir -p $(SpmiLogsLocation)
displayName: 'Create SuperPMI directories'
condition: always()
- ${{ if eq(parameters.osGroup, 'windows') }}:
- script: |
mkdir $(MergedMchFileLocation)
mkdir $(SpmiLogsLocation)
displayName: 'Create SuperPMI directories'
condition: always()

- script: $(PythonScript) $(Build.SourcesDirectory)/src/coreclr/scripts/superpmi.py merge-mch -log_level DEBUG -pattern $(MchFilesLocation)$(SuperPmiCollectionName).$(SuperPmiCollectionType)*.mch -output_mch_path $(MergedMchFileLocation)$(SuperPmiCollectionName).$(SuperPmiCollectionType).$(MchFileTag).mch -core_root $(SuperPmiMcsPath)
displayName: 'Merge $(SuperPmiCollectionName)-$(SuperPmiCollectionType) SuperPMI collections'
condition: always()

- template: /eng/pipelines/common/upload-artifact-step.yml
parameters:
rootFolder: $(MergedMchFileLocation)
includeRootFolder: false
archiveType: $(archiveType)
tarCompression: $(tarCompression)
archiveExtension: $(archiveExtension)
artifactName: 'SuperPMI_Collection_$(SuperPmiCollectionName)_$(SuperPmiCollectionType)_$(osGroup)$(osSubgroup)_$(archType)_$(buildConfig)'
displayName: 'Upload artifacts SuperPMI $(SuperPmiCollectionName)-$(SuperPmiCollectionType) collection'
condition: always()

# Add authenticated pip feed
- task: PipAuthenticate@1
displayName: 'Pip Authenticate'
inputs:
artifactFeeds: public/dotnet-public-pypi
onlyAddExtraIndex: false
condition: always()

# Ensure the Python azure-storage-blob package is installed before doing the upload.
- script: $(PipScript) install --user --upgrade pip && $(PipScript) install --user azure.storage.blob==12.5.0 --force-reinstall
displayName: Upgrade Pip to latest and install azure-storage-blob Python package
condition: always()

- script: $(PythonScript) $(Build.SourcesDirectory)/src/coreclr/scripts/superpmi.py upload -log_level DEBUG -arch $(archType) -build_type $(buildConfig) -mch_files $(MergedMchFileLocation)$(SuperPmiCollectionName).$(SuperPmiCollectionType).$(MchFileTag).mch -core_root $(Build.SourcesDirectory)/artifacts/bin/coreclr/$(osGroup).x64.$(buildConfigUpper)
displayName: 'Upload SuperPMI $(SuperPmiCollectionName)-$(SuperPmiCollectionType) collection to Azure Storage'
condition: always()
env:
CLRJIT_AZ_KEY: $(clrjit_key1) # secret key stored as variable in pipeline

- task: CopyFiles@2
displayName: Copying superpmi.log of all partitions
inputs:
sourceFolder: '$(MchFilesLocation)'
contents: '**/$(SuperPmiCollectionName).$(SuperPmiCollectionType)*.log'
targetFolder: '$(SpmiLogsLocation)'
condition: always()

- task: PublishPipelineArtifact@1
displayName: Publish SuperPMI logs
inputs:
targetPath: $(SpmiLogsLocation)
artifactName: 'SuperPMI_Logs_$(SuperPmiCollectionName)_$(SuperPmiCollectionType)_$(osGroup)$(osSubgroup)_$(archType)_$(buildConfig)'
condition: always()

########################################################################################################
#
# End of SuperPMI processing
#
########################################################################################################
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes add a lot of complexity to the already non-trivial libraries yml files. Please consider moving this block into another YML and conditionally import it based on the PMI boolean.

Comment on lines 79 to 94
# Convenience variables
- name: buildConfig
value: ${{ parameters.buildConfig }}
- name: archType
value: ${{ parameters.archType }}
- name: osGroup
value: ${{ parameters.osGroup }}
- name: osSubgroup
value: ${{ parameters.osSubgroup }}
- name: buildConfigUpper
${{ if eq(parameters.buildConfig, 'debug') }}:
value: 'Debug'
${{ if eq(parameters.buildConfig, 'release') }}:
value: 'Release'
${{ if eq(parameters.buildConfig, 'checked') }}:
value: 'Checked'
Copy link
Member

@ViktorHofer ViktorHofer Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly happy about defining these variables here. We don't need any of those in src/libraries and some of those might already be defined. I would prefer if you find a way to avoid them.

In general, as the SuperPMI logic is completely optional it would be good to not increase the YML complexity. Can you move this into a feature YMl template?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little unfortunate these aren't already defined at a higher level (base-job.yml?) -- they are available in the coreclr side.

@BruceForstall
Copy link
Member Author

I assume these two new collections effectively subsume libraries.pmi and libraries_tests.pmi -- can we remove those collections? Both in the interest of saving download/disk size but also to avoid misleading/disproportionate diffs on changes.

I think it makes sense to remove one or both. We did the same thing when we added the coreclr_tests run collection: we removed the corresponding PMI collection. Now, these are the last remaining PMI collections. An argument could be made that PMI generates unique code patterns not otherwise seen due to its attempt to aggressively instantiate generics. If we decide that's not interesting enough, for the cost, we can drop the PMI collections. @AndyAyersMS: opinions?

Comment on lines 136 to 183
<ItemGroup Condition=" '$(TargetOS)' == 'windows' and '$(SuperPmiCollect)' == 'true' ">
<!-- Set variables needed by the test wrapper scripts to do SuperPMI collection -->
<HelixPreCommand Include="set spmi_enable_collection=1" />
<!-- spmi_collect_dir can point to any temporary directory. We choose %HELIX_WORKITEM_PAYLOAD%\spmi_collect for convenience, and
because we know it can be used, but %TEMP%\spmi_collect might be better. -->
<HelixPreCommand Include="set spmi_collect_dir=%HELIX_WORKITEM_PAYLOAD%\spmi_collect" />
<HelixPreCommand Include="if not exist %spmi_collect_dir% mkdir %spmi_collect_dir%" />
<HelixPreCommand Include="set spmi_core_root=%HELIX_CORRELATION_PAYLOAD%\coreclr" />
</ItemGroup>

<ItemGroup Condition=" '$(TargetOS)' == 'windows' and '$(SuperPmiCollect)' == 'true' ">
<!-- Merge all the per-test generated .MC files as a post-step. Superpmi.py needs superpmi, mcs, and the JIT to do processing. -->
<HelixPostCommand Include="echo on" />
<HelixPostCommand Include="set spmi_core_root=%HELIX_CORRELATION_PAYLOAD%\coreclr" />
<HelixPostCommand Include="set spmi_collection_name=$(SuperPmiCollectionName)" />
<HelixPostCommand Include="set spmi_collection_type=$(SuperPmiCollectionType)" />
<HelixPostCommand Include="set spmi_collection_mch_file_tag=$(TargetOS).$(TargetArchitecture).$(Configuration)" />
<HelixPostCommand Include="set spmi_superpmi_py=%HELIX_CORRELATION_PAYLOAD%\spmi_scripts\superpmi.py" />
<HelixPostCommand Include="set spmi_upload_dir=%HELIX_WORKITEM_UPLOAD_ROOT%" />
<HelixPostCommand Include="if not exist %spmi_upload_dir% mkdir %spmi_upload_dir%" />
<HelixPostCommand Include="set spmi_output_base_name=%spmi_collection_name%.%spmi_collection_type%.%spmi_collection_mch_file_tag%" />
<HelixPostCommand Include="set spmi_finalmch=%spmi_upload_dir%\%spmi_output_base_name%.mch" />
<HelixPostCommand Include="set spmi_log_file=%spmi_upload_dir%\%spmi_output_base_name%.log" />
<HelixPostCommand Include="%HELIX_PYTHONPATH% %spmi_superpmi_py% collect -log_level DEBUG -core_root %spmi_core_root% --skip_cleanup --clean --ci --skip_collection_step --skip_toc_step -temp_dir %spmi_collect_dir% -output_mch_path %spmi_finalmch% -log_file %spmi_log_file%" />
</ItemGroup>

<ItemGroup Condition=" '$(TargetOS)' != 'windows' and '$(SuperPmiCollect)' == 'true' ">
<!-- Set variables needed by the test wrapper scripts to do SuperPMI collection -->
<HelixPreCommand Include="export spmi_enable_collection=1" />
<HelixPreCommand Include="export spmi_collect_dir=$HELIX_WORKITEM_PAYLOAD/spmi_collect" />
<HelixPreCommand Include="mkdir -p $spmi_collect_dir" />
<HelixPreCommand Include="export spmi_core_root=$HELIX_CORRELATION_PAYLOAD/coreclr" />
</ItemGroup>

<ItemGroup Condition=" '$(TargetOS)' != 'windows' and '$(SuperPmiCollect)' == 'true' ">
<!-- Merge all the per-test generated .MC files as a post-step. Superpmi.py needs superpmi, mcs, and the JIT to do processing. -->
<HelixPostCommand Include="export spmi_core_root=$HELIX_CORRELATION_PAYLOAD/coreclr" />
<HelixPostCommand Include="export spmi_collection_name=$(SuperPmiCollectionName)" />
<HelixPostCommand Include="export spmi_collection_type=$(SuperPmiCollectionType)" />
<HelixPostCommand Include="export spmi_collection_mch_file_tag=$(TargetOS).$(TargetArchitecture).$(Configuration)" />
<HelixPostCommand Include="export spmi_superpmi_py=$HELIX_CORRELATION_PAYLOAD/spmi_scripts/superpmi.py" />
<HelixPostCommand Include="export spmi_upload_dir=$HELIX_WORKITEM_UPLOAD_ROOT" />
<HelixPostCommand Include="mkdir -p $spmi_upload_dir" />
<HelixPostCommand Include="export spmi_output_base_name=$spmi_collection_name.$spmi_collection_type.$spmi_collection_mch_file_tag" />
<HelixPostCommand Include="export spmi_finalmch=$spmi_upload_dir/$spmi_output_base_name.mch" />
<HelixPostCommand Include="export spmi_log_file=$spmi_upload_dir/$spmi_output_base_name.log" />
<HelixPostCommand Include="$HELIX_PYTHONPATH $spmi_superpmi_py collect -log_level DEBUG -core_root $spmi_core_root --skip_cleanup --clean --ci --skip_collection_step --skip_toc_step -temp_dir $spmi_collect_dir -output_mch_path $spmi_finalmch -log_file $spmi_log_file" />
</ItemGroup>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same feedback as for the YML change. In general, the sendtohelix and sendtohelixhelp projects are already quite complex. I would prefer if we can move this optional feature into an optional import, i.e. an msbuild targets file.

Disk space on AzDO machines is running out during collection. Print
disk space to see what we've got.

E.g., for linux-x64, the Helix work items create 25.5GB, and the merged
.mch is 12GB. In total, the correlation payload is 1.8GB and all
downloaded workitems are 29GB. The total space needed including merged
.mch is 44GB.
It needs to copy `SuperPMIShim*` environment variables when
creating a new "empty" environment. (It already copies `DOTNET_*`
variables.)
Put HelixPreCommand and HelixPostCommand specification
in sendtohelix-superpmi-collect.targets file.

Note that the order of these commands should not be sensitive
(that is, it should not matter if they are added before or
after other Helix commands).
@BruceForstall BruceForstall force-pushed the SuperPmiLibsTestRunCollection3 branch from b468dc3 to a71c37f Compare August 29, 2023 04:43
@BruceForstall
Copy link
Member Author

@ViktorHofer @ericstj I extracted some SuperPMI-specific logic to separate .yml and .targets files. PTAL.

The current test run is:

https://dev.azure.com/dnceng/internal/_build/results?buildId=2254732&view=results

(but it keeps hitting the "no VS2022 build tools" error (like every other pipeline))

value: 'Debug'
${{ if eq(parameters.buildConfig, 'release') }}:
value: 'Release'
${{ if eq(parameters.buildConfig, 'checked') }}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder why we need this here. We don't have a checked configuration in libraries. Also, please move this and other variables that are only needed by the superpmi YML file into the YML file, i.e. PythonScript and others.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just being consistent with the definition in the coreclr files.

Also, please move this and other variables that are only needed by the superpmi YML file into the YML file, i.e. PythonScript and others.

I don't believe that's possible: the superpmi YML file is a "steps" template and as such can't define variables. (These here are not actual YML variables definitions either, they are "variables" parameters to a template that later injects them in a "variables:" section)

Add superpmi-collect-variables.yml "variables template".
These are used by the superpmi-postprocess-step.yml file,
as well as other SuperPMI collection code.
```
The 'stages' parameter is not a valid StageList.
/eng/pipelines/libraries/run-test-job.yml (Line: 31, Col: 1): Unexpected value 'variables'
```
@BruceForstall
Copy link
Member Author

@ViktorHofer I was able to extract the setting of superpmi variables. I was not able to make it work to extract out the naming of the job (templates are very particular and annoyingly limited). PTAL.

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reacting to my feedback and cleaning things up.

@BruceForstall
Copy link
Member Author

The test failure in Depth1Test looks like a variant of #90777.

@BruceForstall
Copy link
Member Author

Test in internal pipeline with these changes: https://dev.azure.com/dnceng/internal/_build/results?buildId=2255715&view=results

@BruceForstall BruceForstall merged commit 3ce2c88 into dotnet:main Aug 30, 2023
@BruceForstall BruceForstall deleted the SuperPmiLibsTestRunCollection3 branch August 30, 2023 17:54
BruceForstall added a commit to BruceForstall/runtime that referenced this pull request Aug 31, 2023
With dotnet#91101 we have a SuperPMI collection of the libraries tests being
run, so the existing PMI-based collection is someone duplicative:
remove the PMI collection.

It's arguable that the PMI collection of the libraries themselves is
also duplicative, but we can decide whether to remove that one separately.
BruceForstall added a commit that referenced this pull request Sep 6, 2023
With #91101 we have a SuperPMI collection of the libraries tests being
run, so the existing PMI-based collection is someone duplicative:
remove the PMI collection.

It's arguable that the PMI collection of the libraries themselves is
also duplicative, but we can decide whether to remove that one separately.
@ghost ghost locked as resolved and limited conversation to collaborators Sep 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants