Skip to content

Conversation

@StephanTLavavej
Copy link
Member

After 5.5 years, fixes #820.

Commits

  • New-AzGalleryImageDefinition doesn't need the original Publisher/Offer/Sku.
    • We're storing a custom image in a newly created Azure Compute Gallery. While we have to label it with a Publisher, Offer, and Sku, they don't have to match the original one we used from the Azure Marketplace. They also don't have to be globally distinct (and indeed, previously they weren't), since they're specific to this gallery. We could name them distinctly (embedding the resource group name), but since we never have to refer to them again, I figured that using invariant 'StlPublisher'/'StlOffer'/'StlSku' was simplest.
    • I have to do this because we're not going to have an original Publisher/Offer/Sku for ARM64, at least for now.
  • Add an Arch parameter to provision-image.ps1.
    • This allows create-1es-hosted-pool.ps1 to tell provision-image.ps1 what to do.
    • My original attempt of having provision-image.ps1 inspect the PROCESSOR_ARCHITECTURE didn't work for a curious reason. When executed in New PowerShell, this accurately reports ARM64. However, in a Command Prompt or Old PowerShell, the PROCESSOR_ARCHITECTURE is reported as x86. (And provision-image.ps1 is executed in Old PowerShell, because it's what installs New PowerShell). I still don't fully understand why this happens (I suspect it's a compatibility layer), but it's easy enough to avoid.
    • I compare the $Arch with -ieq (case-insensitive equality) since that's how the parameter ValidateSet works.
  • We're never going back to Spot priority.
    • The past is in the past! Let it go!
  • Fix vsDevCmdBat by hardcoding "C:\Program Files".
    • Similarly, the ProgramFiles environment variable was expanding to "C:\Program Files (x86)" in Command Prompts and Old PowerShell (see environment variable investigation below), but VS correctly installs itself to "C:\Program Files". There's nothing to be gained from this environment variable parameterization, since it's only used for the CI machines, whose environment we fully control.
    • I looked into switching our Command Prompt scripts to New PowerShell, but that's difficult because there isn't a PowerShell-friendly way to put the build tools on the path, especially for ARM64-hosted builds. Perhaps in the future, but not now. Fortunately, during the VS 2026 upgrade I had centralized this path (in Toolset update: MSVC Compiler 19.50 Preview 1, Clang 20 #5717), so it only needs to be updated in one location.
  • Skip numeric.limits.members/traps.pass.cpp for Clang.
    • This is the one runtime test issue encountered, an unexpected pass for ARM64. It's caused by interactions between Clang and MSVC's predefined macros, the test's macro inspection, and our probably-bogus product code. Skip it until we can properly fix the root cause.
    • We recently got MSVC-internal ARM64 runtime test coverage, which is why we were almost entirely clean. This one was missed because it had been marked as :2 FAIL (i.e. expected failure for Clang). The MSVC-internal test harness will outright skip anything that's mentioned for any reason, so it was never running this test for ARM64 and therefore never discovered that it was unexpectedly passing.
  • PowerShell 7.5.4.
    • May as well update it, not needed for anything though. Not regenerating the x64 pool at this time.
  • Add the ability to create ARM64 pools.
    • Lots of fun here, see below.
  • Use the ARM64 pool to build and run tests natively.
    • We now have two separate 1ES Hosted Pools. poolName is the classic x64 pool (at this time I saw no value in renaming it to x64PoolName). The new arm64PoolName is separately prepared, and takes care of the ARM64-hosted checks.
    • azure-pipelines.yml then uses the ARM64 pool for 'Build and Test ARM64'. We set the hostArch to arm64 (yay for previously parameterizing this), and we drop testsBuildOnly: true so we'll perform normal execution of the tests.
    • The Standard_D32ps_v6 SKU (Azure Cobalt 100) is delightfully fast, only a bit slower than Standard_F32as_v6 which is non-SMT AMD Zen 4. I found that using 10 shards instead of the default 8 results in roughly the test execution time that we're used to.

How to create ARM64 pools

In create-1es-hosted-pool.ps1, the Arch is now a mandatory parameter instead of hardcoded. Again I compare it case-insensitively with -ieq. For x64, we do everything the same (redundantly adding one bit of info at the end). For ARM64, we select Standard_D32ps_v6 (32-core Azure Cobalt 100 without local storage). I'm using 32 VMs, half the size of our x64 pool. And I'm using 2025-datacenter-azure-edition-arm64 via a Direct Shared Gallery, given to me by the Azure team.

When calling New-AzVMConfig, I have to set the DiskControllerType to 'SCSI' because 'NVMe' is not supported for this SKU. (I expect this will change with some future SKU, avoiding the need for this variation, but I have no info and couldn't share if I did.) I pass -SecurityType 'TrustedLaunch' to silence a notification message about Trusted Launch being available, even though this SKU and image indeed support Trusted Launch. Finally, I pass -SharedGalleryImageId, which is the only way I found to use this Direct Shared Gallery along with our other options.

This supersedes the need to call Set-AzVMSourceImage, which I've commented.

When setting up the New-AzGalleryImageDefinition, I again need to handle the NVMe vs. SCSI variation. We were already setting Trusted Launch, so no variation there. (At some point, possibly in the past, Trusted Launch By Default might make that setting unnecessary.)

Finally, New-AzGalleryImageDefinition defaults to x64, and will fail if asked to store an ARM64 image without explicitly being told so. Our $Arch varies in the correct way ('x64' vs. 'arm64') so I'm directly passing that as the Architecture.

Seems simple? Only because I burned an incredible amount of time over the weekend figuring out this exact sequence of incantations from instructions that only vaguely gestured at them!

Environment Variable Investigation

Command Prompt

"C:\Windows\system32\cmd.exe"
ProgramFiles: C:\Program Files (x86)
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_IDENTIFIER: ARMv8 (64-bit) Family 8 Model D49 Revision   0, MICROSOFT CORPORATION

Old PowerShell (5.1.26100.6899)

"C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe"
ProgramFiles: C:\Program Files (x86)
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_IDENTIFIER: ARMv8 (64-bit) Family 8 Model D49 Revision   0, MICROSOFT CORPORATION

New PowerShell (7.5.3)

"C:\Program Files\PowerShell\7\pwsh.exe"
ProgramFiles: C:\Program Files
PROCESSOR_ARCHITECTURE: ARM64
PROCESSOR_IDENTIFIER: ARMv8 (64-bit) Family 8 Model D49 Revision   0, MICROSOFT CORPORATION

"%ProgramFiles%" was expanding to "C:\Program Files (x86)":

"C:\Windows\system32\cmd.exe"
Command Prompt
ProgramFiles: C:\Program Files (x86)
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_IDENTIFIER: ARMv8 (64-bit) Family 8 Model D49 Revision   0, MICROSOFT CORPORATION

"C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe"
PowerShell 5.1.26100.6899
ProgramFiles: C:\Program Files (x86)
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_IDENTIFIER: ARMv8 (64-bit) Family 8 Model D49 Revision   0, MICROSOFT CORPORATION

"C:\Program Files\PowerShell\7\pwsh.exe"
PowerShell 7.5.3
ProgramFiles: C:\Program Files
PROCESSOR_ARCHITECTURE: ARM64
PROCESSOR_IDENTIFIER: ARMv8 (64-bit) Family 8 Model D49 Revision   0, MICROSOFT CORPORATION
@StephanTLavavej StephanTLavavej requested a review from a team as a code owner October 28, 2025 05:16
@StephanTLavavej StephanTLavavej added the infrastructure Related to repository automation label Oct 28, 2025
@StephanTLavavej StephanTLavavej added test Related to test code ARM64 Related to the ARM64 architecture labels Oct 28, 2025
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Oct 28, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Final Review in STL Code Reviews Oct 28, 2025
@StephanTLavavej StephanTLavavej moved this from Final Review to Ready To Merge in STL Code Reviews Oct 28, 2025
@StephanTLavavej
Copy link
Member Author

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Oct 28, 2025
@StephanTLavavej StephanTLavavej merged commit e21d834 into microsoft:main Oct 29, 2025
41 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Oct 29, 2025
@StephanTLavavej StephanTLavavej deleted the arm64 branch October 29, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ARM64 Related to the ARM64 architecture infrastructure Related to repository automation test Related to test code

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Run tests for ARM64

2 participants