JIT: Skip redundant AND masking in NarrowWithSaturation codegen #122898

laveeshb · 2026-01-05T21:05:43Z

When NarrowWithSaturation is used, we're already clamping via Min() internally - so the subsequent vpand to mask the result is redundant. This adds an inputsAlreadyClamped param to gtNewSimdNarrowNode so callers like NarrowWithSaturation can skip the extra AND.

Approach suggested by @tannergooding in the issue.

Before:

vpminuw  xmm1, xmm0, xmmword ptr [...]
vpand    xmm1, xmm1, xmm0
vpminuw  xmm2, xmm0, xmmword ptr [...]
vpand    xmm0, xmm2, xmm0
vpackuswb xmm0, xmm1, xmm0

After:

vpminuw  xmm1, xmm0, xmmword ptr [...]
vpminuw  xmm0, xmm0, xmmword ptr [...]
vpackuswb xmm0, xmm1, xmm0

ARM64 isn't affected - it uses AdvSimd intrinsics that handle this natively.

Tested with System.Runtime.Intrinsics and System.Numerics.Vectors test suites, plus added a disasm check test.

dotnet-policy-service · 2026-01-05T21:06:56Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

This PR optimizes the codegen for Vector128/256.NarrowWithSaturation by eliminating redundant AND masking instructions on x86/x64. The optimization recognizes that when inputs are already clamped to the target range by preceding Min/Max operations, the subsequent AND operations are unnecessary.

Key Changes:

Added an optional inputsAlreadyClamped parameter to gtNewSimdNarrowNode to skip AND masking when inputs are guaranteed to be within target range
Updated the NarrowWithSaturation codegen path to pass inputsAlreadyClamped=true after explicit clamping operations
Added regression test with disasm checks to verify the optimization

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/coreclr/jit/compiler.h	Added optional `inputsAlreadyClamped` parameter (default false) to `gtNewSimdNarrowNode` function signature
src/coreclr/jit/gentree.cpp	Implemented conditional logic to skip AND masking operations when `inputsAlreadyClamped` is true for four specific narrowing scenarios (TYP_UBYTE and TYP_USHORT paths in both AVX2 and SSE2)
src/coreclr/jit/hwintrinsicxarch.cpp	Modified `NarrowWithSaturation` implementation to pass `inputsAlreadyClamped=true` when calling `gtNewSimdNarrowNode` after explicit Min/Max clamping
src/tests/JIT/Regression/JitBlue/Runtime_116526/Runtime_116526.csproj	Added test project configuration with disasm checking enabled and JIT optimization settings
src/tests/JIT/Regression/JitBlue/Runtime_116526/Runtime_116526.cs	Added regression test with disasm assertions to verify `vpand` is not generated and functional tests to verify correctness for UInt16→Byte and UInt32→UShort narrowing

src/tests/JIT/Regression/JitBlue/Runtime_116526/Runtime_116526.cs

Vector128/256.NarrowWithSaturation was generating redundant vpand instructions because gtNewSimdNarrowNode didn't know the inputs were already clamped to the target range by the preceding Min() operations. This change adds an optional parameter to gtNewSimdNarrowNode to skip the AND masking when the caller knows inputs are already in range. The fix applies to x86/x64 only. ARM64 uses native AdvSimd instructions that don't have this issue. Before: vpminuw xmm1, xmm0, xmmword ptr [...] vpand xmm1, xmm1, xmm0 ; redundant vpminuw xmm2, xmm0, xmmword ptr [...] vpand xmm0, xmm2, xmm0 ; redundant vpackuswb xmm0, xmm1, xmm0 After: vpminuw xmm1, xmm0, xmmword ptr [...] vpminuw xmm0, xmm0, xmmword ptr [...] vpackuswb xmm0, xmm1, xmm0

Copilot AI review requested due to automatic review settings January 5, 2026 21:05

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 5, 2026

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 5, 2026

Copilot started reviewing on behalf of laveeshb January 5, 2026 21:06 View session

Copilot AI reviewed Jan 5, 2026

View reviewed changes

laveeshb force-pushed the fix/narrow-with-saturation-codegen branch from bd48091 to ea91488 Compare January 5, 2026 21:20

This was referenced Jan 6, 2026

[Android][CoreCLR] System.Security.Cryptography.Tests killed by lowmemorykiller #118603

Open

iOS.Device test WorkItemExecutions #122874

Open

saucecontrol reviewed Jan 6, 2026

View reviewed changes

src/tests/JIT/Regression/JitBlue/Runtime_116526/Runtime_116526.cs Outdated Show resolved Hide resolved

laveeshb force-pushed the fix/narrow-with-saturation-codegen branch from ea91488 to 431213e Compare January 6, 2026 02:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JIT: Skip redundant AND masking in NarrowWithSaturation codegen #122898

JIT: Skip redundant AND masking in NarrowWithSaturation codegen #122898

Uh oh!

laveeshb commented Jan 5, 2026 •

edited

Loading

Uh oh!

dotnet-policy-service bot commented Jan 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JIT: Skip redundant AND masking in NarrowWithSaturation codegen #122898

Are you sure you want to change the base?

JIT: Skip redundant AND masking in NarrowWithSaturation codegen #122898

Uh oh!

Conversation

laveeshb commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dotnet-policy-service bot commented Jan 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

laveeshb commented Jan 5, 2026 •

edited

Loading