Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Compare opt against zero involving a shift oper. #8461

Merged
merged 1 commit into from
Dec 6, 2016

Conversation

sivarv
Copy link
Member

@sivarv sivarv commented Dec 5, 2016

This is a regression introduced by PR #7787
That PR meant to optimize test instruction in case of GT_EQ/NE comparison against zero.

On xarch shift instructions (SAL/SAR/SHR/SHL) modify SF/ZF flags provided shift count is not zero. The following is a repro case whose very first iteration through the loop involves shift-right by zero shift count and hence the loop will not be executed since 'test' instruction is incorrectly optimized out.

           // Absolute bits
            int bitCount = 0;
            while ((0 != (100 >> bitCount)) && (31 > bitCount))
            {
                bitCount++;
            }
            // Sign bit
            bitCount++;

On xarch depending on the operand size, hardware narrows down shift count by masking upper bits. Flags are not modified if the resulting shift count is zero.

Fix: For now disabling compare opt peep for GT_EQ/NE(shift, 0).

Filed issue https://github.com/dotnet/coreclr/issues/8462 to track re-doing this optimization correctly.

Fixes #8460

@sivarv
Copy link
Member Author

sivarv commented Dec 5, 2016

@dotnet/jit-contrib - Please review this.

tree->gtFlags |= GTF_ZSF_SET;
// Codegen of shift oper sets ZF and SF flags provided shift count is not zero.
// Note that Rotate Left/Right instructions don't set ZF and SF flags.
if (tree->OperIsShift() && (shiftBy->gtIntConCommon.IconValue() != 0))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the condition needs to be tree->OperIsShift() && shiftBy->IsCnsIntOrI() && (shiftBy->gtIntConCommon.IconValue() != 0), as it would appear that shiftBy is not guaranteed to be a constant int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it is in else part of the if condition on line 1167-1168. If that condition is false would imply that shiftBy is a constant >= 0 and <=255


In reply to: 90954305 [](ancestors = 90954305)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we haven't already narrowed the constant value to [0..63]

Then you probably need to to handle shifts of 64,128,192, etc..
As those values may also behave the same as a a shift of zero on the hardware.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we have a 32-bit operation then 32, 96, etc.. need to be considered as zero.

Copy link
Member Author

@sivarv sivarv Dec 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, depending on the size of the operand the shift-count gets narrowed by the hardware by masking upper bits. For e.g. while shifting 32-bit operands, upper 3 bits of shift count is masked by hardware to derive shift count. If the resulting shift count is zero, then it won't modify the flags.

Given the urgency of this fix, I am going to disable this optimization for shift operations with a CQ issue to implement correctly.

@briansull
Copy link

briansull commented Dec 5, 2016

*** Important ***

The Test case looks like it will issue a variable shift instruction and not use a constant shift with a zero immediate. So the entire optimization has to be disabled for variable shifts if the instruction doesn't set the flags properly when a zero value is used in ECX. (the variable amount to shift)

C#:

// Absolute bits
int bitCount = 0;
while ((0 != (100 >> bitCount)) && (31 > bitCount))
{
bitCount++;
}
// Sign bit
bitCount++;

if (tree->OperIsShift() && (shiftBy->gtIntConCommon.IconValue() != 0))
{
tree->gtFlags |= GTF_ZSF_SET;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you did disable this optimization for variable sized shifts.

It is surprising that there are no asm diffs for this change.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's rather surprising. Also, even though I'm guessing that the test case reflects the original bug case, it would be good to test the constant cases as well.


In reply to: 90961146 [](ancestors = 90961146)

@sivarv
Copy link
Member Author

sivarv commented Dec 5, 2016

@briansull @CarolEidt
For now I am disabling compare opt for GT_EQ/NE(shift, 0) case given the urgency of the fix. I am going to file a CQ issue to correctly fix this for shift operations.

@briansull
Copy link

briansull commented Dec 5, 2016

LGTM

@briansull
Copy link

AsmDiffs this time?

@sivarv
Copy link
Member Author

sivarv commented Dec 5, 2016

Yes I have already filed tracking issue (please see PR description above)

@sivarv
Copy link
Member Author

sivarv commented Dec 5, 2016

Yes, this time SPMI shows asm diffs when measured against CQPerf suite on desktop.

00.35:         <filename>                                            baseline        diff improvement  %improvement    #funcs
00.35:         CQ_PERF_Clean_Thin_Unique_all.dasm                        1545        1549          -4         -0.26         1
00.35:            1 functions regressed (4 total bytes regression)
00.35:               Regression # 1 goes from  1545 to  1549, diff of     4 ( 0.26%) :: Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.Lexer:LexXmlDocCommentLeadingTrivia(byref):this

Here are the diffs across all desktop Framework assemblies:

02.70:         <filename>                                            baseline        diff improvement  %improvement    #funcs
02.70:         All_FX_Clean_Thin_Unique_all.dasm                         2370        2382         -12         -0.51         5
02.70:            5 functions regressed (12 total bytes regression)
02.70:               Regression # 1 goes from    70 to    72, diff of     2 ( 2.78%) :: System.Diagnostics.TraceListener:set_TraceOutputOptions(int):this
02.70:               Regression # 2 goes from   113 to   115, diff of     2 ( 1.74%) :: System.Web.UI.Control:UpdateNamingContainer(ref):this
02.70:               Regression # 3 goes from   733 to   737, diff of     4 ( 0.54%) :: System.Web.UI.Control:get_EffectiveClientIDMode():int:this
02.70:               Regression # 4 goes from   591 to   593, diff of     2 ( 0.34%) :: System.Windows.Forms.ControlPaint:CreateHBitmapTransparencyMask(ref):long
02.70:               Regression # 5 goes from   863 to   865, diff of     2 ( 0.23%) :: System.Text.RegularExpressions.Regex:.ctor(ref,int,struct,bool):this
02.70:         TOTAL                                                     2370        2382         -12         -0.51         5

Note that this compare opt is looking for GT_EQ/NE(shift, 0) pattern for optimizing 'test' instruction. It is quite possible that this IR pattern doesn't occur that commonly.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sivarv sivarv merged commit 1c8d3d9 into dotnet:master Dec 6, 2016
@sivarv sivarv deleted the shiftFix branch December 6, 2016 00:28
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Compare opt against zero involving a shift oper.

Commit migrated from dotnet/coreclr@1c8d3d9
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants