Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RyuJIT] Assert failure "unexpected operand size" when folding LCL_FLD into Avx.BlendVariable #10640

Closed
fiigii opened this issue Jul 6, 2018 · 4 comments · Fixed by dotnet/coreclr#18849
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Milestone

Comments

@fiigii
Copy link
Contributor

fiigii commented Jul 6, 2018

The assert failture happens

Assert failure(PID 2324 [0x00000914], Thread: 8032 [0x1f60]): Assertion failed '!"unexpected operand size"' in 'Packet256Tracer:GetNaturalColor(struct,struct,struct,struct,ref):struct:this' (IL size 830)

    File: d:\workspace\coreclr\src\jit\emitxarch.cpp Line: 10343
    Image: C:\Program Files\dotnet\dotnet.exe

when RyuJIT tries to generate the below vblendvps

IN00cd:        vinsertf128 ymm8, ymm9, 1
Added IP mapping: 0x0095 STACK_EMPTY (G_M18437_IG06,ins#34,ofs#229)
Generating: N583 (  0,  3) [000182] ------------                 IL_OFFSET void   IL offset: 0x95 REG NA
Generating: N585 (  0,  0) [000178] ------------       t178 =    HWIntrinsic simd32 float SetZeroVector256 REG mm7 $626
IN00ce:        vxorps   ymm7, ymm7, ymm7
                                                             /--*  t178   simd32 
Generating: N587 (  0,  3) [000181] DA----------              *  STORE_LCL_VAR simd32 V17 loc10        d:3 mm7 REG mm7
							V17 in reg mm7 is becoming live  [000181]
							Live regs: 0000F0E8 {rbx rbp rsi rdi r12 r13 r14 r15 xmm8} => 0000F0E8 {rbx rbp rsi rdi r12 r13 r14 r15 xmm7 xmm8}
							Live vars: {V00 V01 V02 V03 V04 V05 V06 V07 V08 V14 V15 V21 V24 V25 V31} => {V00 V01 V02 V03 V04 V05 V06 V07 V08 V14 V15 V17 V21 V24 V25 V31}
Added IP mapping: 0x009C STACK_EMPTY (G_M18437_IG06,ins#35,ofs#234)
Generating: N589 (  8,  9) [000195] ------------                 IL_OFFSET void   IL offset: 0x9c REG NA
Generating: N591 (  1,  1) [000183] ------------       t183 =    LCL_VAR   simd32 V17 loc10        u:3 mm7 REG mm7 $7cb
Generating: N593 (  3,  4) [000184] -c----------       t184 =    LCL_FLD   simd32 V16 loc9         u:3[+0] Fseq[Xs] NA REG NA <l:$764, c:$765>
Generating: N595 (  1,  1) [000187] ------------       t187 =    LCL_VAR   simd32 V15 loc8         u:3 mm8 REG mm8 $6ff
                                                             /--*  t183   simd32 
                                                             +--*  t184   simd32 
                                                             +--*  t187   simd32 
Generating: N597 (  8,  9) [000191] ------------       t191 = *  HWIntrinsic simd32 float BlendVariable REG mm0 $627
IN00cf:        vblendvps ymm0, ymm7, [V16 rsp+1190H], ymm8

This bug is detected in the work of https://github.com/dotnet/coreclr/issues/17798, and .NET Core 2.1 works fine, so it seems introduced by the recent containment change.

@tannergooding @CarolEidt

@fiigii
Copy link
Contributor Author

fiigii commented Jul 6, 2018

The compiled C# code

    private ColorPacket256 GetNaturalColor(Vector256<int> things, VectorPacket256 pos, VectorPacket256 norms, VectorPacket256 rds, Scene scene)
    {
        var colors = ColorPacket256Helper.DefaultColor;
        for (int i = 0; i < scene.Lights.Length; i++)
        {
            var light = scene.Lights[i];
            var colorPacket = light.Color.ToColorPacket256();
            var lights = light.ToPacket256();
            var ldis = lights.Positions - pos;
            var livec = ldis.Normalize();
            var neatIsectDis = TestRay(new RayPacket256(pos, livec), scene);

            // is in shadow?
            var mask1 = Compare(neatIsectDis, ldis.Lengths, FloatComparisonMode.LessThanOrEqualOrderedNonSignaling);
            var mask2 = Compare(neatIsectDis, SetZeroVector256<float>(), FloatComparisonMode.NotEqualOrderedNonSignaling);
            var isInShadow = And(mask1, mask2);

            Vector256<float> illum = VectorPacket256.DotProduct(livec, norms);
            Vector256<float> illumGraterThanZero = Compare(illum, SetZeroVector256<float>(), FloatComparisonMode.GreaterThanOrderedNonSignaling);
            var tmpColor1 = illum * colorPacket;
            var defaultRGB = SetZeroVector256<float>();
            Vector256<float> lcolorR = BlendVariable(defaultRGB, tmpColor1.Xs, illumGraterThanZero);
            Vector256<float> lcolorG = BlendVariable(defaultRGB, tmpColor1.Ys, illumGraterThanZero);
            Vector256<float> lcolorB = BlendVariable(defaultRGB, tmpColor1.Zs, illumGraterThanZero);
            ColorPacket256 lcolor = new ColorPacket256(lcolorR, lcolorG, lcolorB);

            Vector256<float> specular = VectorPacket256.DotProduct(livec, rds.Normalize());
            Vector256<float> specularGraterThanZero = Compare(specular, SetZeroVector256<float>(), FloatComparisonMode.GreaterThanOrderedNonSignaling);

            var difColor = new ColorPacket256(1, 1, 1);
            var splColor = new ColorPacket256(1, 1, 1);
            var roughness = SetAllVector256<float>(1);

            for (int j = 0; j < scene.Things.Length; j++)
            {
                Vector256<float> thingMask = StaticCast<int, float>(CompareEqual(things, SetAllVector256<int>(j)));
                var rgh = SetAllVector256<float>(scene.Things[j].Surface.Roughness);
                var dif = scene.Things[j].Surface.Diffuse(pos);
                var spl = scene.Things[j].Surface.Specular(pos);

                roughness = BlendVariable(roughness, rgh, thingMask);

                difColor.Xs = BlendVariable(difColor.Xs, dif.Xs, thingMask);
                difColor.Ys = BlendVariable(difColor.Ys, dif.Ys, thingMask);
                difColor.Zs = BlendVariable(difColor.Zs, dif.Zs, thingMask);

                splColor.Xs = BlendVariable(splColor.Xs, spl.Xs, thingMask);
                splColor.Ys = BlendVariable(splColor.Ys, spl.Ys, thingMask);
                splColor.Zs = BlendVariable(splColor.Zs, spl.Zs, thingMask);
            }

            var tmpColor2 = VectorMath.Pow(specular, roughness) * colorPacket;
            Vector256<float> scolorR = BlendVariable(defaultRGB, tmpColor2.Xs, specularGraterThanZero);
            Vector256<float> scolorG = BlendVariable(defaultRGB, tmpColor2.Ys, specularGraterThanZero);
            Vector256<float> scolorB = BlendVariable(defaultRGB, tmpColor2.Zs, specularGraterThanZero);
            ColorPacket256 scolor = new ColorPacket256(scolorR, scolorG, scolorB);

            var oldColor = colors;

            colors = colors + ColorPacket256Helper.Times(difColor, lcolor) + ColorPacket256Helper.Times(splColor, scolor);

            colors = new ColorPacket256(BlendVariable(colors.Xs, oldColor.Xs, isInShadow), BlendVariable(colors.Ys, oldColor.Ys, isInShadow), BlendVariable(colors.Zs, oldColor.Zs, isInShadow));

        }
        return colors;
    }

@tannergooding
Copy link
Member

@fiigii, do you have the full source code available somewhere so that I can validate locally?

@tannergooding
Copy link
Member

tannergooding commented Jul 7, 2018

Looks like the failure has nothing to do with LCL_FLD/etc, it is because the code in question happens to use one of the upper 8 XMM registers (for the third source operand) and causes the immediate value to be > 128 (which fails the "fits in byte" check).

The fix is to add the same casting logic that I added for the other immediate values in codegen.

@tannergooding
Copy link
Member

Fix is here: dotnet/coreclr#18820

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Projects
None yet
3 participants