JIT: inconsistent CQ for div/mod by power of 2 idioms #11442

AndyAyersMS · 2018-11-10T18:18:42Z

Method	Mean	Error	StdDev	Ratio
DivAndMod	19.96 us	0.0407 us	0.0340 us	1.00
Div32Rem	16.27 us	0.0267 us	0.0237 us	0.81

Could be that these aren't exactly equivalent for negative values. But here i is non-negative.

public class BitArrayPerf
{
    private const int N = 10_000;
    private readonly int[] data;

    public BitArrayPerf()
    {
        data = new int[N];
        var rand = new Random(42);
        for (int i = 0; i < N; i++)
        {
            data[i] = rand.Next();
        }
    }

    [Benchmark(Baseline = true)]
    public int DivAndMod()
    {
        for (int i = 0; i < N; i++)
        {
            data[i / 32] |= (1 << (i % 32));
        }
        return data[N - 1];
    }

    [Benchmark]
    public int Div32Rem()
    {
        for (int i = 0; i < N; i++)
        {
            int quotient = Div32Rem(i, out int remainder);
            data[quotient] |= 1 << remainder;
        }
        return data[N - 1];
    }

    private static int Div32Rem(int number, out int remainder)
    {
        int quotient = number >> 5;   // Divide by 32.
        remainder = number - (quotient << 5);   // Multiply by 32.
        return quotient;
    }
}

category:cq
theme:basic-cq
skill-level:intermediate
cost:small
impact:small

The text was updated successfully, but these errors were encountered:

gfoidl · 2018-11-11T21:47:19Z

Although for the modulo a bit-operation (& (b - 1)) should be used.

private static int Div32Rem(int number, out int remainder)
{
    int quotient = number >> 5;     // Divide by 32.
    remainder = number & 31;        // Modulo 32
    return quotient;
}

as it's faster.

PS: I assume you are absolutely aware of this, but just that this is noted 😉

mikedn · 2018-11-11T21:58:08Z

Although for the modulo a bit-operation (& (b - 1)) should be used.

Yep, the JIT already does that for unsigned modulo.

And it's worth mentioning that it's faster:

Method	Mean	Error	StdDev	Scaled
DivAndMod	21.02 us	0.2705 us	0.2530 us	1.00
Div32Rem	16.82 us	0.1139 us	0.1009 us	0.80
Div32Rem_2	15.38 us	0.0987 us	0.0923 us	0.73

Not only that it's one operation less but it's also independent from the quotient computation so it can be done in parallel.

erozenfeld · 2018-11-13T23:56:11Z

@dotnet/jit-contrib

danmoseley · 2022-12-22T18:02:29Z

Yep, the JIT already does that for unsigned modulo.

I may be missing something, but does that mean that these should produce the same code?

 public static uint X(uint i)
    {
        return (uint)(i % ((long)uint.MaxValue + 1));
    }
    
    public static uint X2(uint i)
    {
        return (uint)(i % 0x1_0000_0000L);
    }    
    
    public static uint Y(uint i)
    {
        return (uint)i & uint.MaxValue;
    }

I get

XX.X(UInt32)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: sub esp, 8
    L0006: push 0
    L0008: push ecx
    L0009: push 1
    L000b: push 0
    L000d: call 0x72042250
    L0012: mov [ebp-8], eax
    L0015: mov [ebp-4], edx
    L0018: mov eax, [ebp-8]
    L001b: mov esp, ebp
    L001d: pop ebp
    L001e: ret

XX.X2(UInt32)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: sub esp, 8
    L0006: push 0
    L0008: push ecx
    L0009: push 1
    L000b: push 0
    L000d: call 0x72042250
    L0012: mov [ebp-8], eax
    L0015: mov [ebp-4], edx
    L0018: mov eax, [ebp-8]
    L001b: mov esp, ebp
    L001d: pop ebp
    L001e: ret

XX.Y(UInt32)
    L0000: mov eax, ecx
    L0002: and eax, 0xffffffff
    L0005: ret

Is that covered by this issue?

edit: I guess 32 bit only.
https://sharplab.io/#v2:C4LghgzgtgPgAgJgAQA0VIN4FgBQT9K4GE65wCMAbEgK4CWAdsKgBT1NJ0CURB2exAnADsSNo2BcWdJAFIxLADYB7BgHMu7YADoAsmAAeANTCKaAUyQBqJOS5cA3L3wBfZyUEVqW1AnEdud35BIVF/SWk5JAAGA3IAfWik6MTkgBlHdxdid3ckL1oJJABNcM4eAXxgkPywrS4ZADJCpj1DEzNzJ0qkNx7cFyA===
https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgAIANS8gbwFgAocl8p1txp4gRiXICuASwB2GKgAphY8kICU7Vg2YdWxAOzkpojHIlDyAUi0SANhBEBzOdIwA6ALLYEANWymBMcgGpyPOXIA3IosAL4hnKq8/LZUpNoy8hHKqmqaCbr6RuQADAg8APo5xTlFJQAyQRGhHBER5NGCOuQAmhmyCiosKakN6bZyBgBkTWKOzm4eMMFd5OGzTKFAA==

maybe it's low priority then.

TIHan · 2023-01-11T00:30:40Z

Conerning this example:

    public int DivAndMod()
    {
        for (int i = 0; i < N; i++)
        {
            data[i / 32] |= (1 << (i % 32));
        }
        return data[N - 1];
    }

This could actually do the i & (32 - 1) transformation in the JIT if we verify that i is never negative.
The only way for this to work is if you know the range which we do for N since it is a const of 10_000.

TIHan · 2023-01-28T01:01:48Z

Related: #12218

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020

BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Nov 25, 2020

TIHan self-assigned this Oct 31, 2022

danmoseley mentioned this issue Dec 22, 2022

Implementation of Lemire's nearly divisionless method #79790

Merged

TIHan modified the milestones: Future, 8.0.0 Jan 17, 2023

TIHan removed their assignment Jan 31, 2023

JulieLeeMSFT assigned TIHan May 18, 2023

TIHan modified the milestones: 8.0.0, Future Jun 9, 2023

BruceForstall added the tenet-performance Performance related issue label Oct 15, 2024

BruceForstall unassigned TIHan Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: inconsistent CQ for div/mod by power of 2 idioms #11442

JIT: inconsistent CQ for div/mod by power of 2 idioms #11442

AndyAyersMS commented Nov 10, 2018 •

edited by BruceForstall

Loading

gfoidl commented Nov 11, 2018

mikedn commented Nov 11, 2018

erozenfeld commented Nov 13, 2018

danmoseley commented Dec 22, 2022 •

edited

Loading

TIHan commented Jan 11, 2023 •

edited

Loading

TIHan commented Jan 28, 2023

JIT: inconsistent CQ for div/mod by power of 2 idioms #11442

JIT: inconsistent CQ for div/mod by power of 2 idioms #11442

Comments

AndyAyersMS commented Nov 10, 2018 • edited by BruceForstall Loading

gfoidl commented Nov 11, 2018

mikedn commented Nov 11, 2018

erozenfeld commented Nov 13, 2018

danmoseley commented Dec 22, 2022 • edited Loading

TIHan commented Jan 11, 2023 • edited Loading

TIHan commented Jan 28, 2023

AndyAyersMS commented Nov 10, 2018 •

edited by BruceForstall

Loading

danmoseley commented Dec 22, 2022 •

edited

Loading

TIHan commented Jan 11, 2023 •

edited

Loading