Skip to content

[AArch64] multiply-by-parts idiom could be recognized #60200

@rose00

Description

@rose00

reproducer: https://gcc.godbolt.org/z/xsTP44xvn

The two functions compute the same result, a 64x64-to-128 bit unsigned multiply. The portable version (that avoids __int128) does fold to good code like the non-portable one on either x86 or aarch64.

The portable version is modeled on:

https://github.com/Cyan4973/xxHash/blob/8e5fdcbe70687573265b7154515567ee7ca0645c/xxh3.h#L338

Since mul and umulh are cheap, there is a special advantage to
recognizing and simplify portable code that performs these
operations by parts using 32-to-64-bit multiplies.

For comparison, similar simplifications target rbit and ror.
https://gcc.godbolt.org/z/Ghb3rfP5W

Also popcnt is combined into the intrinsic from the portable code.

The existing implementations are in places like
InstCombine/InstCombineAndOrXor.cpp
and
AggressiveInstCombine/AggressiveInstCombine.cpp.

Handling the present use case would require recognition that a 128-bit multiply is happening, but based only on observations of 64-bit values. Both 64-bit values, built up from 32-bit temporary values, would get replaced by a full 128-bit combo value, which would then be split up again (or not).

It seems to me that the "bit provenance" algorithm which recognizes bit-reverses might possible be adaptable to a "scaled term" provenance algorithm that would enable a pattern matcher to verify that a nest of shift/multiply/mask nodes actually did add up to 128 bits of product.

This is my first bug report. While I do realize that "missed optimizations" are not really bugs, I did not find in the user guide a way to make a record of suggestions like this. I hope this is actionable, perhaps after moving into whatever bucket such things should actually go into.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions