Implement more cases for getMaxBits #2879

MaxGraey · 2020-05-27T08:18:06Z

Complete 64-bit cases in range AddInt64 ... ShrSInt64
ExtendSInt32 and ExtendUInt32 for unary cases
For binary cases
- AddInt32 / AddInt64
- MulInt32 / MulInt64
- RemUInt32 / RemUInt64
- RemSInt32 / RemSInt64
- DivUInt32 / DivUInt64
- DivSInt32 / DivSInt64
- and more

Also more fast paths for some getMaxBits calculations

kripken

This will need new testcases.

src/passes/OptimizeInstructions.cpp

kripken · 2020-05-29T22:48:31Z

Fuzz testcase (wasm-opt -O2 --fuzz-exec):

(module
 (type $none_=>_none (func))
 (type $i32_=>_none (func (param i32)))
 (type $none_=>_anyref (func (result anyref)))
 (import "fuzzing-support" "log-i32" (func $fimport$0 (param i32)))
 (memory $0 1 1)
 (export "func_35_invoker" (func $1))
 (func $0 (result anyref)
  (local $0 anyref)
  (drop
   (i32.load offset=4 align=2
    (i32.and
     (i32.rotr
      (i32.const 15)
      (i32.const 15)
     )
     (i32.const 15)
    )
   )
  )
  (local.get $0)
 )
 (func $1
  (drop
   (call $0)
  )
  (call $fimport$0
   (i32.const 0)
  )
 )
)

MaxGraey · 2020-05-29T23:27:25Z

Thanks! I'm not yet tested this PR. Still need write unit tests at first

MaxGraey · 2020-06-06T08:38:36Z

I'm wondering could we significantly speedup all this computation if cache maxBits inside Expression and just recalc getMaxBits after every optimization? 🤔

kripken · 2020-08-20T02:39:32Z

src/ir/bits.h

+          }
+          auto value = c->value.geti32();
+          auto maxBitsRight = 31 - Index(CountLeadingZeroes(value));
+          return std::max(Index(0), maxBitsLeft - maxBitsRight);


The type here looks wrong. the maxBits are both unsigned, so the difference is unsigned too. So if it would be negative, it will be a very big positive number, and "win" in the max operation.

Aside from the type it also looks wrong. We know that

left <= 2^maxBitsLeft right <= 2^maxBitsRight

(from the definition of maxBits). but that doesn't imply

left / right <= 2^(maxBitsLeft - maxBitsRight)

The only inequality I can almost see how to prove is

left / right <= 2^maxBitsLeft

but even that isn't quite right, as right may be 0, even if maxBitsRight > 0. This is similar to the issue from before: maxBits is an upper bound.

Hmm. Perhaps it's not so simple. Decide check how do this in souper and it's look more complex

So it seems there uses

if (getMaxBits(right) != 32) return min(32, 31 + getMinBits(left) - getMaxBits(right)) else return getMinBits(left);

but even that isn't quite right, as right may be 0, even if maxBitsRight > 0. This is similar to the issue from before: maxBits is an upper bound.

It's easily could be rejected due to current implementation do max bits div calcs only for x / C(onst): https://github.com/WebAssembly/binaryen/pull/2879/files#diff-58b7cf13906e9b5a4ba6306aaea394aaR158. Right value always const in my scenario. I don't try calc bits for general case like x / y

Oh, I see. Yes, if the right value is a const, then this is simpler. In that case, the variable name maxBitsRight is confusing - it's not the max bits, it's the actual number of bits.

I refactored and now using bitsRight instead maxBitsRight

tlively

Since so much of the code is similar for the 32-bit and 64-bit versions of operations, it would be nice to deduplicate it somehow (but I'd be fine landing this without that improvement, too).

src/ir/bits.h

tlively · 2020-09-14T00:13:17Z

src/ir/bits.h

+          auto bitsRight =
+            32 - Index(CountLeadingZeroes(value - 1)); // ceiled Log2


Can you give some more intuition for why this is correct? Also, since it is not intuitive and appears multiple times, it would be good to extract it out into a separate helper function with a good explanatory comment.

src/ir/bits.h

tlively · 2020-09-14T00:19:26Z

src/ir/bits.h

-        return std::max(getMaxBits(binary->left, localInfoProvider),
-                        getMaxBits(binary->right, localInfoProvider));
+      case XorInt32: {
+        auto maxBits = getMaxBits(binary->right, localInfoProvider);


Is there reason to believe that checking the right side first will save more work than checking the left side first, or is it an arbitrary choice?

Yes, right side could be cheaper due to canonization which always force constant on the right. Also x ^ -1 is quite often case.

src/ir/bits.h

Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com>

tlively · 2020-09-16T21:02:41Z

src/support/bits.h

+extern template int CeilLog2(uint32_t);
+extern template int CeilLog2(uint64_t);


It looks like it would be a lot simpler to just overload the CeilLog2 for uint32_t and uint64_t without using any templates. That probably applies to all these functions, but since they're already like that, this can be left for a follow up.

It was my first attempt, but templates in C++ still a mystery to me sometimes=) I didn’t manage to make friends with bits.h header this bits.cpp part:

template<typename T> int CeilLog2(T v) { return sizeof(T) * 8 - CountLeadingZeroes(v - 1); }

Also it quite hard to restrict T to unsigned integer types only.

Yeah, no problem, let's leave cleaning that up to a follow-up PR.

tlively

@MaxGraey is this all fuzzed and ready to land?

MaxGraey · 2020-09-16T22:02:20Z

Let me refuzz this

MaxGraey · 2020-09-17T00:22:05Z

Refuzzed:

Invocations so far:
   FuzzExec: 17393
   CompareVMs: 4822
   CheckDeterminism: 1509
   Wasm2JS: 4094
   Asyncify: 4400

ITERATION: 20444

But I want fuzz more tomorrow

tlively · 2020-09-17T00:46:25Z

Sounds good. Let me know when you're done fuzzing and we can land this.

MaxGraey · 2020-09-17T16:45:08Z

Additional fuzzing:

Invocations so far:
   FuzzExec: 37722
   CompareVMs: 10319
   CheckDeterminism: 3260
   Wasm2JS: 9071
   Asyncify: 9567

ITERATION: 44267

MaxGraey · 2020-09-17T16:46:15Z

I guess it ready for landing. But will be great if somebody also fuzz it

tlively · 2020-09-17T20:04:27Z

I think that sounds like a reasonable amount of fuzzing, so I'll merge this.

Improve some comments, and remove fast paths that are just optimizations for compile time (code clarity matters more here).

MaxGraey added 5 commits May 27, 2020 11:15

implement 64-bit cases for getMaxBits

ee08839

add ExtendSInt32 & ExtendUInt32 for unary cases

374c9b6

fix ExtendSInt32

3a0f72e

implement RemUInt32 / RemUInt64 for getMaxBits

091e15e

implement DivUInt(32/64) / DivSInt(32/64)

18a2aa2

MaxGraey changed the title ~~[WIP] Implement 64-bit cases for getMaxBits~~ Implement 64-bit cases for getMaxBits May 27, 2020

MaxGraey changed the title ~~Implement 64-bit cases for getMaxBits~~ Implement more cases for getMaxBits May 27, 2020

Implement RemSInt32 / RemSInt64

168fed4

kripken reviewed May 27, 2020

View reviewed changes

src/passes/OptimizeInstructions.cpp Outdated Show resolved Hide resolved

src/passes/OptimizeInstructions.cpp Outdated Show resolved Hide resolved

src/passes/OptimizeInstructions.cpp Outdated Show resolved Hide resolved

update according review

d04ae7d

MaxGraey changed the title ~~Implement more cases for getMaxBits~~ [WIP] Implement more cases for getMaxBits May 29, 2020

MaxGraey added 16 commits May 30, 2020 09:01

Merge branch 'master' into more-getmaxbits

c08376c

fix missing returns for rest cases

2dc792c

Merge branch 'master' into more-getmaxbits

56146d3

Merge branch 'master' into more-getmaxbits

3f68768

rearrange cases

36f73e5

calc getMaxBits also for multiply

c77d5d4

Merge branch 'master' into more-getmaxbits

d60b53e

lint

0bba5b6

generalize div / rem for all type dividers

f43c8db

add fast paths when divider or multiplier is zero

a6415e1

more optimizations

d979620

lint

6685e6d

revert generalizations for div / rem

7726f18

return max bits of left expr for udiv as default case

a39091a

Merge branch 'master' into more-getmaxbits

908eb96

return zero when lhs also zero for mul

df767f4

MaxGraey added 2 commits August 18, 2020 21:19

Merge branch 'master' into more-getmaxbits

790a05c

remove some zero checks

30b1842

kripken reviewed Aug 20, 2020

View reviewed changes

MaxGraey added 10 commits August 22, 2020 19:22

Merge branch 'master' into more-getmaxbits

341a3ef

add comments + refactorings

b91ee8c

Merge branch 'master' into more-getmaxbits

719d8c3

refactor DivSInt32|64 / DivUInt32|64

146c966

more tests and fixes

a74f10e

refactor

b4d614d

lint

9b2fe4d

more tests

162c94d

indent

8404d91

simplify

4be0956

tlively reviewed Sep 14, 2020

View reviewed changes

MaxGraey and others added 5 commits September 16, 2020 15:54

Merge branch 'master' into more-getmaxbits

ec62420

Update src/ir/bits.h

5fc687a

Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com>

Update src/ir/bits.h

0be9076

Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com>

add CeilLog2 util

33c9eab

lint

277d159

tlively reviewed Sep 16, 2020

View reviewed changes

tlively approved these changes Sep 16, 2020

View reviewed changes

tlively merged commit 2d47c0b into WebAssembly:master Sep 17, 2020

MaxGraey deleted the more-getmaxbits branch September 17, 2020 20:06

kripken added a commit that referenced this pull request Sep 22, 2020

ir/bits.h cleanups after #2879 (#3156)

0f9339d

Improve some comments, and remove fast paths that are just optimizations for compile time (code clarity matters more here).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement more cases for getMaxBits #2879

Implement more cases for getMaxBits #2879

MaxGraey commented May 27, 2020 •

edited

Loading

kripken left a comment

kripken commented May 29, 2020

MaxGraey commented May 29, 2020 •

edited

Loading

MaxGraey commented Jun 6, 2020 •

edited

Loading

kripken Aug 20, 2020

MaxGraey Aug 20, 2020

MaxGraey Aug 20, 2020 •

edited

Loading

MaxGraey Aug 20, 2020

kripken Aug 22, 2020

MaxGraey Aug 22, 2020

tlively left a comment

tlively Sep 14, 2020 •

edited

Loading

tlively Sep 14, 2020

MaxGraey Sep 16, 2020 •

edited

Loading

tlively Sep 16, 2020

MaxGraey Sep 16, 2020 •

edited

Loading

tlively Sep 16, 2020

tlively left a comment

MaxGraey commented Sep 16, 2020

MaxGraey commented Sep 17, 2020

tlively commented Sep 17, 2020

MaxGraey commented Sep 17, 2020

MaxGraey commented Sep 17, 2020

tlively commented Sep 17, 2020

		auto bitsRight =
		32 - Index(CountLeadingZeroes(value - 1)); // ceiled Log2

		extern template int CeilLog2(uint32_t);
		extern template int CeilLog2(uint64_t);

Implement more cases for getMaxBits #2879

Implement more cases for getMaxBits #2879

Conversation

MaxGraey commented May 27, 2020 • edited Loading

kripken left a comment

Choose a reason for hiding this comment

kripken commented May 29, 2020

MaxGraey commented May 29, 2020 • edited Loading

MaxGraey commented Jun 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MaxGraey Aug 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlively left a comment

Choose a reason for hiding this comment

tlively Sep 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MaxGraey Sep 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MaxGraey Sep 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlively left a comment

Choose a reason for hiding this comment

MaxGraey commented Sep 16, 2020

MaxGraey commented Sep 17, 2020

tlively commented Sep 17, 2020

MaxGraey commented Sep 17, 2020

MaxGraey commented Sep 17, 2020

tlively commented Sep 17, 2020

MaxGraey commented May 27, 2020 •

edited

Loading

MaxGraey commented May 29, 2020 •

edited

Loading

MaxGraey commented Jun 6, 2020 •

edited

Loading

MaxGraey Aug 20, 2020 •

edited

Loading

tlively Sep 14, 2020 •

edited

Loading

MaxGraey Sep 16, 2020 •

edited

Loading

MaxGraey Sep 16, 2020 •

edited

Loading