Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement more cases for getMaxBits #2879

Merged
merged 88 commits into from
Sep 17, 2020

Conversation

MaxGraey
Copy link
Contributor

@MaxGraey MaxGraey commented May 27, 2020

  • Complete 64-bit cases in range AddInt64 ... ShrSInt64

  • ExtendSInt32 and ExtendUInt32 for unary cases

  • For binary cases

    • AddInt32 / AddInt64
    • MulInt32 / MulInt64
    • RemUInt32 / RemUInt64
    • RemSInt32 / RemSInt64
    • DivUInt32 / DivUInt64
    • DivSInt32 / DivSInt64
    • and more

Also more fast paths for some getMaxBits calculations

@MaxGraey MaxGraey changed the title [WIP] Implement 64-bit cases for getMaxBits Implement 64-bit cases for getMaxBits May 27, 2020
@MaxGraey MaxGraey changed the title Implement 64-bit cases for getMaxBits Implement more cases for getMaxBits May 27, 2020
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need new testcases.

src/passes/OptimizeInstructions.cpp Outdated Show resolved Hide resolved
src/passes/OptimizeInstructions.cpp Outdated Show resolved Hide resolved
src/passes/OptimizeInstructions.cpp Outdated Show resolved Hide resolved
@kripken
Copy link
Member

kripken commented May 29, 2020

Fuzz testcase (wasm-opt -O2 --fuzz-exec):

(module
 (type $none_=>_none (func))
 (type $i32_=>_none (func (param i32)))
 (type $none_=>_anyref (func (result anyref)))
 (import "fuzzing-support" "log-i32" (func $fimport$0 (param i32)))
 (memory $0 1 1)
 (export "func_35_invoker" (func $1))
 (func $0 (result anyref)
  (local $0 anyref)
  (drop
   (i32.load offset=4 align=2
    (i32.and
     (i32.rotr
      (i32.const 15)
      (i32.const 15)
     )
     (i32.const 15)
    )
   )
  )
  (local.get $0)
 )
 (func $1
  (drop
   (call $0)
  )
  (call $fimport$0
   (i32.const 0)
  )
 )
)

@MaxGraey
Copy link
Contributor Author

MaxGraey commented May 29, 2020

Thanks! I'm not yet tested this PR. Still need write unit tests at first

@MaxGraey MaxGraey changed the title Implement more cases for getMaxBits [WIP] Implement more cases for getMaxBits May 29, 2020
@MaxGraey
Copy link
Contributor Author

MaxGraey commented Jun 6, 2020

I'm wondering could we significantly speedup all this computation if cache maxBits inside Expression and just recalc getMaxBits after every optimization? 🤔

src/ir/bits.h Outdated
}
auto value = c->value.geti32();
auto maxBitsRight = 31 - Index(CountLeadingZeroes(value));
return std::max(Index(0), maxBitsLeft - maxBitsRight);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type here looks wrong. the maxBits are both unsigned, so the difference is unsigned too. So if it would be negative, it will be a very big positive number, and "win" in the max operation.

Aside from the type it also looks wrong. We know that

left <= 2^maxBitsLeft
right <= 2^maxBitsRight

(from the definition of maxBits). but that doesn't imply

left / right <= 2^(maxBitsLeft - maxBitsRight)

The only inequality I can almost see how to prove is

left / right <= 2^maxBitsLeft

but even that isn't quite right, as right may be 0, even if maxBitsRight > 0. This is similar to the issue from before: maxBits is an upper bound.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Perhaps it's not so simple. Decide check how do this in souper and it's look more complex

Copy link
Contributor Author

@MaxGraey MaxGraey Aug 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems there uses

if (getMaxBits(right) != 32)
  return min(32, 31 + getMinBits(left) - getMaxBits(right))
else
  return getMinBits(left);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but even that isn't quite right, as right may be 0, even if maxBitsRight > 0. This is similar to the issue from before: maxBits is an upper bound.

It's easily could be rejected due to current implementation do max bits div calcs only for x / C(onst): https://github.com/WebAssembly/binaryen/pull/2879/files#diff-58b7cf13906e9b5a4ba6306aaea394aaR158. Right value always const in my scenario. I don't try calc bits for general case like x / y

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Yes, if the right value is a const, then this is simpler. In that case, the variable name maxBitsRight is confusing - it's not the max bits, it's the actual number of bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored and now using bitsRight instead maxBitsRight

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since so much of the code is similar for the 32-bit and 64-bit versions of operations, it would be nice to deduplicate it somehow (but I'd be fine landing this without that improvement, too).

src/ir/bits.h Outdated Show resolved Hide resolved
src/ir/bits.h Outdated
Comment on lines 185 to 186
auto bitsRight =
32 - Index(CountLeadingZeroes(value - 1)); // ceiled Log2
Copy link
Member

@tlively tlively Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give some more intuition for why this is correct? Also, since it is not intuitive and appears multiple times, it would be good to extract it out into a separate helper function with a good explanatory comment.

src/ir/bits.h Outdated Show resolved Hide resolved
return std::max(getMaxBits(binary->left, localInfoProvider),
getMaxBits(binary->right, localInfoProvider));
case XorInt32: {
auto maxBits = getMaxBits(binary->right, localInfoProvider);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there reason to believe that checking the right side first will save more work than checking the left side first, or is it an arbitrary choice?

Copy link
Contributor Author

@MaxGraey MaxGraey Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right side could be cheaper due to canonization which always force constant on the right. Also x ^ -1 is quite often case.

src/ir/bits.h Show resolved Hide resolved
MaxGraey and others added 5 commits September 16, 2020 15:54
Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com>
Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com>
Comment on lines +56 to +57
extern template int CeilLog2(uint32_t);
extern template int CeilLog2(uint64_t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it would be a lot simpler to just overload the CeilLog2 for uint32_t and uint64_t without using any templates. That probably applies to all these functions, but since they're already like that, this can be left for a follow up.

Copy link
Contributor Author

@MaxGraey MaxGraey Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was my first attempt, but templates in C++ still a mystery to me sometimes=) I didn’t manage to make friends with bits.h header this bits.cpp part:

template<typename T>
int CeilLog2(T v) {
  return sizeof(T) * 8 - CountLeadingZeroes(v - 1);
}

Also it quite hard to restrict T to unsigned integer types only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no problem, let's leave cleaning that up to a follow-up PR.

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGraey is this all fuzzed and ready to land?

@MaxGraey
Copy link
Contributor Author

Let me refuzz this

@MaxGraey
Copy link
Contributor Author

Refuzzed:

Invocations so far:
   FuzzExec: 17393
   CompareVMs: 4822
   CheckDeterminism: 1509
   Wasm2JS: 4094
   Asyncify: 4400

ITERATION: 20444

But I want fuzz more tomorrow

@tlively
Copy link
Member

tlively commented Sep 17, 2020

Sounds good. Let me know when you're done fuzzing and we can land this.

@MaxGraey
Copy link
Contributor Author

Additional fuzzing:

Invocations so far:
   FuzzExec: 37722
   CompareVMs: 10319
   CheckDeterminism: 3260
   Wasm2JS: 9071
   Asyncify: 9567

ITERATION: 44267

@MaxGraey
Copy link
Contributor Author

I guess it ready for landing. But will be great if somebody also fuzz it

@tlively
Copy link
Member

tlively commented Sep 17, 2020

I think that sounds like a reasonable amount of fuzzing, so I'll merge this.

@tlively tlively merged commit 2d47c0b into WebAssembly:master Sep 17, 2020
@MaxGraey MaxGraey deleted the more-getmaxbits branch September 17, 2020 20:06
kripken added a commit that referenced this pull request Sep 22, 2020
Improve some comments, and remove fast paths that are just optimizations for
compile time (code clarity matters more here).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants