`IsCollinear`: detect and handle integer multiplication wrapping (version 2) #834

reunanen · 2024-05-10T20:51:45Z

This is similar to #832, but tries to avoid branching and division.

@here-abarany, what do you think?

here-abarany · 2024-05-10T21:05:18Z

Looks good to me.

Do note that the C# and Pascal implementations very likely suffer from the same issues.

reunanen · 2024-05-10T21:10:23Z

Do note that the C# and Pascal implementations very likely suffer from the same issues.

Yes. Hopefully easier to port this than the more complicated implementation.

reunanen · 2024-05-10T21:11:21Z

@AngusJohnson, what do you think of this alternative?

AngusJohnson · 2024-05-11T07:17:14Z

@AngusJohnson, what do you think of this alternative?

Juha, I definitely think we're getting much closer, and I really appreciate the effort you've made.

I was a little concerned that these changes would affect performance, but in my testing so far, it doesn't seem to have done so 👍.
And I was wondering if it would be better to avoid calling ProductIsEqual when none of the upper 32bits in a,b,c,d are set (which I believe would be likely for most users). But given that calling ProductIsEqual has negligible impact on performance, that additional check probably wouldn't be necessary.

So really the only (temporary) reservation I have is getting my head around the logic in CalculateCarry(). Thanks for the link there which is helping a little 😜.

reunanen · 2024-05-11T07:45:41Z

And I was wondering if it would be better to avoid calling ProductIsEqual when none of the upper 32bits in a,b,c,d are set (which I believe would be likely for most users). But given that calling ProductIsEqual has negligible impact on performance, that additional check probably wouldn't be necessary.

Yeah, and as pointed out by @here-abarany, that kind of branching may very well be more detrimental to performance than the additional instructions.

So really the only (temporary) reservation I have is getting my head around the logic in CalculateCarry(). Thanks for the link there which is helping a little 😜.

There are also alternative implementations out there, some of which are perhaps a bit clearer. I could try and find you some?

AngusJohnson · 2024-05-11T08:12:12Z

I could try and find you some?

I hope that won't be necessary 🤞. I just need a bit of time (and a clear head) to work it out 😁.
(I'm very reluctant to use code that I don't fully understand.)

Edit: Here's my working understanding of CalculateCarry() in pseudocode ...

to test for a * b overflow carry
given that a & b are type Int64
 
let:
  aHi = a & 0xFFFFFFFF00000000;
  aLo = a & 0xFFFFFFFF;
  bHi = b & 0xFFFFFFFF00000000;
  bLo = b & 0xFFFFFFFF;
then:  
  a * b == (aHi + aLo) * (bHi + bLo)
        == (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo)
also:
  (aHi * bHi) must be > 64bits, so carry = ((aHi >> 32) * (bHi >> 32))
but:
  (aHi * bLo) + (aLo * bHi) + (aLo * bLo) may also (just) overflow 
  so if ((aHi * bLo) + (aLo * bHi) + (aLo * bLo) > MAX_INT64) ++carry
  or safely ... if ((aLo * bHi) + (aLo * bLo) > (MAX_INT64 - (aHi * bLo))) ++carry

reunanen · 2024-05-11T10:18:53Z

Oops, accidentally closed this... 🙈

AngusJohnson · 2024-05-11T10:33:24Z

I'll probably rewrite CalculateCarry so that I can understand what's happening there😁.
I just hope this doesn't break things again🤞.

inline uint64_t CalculateCarry(uint64_t a, uint64_t b)
{
  const uint64_t aHi = a & 0xFFFFFFFF00000000;
  const uint64_t aLo = a & 0xFFFFFFFF;
  const uint64_t bHi = b & 0xFFFFFFFF00000000;
  const uint64_t bLo = b & 0xFFFFFFFF;
  // a * b == (aHi + aLo) * (bHi + bLo)
  // a * b == (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo)
  uint64_t carry = (aHi >> 32) * (bHi >> 32);
  // but since (aHi * bLo) + (aLo * bHi) + (aLo * bLo) may also (just) overflow 
  // do this (but safely) ... if ((aHi * bLo) + (aLo * bHi) + (aLo * bLo) > MAX_UINT64) ++carry
  if ((aLo * bHi) + (aLo * bLo) > (std::numeric_limits<uint64_t>::max() - (aHi * bLo))) ++carry;
  return carry;
}

reunanen · 2024-05-11T10:46:56Z

I just hope this doesn't break things again🤞.

For what it's worth: I noticed that there's already some serious test automation coverage – in the sense that whenever I made a mistake doing this, it was flagged by at least some (and often multiple) test cases. Which is great! 💪

Of course doesn't mean one should blindly rely on the tests only, but at least simple mistakes will probably be detected right away.

reunanen · 2024-05-11T10:48:43Z

// do safely ... if ((aHi * bLo) + (aLo * bHi) + (aLo * bLo) > MAX_INT64) ++carry

Just a nitpick: that should be MAX_UINT64, right?

reunanen · 2024-05-11T10:51:24Z

And yes, your version is very readable, now I think I understand this too 😅

You use redundant parentheses more than I would, but that's a matter of taste I guess.

AngusJohnson · 2024-05-11T11:42:18Z

You use redundant parentheses more than I would, but that's a matter of taste I guess.

I was deliberately heavy handed with parentheses so they would parallel the commented code logic above.

AngusJohnson · 2024-05-11T12:41:03Z

I'll probably rewrite CalculateCarry so that I can understand what's happening there😁. I just hope this doesn't break things again🤞.

inline uint64_t CalculateCarry(uint64_t a, uint64_t b)
{
  const uint64_t aHi = a & 0xFFFFFFFF00000000;
  const uint64_t aLo = a & 0xFFFFFFFF;
  const uint64_t bHi = b & 0xFFFFFFFF00000000;
  const uint64_t bLo = b & 0xFFFFFFFF;
  // a * b == (aHi + aLo) * (bHi + bLo)
  // a * b == (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo)
  uint64_t carry = (aHi >> 32) * (bHi >> 32);
  // but since (aHi * bLo) + (aLo * bHi) + (aLo * bLo) may also (just) overflow 
  // do this (but safely) ... if ((aHi * bLo) + (aLo * bHi) + (aLo * bLo) > MAX_UINT64) ++carry
  if ((aLo * bHi) + (aLo * bLo) > (std::numeric_limits<uint64_t>::max() - (aHi * bLo))) ++carry;
  return carry;
}

Actually, looking again more closely, I don't think the code above is correct.
And if it is wrong then I believe I was somewhat mislead by the StackOverflow link provided.

I think the following code is correct (or almost correct), but it's bedtime again so I may not be thinking straight ...

  inline uint64_t CalcOverflowCarry(uint64_t a, uint64_t b)
  {
    const uint64_t aHi = a & 0xFFFFFFFF00000000;
    const uint64_t aLo = a & 0xFFFFFFFF;
    const uint64_t bHi = b & 0xFFFFFFFF00000000;
    const uint64_t bLo = b & 0xFFFFFFFF;
    // a * b == (aHi + aLo) * (bHi + bLo)
    // a * b == (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo)
    return (aHi >> 32) * (bHi >> 32) + (((aHi >> 32) * bLo) >> 32) + (((bHi >> 32) * aLo) >> 32);
  }

reunanen · 2024-05-11T12:57:21Z

return (aHi >> 32) * (bHi >> 32) + ((aHi * bLo) >> 32) + ((bHi * aLo) >> 32);

Hmm but won't aHi * bLo and bHi * aLo possibly wrap here?

AngusJohnson · 2024-05-11T13:01:24Z

return (aHi >> 32) * (bHi >> 32) + ((aHi * bLo) >> 32) + ((bHi * aLo) >> 32);

Hmm but won't aHi * bLo and bHi * aLo possibly wrap here?

You just missed my edit. (See above.)
Anyhow, this is why I'm so keen to understand the underlying code.
I doesn't stop me making mistakes, but hopefully it helps me avoid making a lot more.

  inline uint64_t CalcOverflowCarry(uint64_t a, uint64_t b)
  {
    //const uint64_t aHi = a & 0xFFFFFFFF00000000;
    const uint64_t aHiShr32 = a >> 32;
    const uint64_t aLo = a & 0xFFFFFFFF;
    //const uint64_t bHi = b & 0xFFFFFFFF00000000;
    const uint64_t bHiShr32 = b >> 32;
    const uint64_t bLo = b & 0xFFFFFFFF;
    // a * b == (aHi + aLo) * (bHi + bLo)
    // a * b == (aHi * bHi) + (aHi * bLo) + (aLo * bHi) + (aLo * bLo)
    //carry = (aHi >> 32) * (bHi >> 32) + (((aHi >> 32) * bLo) >> 32) + (((bHi >> 32) * aLo) >> 32);
    return aHiShr32 * bHiShr32 + ((aHiShr32 * bLo) >> 32) + ((bHiShr32 * aLo) >> 32);
  }

tomwiel · 2024-05-11T17:30:08Z

Just in case, this one (Emulate64x64to128) looks quite simple too.

AngusJohnson · 2024-05-11T20:44:01Z

Thanks Tom. You've just reminded me of my own Int128Mul function and Int128 class in Clipper1. I'm very confident that Int128Mul there (line 354) performs correctly.

azrafe7 · 2024-05-11T21:27:35Z

Hi all, following the discussion and throwing in my two cents...
This one https://stackoverflow.com/a/1815371/1158913 seems to behave correctly for me, returning the expected carry.

I've checked against a naive implementation (carry128()) using uint_128 (BASELINE to ensure correct results - also tested in Python) and an adaptation of the solution in the link (multiply_carry()):

typedef unsigned __int128 uint128_t;
uint64_t carry128(uint64_t a, uint64_t b) {
    uint128_t a_128 = a;
    uint128_t res_128 = a_128 * b;
    uint64_t carry = (uint64_t) (res_128 >> 64);
    
    return carry;
}

uint64_t hi(uint64_t x) {
    return x >> 32;
}

uint64_t lo(uint64_t x) {
    return ((1ULL << 32) - 1) & x;
}

// https://stackoverflow.com/a/1815371/1158913
uint64_t multiply_carry(uint64_t a, uint64_t b) {
    // actually uint32_t would do, but the casting is annoying
    uint64_t s0, s1, s2, s3; 
    
    uint64_t x = lo(a) * lo(b);
    s0 = lo(x);
    
    x = hi(a) * lo(b) + hi(x);
    s1 = lo(x);
    s2 = hi(x);
    
    x = s1 + lo(a) * hi(b);
    s1 = lo(x);
    
    x = s2 + hi(a) * hi(b) + hi(x);
    s2 = lo(x);
    s3 = hi(x);
    
    uint64_t result = s1 << 32 | s0;
    uint64_t carry = s3 << 32 | s2;
    
    return carry;
}

Examples of testing carry returned from multiplication with random numbers:

TEST: 0x51eaed81157de061 * 0x3a271fb2745b6fe9
BASELINE carry128: 0x129bbebdfae0464e
CalculateCarry   : 0x129bbebdd0c2c2b2 WRONG
CalcOverflowCarry: 0x129bbebdfae0464e OK
multiply_carry   : 0x129bbebdfae0464e OK

TEST: 0xc2055706a62883fa * 0x26c78bc79c2322cc
BASELINE carry128: 0x1d640701d192519b
CalculateCarry   : 0x1d6407014210e7ab WRONG
CalcOverflowCarry: 0x1d640701d1925199 WRONG
multiply_carry   : 0x1d640701d192519b OK

More tests...

TEST: 0x874ddae32094b0de * 0x9b1559a06fdf83e0
BASELINE carry128: 0x51f76c49563e5bfe
CalculateCarry   : 0x51f76c490760b8e0 WRONG
CalcOverflowCarry: 0x51f76c49563e5bfd WRONG
multiply_carry   : 0x51f76c49563e5bfe OK

TEST: 0x81fb3ad3636ca900 * 0x239c000a982a8da4
BASELINE carry128: 0x12148e28207b83a3
CalculateCarry   : 0x12148e27c5644c3e WRONG
CalcOverflowCarry: 0x12148e28207b83a2 WRONG
multiply_carry   : 0x12148e28207b83a3 OK

TEST: 0x4be0b4c5d2725c44 * 0x990cd6db34a04c30
BASELINE carry128: 0x2d5d1a4183fd6165
CalculateCarry   : 0x2d5d1a40f6935288 WRONG
CalcOverflowCarry: 0x2d5d1a4183fd6164 WRONG
multiply_carry   : 0x2d5d1a4183fd6165 OK

TEST: 0x978ec0c0433c01f6 * 0x2df03d097966b536
BASELINE carry128: 0x1b3251d91fe272a5
CalculateCarry   : 0x1b3251d8cbf286c1 WRONG
CalcOverflowCarry: 0x1b3251d91fe272a4 WRONG
multiply_carry   : 0x1b3251d91fe272a5 OK

TEST: 0x49c5cbbcfd716344 * 0xc489e3b34b007ad3
BASELINE carry128: 0x38a32c74c8c191a4
CalculateCarry   : 0x38a32c73f0912874 WRONG
CalcOverflowCarry: 0x38a32c74c8c191a3 WRONG
multiply_carry   : 0x38a32c74c8c191a4 OK

TEST: 0xd3361cdbeed655d5 * 0x1240da41e324953a
BASELINE carry128: 0x0f0f4fa11e7e8f2a
CalculateCarry   : 0x0f0f4fa0520fd19c WRONG
CalcOverflowCarry: 0x0f0f4fa11e7e8f28 WRONG
multiply_carry   : 0x0f0f4fa11e7e8f2a OK

TEST: 0x51b854f8e71b0ae0 * 0x6f8d438aae530af5
BASELINE carry128: 0x239c04ee3c8cc248
CalculateCarry   : 0x239c04eda032b5b0 WRONG
CalcOverflowCarry: 0x239c04ee3c8cc247 WRONG
multiply_carry   : 0x239c04ee3c8cc248 OK

TEST: 0xbbecf7dbc6147480 * 0xbb0f73d0f82e2236
BASELINE carry128: 0x895170f4e9a216a7
CalculateCarry   : 0x895170f3a2b5c2f1 WRONG
CalcOverflowCarry: 0x895170f4e9a216a5 WRONG
multiply_carry   : 0x895170f4e9a216a7 OK

WRONG (of 10 tries): 
  CalculateCarry   : 10
  CalcOverflowCarry: 9
  multiply_carry   : 0

Hope it's helpful.

AngusJohnson · 2024-05-11T21:38:43Z

Thanks Giuseppe.
StackOverflow is a fantastic resource, but the code there isn't always reliable (even from contrubutors with very high reputations).
Anyhow, I've just updated clipper.core.h again with a revised CalcOverflowCarry function.

Otherwise a minor code tidy and updated code comments

Add test case that fails

b0610d6

reunanen force-pushed the fix-831-simplified branch from f0e2726 to ea35ce6 Compare May 10, 2024 20:56

IsCollinear: detect and handle integer multiplication wrapping

08b8437

reunanen force-pushed the fix-831-simplified branch from ea35ce6 to 08b8437 Compare May 10, 2024 20:58

reunanen closed this May 11, 2024

reunanen reopened this May 11, 2024

reunanen mentioned this pull request May 11, 2024

IsCollinear: detect and handle integer multiplication wrapping #832

Closed

AngusJohnson merged commit 6e4feb8 into AngusJohnson:main May 11, 2024
14 checks passed

reunanen deleted the fix-831-simplified branch May 11, 2024 10:40

AngusJohnson added a commit that referenced this pull request May 12, 2024

Updated IsCollinear functions in C# and Delphi (#834)

235879b

Otherwise a minor code tidy and updated code comments

reunanen mentioned this pull request May 12, 2024

More carry calculation fixing #835

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`IsCollinear`: detect and handle integer multiplication wrapping (version 2) #834

`IsCollinear`: detect and handle integer multiplication wrapping (version 2) #834

reunanen commented May 10, 2024

here-abarany commented May 10, 2024

reunanen commented May 10, 2024

reunanen commented May 10, 2024

AngusJohnson commented May 11, 2024

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024 •

edited

Loading

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024 •

edited

Loading

reunanen commented May 11, 2024

reunanen commented May 11, 2024

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024

AngusJohnson commented May 11, 2024 •

edited

Loading

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024 •

edited

Loading

tomwiel commented May 11, 2024

AngusJohnson commented May 11, 2024 •

edited

Loading

azrafe7 commented May 11, 2024 •

edited

Loading

AngusJohnson commented May 11, 2024

IsCollinear: detect and handle integer multiplication wrapping (version 2) #834

IsCollinear: detect and handle integer multiplication wrapping (version 2) #834

Conversation

reunanen commented May 10, 2024

here-abarany commented May 10, 2024

reunanen commented May 10, 2024

reunanen commented May 10, 2024

AngusJohnson commented May 11, 2024

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024 • edited Loading

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024 • edited Loading

reunanen commented May 11, 2024

reunanen commented May 11, 2024

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024

AngusJohnson commented May 11, 2024 • edited Loading

reunanen commented May 11, 2024

AngusJohnson commented May 11, 2024 • edited Loading

tomwiel commented May 11, 2024

AngusJohnson commented May 11, 2024 • edited Loading

azrafe7 commented May 11, 2024 • edited Loading

AngusJohnson commented May 11, 2024

`IsCollinear`: detect and handle integer multiplication wrapping (version 2) #834

`IsCollinear`: detect and handle integer multiplication wrapping (version 2) #834

AngusJohnson commented May 11, 2024 •

edited

Loading

AngusJohnson commented May 11, 2024 •

edited

Loading

AngusJohnson commented May 11, 2024 •

edited

Loading

AngusJohnson commented May 11, 2024 •

edited

Loading

AngusJohnson commented May 11, 2024 •

edited

Loading

azrafe7 commented May 11, 2024 •

edited

Loading