Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IsCollinear: detect and handle integer multiplication wrapping #832

Closed
wants to merge 6 commits into from

Conversation

reunanen
Copy link
Contributor

@reunanen reunanen commented May 9, 2024

Resolves #831.

@reunanen reunanen marked this pull request as ready for review May 9, 2024 07:07

if (MultiplyWrap::IsDifferentForSure(ab, cd))
{
return false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If plain integer multiplication gives a different result, then the true result is different, even if there's wrapping.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming these comments are to explain what's going on for the review, perhaps it's an indication that they should be code comments instead to make them more permanent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking of that too. Maybe. Or even better: the code should be obvious even without the comments. Not sure if the comments are needed as it is now – thought that if they are not necessary, then they are that much less harmful and less noisy when they are here and not in the source file. What do you think? Genuine question.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a play by play isn't necessarily required, but a high-level explanation is probably worth having. Looking at MultiplyWrap takes a bit of parsing through the interface and reasoning through the math to figure out what it's intended to do. Similarly, checks and operations in this function require that baseline knowledge to understand what's going on.

If there's a top-level commenting explaining what MultiplyWrap does at a conceptual level and how it's used in combination with the overflow computation that follows, then the specifics of the code will probably become self-evident.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback, much appreciated.

Honestly I don't really love the name MultiplyWrap. Think that just renaming it to something better might already help a bit.

Copy link

@here-abarany here-abarany May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe SignedMultiply?

Edit: Something like OverflowMultiply may also be a good choice if you move CalculateCarry() into this class and have it be more self-contained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like OverflowMultiply may also be a good choice if you move CalculateCarry() into this class and have it be more self-contained.

The main reason I didn't move CalculateCarry() there just yet is that it's kind of uint64_t specific, when MultiplyWrap is more naturally uintmax_t. Yea I know in practice they are the same...


if (!ab.IsWrap() && !cd.IsWrap())
{
return true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there isn't wrapping, then we can take a shortcut.

}

const auto carry_ab = CalculateCarry(a, b);
const auto carry_cd = CalculateCarry(c, d);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculate carry only if really needed (at least one result wrapped, and the potentially wrapped multiplication results are equal).

private:
void Init(intmax_t value, uintmax_t& abs_value)
{
if (value >= 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps bit math could be used here to avoid conditionals that may degrade performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, was thinking of that too, but wasn't sure if it'd be premature optimization at this point.


static bool IsDifferentForSure(const MultiplyWrap& a, const MultiplyWrap& b)
{
if (a.abs_result != b.abs_result) return true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth looking at the disassembly to verify that these multiple if statements get optimized out, otherwise they could be combined into a single return value. For example, the following return statement should be equivalent:

return a.abs_result != b.abs_result || (a.abs_result != 0 && a.sign != b.sign);

(I only check a.abs_result != 0 since that portion would only get executed if a.abs_result == b.abs_result)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Noticed that as well, but decided for the time being to leave the redundant b.abs_result != 0 part for clarity and symmetricity.

Might try to improve on these fronts in the near future. Thanks again for the feedback!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again Juha, and thank you for your help with this tricky issue.

    bool IsWrap() const
    {
      return abs_a != 0 && abs_result / abs_a != abs_b;
    }

I'm trying without success to get my head around the code logic above, though I am just waking up 🤣.
Could you explain how this tests for presumably overflow wrap?
And ISTM that there's a potential divide by zero and, if so, how this is handled in C++?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a divide by zero to be handled by abs_a != 0 && short-circuiting.

It seems to be detecting overflow by switching around the terms of the formula to make sure it gets the expected result. i.e.

  • abs_result = abs_a * abs_b
  • abs_result / abs_a = abs_b

This code assumes that abs_result / abs_a != abs_b would always be true when overflow occurred. On the surface that seems like it would be true, but considering that integer division would throw away the remainder I'm not completely sure at how robust that would be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a divide by zero to be handled by abs_a != 0 && short-circuiting.

Correct.

This code assumes that abs_result / abs_a != abs_b would always be true when overflow occurred. On the surface that seems like it would be true, but considering that integer division would throw away the remainder I'm not completely sure at how robust that would be.

Hmm, I see what you mean (I think). Proving that this (commonly used, I believe) method is correct is discussed e.g. here, but right now I can neither confirm or deny that the proof itself is correct. You didn't have a counterexample in mind, did you? 🙂

Copy link

@here-abarany here-abarany May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a counter-example, or any proof that it doesn't work, it was mainly a concern I had. After going through the proof you linked many times and comparing to other related posts, I'm pretty sure I understand it and accept it as correct, so I'd say that concern was invalid.

That said, between the divisions, and conditional, and required short-circuit to avoid a CPU exception for divide by zero, it's very possible that in practice performing this early-out may actually be slower than always computing the carry bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's very possible that in practice performing this early-out may actually be slower than always computing the carry bits

Good point. Here, Angus mentioned that using int128 is slow, and I figured calculating the carry like this is probably even slower. But that of course doesn't mean that evaluating the conditionals or the division is fast.

Does anybody have any real-life example(s) where the perf impact of these operations matters? I'm afraid that if I create some benchmark for testing this, I'll be optimizing for some completely irrelevant situation (and pessimizing the real-world scenarios while at it).

Maybe, for simplicity, I could just remove the short-circuits from here? Until it's proven they are beneficial to begin with? Then there's at least no need to document them, or find a better name for MultiplyWrap for that matter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of removing the short-circuit for overflow. If it's both more complicated and is questionable if it's faster, I think leaving them out unless proven beneficial otherwise is the better option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Will look into it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect a divide by zero to be handled by abs_a != 0 && short-circuiting.

Indeed. I'll use my "just waking up" as the excuse 🤣.

@tomwiel
Copy link

tomwiel commented May 11, 2024

About safe multiplication: Starting with C23, C has ckd_mul():
if (ckd_mul(&res, a, b)) { res is wrapped result} else { res is correct result}
Other languages would need the same.

@reunanen reunanen closed this May 11, 2024
@reunanen
Copy link
Contributor Author

Closed this PR in favor of #834.

@reunanen reunanen deleted the fix-831 branch May 11, 2024 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression when removing type casting in IsCollinear()
4 participants