Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove undefined behavior and add stricter checks in console arg parsing #3613

Merged
merged 2 commits into from
Feb 6, 2020

Conversation

kunaltyagi
Copy link
Member

@kunaltyagi kunaltyagi commented Jan 31, 2020

Started with a wish to reduce duplication, but grew into a monster than has better semantics, stricter bound checking and fewer UB along with less duplicated code.

The original implementation had several silent failures (on edge cases) such as

  • use of atoi and atol
    • returns 0 on failure to parse (wrong input)
    • no bounds checked for unsigned int (negative numbers became large positive numbers)
  • non-handling of edge case where sizeof (int) == sizeof (long int) for unsigned int

Step 1: Use strtoX

Usage of strtoX allowed error handling, but needed to be separated in a template-function to prevent duplication of logic. 3 sub-cases were needed for minimal overhead:

  • double, float, long int were simple enough
  • integral types (int, unsigned int) needed bounds checked
  • boolean was a simple indirection

Step 2: Check bounds for integral types

It was still possible to have just 2 simple functions, but

  • It would have required duplicating the checks (which I already messed up once) apart from being non-intuitive. So I introduced a template-function hop

Step 3: Now do it in cross-platform manner

This worked for LP64, but failed on LLP64. Current solution should work with all 64-bit memory models.

  • The edge case on LLP64 required the jumping through TMP hoops to get the right type. This is a clean-ish solution (another method was using std::conditional all over, which I didn't like).
  • Using MPL was a bit overkill for this situation, even though the solution would have been a tad bit cleaner
  • Using if constexpr wasn't an option in C++14

Final implementation

100 lines which implement:

  • a generic function to encapsulate C style errorno checking
  • a generic function to encapsulate integral bounds checking
  • TMP to find the correct type for various 64 bit system implementations

And @larshg was nice enough to debug the platform dependent issue. Wouldn't have caught it for a few more days without him

@kunaltyagi
Copy link
Member Author

I'll remove the use of atoi and atof functions with strtol and strtod (which aren't UB footguns) to have better semantics.

common/src/parse.cpp Outdated Show resolved Hide resolved
common/src/parse.cpp Outdated Show resolved Hide resolved
common/src/parse.cpp Outdated Show resolved Hide resolved
test/common/test_parse.cpp Outdated Show resolved Hide resolved
test/common/test_parse.cpp Outdated Show resolved Hide resolved
test/common/test_parse.cpp Show resolved Hide resolved
@kunaltyagi kunaltyagi force-pushed the dup1 branch 4 times, most recently from f6739e7 to 5078619 Compare February 4, 2020 13:08
Copy link
Member

@SergioRAgostinho SergioRAgostinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides my really minor comment, LGTM. Up to you do decide to address it or not. Feel free to merge after.

common/src/parse.cpp Outdated Show resolved Hide resolved
Copy link
Member

@SergioRAgostinho SergioRAgostinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized the CI is red :')

@larshg
Copy link
Contributor

larshg commented Feb 6, 2020

Hi @kunaltyagi

The cast of max to long int gives -1:
image

I tried to change it to long unsigned int - but that didn't work well with the -314 case :)

@kunaltyagi
Copy link
Member Author

The cast of max to long int gives -1:

That's a nice find. I should have seen it coming. Dang it Windows. At least it tells me I need to change my SFINAE. Thanks for the hint @larshg

@larshg
Copy link
Contributor

larshg commented Feb 6, 2020

What does it return linux/unix side ? Max of singed int? ie. 2,147,483,647 or?

@kunaltyagi
Copy link
Member Author

On my system, sizeof(int) < sizeof(long int).

@larshg
Copy link
Contributor

larshg commented Feb 6, 2020

Ahh, yeah its equal here.
According to this, the use of long long int should do it. And the test passes here.

@kunaltyagi
Copy link
Member Author

Oh well, now I can spend atleast 1 PR strengthening the parse.cpp itself

common/src/parse.cpp Outdated Show resolved Hide resolved
common/src/parse.cpp Outdated Show resolved Hide resolved
@kunaltyagi kunaltyagi changed the title Removing duplicated logic Removing undefined behavior and stricter checks in console arg parsing Feb 6, 2020
@kunaltyagi
Copy link
Member Author

@larshg Since you have a Windows PC, could you check the latest push? It compiles for me without issues.

@larshg
Copy link
Contributor

larshg commented Feb 6, 2020

👍 yes, it compiles and unit tests passes.

@kunaltyagi kunaltyagi added the needs: code review Specify why not closed/merged yet label Feb 6, 2020
Copy link
Member

@SergioRAgostinho SergioRAgostinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first sight this feels slightly over engineered :') Thank you for chiming in @larshg

@kunaltyagi
Copy link
Member Author

kunaltyagi commented Feb 6, 2020

this feels slightly over engineered

Yeah, that's true. Net result: ~300 lines added. Positive side: 200 lines are tests.

The original implementation had several silent failures (on edge cases) such as

  • use of atoi and atol
    • returns 0 on failure to parse (wrong input)
    • no bounds checked for unsigned int (negative numbers became large positive numbers)
  • non-handling of edge case where sizeof (int) == sizeof (long int) for unsigned int

Step 1: Use strtoX

Usage of strtoX allowed error handling, but needed to be separated in a template-function to prevent duplication of logic. 3 sub-cases were needed for minimal overhead:

  • double, float, long int were simple enough
  • integral types (int, unsigned int) needed bounds checked
  • boolean was a simple indirection

Step 2: Check bounds for integral types

It was still possible to have just 2 simple functions, but

  • It would have required duplicating the checks (which I already messed up once) apart from being non-intuitive. So I introduced a template-function hop

Step 3: Now do it in cross-platform manner

This worked for LP64, but failed on LLP64. Current solution should work with all 64-bit memory models.

  • The edge case on LLP64 required the jumping through TMP hoops to get the right type. This is a clean-ish solution (another method was using std::conditional all over, which I didn't like).
  • Using MPL was a bit overkill for this situation, even though the solution would have been a tad bit cleaner
  • Using if constexpr wasn't an option in C++14

Final implementation

100 lines which implement:

  • a generic function to encapsulate C style errorno checking
  • a generic function to encapsulate integral bounds checking
  • TMP to find the correct type for various 64 bit system implementations

And @larshg was nice enough to debug the platform dependent issue. Wouldn't have caught it for a few more days without him

@SergioRAgostinho
Copy link
Member

I just copied your explanation to the OP. Future people will appreciate it. :)

@SergioRAgostinho SergioRAgostinho merged commit 8888816 into PointCloudLibrary:master Feb 6, 2020
@kunaltyagi kunaltyagi deleted the dup1 branch February 6, 2020 16:15
@taketwo taketwo removed the needs: code review Specify why not closed/merged yet label Feb 6, 2020
@taketwo taketwo changed the title Removing undefined behavior and stricter checks in console arg parsing Remove undefined behavior and add stricter checks in console arg parsing Mar 18, 2020
@taketwo taketwo added the changelog: enhancement Meta-information for changelog generation label Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog: enhancement Meta-information for changelog generation module: common
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants