Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples of dangerous bugs and diagnostics catching them #40

Closed
springmeyer opened this issue Feb 20, 2017 · 7 comments
Closed

Examples of dangerous bugs and diagnostics catching them #40

springmeyer opened this issue Feb 20, 2017 · 7 comments

Comments

@springmeyer
Copy link
Contributor

Context

Currently the scope of skel is best practices through working examples.

However in #29 we are experimenting with writing sample code that exhibits "common performance problems" like thread contention along with code that demonstrates idealized cpu usage for async code. The motivation is that:

  • the scenarios (good perf and bad) have more value and are easier to explain in direction comparison
  • a problem like thread contention is quietly sinister: you can't see it, there is no obvious crash or error. Just a slow program.

So, we plan to document the tell-tale signs of things like thread contention when seen via a profiler.

Demonstrating diagnostics to catch hard to see bugs

Similar to thread contention that may be difficult to diagnose without specialized methods like function level profiling, there is a class of dangerous bugs that can cause silent memory corruption and are not often detectable without advanced methods.

Therefore, we could consider adding example code (perhaps first in an advanced branch) that contains intentional dangerous bugs that corrupt memory. Then write documentation for how to detect them.

Examples/ideas:

@springmeyer
Copy link
Contributor Author

further idea: write a function that uses v8 objects and does not have a handlescope and leaks therefore. Ensure that the leak checker run on travis catches the leak.

@springmeyer
Copy link
Contributor Author

springmeyer commented Aug 1, 2017

This ticket is blue sky and most valuable for thinking about what is possible. However @mapsam over in mapbox/hpp-skel#17 (comment) just did a quick test in hpp-skel to add a bug in a throwaway branch to confirm manually that the sanitizers are working. We should do the same for node-cpp-skel here, just to quickly confirm they are working. @mapsam want to pair with @GretaCB on this in a free moment?

  • Create a throwaway branch
  • Add a bug to the code (either a memory leak or undefined behavior or both), ideally inside the threadpool function
  • Ensure the correct sanitizer catches it.

@GretaCB
Copy link
Contributor

GretaCB commented Aug 8, 2017

@springmeyer I created a throwaway branch and removed the Handlescopes to trigger undefined behaviour within the threadpool. Looks like Travis successfully failed with the undefined sanitizer 🎉

@springmeyer
Copy link
Contributor Author

Looks like Travis successfully failed with the undefined sanitizer 🎉

👍 Per chat, this proves that things are working. Which is great. So, going forward we'll be confident that if we make a coding mistake in node-cpp-skel (or a project based on it) we should have help from the address and undefined sanitizer.

Note: The error that was thrown in your branch actually is one that was suppressed in the master branch. It is coming from the vptr part of the undefined behavior sanitizer and arose when you added -faddress=undefined since that overrode the -fno-sanitize=vptr,function at

echo 'export MASON_SANITIZE_CXXFLAGS="${MASON_SANITIZE} -fno-sanitize=vptr,function -fsanitize-address-use-after-scope -fno-omit-frame-pointer -fno-common"' >> ${config}
. So, this still proves things are working, but did not actually detect a problem due to the code changes (only your .travis.yml change: https://github.com/mapbox/node-cpp-skel/compare/sanitize#diff-354f30a63fb0907d4ad57269548329e3R79).

@springmeyer
Copy link
Contributor Author

@GretaCB - noticed your last commit definitely triggered the use-after-free error, as we hoped! 🎉

The sanitizer output is verbose, but the key thing to note is the type of error heap-use-after-free and the line it was encountered: hello_async.cpp:130:17. That means line 130 and 17 characters in, which points exactly to the bug x[5]; at 663df40...4df31d7#diff-032fe0b90870cc3a55a6326685d9adbcR130

=================================================================
==4355==ERROR: AddressSanitizer: heap-use-after-free on address 0x60700000a825 at pc 0x7fb32b9d0926 bp 0x7fb329d38b50 sp 0x7fb329d38b48
READ of size 1 at 0x60700000a825 thread T8
    #0 0x7fb32b9d0925 in object_async::do_expensive_work(bool, std::string const&) /home/travis/build/mapbox/node-cpp-skel/build/../src/object_async/hello_async.cpp:130:17

at https://travis-ci.org/mapbox/node-cpp-skel/jobs/262424521#L664

@GretaCB
Copy link
Contributor

GretaCB commented Aug 9, 2017

Gave a couple more bugs a try in the sanitizer:

AddressSanitizer: attempting free on address which was not malloc()-ed: 0x60700000a821 in thread T8
LeakSanitizer: detected memory leaks

@springmeyer
Copy link
Contributor Author

Per chat - this ^^ is great news. I now propose pausing on this quest. There are ideas in the description of even more types of problems we could try to trigger (to see if the sanitizers catch the errors). But for the sake of time, probably best to use node-cpp-skel as a place to build-back checks when we hit real world problems. And not put more effort now into trying to simulate errors. I'll therefore close this and re-open new issues if/when we:

  • a need to add more sanitizers
  • hit a narly bug in production, find a way to detect it/prevent it, and want to build that learning back here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants