-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of tests with sanitizers #21
Conversation
My concern is that I have basically zero experience with sanitizers, so I'd be rather helpless when something goes wrong with them. Would it be possible to isolate these tests more from each other? Like, have a |
esac | ||
|
||
|
||
# run the tests (some also without validation, to exercise those code paths in Miri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments here still refer to Miri? Also running on other targets will not work so easily with sanitizers.
Or alternatively, would it make sense to look into having this as part of rustc CI? |
I wasn’t expecting a review quite so fast 😂 I just started a PR so I could see CI results. So a lot of the comments aren’t updated Do you mean as a periodic task in rust-lang/rust? I suppose that would be an option too, but I don’t know how that would interact with everything else if these tests fail. I don’t really know how likely failure are, this will be interesting. If you’d prefer it in a separate repo that’s fine too of course. It just seemed that with a lot of overlap, having one place to make any setup changes is easier than two. Regarding failures, I honestly don’t know what to expect with this and I don’t know who maintains the implementation. This came out if a suggestion to dogfood the sanitizers feature since it may head to stabilization soon. So I’ll fix the directories, and just consider this borrowing your CI for a bit until someone else can chime in :) |
Periodic, or even with each PR, not sure how far we want to go.
How big is the overlap in the end? All of the scripts that run the actual tests are separate, right? The overlap is in the crate setup that lets us invoke I'd say it depends on how much the parts that are different can be isolated, so that I don't have to worry about the sanitizer part when I need to patch the Miri part. If that can be done, then I'm fine with sharing the parts that can be shared. Also someone should feel in charge of keeping this working so I can ping them in case of trouble. Would you be that someone? |
We should probably rename the project if we land this.^^ miri-test-libstd was anyway not a great name since it also tests libcore and more... |
You can only compile with one sanitizer at a time and we supposedly support like 7,
Yeah, that's about it. It really makes no difference whether it is here or elsewhere, it just seemed maybe nice to keep the similar structure together (but it's also not running yet so that could change).
I wouldn't mind being that someone, but I'm also not a team member. I'm sure someone there's at least one person on the compiler team that wouldn't mind being a fallback, I'll ask around once I (hopefully) get this working.
It it winds up that we have different repos, I'm voting to name this one SANity check :) It will be pretty interesting to see how the results of this all compare to Miri. I suspect Miri catches a lot more, but it's interesting that some of the sanitizers can see through to the C side as well (have to figure that bit out yet...) |
Finally just told it to mark everything as a pass so we'd see actual results. Initial group of failures looks pretty repetitive:
I'm sure a lot of those could be false positives, just need to do a bit of chasing. Everything passed |
Hm. My biggest concern at the moment is that the backtraces aren't useful, I'm not confident in debugging from CI without good backtraces. Are we somehow compiling without debuginfo? |
I guess llvm-symbolizer needed to be installed for better output, the first result is now https://github.com/rust-lang/miri-test-libstd/actions/runs/6434877064/job/17474985110?pr=21#step:4:69. Still not that easy to pinpoint but better, if I'm reading that right it seems like maybe it's yelling at something inside the panic handler? |
At least the tryreserve test can be explained https://github.com/rust-lang/rust/blob/64fa0c34d7cb1a2d522414ab2c87024e465bd613/library/alloc/tests/vec.rs#L1641 |
Yep, in my experience it's pretty common that you need to permit allocator failure due to tests like this. The MSan backtraces look like this:
The paths suggest we've somehow linked together artifacts from the local build and the precompiled standard library from rustup. MSan false positives are pretty common when you only instrument part of the program, which sure looks like is happening here. It wouldn't surprise me if straightening out whatever is causing this fixes all the other errors. |
I think we're better off setting ASAN_OPTIONS something like
At the very least letting the allocator return null. I honestly don't trust |
I updated the flags but didn't add Do you know of a way to fix cross-build linking? I am sort of wondering if maybe it is better to let We could almost do it as part of rust-lang/rust CI like Ralf suggested, since there's about an hour free time between x86 finishing and a full run. But we would want to reuse the built artifacts but parallelize these sanitizer tests, I don't know of a good way to do that |
That's the pile of flags I use to look for UB. Leaks are just annoying, and I've seen tests are supposed to leak. Happy to start with a stricter approach for the standard library tests. (I'm not aware of false positives) |
I guess that if we can't rely on |
@tgross35 I am going to close this PR due to inactivity. Feel free to reopen when you want to get back to this. :) |
It was proposed to start running sanitizers as part of occasional CI in this same way we run Miri, this is a first pass at doing that.
Originally my plan was to fork the repo but since there is quite a bit of common behavior, I think it could make sense to keep it together.