-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Conversation
Two "coverage" job attempts on different machines failed at:
Mashing retry again to see what the 3rd attempt does... |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
dac26ac
to
56eacdf
Compare
Same coverage failures as before, but #26115 resolves most of the latest nightly clippy issues to avoid bloating the scope of this issue |
f5b75a8
to
c4e3de0
Compare
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
5440a11
to
b1af88d
Compare
Looks like we are timing out for all 3 of these test failures in I'm guessing this is a soft timeout as opposed to some sort of deadlock because the tests are passing on my laptop. I'll try increasing the wait time to confirm before diving into what might have slowed down after moving to 1.62 |
Heh... should we bump all the way to rust v1.63.0 now instead? |
Increased timeout 10x and still seeing failures.. need to keep digging |
I just added Rust 1.63.0 here |
Interestingly, 1.62 was hanging on some tests at |
New set of failures for the following tests:
Looks like they're all failing the assert check at the end of flush due to bucket age advancing unexpectedly. |
@jeffwashington Looks like there may be some lingering races still in the bucket aging? Not sure if this is something new with rust v1.63, or just exasperated by. |
Yeah, I can repro this easily on my laptop if I put the non-advancing thread to sleep before this assert check. It looks like we are somehow releasing the flush lock while we are still inside the flush function. This allows another thread to grab the lock, complete the flush routine, increment buckets flushed, and allow thread 0 to increment age. WIP to understand how/why the flush lock is getting released |
Don't fully understand what's going on yet, but this function is the problem:
If it is changed to the following, the lock actually seems to work
|
Does the version in master work? (!already_flushing).then(|| Self { flushing }) I'm guessing this was a clippy lint that caused the change in this PR. |
Yep, master version seems to work. This problem can be simplified down to this basic unit test: |
Ah, duh (!already_flushing).then_some(Self { flushing }) Will always create the There's a few work-arounds.
I'd do option 1, and probably add a comment in |
@brooksprumo - Do you mean keep the version in master?: |
@brennanwatt Yep! Sorry for my typo above, fixed it! |
Able to get a green CI (over at #27095) after the lock changes discussed above |
Created #27148 based on this PR + fixes discussed above |
Closing this one as #27148 has been merged. |
No description provided.