Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli::discard_failures_in_background is flaky #14

Closed
dimo414 opened this issue Mar 28, 2022 · 6 comments
Closed

cli::discard_failures_in_background is flaky #14

dimo414 opened this issue Mar 28, 2022 · 6 comments

Comments

@dimo414
Copy link
Owner

dimo414 commented Mar 28, 2022

Reported by @tranzystorek-io in #12 (comment) - example failure blaming this line:

---- cli::discard_failures_in_background stdout ----
thread 'cli::discard_failures_in_background' panicked at 'SystemTime { tv_sec: 1648411197, tv_nsec: 770703005 } !> SystemTime { tv_sec: 1648411197, tv_nsec: 770703005 }', tests/cli.rs:227:9
@dimo414
Copy link
Owner Author

dimo414 commented Mar 28, 2022

This test is known to be brittle (see 58e62b) though I don't totally understand why. Presumably a race with the backgrounded process but IIRC I observed the process completing even though the modtime hadn't changed. Will try to dig into this further, or we can always disable it in the meantime if it's causing trouble.

@tranzystorekk
Copy link
Contributor

How critical is this test? If not too much, I'd opt for disabling

@dimo414
Copy link
Owner Author

dimo414 commented Apr 11, 2022

Well it does provide coverage for supported behavior so I'd rather fix it, but if it's causing too much trouble I'll disable it for now.

@zbentley
Copy link

zbentley commented May 1, 2022

I may be misreading this, but it looks like the test is sleeping for a total of 1sec (10x 100ms) rather than the invalidate threshold of 10sec. Is it that simple? Or am I missing something?

@dimo414
Copy link
Owner Author

dimo414 commented May 7, 2022

The sleep duration should only need to be long enough for the command to execute, it isn't related to the --stale threshold because we've already marked the directory as 15s stale. As mentioned above "I observed the process completing even though the modtime hadn't changed" previously. Here's what I'm seeing:

  • Increasing the loop iterations (e.g. 1..100) does not avoid the race condition
  • Commenting out the modtime assert allows the test to pass seemingly consistently
  • The following assert of the file's contents should fail if the subprocess hasn't executed yet, because it's reading the contents of the same file we're checking the mod-time of

In other words, it seems like occasionally file's modtime isn't being updated even though it's been written to since line 217. That doesn't make sense to me but I'm not sure how else to explain it :)

@dimo414

This comment was marked as outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants