-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add --repeat-until-failure to mix test #13398
add --repeat-until-failure to mix test #13398
Conversation
Shall we make it an integer, so we can put a maximum amount of repeats? |
Changed it to be an integer. Works fine, but I tried writing some unit tests and that did not go well... |
Thank you! To be honest, I am not very happy that we need to copy of all test cases in the runner now. You can use ExUnit to run tests programmatically and that would become a memory leak. I can think of two other options:
Thoughts? |
I don't think 1 is a good solution, because non-flaky modules could prevent others from running? Concerning 2: maybe such a repeat tag would be nice as a complement to this in general? I adapted the code to only store the modules when |
5e83893
to
b3e8c4b
Compare
lib/ex_unit/lib/ex_unit/server.ex
Outdated
end | ||
|
||
defp maybe_add_persistent_sync_modules(state, name) do | ||
if Application.fetch_env!(:ex_unit, :repeat_until_failure) > 0 do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but I am worried about depending on global state here. For example, ExUnit.configure can be called at any time and change the meaning of this. Ideally we will keep most state management in the runner. I can think of two other options:
- Pass a flag when we call
take_modules
in the runner to also make a copy of the modules we are taking - Have the runner collect all modules as it runs them and explicitly push them again on
restore_modules
I think 2 is the cleanest and requires only a small change to async_loop
. There is no need of additional state and we most rely on existing server APIs. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the feedback, I appreciate that very much. I changed the code to use your suggestion 2.
Maybe it would be nice to use a new seed for every repeat if it was not explicitly configured with |
Sounds like a solid idea to me. :) Maybe we move the loop a bit higher in the stack and run the whole configuration again? |
This feature will be very useful, thanks for implementing it @SteffenDE 😃 |
Sounds good. I wonder how we should handle the seed, because we store the generated seed in the env and then we cannot easily say if the seed was explicitly set or generated: elixir/lib/ex_unit/lib/ex_unit.ex Lines 484 to 487 in 713bac0
What would happen if we don't persist the seed? (the tests still pass if I exclude the seed) |
Hmm yep. Not persisting the seed is not compatible with #12442... |
# we want to change the seed when using with `repeat_until_failure` | ||
Application.put_env(:ex_unit, :seed_generated, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not pretty, but not sure what would be better
So, one important difference with the seed when using I think that's reasonable. |
I will go ahead and merge it and explore some ideas. :) Thank you! |
💚 💙 💜 💛 ❤️ |
Landing on this PR as I am debugging a flaky test at the moment 🥲 20mn passes as I upgrade the codebase to 1.17 to test that new shiny tool... 20mn to debug the test, which ended up being an issue with multiple And done, thanks! That helped a lot! One thing that could help is to be able to, say, run tests with a concurrency of 10 or 100, as my issue would have been more prominent the slower things were, or to have another way of "slowing things down". |
The world isn't perfect and so are our tests. When debugging flaky tests I found myself wanting to run
mix test
repeatedly until a test fails. Of course you can do a loop in bash, fish, etc., but a lot of time is spent on starting the BEAM and the application.This is where this PR comes in. It proposes a new flag
--repeat-until-failure
that re-runs all tests until at least one fails.