-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky tests #248
Comments
Could the flaky tests be parsed out and posted as individual statuses? Might make them visible enough. |
@rmg you mean as fails? if they're flagged as pass gh will still fold into "success" |
As passes. The "folding" is only a UI thing, the status list could still be expanded to show them, which is a lot more visible than having to go through to the build logs. |
Then it could just as well be part of the current, which would currently show: All tested passed! (3 flaky, 14 skips, 790 total) |
This may be unrealistic, but I would like to see a small and committed group of people aggressively work on the flaky tests and make them Not Flaky. Speaking of which, nodejs/node#3636 needs an LGTM. Just saying. |
(Gratuitous self-promotion, but I am doing my part to try to rally people to fix flaky tests.) |
@Trott what are your thoughts on making it a realistic goal? How can we put more pressure on either flagging tests as flaky or making sure they are followed up? |
Here's what I've been thinking:
(Wherever the flaky tests firefighters assemble, obviously, I want in.) |
Maybe we can raise alerts if the buildbots go down |
The current documented policy for flaky tests (https://github.com/nodejs/node/wiki/Flaky-tests#what-to-do-when-you-encounter-a-new-flaky-test) calls for opening an issue to track them when you mark the test as flaky, and assigning the issue to the next release milestone. I am not sure if we have followed it in 100% of cases. During the node-accept-pull-request experiment we were aggressively marking tests as flaky, but there were so many different ones that were failing randomly that we had to give up. Reliability of the build bots is definitely something that we need to address, and I certainly haven't given up on that. Btw, @joaocgreis is working a CI job that helps stress a single test in CI to determine if it's flaky or not, which should be a useful tool in this context. Assuming we are going to finally improve the reliability of build bots, the part that I think could use some improvement is clarifying who is going to take responsibility for fixing the flaky test / how to motivate people to do it. The person who marks the tests as flaky is usually the collaborator who is making the determination that the test is not failing because of the current pull request changes. They are not motivated to fix the test and not necessarily the most qualified in the particular test that is failing. In a dev team working for one company, you could probably just assign the issue to the test author/owner. I am not sure that this would work in an open source project. So how do we motivate collaborators to investigate and fix these failures? Here are some options we could consider:
|
Possibly something to completely hand off to the new testing group, @Trott has basically been owning this whole area in the time between when this issue was posted and now. Shall we close? Is there anything in here that @nodejs/testing wants to capture first? |
Sounds good to me. For an 'outsider' it might be good to define the scope of the build and testing workgroup somewhere (where do I post what problem). |
@jbergstroem wrote:
The Testing WG is putting together docs so we can be chartered by the TSC. Because some of our charter is likely to overlap with the existing Build WG charter, the Build WG will probably have to ratify our charter as well. It would be great if as many build folks as were willing would read the very short draft docs for the Testing WG and comment. You can find them here: https://github.com/nodejs/testing/pull/2/files |
Before the nodejs merge we only has pass/fail. We had a discussion about this during the merge and the conclusion was to introduce the flaky test structure already found in nodejs so we could always have a passing [although flaky] test suite for releases.
We [the build group] just brought it up again since the tests has a tendency to rot. Also, reporting tests to github (see #236) will make it even less visible since a flaky pass is still a pass.
What can we do to improve this? Do we add a policy for moving flaky tests to fail after a while? Removing them altogether? I know @Trott has been doing a lot of work to improve the status of flaky tests (well, tests in general) -- perhaps you want to chip in?
The text was updated successfully, but these errors were encountered: