-
Notifications
You must be signed in to change notification settings - Fork 156
Re-runs should be part of the main job flow #217
Comments
You might be able to implement a "yellow" status for "failing, but still has retries" in the grid, would make it a little more obvious whats happening, although if it still had the "waiting" indicator & the red result, that would prolly do it too. |
What is the reason behind necessity of reruns? Is it really possible for a test to pass one time, and fail on another, while testing not changing code base? |
@jayarjo Re-runs are not related to the test suite, but are to fix environmental issues. When working on a localhost this may not be an issue in most cases, but when working over the internet the imperfections of our international network really show. Check out Result correction for more information. If you don't need them, they are easily omitted by setting |
Yes sure, just wanted to know a theory behind the solution. Thanks :) |
@Krinkle any news on this one? |
I've looked into this, mostly to get our Jenkins jobs to wait for reruns instead of declaring failure immediately. I learned something, but not enough to solve this. Jenkins uses node-testswarm to call the job action to figure out if a job is done or not. For that it iterates through all uaRuns and looks at their status (which I think maps to runresults.status in the DB). As long as they are "new" or "progress", it waits. The problem here is that runresults.status has no concept of reruns (only busy, finished, 2 kinds of timeout). The concept of rerun exists in the run_useragent table, where status can be idle, busy and done (passed and/or reached max). For the Jenkins API calls, this is what we care about, not (only) the runresult.status. At first glance we can just change this line to map the value of $runUaRow->status to something node-testswarm can understand. Since that value doesn't tell us if the run has passed or failed, its more complicated then that. I can't figure out how to merge the two status values, if that's even the right approach. |
It looks you've got it in memory better than I do at the moment. However here's the approach I would take. The I remember now why I did not do this originally. The reason is that reruns are restricted to running on a different client. A big factor in why a job failed and might pass if re-run is things that relate to whatever goggled up state or network issues or something else a client might have. That's why we at the very least require a rerun not be distributed to the same client again. The new client could be a new VM worker, or the same one after they refreshed the run page (either is enough to get a new client id). Because of this requirement (enforced in the While integrating automated reruns into the initial CI run makes sense, that would require testswarm-browserstack to account for it as well. Otherwise you'd be waiting indefinitely (node-testswarm waiting for reruns to complete, and testswarm-browserstack doing nothing since its clients have finished all tests and are not given new tests). In practice reruns do eventually finish because of margins and delays. E.g. when a job is finished (except for reruns) we don't immediately terminate the worker (for one, it takes us a minute to realise the worker is finished in the first place). So the TestSwarm Run page stays open and continues to poll for more tests to run. Let's start observing at a point in time where job 10 has completed (with pending reruns) and there are no online clients. Job 11 is submitted and VM workers are spawned. The VMs poll the |
As discussed in IRC: We'd have to first update testswarm-browserstack to account for reruns, then TestSwarm API to expose reruns per job, then node-testswarm to wait for them. Sounds okay to me, we'd finally get some value out of doing reruns. |
@Krinkle any estimate if/when you could implement this? |
Right now they're kind of implemented as an independent thing. So the job looks complete when it really isn't.
Suggesting to change the display that if a run is not "complete" (
!passing
andruns < max
) to not display "fail", but wait for the re-run.Notes:
The text was updated successfully, but these errors were encountered: