-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Chrome CI job occassionally does not run any tests #5406
Comments
Potentially has overlap with #5408. |
The problem here is actually specific to the Chrome job, so I'm updating the In both reported cases, ChromeDriver failed to run any tests at all. The full
...repeated 3 more times (for a total of 5 "Init failed" reports), followed by
|
The relevant issue here are #5336 and #5341. References to this in 5395 and 5343 are mis-attributed, as both of those PRs ran tests to completion with results. The issue that distinguiishes this from #5407 is that here, Chromedriver tries to start 5 times each on various ports starting at 4444 and continuing to 4463. |
Going back through PRs, all PRs since 11:15am EDT on April 7 have completed Chrome tests that match the Firefox tests. I am also no longer able to recreate the timeouts on my fork. Unless this PR success is due to re-triggering Chrome jobs, I propose that this be closed. This is unsatisfying to me personally, because I was not able to track down a root cause for the timeouts, but as I no longer recreate the behavior, I don't hold out much hope of figuring it out. |
Looks like we're still having trouble:
This is pretty elusive, though: of forty builds that occurred over the last 5 days, it's only occurred once. This may be a regression in the "Chromedriver" binary. I say this because the "check stability" script was authored to fetch the most recently-published version, and version 2.29 of ChromeDriver was released on April 4, which roughly corresponds to when we started to experience this instability. Nothing in that change log seems particularly relevant, but you never know. I would like to follow up with the ChromeDriver development group, but we have very little information to share at the moment. As a preliminary step, I am attempting to capture ChromeDriver debugging output at the moment of failure. This output is highly verbose, so we can't enable logging generally (doing so would cause log truncation and potentially obscure output that is relevant to test contributors). Instead, I've opened a dedicated pull request to collect the data. I plan on manually re-triggering that build until the timeout occurs. I'll report back here when I've got some data. |
Does 2.28 work with the current dev channel of Chrome? Should we try rolling back to 2.28 and seeing if we get stability back? |
I discussed that possibility with @bobholt. Given the infrequency of the |
Alright, we now have some relevant debugging information. gh-5544 references a
I'll skip the analysis here since it's not immediately relevant to WPT. The good news is that this is a known problem: there are already two separate
The bad news is that they have not received very much traction from the @foolip is there anyone on either team that you can talk to about increasing |
@pavelfeldman or @RByers, do you know who's responsible for triaging ChromeDriver bugs? |
Unfortunately our main ChromeDriver owner (Sam) has recently left Google. I'm trying to find a web platform team to help own it. @NavidZ may be able to help. |
Based on the bugs that @jugglinmike linked #5626 may offer a bandaid over the problem (although I'm not yet sure). @RByers: Assuming it does, it is more beneficial to land the bandaid fix, or leave Chrome running but not blocking PRs? I know that @jugglinmike would like to avoid the bandaid since we don't really understand it, and it may delay a real fix. However I don't want to leave you in a situation where you are importing tests that are unstable in a way that we could have caught. |
(reposting from IRC) if we implement the workaround, we may never see a fix from Chromium. The bug will continue to trip up application developers. I think ignoring Chromium failures is the better solution because it avoids the workflow interruptions in WPT, places some pressure on the Chromium team, and (importantly) is something we actually understand. |
For the record here, we agreed to land the temporary work-around but I'm also pushing to get either a workaround landed in Chrome and/or a proper fix to glib shipped at high priority. This is a problem for anyone trying to automate Chrome - so is important regardless of whether WPT has a workaround. |
Thank you, Rick! 🌈 |
FYI the work-around has now landed in chromium and is included in the latest Chrome dev-channel build (60.0.3095.5). I suggest we try removing the workaround in WPT. |
#6438 ran tests only on Firefox. Is the issue with Safari/Edge related to this? |
This issue is specific to the problem of Chrome hanging on startup, causing timeout errors during webdriver connection. We believe that issue is now fixed, so closing (but there appear to still be other issues for why tests may not run). |
See:
https://travis-ci.org/w3c/web-platform-tests/builds/218464208
cc/ @jgraham @jugglinmike @bobholt
The text was updated successfully, but these errors were encountered: