Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intermittent unit/reference test failures #12123

Closed
6 of 7 tasks
timvandermeij opened this issue Jul 26, 2020 · 21 comments
Closed
6 of 7 tasks

Fix intermittent unit/reference test failures #12123

timvandermeij opened this issue Jul 26, 2020 · 21 comments
Labels

Comments

@timvandermeij
Copy link
Contributor

timvandermeij commented Jul 26, 2020

To debug this, pull request #12124 included the browser name in the logs for the unit tests too.

The following is a list of unit test failures we have seen:

It's clear from this overview that most problems are related to unexpected action abortions, both in Chrome/Firefox and on Windows/Linux. This wasn't happening a few days ago, so maybe something regressed? The second one is also interesting since we explicitly catch the exception in the unit test.

Pull request #12125 tries to improve the situation for the reference test failures.

@timvandermeij timvandermeij changed the title Intermittent unit/reference test failures on the bots Intermittent unit/reference test failures Jul 26, 2020
@timvandermeij timvandermeij changed the title Intermittent unit/reference test failures Fix intermittent unit/reference test failures Jul 26, 2020
@timvandermeij
Copy link
Contributor Author

It seems to have something to do with 47ab676, also the first PR where we noticed the issues. If I revert that commit, the unit tests on Linux seem to pass all the time. Strange...

@Snuffleupagus
Copy link
Collaborator

If I revert that commit, the unit tests on Linux seem to pass all the time.

Most likely the updated Jasmine-packages are responsible in that case. Perhaps we should just revert those changes for now, and also pin the packages at the old ~3.5.0 versions?

(Obviously that wouldn't really fix things, only hide the errors for now, and future updates would be blocked until the problems have been identified and addressed.)

@timvandermeij
Copy link
Contributor Author

I have made #12125 which should solve the problems for now so the unit tests all pass again. This issue will remain open for the follow-up because those upgrades need to be unblocked.

@timvandermeij
Copy link
Contributor Author

timvandermeij commented Jul 26, 2020

I filed an upstream issue for Jasmine: jasmine/jasmine#1840

@timvandermeij
Copy link
Contributor Author

timvandermeij commented Jul 27, 2020

I received a reply for the Jasmine developers and it seems like a pre-existing issue in our unit tests that just surfaced in 3.6 because of behavior changes in Jasmine. The idea is to to track down the test that causes it by trying to get it reproducible, for example by running the tests until failures happen and keeping track of the random seed so we can replay it. Another approach is to track it down backwards, i.e., finding what can throw the AbortException and tracing it back to a particular test, but that may be more difficult.

@timvandermeij
Copy link
Contributor Author

Only one unit test failure remains after which the Jasmine update is unblocked.

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Aug 1, 2020

TEST-UNEXPECTED-FAIL | creates pdf doc from non-existent URL | in chrome | Unhandled promise rejection: MissingPDFException: Missing PDF "http://127.0.0.1:38165/test/pdfs/non-existent.pdf".

I've been able to reproduce this a handful of times locally with Jasmine 3.6.1, even on Windows, but it's extremely intermittent[1] to the point that I cannot do any meaningful debugging :-(


[1] Something like 1-2 failures for 100 runs, even with a constant Jasmine seed.

@timvandermeij
Copy link
Contributor Author

I'll also give this a try. Perhaps it's more easily reproduced on Linux for some reason, or we can find a way to make it more reproducible so it's hopefully easier to find a fix.

@timvandermeij
Copy link
Contributor Author

timvandermeij commented Aug 1, 2020

I'm not having any "luck" so far with reproducing the failure at all on Linux. If you still have it, could you post the random seed so I can try with that?

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Aug 1, 2020

If you still have it, could you post the random seed so I can try with that?

I don't think that the seed actually matters though, since I've seen the intermittent failure for different ones.

Also, I forgot to mention that (almost) all of my testing was done directly in the browser, using http://localhost:8888/test/unit/unit_test.html?spec=api%20getDocument%20creates%20pdf%20doc%20from%20non-existent%20URL, which means that only the affected test runs and the seed should thus be (mostly) irrelevant. Still I only managed something like a ~1 % failure rate.

Edit: When testing in the browser, don't forget to run TESTING=true gulp generic first (and also after making any changes in src/core/ files) such that the worker is being used. Otherwise you're not going to be able to reproduce this at all, at least I don't think so, given just how different the "fakeWorker" code-paths are.

My apologies, #12123 (comment) wasn't entirely clear on the details.

@timvandermeij
Copy link
Contributor Author

Ah, that helps, thanks! I was worried that the problem wouldn't show up if I just ran that one particular test, perhaps because another test might influence this one, but it's good to know that you have managed to get it reproduced with just running that single test since that makes things easier. I had also not used the TESTING=true environment variable (I just ran gulp unittest once and then re-ran the tests in the browser window with some code to not close the window after the test), so that may explain my lack of observed test failures ;-)

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Aug 1, 2020

I just ran gulp unittest once

Note that that command actually uses TESTING=true gulp generic under the hood, I just find it more convenient to use the latter format directly when running unit-tests manually in the browser.


Edit: Also, I suspect that the source of the intermittent unit-test failure might be related to the timing when destroying the worker-thread MessageHandler-instance and/or the webWorker-instance itself.

@timvandermeij
Copy link
Contributor Author

The remaining intermittent failure didn't happen for me locally at all, but happens more often on the bots. If it's related to ordering, we now at least have a run with a random seed: http://54.67.70.0:8877/e79cd113d2ab405/output.txt

@brendandahl
Copy link
Contributor

For the issue where text on page 11 of tracemonkey disappears I was able to create a "smaller" test case and file a bug with chromium.
https://bugs.chromium.org/p/chromium/issues/detail?id=1146296

@timvandermeij
Copy link
Contributor Author

That's great news! Let's hope the Chrome team can resolve this.

@timvandermeij
Copy link
Contributor Author

It looks like a fix is hopefully coming soon given that the upstream bugs are resolved. We'll keep an eye on this.

@timvandermeij
Copy link
Contributor Author

timvandermeij commented Dec 19, 2020

The fix mentioned above didn't change the majority of the intermittent failures unfortunately. However, I do notice that recent browser updates (automatically for Firefox and through Puppeteer updates for Chrome) did resolve the few intermittent failures that happened in Firefox, so I don't see any Firefox reference tests failing intermittently anymore. Chrome on Windows also improved with around 6 failures remaining, but Chrome on Linux is still problematic.

@brendandahl
Copy link
Contributor

I don't think chrome has been updated in puppeteer to the fixed version yet. It should be in version 88.0.43XX and the bots are still on an older version.

@brendandahl
Copy link
Contributor

We're now on chrome 89.0.4389.0, it looks like a number of the windows issues have been fixed, but there are now more linux intermittent failures. They seem to mainly be small pixel changes vs the old behavior where text would be completely missing.

@timvandermeij
Copy link
Contributor Author

The most recent Puppeteer update seems to have resolved most issues, but there are still a few intermittent ones, mainly in Chrome, albeit much fewer than before. The unit test mentioned above seems to fail a bit less often, but still fails from time to time.

@timvandermeij
Copy link
Contributor Author

Closing since I've now opened a new issue with the remaining test failures for a better overview.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants