Inconsistent handling of script errors between the three engines #731

AkA84 · 2018-04-13T07:11:59Z

(This is with backstopjs@3.2.9)

I noticed an inconsistency with how the three engines handle script errors, with Puppeteer being the one more troublesome, imho.

I came up with this setup, which you can use to replicate the issue:

Three scenarios, one of which is designed to fail

"scenarios": [
  {
    "label": "Google",
    "url": "https://google.com"
  },
  {
    "label": "Google / Wait for wrong selector",
    "url": "https://google.com",
    "onReadyScript": "wait-for-wrong-selector.js"
  },
  {
    "label": "Google / Type",
    "url": "https://google.com",
    "onReadyScript": "type.js"
  }
]

For the two scenarios with the "onReady" script, there is a variation of the script for each of the engines

Casper

// casper/type.js
module.exports = casper =>
  casper.then(() => {
    casper.fillSelectors('form.tsf', {
      'input#lst-ib': 'foobarbar'
    });
  });
}

// casper/wait-for-wrong-selector.js
module.exports = casper =>
  casper.then(() => {
    casper.fillSelectors('form.tsf', {
      'input#lst-ib': 'foobarbar'
    });
  });

  casper.then(() => {
    casper.waitForSelector('.does-not-exist');
  });
}

Chromy

// chromy/type.js
module.exports = chromy => {
  chromy.type('input#lst-ib', 'foobarbar');
}

// chromy/wait-for-wrong-selector.js
module.exports = chromy => {
  chromy.type('input#lst-ib', 'foobarbar');
  chromy.wait('.does-not-exist');
}

Puppeteer

// puppet/type.js
module.exports = async puppet => {
  await puppet.type('input#lst-ib', 'foobarbar');
}

// puppet/wait-for-wrong-selector.js
module.exports = async puppet => {
  await puppet.type('input#lst-ib', 'foobarbar');
  await puppet.waitFor('.does-not-exist', { timeout: 1000 });
}

Now backstop reference and backstop test are run with all three engines, and these are the results that I've found

Casper

COMMAND | Executing core for `reference`
# ...
CasperJS:  CREATING NEW REFERENCE FILES
CasperJS:  Current location is https://google.com
CasperJS:  Capturing screenshots for desktop (1920x900)
CasperJS:  Ready event received.
CasperJS:  Current location is https://google.com
CasperJS:  Wait timeout of 5000ms expired, exiting.

Testing script failed with code: 1
# ...
      COMMAND | Command `reference` ended with an error after [9.965s]

With CasperJS the script error makes the entire command immediately fail and exit, meaning that BackstopJS can't capture the remaining screenshots in the list.

Given that reference fails, there's no point in running test given that it will fail as well and no report will be generated

Chromy

COMMAND | Executing core for `reference`
# ...
CREATING NEW REFERENCE FILES
9222 INFO >  BackstopTools have been installed.
CREATING NEW REFERENCE FILES
9223 Chrome v65 detected.
9223 INFO >  BackstopTools have been installed.
CREATING NEW REFERENCE FILES
9224 Chrome v65 detected.
9224 INFO >  BackstopTools have been installed.

# ...
      COMMAND | Command `reference` successfully executed in [10.396s]

Chromy marks the task as successful and doesn't report any error whatsoever, although the screenshot for "Google / Wait for wrong selector" had not been generated

Running test gives the same output, but it's marked as failed. This is report's outcome

COMMAND | Executing core for `report`
      compare | Chromy error: WaitTimeoutError. See scenario – Google / Wait for wrong selector (desktop)
      compare | OK: Google backstop_default_Google_0_document_0_desktop.png
      compare | OK: Google / Type backstop_default_Google___Type_0_document_0_desktop.png
      # ...
      report | 2 Passed
      report | 1 Failed
      # ...
      COMMAND | Command `report` ended with an error after [0.177s]
      COMMAND | Command `test` ended with an error after [10.845s]

Now report reports on the WaitTimeoutError and it fails, showing in the html report the missing screenshot for "Google / Wait for wrong selector"

Puppeteer

COMMAND | Executing core for `reference`
# ...
CREATING NEW REFERENCE FILE
Browser Console Log 0: JSHandle:BackstopTools have been installed.
# ...
CREATING NEW REFERENCE FILE
Browser Console Log 0: JSHandle:BackstopTools have been installed.
######## Error running Puppeteer ######### Error: waiting failed: timeout 1000ms exceeded
# ...
CREATING NEW REFERENCE FILE
Browser Console Log 0: JSHandle:BackstopTools have been installed.

# ...

      COMMAND | Command `reference` successfully executed in [5.624s]

While it reports that something went wrong with a script (which is good for debugging), Puppeteer still allows for the entire test suite to run, and the report command is marked as successful

Running test gives the same output, and it's marked as successful as well. This is report's outcome

COMMAND | Executing core for `report`
      compare | OK: Google / Type backstop_default_Google___Type_0_document_0_desktop.png
      compare | OK: Google backstop_default_Google_0_document_0_desktop.png
       # ...
       report | 2 Passed
       report | 0 Failed
       # ...
      COMMAND | Command `report` successfully executed in [0.476s]
      COMMAND | Command `test` successfully executed in [6.424s]

There is no mentioning whatsoever of "Google / Wait for wrong selector" which at this point is simply ignored, report is marked as successful and the html report shows that everything is fine...but with only 2 screenshots instead of 3

While CasperJS is too eager to stop everything as soon as it finds an error, Puppeteer's handling of script errors imho is worse as it simply marks the whole run as successful even if there are screenshots missing. If you have a big test suite (we have 120+ screenshots in our project), it's very easy to trust the report and not realize that some screenshots might be missing.

I think the best behaviour should be a mix of Chromy + Puppeteer: show the script error in the output for debugging purposes, let the command run to completion and then mark test/report as failed and show the missing screenshot(s) in the report

The text was updated successfully, but these errors were encountered:

garris · 2018-04-13T15:26:18Z

@AkA84 thank you very much for doing this comparison! The current intended puppeteer behavior is supposed to throw an unexpected error if a case like this happens. So, this is definitely a bug. I would definitely love to see this fixed and I have been putting off closing #704 and announcing an official release of puppeteer until this is addressed.

@krisdigital — 👆 any ideas here?

krisdigital · 2018-04-13T20:46:30Z

@garris With this PR it should behave like Chromy + the error messages..

garris · 2018-04-13T21:11:31Z

@krisdigital — thank you so much! Will follow up soon.

garris · 2018-04-15T07:19:32Z

The puppeteer version is now fixed on @latest. The error messages are also being sent to the front end -- now I just need to add this information to the layout.

npm install -g backstopjs@3.2.11

garris · 2018-04-16T18:29:43Z

Hello All, Puppeteer errors are now correctly surfaced in the web report...

AkA84 · 2018-04-18T07:31:10Z

@garris @krisdigital i installed the latest version (v3.2.12) and run those 3 scenarios again, and now backstop (with puppeteer) exits as soon as an error is encountered

COMMAND | Executing core for `reference`
createBitmaps | Selected 3 of 3 scenarios.
CREATING NEW REFERENCE FILE
Browser Console Log 0: JSHandle:BackstopTools have been installed.
x Close Browser
CREATING NEW REFERENCE FILE
Browser Console Log 0: JSHandle:BackstopTools have been installed.
Puppeteer encountered an error while running scenario "Google / Wait for wrong selector"
Error: waiting for selector ".does-not-exist" failed: timeout 1000ms exceeded
      COMMAND | Command `reference` ended with an error after [3.799s]
      COMMAND | TypeError: Cannot read property 'join' of undefined
                    at processScenarioView (node_modules/backstopjs/core/util/runPuppet.js:252:151)
                    at <anonymous>

I noticed that the line referenced in the error is this one

const testPair = engineTools.generateTestPair(config, scenario, viewport, variantOrScenarioLabelSafe, scenarioLabelSafe, 0, `${scenario.selectors.join('__')}`);

scenario.selectors in my case is undefined. This is the scenario object for me:

{
  label: 'Google / Wait for wrong selector',
  url: 'https://google.com',
  onReadyScript: 'wait-for-wrong-selector.js',
  sIndex: 1
}

krisdigital · 2018-04-18T07:39:48Z

@AkA84 @garris I think that error was introduced here ddce7a5#diff-2b0d3e4bf62b40b86f9b984a5f9003ecR252

garris · 2018-04-18T15:15:58Z

Thank you for finding this! I will fix this ASAP.

AkA84 · 2018-04-19T07:14:18Z

@garris (and @krisdigital ) everything works great now, thanks!

AkA84 · 2018-04-20T13:44:47Z

I'm sorry @garris i just noticed now that it still doesn't work as expected, the missing screenshots are marked as green, as if Backstop compared the two error images

This wasn't the case from what i see here, so something must've gotten broken after that

garris · 2018-04-20T14:52:25Z

@AkA84 ah, so did you approve that error state? That is actually legal! 😄

You are allowed to check for (and approve) selectors not being available.

AkA84 · 2018-04-22T08:32:15Z

@garris no i've never approved any report in any of the tests i did as part of this issue, that's why i'm reporting this as a bug. I think the behaviour should be the one displayed in your screenshot

and the one that Chromy has (with the due differences)

Running test gives the same output, but it's marked as failed. [...]
Now report reports on the WaitTimeoutError and it fails, showing in the html report the missing screenshot for "Google / Wait for wrong selector"

otherwise imho this could be even worse than it was before: if before you knew the total number of scenarios you could quickly see that the number reported on was not right, now you can't even do that ! You basically have to scroll down yourself and see if there are any errors reported (= you can't trust the report)

And I'm not even considering the scenario where you are running backstop as part of a CI process where none of the two options are even possible :/

garris · 2018-04-22T10:53:43Z

Did you run reference at any point? You had to run reference or approve for that reference image to get there. If you did not do either of those things then there is an error. Otherwise this is working as designed.

igorpavlov-zz · 2018-04-23T09:24:43Z

@garris Does this mean that I would need to check all my screenshots manually? We are a relatively large project and currently having 100+ scenarios to run, we cannot really afford checking each reference screenshot to be valid.

I think it would make sense to just show the scenario fail if:

there are errors in script / selector errors
screenshot could not be taken by any reason

Please let us know what do you think.

AkA84 · 2018-04-23T12:10:17Z

@garris I think I understand where we are diverging with our opinions on how things should work.

The way we use backstop (the first time, at least) in our project is to first run directly backstop reference so that we automatically get the baseline screenshots, and then to run backstop test to catch any diffs. In this case if a screenshot had a script error, backstop would not report it after running test, on the assumption that if the same error appeared on the reference screenshot then the dev must have approved it and therefore must be fine with it.

Instead, if I understand correctly, the way you expect it to be used is to run directly test even the first time, then manually check if all the screenshots look ok and finally accept them so they can get promoted to original set of baseline screenshots. By checking manually the would-be baseline screenshots, the dev is sure to be aware of any screenshot with script errors and so if he's fine with having a that as a reference, then Backstop will be fine with it as well

I see two possible problems with the latter approach (on the assumption that I got you right)

It requires a manual check to approve the first set of baseline screenshots. This is fine if you have a small set of scenarios, but if you have a considerable amount (we have more than 120 scenarios at the moment), this step could not only become an annoyance but also be prone to human error
It might not work well with CI. We intend to use backstop at some point in Jenkins, but in order to do that we need to create the reference screenshots automatically and thus to trust that backstop reports any script errors that it might find, however consistent they might be

Last point I though worth discussing was

You are allowed to check for (and approve) selectors not being available.

But wouldn't it be odd to rely on a scenario with a failing script to check whether a selector is available or not? Shouldn't backstop encourage devs, in that case, to write a successful script that check if a selector doesn't exist, instead of using a failing script that expect a selector to exist?

Also please note that this is not simply about a selector existing or not, it's about any script error. For example in changed my type.js script to call non existing method, and backstop still passed the comparison despite it

module.exports = async engine => {
  // ...
  await engine.nonExistingMethod('.does-not-exist', { timeout: 1000 }); // <---
}

And I tried one last thing: to have, for the same scenario, one error thrown for reference, and a different error thrown for test

// when running "reference"
module.exports = async engine => {
  // ...
  await engine.nonExistingMethod('.does-not-exist', { timeout: 1000 });
}

// when running "test"
module.exports = async engine => {
  // ...
  await engine.waitFor('.does-not-exist', { timeout: 1000 });
}

and backstop still passed the comparison, reporting only on the second error

but this means that then a scenario with a failing script will always be pass comparison as long as the script keeps failing, no matter what the reason for the failure is (could be a non-existent selector, a wrong method, internal engine error, etc)

garris · 2018-04-23T15:38:09Z

I understand your thinking — however, this has not been a primary use case as of yet.

That said — i think this is a useful feature and it would be very easy to add. How about, at the end of a backstop reference flow, we look though the results collection to see if there were any client errors — if there were any errors we exit with a 1. Or, you could add this behavior to reference and test commands behind a config flag (e.g. failOnException). I think that would be cool.

Or maybe simpler for you on the CI side, you could just grep for error messages that pop up during the run and decide specifically what kinds of things you want to check for.

Do any of these sound good for you?

igorpavlov-zz · 2018-04-23T16:54:28Z

@garris Thanks a lot for the reply, I think we should still allow reference to continue running, even if there are errors. These errors at the end are actually very handy. Imagine we have 2 errors, this would allow to see them all by running BS once, not twice.

The issue is that the test remains to be considered as passed, while there are errors - it failed, test is not working.

garris · 2018-04-23T17:17:40Z

@igorpavlov Sure, So then you would want failOnException: true to fail an individual scenario if there were any exceptions. (even if the comparison passes) Right? This seems like a useful feature. What do you guys think?

AkA84 · 2018-04-24T11:53:22Z

Thanks @garris for your reply!

If by adding that failOnException flag (or flags, one for reference and one for test) we can achieve the following, it sounds great to me!

reference can be successful (exit code = 0) even if there are script errors. Basically as it is now. If someone wants the command to end with exit code 1, they can simply set failOnException: true on it
test, and by extent report, should fail (exit code = 1) if there are script errors. And by fail I don't mean "exit immediately" but i mean "finish the whole run and then report the error". The report should then mark as failed any scenario with both screenshots missing due to script errors, and that scenario should be counted in the "failed" total. This of course only if failOnException: true

Does that match what you had in mind for the failOnException flag?

garris · 2018-04-24T14:20:09Z

Yes. That is very specifically exactly what I was thinking.

garris closed this as completed in d4f9956 Apr 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent handling of script errors between the three engines #731

Inconsistent handling of script errors between the three engines #731

AkA84 commented Apr 13, 2018

garris commented Apr 13, 2018

krisdigital commented Apr 13, 2018

garris commented Apr 13, 2018

garris commented Apr 15, 2018 •

edited

Loading

garris commented Apr 16, 2018

AkA84 commented Apr 18, 2018

krisdigital commented Apr 18, 2018

garris commented Apr 18, 2018 •

edited

Loading

AkA84 commented Apr 19, 2018

AkA84 commented Apr 20, 2018 •

edited

Loading

garris commented Apr 20, 2018

AkA84 commented Apr 22, 2018

garris commented Apr 22, 2018

igorpavlov-zz commented Apr 23, 2018 •

edited

Loading

AkA84 commented Apr 23, 2018 •

edited

Loading

garris commented Apr 23, 2018

igorpavlov-zz commented Apr 23, 2018 •

edited

Loading

garris commented Apr 23, 2018

AkA84 commented Apr 24, 2018 •

edited

Loading

garris commented Apr 24, 2018

Inconsistent handling of script errors between the three engines #731

Inconsistent handling of script errors between the three engines #731

Comments

AkA84 commented Apr 13, 2018

garris commented Apr 13, 2018

krisdigital commented Apr 13, 2018

garris commented Apr 13, 2018

garris commented Apr 15, 2018 • edited Loading

garris commented Apr 16, 2018

AkA84 commented Apr 18, 2018

krisdigital commented Apr 18, 2018

garris commented Apr 18, 2018 • edited Loading

AkA84 commented Apr 19, 2018

AkA84 commented Apr 20, 2018 • edited Loading

garris commented Apr 20, 2018

AkA84 commented Apr 22, 2018

garris commented Apr 22, 2018

igorpavlov-zz commented Apr 23, 2018 • edited Loading

AkA84 commented Apr 23, 2018 • edited Loading

garris commented Apr 23, 2018

igorpavlov-zz commented Apr 23, 2018 • edited Loading

garris commented Apr 23, 2018

AkA84 commented Apr 24, 2018 • edited Loading

garris commented Apr 24, 2018

garris commented Apr 15, 2018 •

edited

Loading

garris commented Apr 18, 2018 •

edited

Loading

AkA84 commented Apr 20, 2018 •

edited

Loading

igorpavlov-zz commented Apr 23, 2018 •

edited

Loading

AkA84 commented Apr 23, 2018 •

edited

Loading

igorpavlov-zz commented Apr 23, 2018 •

edited

Loading

AkA84 commented Apr 24, 2018 •

edited

Loading