Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] Add visual regression tests #1081

Merged
merged 17 commits into from
Feb 24, 2021

Conversation

oliviertassinari
Copy link
Member

@oliviertassinari oliviertassinari commented Feb 19, 2021

  • I'm using the same stack @eps1lon has put in place for the main repository. It's kicking ass. It's insanely fast:

Capture d’écran 2021-02-20 à 00 41 51

I have noticed a few polish changes we can do in the core, I will apply them in a batch, when I have enough items on my list.

  • For the screenshots, we use the stories and the demos of the documentation.
  • The whole process in the CI takes about 2 minutes. It runs in parallel to the other tasks so we won't feel it.
  • From a cost perspective, each build costs about: $0.01 in Circle CI (200 screenshots) resources.
    It's less than 2 minutes at 10 credits per minute. On Argos-CI, the cost is so small that isn't even worth mentioning. Chromatic pricing is $0.008/image, so $1 per build, x100 more expensive. Percy, Applitools are likely in the same order.
  • I had to disable the random generation of data to guarantee stable diffs for each PR.
  • useData can be seriously slow. I had to reduce the order of magnitude of the data generated to get the test in a reasonable run duration. We timeout after 2s in dev and 4s in the CI.
  • The initial Argos-CI build is made comparing a run I did locally with the one of the CI: https://www.argos-ci.com/mui-org/material-ui-x/builds/4
  • I have removed most of our dependencies on puppeteer, we could remove it.
  • Cost: 3 hours.
  • One chunk of Continuous Integration #37, 6 months later

What's left to do later:

  • Give it a try. The solution is already stress-tested in the main repo, however, the demos we have aren't. We might have unstable content, or else that will require further cleaning
  • Consider removing screenshots. Right now, we can almost include all, as the overhead is negligible. However, we might still want to remove useless tests.
  • Fix the rendering of the demos. I have noticed that many demos don't render well. I didn't look into how to solve this.

@oliviertassinari oliviertassinari marked this pull request as ready for review February 20, 2021 00:12
@oliviertassinari oliviertassinari mentioned this pull request Feb 20, 2021
13 tasks
@@ -30,12 +30,17 @@ const MAX_CIRCLE_CI_CONCURRENCY = 83;
module.exports = function setKarmaConfig(config) {
const baseConfig = {
basePath: '../',
browsers: ['ChromeHeadlessNoSandbox'],
browsers: ['chromeHeadless'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an arbitrary name. I have replicated the name we give in the main repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names are relevant to the CLI. You can do yarn test:karma --browsers browserNameA,browserNameB.

So the longer the name, the more you have to type in your terminal if you just want to test a subset of browsers. Think of the diff as

- yarn test:karma --browsers ChromeHeadlessNoSandbox
+ yarn test:karma --browsers chromeHeadless

And since we don't have a chromeHeadlessSandbox I don't see why we need the NoSandbox qualifier in the first place. I haven't found the information whether we run with or without a sandbox useful.

@oliviertassinari
Copy link
Member Author

@mui-org/x I have pushed a few fixes. Let me know if it's OK to move forward

@oliviertassinari oliviertassinari self-assigned this Feb 22, 2021
client: {
mocha: {
// Some BrowserStack browsers can be slow.
timeout: (process.env.CIRCLECI === 'true' ? 4 : 2) * 1000,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"test:regressions:dev": "concurrently \"yarn test:regressions:build --watch\" \"yarn test:regressions:server\"",
"test:regressions:run": "mocha --config test/regressions/.mocharc.js --delay 'test/regressions/**/*.test.js'",
"test:regressions:server": "serve test/regressions",
"test:argos": "node ./scripts/pushArgos.js",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to reduce the number of scripts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only have to run one script yarn test:regressions locally. The others are here to help debug (when working on improving the regression generation tool.

@@ -33,12 +33,6 @@ const GlobalStyle = createGlobalStyle`
width: 100%;
}

.main-container {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to change that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing dead CSS code (at least, it's the objective)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I think it should be done in the demo move to the doc PR...

Copy link
Member Author

@oliviertassinari oliviertassinari Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't intend to break the demo-app if it's what you assume. I can definitely revert.

@dtassone
Copy link
Member

What would be great for this one, is to have a run through and discuss what is happening

@oliviertassinari
Copy link
Member Author

oliviertassinari commented Feb 23, 2021

What would be great for this one, is to have a run through and discuss what is happening

@dtassone For more context, the latest iteration comes from mui/material-ui#23500 we were using vrtest before (setup in 2017). From a high-level perspective:

  • We use webpack to build all the demos of the docs and the stories
  • We make this build available on port 5000
  • We use mocha to control Playwright
  • Playwright load the build at port 5000
  • Mocha go through all the demos that are available. It basically navigate to each URL, wait for the next rAF, take a screenshot and move to the next one
  • Once Mocha has finished running, we get a folder with all the screenshots
  • These folders is pushed to Argos-CI
  • Argos-CI compares all the screenshots with the baseline one it has (the fork commit between HEAD and the PR).
  • Argos-CI reports the diffs as a GitHub status. If there is a change, it needs to be manually approved
  • @mnajdova was wondering what approve does. It only does one thing, change the GitHub status (it doesn't replace the baseline screenshots)

@dtassone
Copy link
Member

  • Are the rows data different?
  • What is the size of the screen?

@oliviertassinari
Copy link
Member Author

oliviertassinari commented Feb 23, 2021

Are the rows data different?

To some extent, yes. I have used DISABLE_CHANCE_RANDOM to guarantee that the generated data is always identical between two different builds. This is critical. Any flakiness kills the usefulness, our ability to depend on the tool. I know you have raised that it means that screenshot can't test the sorting feature. My counter argument is that it shouldn't be tested with a screenshot. This tool shine for catching CSS rendering regressions. For functional tests (not regressions), it's both too slow to run and too slow to iterate on, you don't get a clear signal if the logic is right or wrong, what you get is a visual difference you then need to check. It's mentally draining.

What is the size of the screen?

The viewport should be 1000x700px.

@oliviertassinari
Copy link
Member Author

oliviertassinari commented Feb 23, 2021

I have an idea for the rendering issue with the data grid stories. I will try to apply the same global class name we use to set the dimensions of the gri in storybook. It should do it.

@eps1lon
Copy link
Member

eps1lon commented Feb 24, 2021

It basically navigate to each URL, wait for the next rAF

Don't know if you do things different here but conceptually it's "wait for the demo to be rendered". How we determine when a demo is rendered is a bit more tricky but on the main repo we're just using React+events. We don't rely on any scheduling internals.

@oliviertassinari
Copy link
Member Author

Ok, I'm merging with a follow-up to get proper spacing with the stories

@oliviertassinari oliviertassinari merged commit 31d3633 into mui:master Feb 24, 2021
@oliviertassinari oliviertassinari deleted the visual-regressions branch February 24, 2021 11:59
@oliviertassinari
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants