Proposal: enable spec globbing, enforce isolated renderer processes for each spec #681

brian-mann · 2017-09-21T20:32:19Z

What users want

To be able to run multiple specs files

How we could do this

Add glob support to cypress run
cypress run --spec cypress/integration/login/**/*

Doesn't sound that hard, but there are many potential areas this will affect.

What I'm proposing

Enable multiple specs, but force each spec to run in its own isolated renderer process

Why is this necessary?

The default behavior is bad

As of today, we automatically "merge" and group multiple spec files by using a magical route called: __all.

This means that when you iterate on the spec itself: it's isolated because it's the only spec file served. But when you run them all together, they are no longer isolated. This can cause unforeseen issues to arise when running all the specs together which are difficult if not impossible to reproduce.

Users don't use the Test Runner and the CLI as intended

We constantly see users running all their specs from the Test Runner.

The Test Runner is not designed for this use case. It's designed to run a single or handful of specs at a time as you iterate and build your tests. The Test Runner is specialized for debuggability while the CLI is designed for running all of your tests.

When the Test Runner runs, it's constantly creating snapshots, holding references in memory, etc. These quickly add up and your tests end up going slower and slower and oftentimes crash the entire renderer process. This is a design decision, not a bug. When running from the Test Runner it's specialized to enable users to interact with the tests' commands. Running from the Test Runner never "ends" or exits.

This is much different than the CLI. Running from the CLI prevents you from interacting with anything. The CLI will run "to completion" and exit whereas the Test Runner never actually ends. Because of these differences, when running from the CLI, Cypress makes several internal performance optimizations to prevent memory exhaustion and to run as fast as possible.

Watch mode is disabled automatically on 'All Tests'

Users expect "Run All Tests" to reload when you change a spec file. File watching is a CPU intensive task and because we've had users write hundreds of spec files, decided not to always do this. This goes back to the theory that you shouldn't be using this mode to run all your tests.

Solving the Test Runner problem

One possible solution is to count the number of tests you're running in the Test Runner and automatically disable the built in debuggability features and to run the same way the CLI does. We'd need to show a warning message to the user indicating we're "switching off" these features and you won't be able to interact with any of the tests.

This may be pretty annoying but at least it will communicate the intent of the Test Runner well. This option could be exposed via cypress.json such as { interactabilityThreshold: 5 }. Users could then set this really high or disable it to get the old functionality back.

The Downsides

Running All the Tests

By enforcing specs to run in their own isolated process, we instantly have a huge problem for the Test Runner.

For instance there is a:

'Run All Tests' button
A seeded example_spec that contains wayyyyy too many tests and does not represent a normal use case.

This could be solved by removing the Run All Tests button but users would likely complain about this and it's unexpected.

Why can every other testing tool run all tests but Cypress can't?

The example_spec could be solved by splitting up the kitchen sink tests into a series of files:

querying_spec
stubbing_spec
network_spec
etc, etc, etc

Splitting up the example_spec into smaller files is an easy win: but then we're back to the problem above: "How do I run all my tests?"

We could kick the can on this, and continue to let users run all of their tests from the Test Runner. Unfortunately, if we introduce a "warning message" this would continue to communicate confusing intent.

Why is it warning me like I'm doing something wrong when all I've done is click the first button I've seen!

Another option is when running "All Tests" from the Test Runner we iterate and highlight each spec but then kill the process when switching to the new spec. This kind of defeats the purpose of the GUI as you would not be able to interact (and failures would instantly go away). Not a good option.

The Upsides (why this would help all users)

Forcing each spec to run in its own isolated renderer would do a few very important things:

It would prevent any global leakage from one spec to the next
It would automatically restart the browser / renderer process in between each spec
Memory would automatically be purged and we'd never have to worry about memory leaks
The browser would likely crash much less often
This positions users to better maintain and manage a growing suite of tests
Introducing parallelization and load balancing becomes painlessly simple
Users would overall see more consistent test results

Parallelization

Forcing specs to run isolated is actually the same thing as performing parallelizations by spinning up multiple docker containers or instances and then dividing all of the spec files between them.

When this happens you still have all the same problems as what we listed above:

You will need to stitch together multiple external reports
You will receive multiple isolated videos
You will receive multiple stdout text

Because users are already doing this (manually) it means they're overcoming these problems, or they aren't really problems to begin with - which is why I'm okay with making this the default behavior.

This problem is further mitigated by the way the Dashboard Service works. It can automatically associate groups of independent specs to the same run. This means all the data will be aggregated in the same place making it appear as if the run happened on a single machine.

Load Balancing

Splitting up the specs enables us to easily load balance parallelization.

Instead of manually balancing by machine like this:

Machine A gets Spec 1,2,3
Machine B gets Spec 4,5
Machine C gets Spec 6,7,8,9
Machine D gets Spec 10,11

Each machine would run at the same time and simply ask "for the next spec file".

That means the work completes as fast as possible without needing to manually balance or tweak anything.

Load balancing is something our Dashboard Service is already doing on internal projects.

Failure Isolation

Another benefit is that by splitting up the specs and taking multiple videos, you have a much easier time focusing on the real thing you care about - failures.

Instead of having a 30+min run, your videos would only be the duration of each spec file. Much better and easier to work with.

More consistent results

Splitting out spec files will absolutely without a doubt yield better, more consistent results. Browsers aren't perfect and Cypress pushes them to their absolute limit. Garbage collection does not work in our favor - letting the browser decide when to GC oftentimes leads it to eating up huge chunks of "primed" memory and there is no way to force it to release.

We've been internally running in isolated specs (oftentimes thousands of specs across dozens of files) and it has removed nearly all flake. Previously this flake only cropped up when running all the specs at once and it was incredibly difficult to default and took a huge amount of effort.

Better video names

Videos would no longer have a random name. Instead they could be named after the spec file: cypress/videos/login_spec.js.mp4

New Challenges

The default spec reporter and the additional Cypress messages would run for as many N specs you have. Instead of receiving a single "report" Cypress would essentially iterate through each spec file and start the "Tests Running" message all the way through to "Tests Finished". Internally we'd keep state with the number of failures and still exit with the total aggregate.

External Reporters

This would unfortunately create a problem for external / 3rd party reporters. Instead of receiving a single report, it would generate a report for each spec file. While this sounds bad it's not that different than what already happens when you parallelize a run.

The text was updated successfully, but these errors were encountered:

brian-mann · 2017-09-21T20:53:19Z

Related to

brian-mann · 2017-09-23T18:43:56Z

After debating this and considering many different options we've decided on the following:

CLI

provide --spec globbing support
isolate each spec and purge primed memory by killing the renderer process in between specs
the terminal's stdout will repeat the same messaging we have now for each spec file. "Starting Tests", "Finished Tests", etc.
we will record an independent video for each spec and name the video after the spec file
we will upload data to the dashboard as soon as each spec finishes which provides results faster
we will provide a final aggregate of all the failures / duration at the very end of the run. We will continue to exit with the number of failures
instead of trying to 'stitch' all the specs together as one contiguous run, we are going to split each into its own 'start' and 'end'. This will mean there will be a N reporters for N number of specs.

Test Runner

Continue enabling users to "Run all Tests"
Enable clicking folders to automatically run a subset of specs
Instead of killing the renderer process between specs the browser will simply be "navigated" to the next spec URL.
This will isolate each spec, and be much faster and less abrupt than killing the renderer like we will for the CLI. However this makes sense considering Cypress already has the code for swapping the URL and rehydrating the state of the reporter. We already do this when swapping domains in a cy.visit.
Isolating specs will help with memory exhaustion, but its still possible to blow out on a very long and intense spec file.
To solve this we'll begin purging snapshots by reducing the default minimum of 50, down to 5.
Purged snapshots will only affect passing tests and we will not purge data for failing tests. This will enable users to review the commands and output for failed tests - as they likely don't care about the data for passing tests anyway.
We'll need to add a good user experience / messaging around why snapshots were purged
When running multiple specs we'll add the spec file as part of the command log which will give you a shortcut for focusing in on a spec file.

Dashboard

Upgrade the dashboard to enable multiple videos per instance (spec)
Update the UI to account for multiple independent specs
Group and associate parallelized instances to a single 'group'
Help the user by focusing all attention on failing tests
Display the spec on each failure
Add a section for 'stdout' for each failure.

Documentation

we will direct / document common solutions for things like aggregating multiple reporter output (JSON, JUnit, etc)
document how to use multiple reporters
document reducing the default minimum of snapshots from 50, down to 5.
We'll modify the workflow and documentation to suggest iterating on a spec and not trying to run all tests from the Test Runner.

jennifer-shehane · 2018-05-04T14:32:10Z

Reopening for issue tracking purposes - since this is an epic with multiple issues within it in Zenhub. The work for the issue itself has been done.

jennifer-shehane · 2018-05-14T19:47:20Z

I tested the spec globbing in the develop branch and ran into a couple of unexpected behaviors:

1. When passing a --project flag with a --spec flag, I expected the --spec flag to work from the project directory i specified, but it only works from the CWD. This was probably mostly confusing because of the way the error message printed as just:

We searched for any files matching this glob pattern:

../cypress/integration/runs/tabs/insights**

I couldn't tell which project dir it was looking within since it didn't print to the err msg.

2. Sometimes it thinks Cypress isn't installed

This code worked fine:

npm run cypress:run -- --project ../cypress-dashboard --spec ../cypress-dashboard/cypress/integration/app_spec.coffee

This code threw an error about install:

Code

npm run cypress:run -- --project ../cypress-dashboard --spec ../cypress-dashboard/cypress/integration/runs/tabs/insights**

Error

> node ./cli/bin/cypress run --dev "--project" "../cypress-dashboard" "--spec" "../cypress-dashboard/cypress/integration/runs/tabs/insights_parallel_spec.coffee" "../cypress-dashboard/cypress/integration/runs/tabs/insights_spec_metrics_spec.coffee"

No version of Cypress is installed.

Please reinstall Cypress by running: cypress install
----------

Cypress executable not found at: /Users/jennifer/Dev/Projects/cypress/cli/dist/Cypress.app/Contents/MacOS/Cypress
----------

Platform: darwin (14.5.0)
Cypress Version: 0.0.0

ryan-mulrooney · 2018-05-27T00:38:06Z

Would someone be able to point towards some documentation for spec globbing, please (@jennifer-shehane?)?

I have the following folder structure:

cypress
- integration
  - e2e
    - e2e_test1
    - e2e_test2

I used the follow CLI command to attempt to glob and run the tests: " ./node_modules/.bin/cypress run -s ./cypress/integration/e2e/e2e**"

I expected the both e2e tests to be run concurrently, but they were ran sequentially.

Thanks.

brian-mann · 2018-05-30T04:27:37Z

Released in 3.0.0.

brian-mann · 2018-05-30T04:29:42Z

@ryan-mulrooney https://docs.cypress.io/guides/guides/command-line.html#Run-tests-specifying-a-glob-of-where-to-look-for-test-files

Spec globbing has nothing to do with running concurrently. Trying to run concurrent e2e tests is not recommended and doesn't work the way you really expect it to. Browsers will not act the same if they are not in focus. It's better to parallelize specs at the operating system level in CI (usually with docker containers).

One of our next big feature releases will have spec load balancing / parallelization to help you do just that.

wmonk · 2018-07-03T09:17:05Z

@brian-mann "One of our next big feature releases will have spec load balancing / parallelization to help you do just that.", is there a way to track the implementation of this? As mentioned previously in this thread, parallelizing through docker & glob is the preferred method right now, but we are wondering wether to spend time doing this, or wait until Cypress has support for it.

Do you know what this feature might look like as well?

jennifer-shehane · 2018-07-18T15:34:35Z

@wmonk Hi, yes! There is this issue #1690

We had a lot of discussion behind the scenes on the implementation details, which have now been completely settled. Am going to update that issue with more detail.

WhatFreshHellIsThis · 2021-12-17T23:26:28Z

It doesn't appear this was ever implemented or am I missing something?
Enable clicking folders to automatically run a subset of specs
clicking on a folder just opens or closes the folder, nothing offered to test that folder.

brian-mann added the stage: proposal 💡 No work has been done of this issue label Sep 21, 2017

brian-mann self-assigned this Sep 21, 2017

jennifer-shehane mentioned this issue Oct 10, 2017

Any plans to run tests in parallel? #64

Closed

brian-mann mentioned this issue Oct 18, 2017

Execute a subset of specs by subfolders in UI #759

Closed

brian-mann mentioned this issue Oct 26, 2017

Cypress slows down drastically over long runs - restart renderer between spec files? #431

Closed

jennifer-shehane assigned jennifer-shehane and bahmutov Oct 30, 2017

jennifer-shehane added stage: in progress pkg/example This is due to an issue in the packages/example directory and removed stage: proposal 💡 No work has been done of this issue labels Oct 30, 2017

jennifer-shehane mentioned this issue Oct 30, 2017

example spec file too long cypress-io/cypress-example-kitchensink#9

Closed

jennifer-shehane added Epic Requires breaking up into smaller issues pkg/desktop-gui This is due to an issue in the packages/desktop-gui directory pkg/reporter This is due to an issue in the packages/reporter directory labels Oct 30, 2017

This was referenced Nov 1, 2017

Recover from renderer / browser crashes #349

Closed

Paused before potential out-of-memory crash #882

Closed

brian-mann mentioned this issue Nov 10, 2017

Running tests from a specific groups #901

Closed

This was referenced Nov 24, 2017

Do not crash if onBeforeLog returns false #958

Closed

If running single spec, name output video file using spec name #980

Closed

jennifer-shehane added type: feature New feature that does not currently exist type: performance 🏃‍♀️ Performance related labels Dec 1, 2017

This was referenced Dec 1, 2017

Tests slows down to a crawl and eventually times out/crashes #990

Closed

Add option to run specs outside of cypress directory #1012

Closed

Cypress resolves import incorrectly? #1035

Closed

This was referenced Dec 16, 2017

Renderer process crashed #598

Closed

Crash after allocation failure #1077

Closed

brian-mann closed this as completed Apr 15, 2018

brian-mann mentioned this issue Apr 15, 2018

Change the way "Run All Specs" are run in Interactive Mode (from the GUI) #1586

Open

brian-mann mentioned this issue Apr 25, 2018

The more the test, the higher the memory occupation #1636

Closed

chrisbreiding mentioned this issue May 1, 2018

Plugins API - preprocessing with source file - electron/ chrome run #1654

Closed

jennifer-shehane mentioned this issue May 4, 2018

Multiple files, multiple beforeEach, not all run #1678

Closed

jennifer-shehane reopened this May 4, 2018

jennifer-shehane mentioned this issue May 14, 2018

Support Firefox browser #1096

Closed

11 tasks

jennifer-shehane added stage: pending release and removed stage: in progress labels May 17, 2018

brian-mann closed this as completed May 30, 2018

brian-mann mentioned this issue Jun 14, 2018

3.0 - custom reporters broken now that tests are run in isolation #1824

Closed

brian-mann mentioned this issue Jun 26, 2018

Test runs fine individually but fails when "run all" is given #2028

Closed

jennifer-shehane mentioned this issue Jul 12, 2018

Reasonable Metrics for Test Latency #2133

Closed

kuceb mentioned this issue Jul 17, 2018

cypress slows dramatically then crashes during text input, especially with clear #579

Closed

brian-mann mentioned this issue Sep 3, 2018

Tests Run Slowly #2206

Closed

jennifer-shehane removed stage: pending release labels Nov 5, 2018

jennifer-shehane mentioned this issue Dec 20, 2018

Cypress Run - Do not relaunch browser in between each spec file #2951

Closed

rbaxter08 mentioned this issue Feb 5, 2020

Bug: Crash "Error: write EPIPE" when running all specs #5581

Closed

lmiller1990 mentioned this issue Jul 19, 2022

CT Performance Improvements: v10 slower than v9 #22353

Closed

jennifer-shehane added Cypress Cloud Feature request or issue in Cypress Cloud, not App and removed external: cloud labels Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: enable spec globbing, enforce isolated renderer processes for each spec #681

Proposal: enable spec globbing, enforce isolated renderer processes for each spec #681

brian-mann commented Sep 21, 2017 •

edited by jennifer-shehane

Loading

brian-mann commented Sep 21, 2017 •

edited

Loading

brian-mann commented Sep 23, 2017 •

edited by jennifer-shehane

Loading

jennifer-shehane commented May 4, 2018

jennifer-shehane commented May 14, 2018 •

edited

Loading

ryan-mulrooney commented May 27, 2018 •

edited

Loading

brian-mann commented May 30, 2018

brian-mann commented May 30, 2018 •

edited

Loading

wmonk commented Jul 3, 2018

jennifer-shehane commented Jul 18, 2018

WhatFreshHellIsThis commented Dec 17, 2021

Proposal: enable spec globbing, enforce isolated renderer processes for each spec #681

Proposal: enable spec globbing, enforce isolated renderer processes for each spec #681

Comments

brian-mann commented Sep 21, 2017 • edited by jennifer-shehane Loading

What users want

How we could do this

What I'm proposing

Why is this necessary?

The default behavior is bad

Users don't use the Test Runner and the CLI as intended

Watch mode is disabled automatically on 'All Tests'

Solving the Test Runner problem

The Downsides

Running All the Tests

The Upsides (why this would help all users)

Parallelization

Load Balancing

Failure Isolation

More consistent results

Better video names

New Challenges

External Reporters

brian-mann commented Sep 21, 2017 • edited Loading

brian-mann commented Sep 23, 2017 • edited by jennifer-shehane Loading

CLI

Test Runner

Dashboard

Documentation

jennifer-shehane commented May 4, 2018

jennifer-shehane commented May 14, 2018 • edited Loading

ryan-mulrooney commented May 27, 2018 • edited Loading

brian-mann commented May 30, 2018

brian-mann commented May 30, 2018 • edited Loading

wmonk commented Jul 3, 2018

jennifer-shehane commented Jul 18, 2018

WhatFreshHellIsThis commented Dec 17, 2021

brian-mann commented Sep 21, 2017 •

edited by jennifer-shehane

Loading

brian-mann commented Sep 21, 2017 •

edited

Loading

brian-mann commented Sep 23, 2017 •

edited by jennifer-shehane

Loading

jennifer-shehane commented May 14, 2018 •

edited

Loading

ryan-mulrooney commented May 27, 2018 •

edited

Loading

brian-mann commented May 30, 2018 •

edited

Loading