Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overall automated testing strategy #671

Open
phil-davis opened this issue May 21, 2021 · 1 comment
Open

Overall automated testing strategy #671

phil-davis opened this issue May 21, 2021 · 1 comment
Assignees

Comments

@phil-davis
Copy link
Contributor

phil-davis commented May 21, 2021

Background

Over the years we have developed various testing tools to test parts of the ownCloud systems, such as:

  1. unit tests in the same language as the units/classes they are testing (phpunit, junit etc)
  2. code analysis tools to help find potential problems without executing any code (code style checks, linters, phan, phpstan etc)
  3. "integration" tests that cover more than one class, or have different back-ends "plugged in" (sometimes using the unit test framework, but exercising a group of things together, having different real filesystems or databases functioning in the tests)
  4. "end-to-end" tests that have a full system-under-test running and use the externally-available points to test the system (API test suites, UI tests that "click around" in a browser, desktop client tests that exercise the desktop client with its UI, https://docs.pact.io/ etc) using various frameworks like Behat, Nightwatch, Jest

We are getting a lot of test scenarios and it takes a long time to run them all on every PR to a repo. The resources of the drone server and agents are often stretched to the limit, CI of PRs gets slower. We want to optimize the automated testing so that we find "most" regressions/bugs but also not run unnecessary and slow test scenarios.

The main two back-end server products are:
a. ownCloud10
b. OCIS

Those should have a "compatible" API. That means that there should be a "core" of API functionality that works exactly the same in both products. But there can be extra features/settings in either product that are not implemented in the other product.

Clients should be able to use those back-end products with minimal need to understand which back-end product is running. Where there are different features/settings available, the back-end product would normally advertise that in some common "capabilities" report. That allows clients to know if a capability is available, or not, rather than having to know if it is talking to ownCloud10 or OCIS.

These are the main clients developed by ownCloud:

  1. web - the new browser-based UI
  2. "traditional" web UI used with ownCloud 10
  3. desktop client (sync client for Windows/Mac/Linux)
  4. iOS app
  5. Android app

The "traditional" web UI only runs with an ownCloud10 back-end. The other clients should work seamlessly with either back-end server products.

Testing Strategy

We have test suites available to us for:

  • API (runs reasonably fast)
  • browser-based UI (suites for both "traditional" and "web" UI), they are quite slow to run, often up to 1 minute per scenario.
  • desktop-client tests (basic tests and more being written), they will be "moderately-slow" because they do have to interact with a UI
  • "simulation" of client-server interactions (pact tests). These have a "recording" of the interactions between a client and server (requests and responses). Test pipelines can "replay" the interactions and verify that the responses are as expected. That runs much faster against a server, rather than having to actually do the actions in the client. So those can quickly confirm that changed server software will still respond in a compatible way to a known client.
  1. Use the core API tests as the "standard" for compatibility of back-end server implementations. Run them against both back-ends, like we do now.
  2. Run the "traditional" webUI tests in core oC10 CI. But if there are no UI code changes, then we could skip the UI tests (need to investigate how to reliably detect this)
  3. Run "pact" (or similar style) tests of the web client against core oC10 and OCIS servers in the core and ocis repos - gives confidence that the web client will still work, without the slowness of clicking around in the UI
  4. Create and run "pact" test suites that simulate the other clients (desktop, iOS, Android) against core oC10 and OCIS servers (as they become available)
  5. Run the full set of "web" UI tests in the owncloud/web repo. (with possible optimization based on test to JS code coverage)

Testing Permutations

We have many permutations of ways to set up a back-end server. For example, in core ownCloud10 there can be different databases and storage implementations used. Apps can be installed that "get in the middle" of almost every request - encryption inserts a layer of processing to/from the storage, user_ldap inserts different authentication flows, activity and audit apps cause extra hooks to fire on almost every request. In OCIS there are also different storage drivers, ways of authenticating,...

To check that each of these "major" installation permutations works, we run the test suites multiple times, against each permutation. For example, the encryption app runs the core test suites with master-key, and also with user-keys encryption in place. cs3org/reva CI runs the API tests 3 times, with each storage driver.

We often think of other permutations that would be nice to cover. For example, in some UI suites we run the tests in different-sized browser windows so that we know that the UI works in mobile-phone-sized windows... This issue considers doing that for the "traditinal" oC10 web UI: owncloud/core#38759

The frequency of running the permutations can be reviewed. Of course, the less frequently a permutation is run, then the more difficult it becomes to triage a regression.

Possible Optimizations

  • investigate if we can map tests to executed code (Test Impact Analysis) and then automatically only run the tests that are relevant to the changed code in a PR. See reduce test run time with Test Impact Analysis web#5127 for a first PoC in progress for web)
  • run less scenarios in regular PRs, for example don't run all the permutations of permissions tests, but run the full suite nightly. (disadvantage, a more obscure regression will not be noticed until after the PR is merged)
@phil-davis phil-davis self-assigned this May 21, 2021
@phil-davis
Copy link
Contributor Author

Just a note about test coverage. If we include test scenarios for "all" the combinations of things that happen with files then the test suite gets long. If we miss out combinations then we get bugs reported. Here is one example:

owncloud/files_antivirus#334 (comment)

If you upload a new file it gets scanned correctly. If you upload overwriting an existing file then there is some issue with the virus scanning.

We often see bugs that are specific to combinations of conditions. Here is an initial list of combinations that usually need to be checked:

  • upload a new file
  • upload overwriting an existing file
  • download file
  • download a folder of files
  • delete a file/folder
  • move/rename a file/folder
  • do these inside folders down some level(s)
  • put special characters and spaces in file names and folder paths
  • make sure old file versions are kept and can be restored (and after file is renamed...)
  • make sure deleted files go in the trashbin and can be restored
  • restore files from the trashbin to different places, and when there is a new file with a matching name already there
  • check all the above when the user has less permissions
  • and more and more...

Murphy's Law is that the combination that we do not test will be the combination that has a regression and something breaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant