reduce test run time with Test Impact Analysis #5127

individual-it · 2021-05-20T05:44:49Z

The whole idea is based on https://martinfowler.com/articles/rise-test-impact-analysis.html

issue

Currently, we run web UI tests that sum up to nearly 17h of CI time (including building web, setup oc & ocis, etc.). To reduce the feedbacktime for developers doing a PR those tests a massively parallelized (90+ UI test-pipelines), but still around 30min are needed to run all pipelines.

The proposal is to use Test Impact Analysis to run less tests so that:

developers can find run all tests that are testing a specific code change
reduce the time developers have to wait for CI feedback
reduce CI cost
save the world by reducing energy use

basic algorithm

run every test and collect coverage data per Scenario. POC PR for that: tests: add e2e coverage #5108 (all the glory to @LukasHirt because he found a way to get the coverage!)
reverse the list to create a source file => test scenario map
on every PR run only those scenarios that are associated with the changed source file
on merge run all tests and rebuild the map

further considerations

the map could be stored in git or in S3
changes in some files should trigger a run of the whole UI test suite
- CSS files (they are not reported in the coverage)
- test related files
- dependencies
- drone related files
changes in some files should not trigger any UI test run
- docs
- changelogs
- unit tests
some tests should be tagged to be run always
- smoke tests?
- test with a specific tag (in the current POC we are loosing coverage on page reload or on session restart)
create a nice command, so that the developer can run all tests related to the current change. E.g. consider all files that are different to master git diff --name-only master

possible future improvements

get coverage only in When and Then steps, not in Given steps as those are only for "setup" (maybe using AfterStep hook)
build the map not based on file but on functions or git diff chunks
add the test-code to the map, so that what e.g. a page object is changed only tests that use that object are executed

demonstrating potential & limitations

I've created a map for 591 scenarios and then check how many of those would have to be run for some recent PRs.
This is not super accurate because the map is created from the current master but would have been potentially different at the moment when the PR was created, but it should give an idea about what can be achieved.
Also this is not claiming to be a statistical analysis, it should only show the potential and the limitations of the approach.

PR	title	contributor	# of scenarios to run
#5023	fix: display navigation for resolved private link	@LukasHirt	4
#5027	feat: add focus trap to left sidebar	@LukasHirt	579
#5046	Enhancher Trashbin a11y	@janackermann	534
#5053	Change appearance of the share action buttons	@JammingBen	34
#5056	Fix z-index on the new file menu	@JammingBen	560
#5073	background loading for avatars	@kulmann	584
#5095	Skip editors route check on empty array	@kulmann	584
#5112	Fix collaborator tag for currentUser in sidebar	@pascalwengerter	584
#5118	fix: make skip to link visible on focus	@LukasHirt	1
#5122	Fix indirect via-share in collaborator sidebar section	@pascalwengerter	211

Script to find all coverage files that report any coverage of a given file

!#/bin/bash
touchedFiles=$1
tests=""
for touchedFile in $(echo $touchedFiles | sed "s/,/ /g")
do
    for file in `find coverage/ -name "*json"`
    do
        result=`cat $file | jq 'keys[] as $k | "\($k) \(.[$k].lines.covered)"' -r | grep $touchedFile | cut -d" " -f2`
        if [ "$result" != "" ] && [ $result -gt 0 ]        then
            tests="$tests $file"
        fi
    done
done

echo $(echo $tests | tr " " "\n" | sort | uniq) | tr " " "\n"

conclusion

With this naive approach the amount of tests runs would only decrease insignificantly for a lot of PRs but for some it would make a huge difference, specially for those that change only a small part of the code.
Other optimisations should reduce the amount of tests executed even further.

The text was updated successfully, but these errors were encountered:

individual-it · 2021-07-06T04:12:38Z

As discussion progressed in the last weeks. I would suggest first to invest more into unit tests and see if we can cut down on acceptance tests, then we can think about coming back to this

individual-it added QA:team Interaction:Discussion labels May 20, 2021

phil-davis mentioned this issue May 25, 2021

Overall automated testing strategy owncloud/QA#671

Open

individual-it mentioned this issue Jul 6, 2021

tests: add e2e coverage #5108

Closed

individual-it closed this as completed Jul 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce test run time with Test Impact Analysis #5127

reduce test run time with Test Impact Analysis #5127

individual-it commented May 20, 2021

individual-it commented Jul 6, 2021

reduce test run time with Test Impact Analysis #5127

reduce test run time with Test Impact Analysis #5127

Comments

individual-it commented May 20, 2021

issue

basic algorithm

further considerations

possible future improvements

demonstrating potential & limitations

conclusion

individual-it commented Jul 6, 2021