- Docker and docker-compose usage, recommending Docker Deep Dive by Nigel Pulton to get necessary level
- Python at scripting level, recommending Python in simple packages by Lubanovic
- Python paralellism stuff, recommending
Chapter 6: Concurrency
out of Expert Python programming by Jaworski - Knowing what for testing is needed, recommending Unit testing principles, practices and patterns by Khorikov
- Knowing what for CI tools like Github Actions or Gitlab CI are needed
we have around 5000 unit and integration tests in a Django backend API project, which we test with pytest. They take half an hour to finish themselves locally or in Github Actions (GA) CI. As result, they produce coverage.xml and junit.xml exported into GA plugins to visualize results with better GUI. We wish to run them faster.
We are using docker-compose like below to raise a dev environment
version: '3.8'
services:
db:
image: postgres:13
environment:
POSTGRES_HOST_AUTH_METHOD: trust
POSTGRES_DB: default
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
expose:
- "5432"
redis:
image: redis
expose:
- "6379"
app:
links:
- db
- redis
build: .
All of our tests are run pretty much with the command below in Github Actions:
docker-compose run -v $(pwd):/code -u 0 app pytest --cov=. --junit-xml=unit.xml .
The main advantages of this approach
- in having the same tests runnable locally,
- same sidecar dependencies raised in CI, same test environment in local dev env and CI environment ran.
- The code for this CI test run is easily transferable to any other CI tool.
As disadvantages:
- people need to ensure for CI runner to have
docker-compose
available inside CI job, with docker daemon provided. - potentially sidecar containers can be not fast enough to raise themselves. In this case, you would wish to apply a tool like wait_for_it.sh, which without any other dependencies will wait the necessary time for dependency to become available (
wait_for_it.sh db:5432 -t 60 && wait_for_it.sh redis:5379 -t 60 && pytest
in our case for example)
Important note
As artifacts of a test run, we produce unit.xml
, and coverage.xml
files, to ensure having published passed tests into Github Actions graphical interface, and coverage results. It allows faster seeing logs to specific broken tests without digging through raw log output and seeing testing coverage metrics changes.
- name: Publish Unit Test Results
uses: EnricoMi/publish-unit-test-result-action@v1
if: always()
continue-on-error: true
with:
files: unit.xml
check_name: Pytest tests
- name: Display coverage
if: always()
continue-on-error: true
uses: ewjoachim/coverage-comment-action@v1
with:
GITHUB_TOKEN: ${{ github.token }}
COVERAGE_FILE: coverage.xml
Usually, people assume to use pytest-xdist in the Python ecosystem in order to have tests runnable in parallel. I was offered to collect flappy tests data in datadog, and after having them analyzed to fix them so that nothing would prevent their running in pytest-xdist. We needed to run our tests faster :)
First, we collected broken tests wrongly working Django translation, which is dependent on Linux Gettext. According to found information - gettext is not really thread-safe:
The GNU Gettext runtime supports only one locale per process. It is not thread-safe to use multiple locales and encodings in the same process. This is perfectly fine for applications that interact directly with a single user like most GUI applications, but is problematic for services and servers.
After having found an interesting source of information at this page, we can assume that the default method to run pytest looks like using multithreading, but multiprocessing is clearly possible as well. (See code below). It can fix part of the test problems at least.
pip install pytest-xdist
# The most primitive case, sending tests to multiple CPUs:
pytest -n NUM
# Execute tests within 3 subprocesses.
pytest --dist=each --tx 3*popen//python=python3.6
# Execute tests in 3 forked subprocess. Won't work on windows.
pytest --dist=each --tx 3*popen//python=python3.6 --boxed
# Sending tests to the ssh slaves
pytest --dist=each --tx ssh=first_slave --tx ssh=seconds_slave --rsyncdir package package
# Sending tests to the socket server, link is available below.
python socketserver.py :8889 &
python socketserver.py :8890 &
pytest --dist=each --tx socket=localhost:8889 --tx socket=localhost:8890
Further collecting data from datadog, there were tests broken because tests used same cache storage in shared redis
Other tests became broken because of an asserted count of SQL requests made by ORM in one of the tests. it made it clear that tests were breaking because another process was running query to db at the same time. Which allowed concluding that having a shared sidecar container db
, made tests broken just because they use the same db instance.
I received recommendations to fix caching by using in-memory solution, but it became obvious to me, that we would be encountering one or other new reasons why tests are broken again and again.
It allows us to make next conclusion, that if we are using pytest-xdist:
- It is additional development cost to fix all current parallel problems in tests, and it will increase development cost in a future to keep it that way
- we would be decreasing how good our test environment is, with replacing side car container like Redis with in memory alternatives, which will decrease quality of tests.
Instead of using pytest-xdist... I realized just to split pytest tests into groups. Luckily there is even a library for this - pytest-split. Each group of tests we will be running in its own raised docker-compose group of containers, each process would be having its own db instance, redis instance and whatever else side car dependency needed. Thus, it would be a perfect imitation for tests being run still in sequence instead of being run in parallel :) The only little problem we need to solve after that, with having merged coverage and junit output results for our Github Actions GUI.
Since we are in Python, the solution is implemented in python as well, with the help of subprocess
library for multiprocessing and argparse
library to have better self documented interface.
Firstly we raise self-hosted Github Actions runner with available docker-compose inside. Since Github Actions installing documentation is offering only installation to Linux manually and clearly lacking in this regard in comparison to Gitlab CI which offers its runnes installations ready solutions for docker and even kubernetes, we write our own solution to automate default Github Actions installation through pyexpect
library, and thus having it containerized as well. Command to quickly raise GA runner becomes TOKEN=your_github_token_to_register_runner docker-compose up
. See full code in repository for reference.
- Installing docker for example at ubuntu
- Installing docker-compose
git clone https://github.com/darklab8/darklab_github_ci.git
TOKEN=your_github_token_to_register_runner docker-compose up -d
# in order to launch- check with
docker-compose logs
to check for being succesfulapp_1 | python3: running listener app_1 | app_1 | √ Connected to GitHub app_1 | app_1 | Current runner version: '2.294.0' app_1 | 2022-09-16 21:39:58Z: Listening for Jobs
As latest step, we need to invoke our pipeline:
- Just pushing any code to master branch of our repository with installed parallel_pytest or opening pull request to merge commits to master, or requesting new workflow at Actions interface :) You see full code of a solution here
under the hood it does:
checking out code if repository in CI run
building if necessary image and assigning it tag name
opening multiple sub processes and each runs its own docker-compose with unique -p project_name in order to avoid container reusage.
results are outputed as reports/junit_{number of process}.xml and reports/.coverage_{number of process}.xml
we run script to merge multiple coverage results and junit results into single files
publishing results into Github GUI
python3 -m make parallel_pytest --dry # can help you to run commands only without running them or just run tests with less amount of selected tests.
-
Our tests are now runnable in parallel and we can enjoy having them run faster:
- 1 Core = 24 minutes 40 secs (no parallelism)
- 2 core = 15 minutes 31 secs
- 6 core = 8 minutes (pc with intel i5-10400 6 core/12threaded processor)
-
We haven't changed anything of a working code to make it more multithreading/multiprocessing safer. We solved the issue at a level above it. We don't need to keep in the future our code any safer than it is already.
-
We kept code running with sidecar containers with postgresql, redis or anything else that is needed, thus having less difference between dev and prod. Thus keeping following rule 10th rule of The Twelve-Factor app
-
We received a solution universal enough to be reapplied to any other repository needing to be sped up
-
We kept our paradigm of being CI agnostic and can run our solution easily locally.
- Refactoring code to be more universal to become third party library reusable for any other repo
- Making parallel_pytest script available as compiled binary in golang? In order to be available as lightweight dependency installations free solution reusable at any repository with minimal time and weight to add
- Testing with same solution applications in other languages, as long as solution to split tests specific to language is found