Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pyproject.toml to dev-setup-ractoring #664

Conversation

afuetterer
Copy link
Member

@afuetterer afuetterer commented Aug 4, 2023

Hi there,

update: This PR got a little bit out of hand and I added a few more suggestions. Please let me know what we think.

My goal was to bring down the durations of your CI runs, which I succesfully did.

CI duration

I saw that your test suite has an impressive number of > 17k tests. The runtime of a single test run (one Python version + one DB) on CI took between 1 and 2 hours. So enough time for coffee and lunch. The PR proposes a few performance improvement in your test suite as well your CI setup.

Proposed changes (succesfully tested)

  • update used GitHub actions
  • set up pip cache in CI
  • separate DB setup steps in CI (i.e. do not start mysql service when testing postgres)
  • use the installed postgresql server in ubuntu 20.04 image in CI (i.e. no docker service), sadly I could not figure out how to access the postgresql db with postgres/postgres credentials, so I created a new user (postgres_user)
  • update testing dependencies (pytest, pytest-cov, etc.)
  • remove obsolete package: pytest-pythonpath
  • run tests in parallel using pytest-xdist
  • run tests in random order to identify dependent test cases
  • use faster password hasher when testing, i.e. MD5PasswordHasher
  • set DEBUG=False and other suggestions in testing config

So far all tests are still working and the run time came done to ~ 15 minutes per run. This is still enough time for coffee, but not for lunch anymore. I think you could do even better with more tweaks. But please let me know what you think.

For example: Do you really need to run the tests against sqlite in CI? Actually you test Django's capability to handle different backends, right?

Unable (unsuccesfully tested)

  • apt package installation takes ~ 1 minute, there is a GitHub action to cache apt package installation, but it does not work with latex packages

TOML setup

There is still room for improvement, but I wanted to suggest the toml for further discussion.

I added the following dependency groups:

  • mysql
  • postgres
  • tests
  • dev
  • ci

But tests, dev and ci are pretty similar.

Do you want to enforce using pre-commit for contributors? That is always a good idea. But if you want to use isort, flake8, (maybe black), you need to have on linting/reformatting-PR, I guess with the described 600+ files being changed. Also pre-commit should run on CI.

Flake8 sadly cannot be configured using the toml file. I would nowadays always recommend using ruff, which is really great and fast and adopted by major projects like pandas. ruff can replace isort and flake8 in my opinion.

Another question is still #648. The installation process using the toml file does not work with Python 3.6.

What do you think?

This was referenced Aug 4, 2023
@afuetterer afuetterer force-pushed the add-toml-to-dev-setup-refactoring branch 4 times, most recently from 9b1729e to 09907eb Compare August 5, 2023 17:25
@afuetterer afuetterer changed the title refactor!: add pyproject.toml [WIP] refactor!: add pyproject.toml Aug 6, 2023
@afuetterer afuetterer force-pushed the add-toml-to-dev-setup-refactoring branch from 09907eb to 65fe8a1 Compare August 13, 2023 08:13
@afuetterer afuetterer changed the title [WIP] refactor!: add pyproject.toml Add pyproject.toml to dev-setup-ractoring Aug 13, 2023
@afuetterer

This comment was marked as outdated.

setup.cfg Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
@jochenklar
Copy link
Member

How big is the difference between md5 hashing and the regular PBKDF2? If we change the settings I would also change the settings in my dev setup (which is documented here: https://github.com/rdmorganiser/rdmo/blob/master/docs/dev.md). The idea is to use the same fixtures for development as for testing.

@afuetterer

This comment was marked as resolved.

- run: python -Im pip install -e .[dev]
- run: python -Ic 'import rdmo; print(rdmo.__version__)'

required-checks-pass:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can be set as required in pull requests.

@afuetterer

This comment was marked as resolved.

@jochenklar
Copy link
Member

Yes, thanks, then I will just update my dev setup here when I use the fixtures in testing.

@jochenklar
Copy link
Member

@afuetterer
Copy link
Member Author

Hmm, https://pypi.org/project/django-plaintext-password/ 💣

You want to use this instead to speed up things even more?

@jochenklar
Copy link
Member

I tried the plaintext stuff and luckily it is not faster than md5. It it probably about the rotations and not about the algorithm.

time pytest --reuse-db rdmo/conditions

master:
real    0m27.273s
user    0m22.530s
sys 0m1.102s

md5:
real    0m20.340s
user    0m15.641s
sys 0m1.167s

plain:
real    0m20.881s
user    0m16.081s
sys 0m1.141s

@jochenklar
Copy link
Member

The other new settings don't seem to make anything faster (for rdmo/conditions). I will try with the whole code.

@afuetterer afuetterer marked this pull request as ready for review August 18, 2023 13:21
@jochenklar
Copy link
Member

time pytest --reuse-db rdmo/projects

master:
real    40m21.091s
user    34m7.364s
sys 1m20.997s

md5:
real    16m28.026s
user    10m15.002s
sys 1m38.042s

even more impressive, this is only the md5 hasher. I am checking the other options, but I think we can remove them (except the DEBUG settings). We could also put the logging part in the mysql/postgres/sqlite files so that they are still used locally.

@jochenklar jochenklar merged commit 3b80fa0 into rdmorganiser:dev-setup-refactoring Aug 18, 2023
@afuetterer afuetterer deleted the add-toml-to-dev-setup-refactoring branch September 7, 2023 13:09
@MyPyDavid
Copy link
Member

MyPyDavid commented Sep 8, 2023

How big is the difference between md5 hashing and the regular PBKDF2? If we change the settings I would also change the settings in my dev setup (which is documented here: https://github.com/rdmorganiser/rdmo/blob/master/docs/dev.md). The idea is to use the same fixtures for development as for testing.

I think that the password fixtures for development are now broken. Should this hashing algorithm also be part of the rdmo-app settings for debug?
Including this in the rdmo-app settings fixes it

PASSWORD_HASHERS = ("django.contrib.auth.hashers.MD5PasswordHasher",)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants