Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytest testing support #21

Open
wants to merge 48 commits into
base: dev
Choose a base branch
from
Open

Conversation

janiemi
Copy link

@janiemi janiemi commented Feb 22, 2023

This pull request contains support for implementing Pytest tests for the Korp backend, along with a couple of sample tests.

Main features

The main features of the pull request are listed below. Please see tests/README.md for some more documentation on how to run and write tests.

  1. Dicrectory tests contains a couple of sample tests and some supporting code. Tests have been divided into functional and unit tests (with subdirectories named correspondingly): functional tests test endpoints as seen by the user, whereas unit tests test individual functions.

  2. To allow overriding configuration settings for testing purposes, the parameter config_override was added to korp.create_app.

  3. tests/conftest.py contains a couple of fixtures that should make it easier to setup commonly-used test prerequisites, such as corpus data and a test client. More fixtures can be added as deemed useful.

  4. Requirements for testing are in requirements-dev.txt, as advised by @aajarven.

  5. The (functional) tests can use CWB corpus data, which is encoded for each test session from corpus source files in tests/data/corpora/src, as I thought it would make the tests more transparent than using pre-encoded corpus data. It would not seem to be significantly slower if the test corpora are relatively few and small. Encoding the corpus data requires cwb-encode and cwb-make.

    The corpus data is represented as VRT (VeRticalized Text), the CWB input format, one file per CWB corpus. To make the file more self-contained and to simplify encoding, I opted for listing both positional and structural attributes in the VRT file as comments at the beginning of the file; for example:

    <!-- #vrt positional-attributes: word lemma -->
    <!-- #vrt structural-attributes: text:0+id paragraph:0+id sentence:0+id -->
    

    If you have suggestions for a better approach for specifying the attribute information (or the whole corpus data), please tell us. It would also be possible to support alternative approaches.

    In addition to the actual corpus data, a corpus needs to have an .info file. As a future improvement, the information in the .info file could be computed based on the VRT file if it does not exist.

Open questions

At least the following questions are open:

  1. How to test endpoints that use the MySQL database? Should we have a real database, populated for each test session, or should we only mock database access? I think for some tests, a real database would be better. @aajarven mentioned testing.mysqld but recommended that we should also investigate other options.

  2. How to test plugins? Plugin code and tests may be in another repository but I think plugin tests should be able to access fixtures and test corpora.

  3. Would it be possible to automate testing with GitHub (or with some other test automation service)? Otherwise perhaps yes, but what about the requirement of having CWB installed?

  4. How to organize tests in a good way? Maybe some requests could be made fixtures in the individual test modules, so that it would be easier and less repetitive to test different properties of the result in separate tests. I extracted a common pattern of making a request, asserting its success and getting the JSON result to a function in tests/testutils.py.

  5. How to test with different values for configuration variables? If a configuration variable is checked when making a request, it is perhaps possible to mock or override it within a test function. But I think that does not work if the configuration variable is used in korp.create_app. Should we parametrize the app fixture to be able to override configuration variable values, or would some other solution be better?

General comments

I think it made implementing testing easier that the code had been split into modules.

We intend to rebase the branches containing our modifications on this branch, so that we can provide some tests for the modifications for which we make pull requests. However, we do not promise comprehensive tests up front.

@janiemi
Copy link
Author

janiemi commented Feb 23, 2023

I pushed a couple of commits:

  • Create and use temporary cache and corpus configuration directories for testing, instead of the default (or instance-configured) ones.
  • Add a very simple corpus configuration under tests/data/corpora/config for testing.
  • Add a very basic test for /corpus_config using the test corpus configuration.

I also edited the description to add open questions on testing with different configurations.

@aajarven
Copy link

All in all, I think this is a great step in the direction of making contributing to the code easier and safer for new developers, great work!

Here are some of my thoughts (ramblings?) about the questions:

  1. I tried googling around a bit, but it seems that either people aren't mocking their databases for tests anymore or at least they are not talking about it online. The rare cases that are at least recent-ish seem to mention testing.mysqld, so if we decide to go that way, that might be the best bet, though even that has most recent commit from 2018 which is a bit worrisome. Might be that my googling wasn't extensive though.

In an ideal world, we would perhaps refactor the code so that we have one database interface that handles the actual SQL stuff. Then we mock methods like db.get_lemgram(some, arguments, that, make, sense) when testing views and such, and when testing the actual interface we mock the cursor/connection/whatnot and assert that the right SQL clauses have been evaluated. Pros: this should not slow our tests at all, mocking different kinds of data for different tests is a breeze, and additionally the code is a bit more robust against stuff like having to change the underlying database technology. Cons: this might require a hefty refactoring, and we won't be automatically testing whether the SQL clauses actually work.

  1. Crossing the repository border in tests does indeed make things a bit hairy. If it was just the software itself we need to use when testing plugins, my instinct would be to make relevant repository or repositories pip installable and add them as requirements for the plugin repositories. If we also need the test code and fixtures, it might be that we need to go with cloning the relevant repositories and using relevant parts from there. As a silver lining, this should be easy to automate into something like make test.

  2. I think this should definitely be possible. If I recall correctly, even the runners hosted by GitHub allow installing software using apt and doing all kinds of stuff during the job. Using an Ansible playbook that does the setup could help us keep the production and test setups somewhat in sync and simplify the pipeline. If the process is heavy and the tests run often, we might run out of free minutes though.

Another option are self-hosted runners. The upside is that we can have a runner container that already has all the basic software installed, and that we won't be dependent on how much free GH Actions minutes we are given. The downside is that we must host the container ourselves somewhere and think about security more rigorously. The main security concern is that if a pipeline is run automatically e.g. on a pull request, a malicious actor can in execute arbitrary code on our machine. Language Bank does have a self-hosted runner in use for one repository though.

  1. The current way of organizing tests looks good to me. It's likely (maybe even certain) that if we do a lot of testing, we will come up with improvements like new useful fixtures etc, but it's usually best to add those at a later stage: "three strikes and you refactor" and so forth.

  2. The exact way we want to do this likely varies depending on what exactly is needed. If we want to run all tests against different kinds of app configurations, we could just parametrize the app creation fixture, something along the lines of this toy example:

$ cat test.py
import pytest

def func(param):
    return f"string with {param}"

@pytest.fixture(params=["different", "values", "yay!"])
def string_from_func(request):
    return func(request.param)

def test_string_properties(string_from_func):
    assert "string" in string_from_func

$ python -m pytest test.py -v
==================================================== test session starts ====================================================
platform linux -- Python 3.8.10, pytest-7.2.1, pluggy-1.0.0 -- /home/vagrant/scratch/korp/Kielipankki-korp-backend/.venv/bin/python
cachedir: .pytest_cache
rootdir: /home/vagrant/scratch/korp/Kielipankki-korp-backend
collected 3 items                                                                                                           

test.py::test_string_properties[different] PASSED                                                                     [ 33%]
test.py::test_string_properties[values] PASSED                                                                        [ 66%]
test.py::test_string_properties[yay!] PASSED                                                                          [100%]

===================================================== 3 passed in 0.00s =====================================================

Or if that's overkill, we could do more targeting parametrization, perhaps with @pytest.mark.parametrize at the level of individual tests if need be.

@janiemi
Copy link
Author

janiemi commented Feb 28, 2023

@aajarven, thanks for your insights! Here are a couple of comments.

  1. I suppose refactoring the database access would indeed make testing easier. As you hinted, it might also make it easier to support other DBMS’s than MySQL/MariaDB, even though that would require more work and maybe another level of abstraction, perhaps using SQLAlchemy or the like.

    Looking at the database access code, I think that some parts of it would be relatively simple to refactor, whereas others would be more complicated, because database access is intertwined with caching (/timespan) or yielding results (/relations). Thus, I might not dare do the refactoring myself, in particular without any tests, unless I got a clear green light from @MartinHammarstedt.

    I also think that it would be important to test that the actual SQL statements work. I suppose that we’d need an actual database for that, but how set up a test database that wouldn’t interfere with a possible production database? At least it would require database creation privileges.

  2. I think that for example the fixture client would be very useful when testing a plugin that adds a new endpoint, so I’d probably opt for having the repository clone accessible somewhere. I’ll probably come back to this issue once I actually get to testing plugins.

  3. It’s good to know that automating the testing is possible, in some way or another, even though it’s probably not a priority right now.

  4. I agree that we could start with this (if Språkbanken approve) and improve the test organization later when needed.

  5. By testing with different values for configuration variables, I meant mainly that some of them affect some results, for example, depending on whether caching is enabled or not. I think some tests should be independent of such configuration, but I think we should also be able to test that the configuration affects the results as intended.

@MartinHammarstedt MartinHammarstedt self-assigned this Aug 14, 2024
@MartinHammarstedt
Copy link
Member

Sorry for taking so long to get to this. I really appreciate the work you've done, as adding tests has been on our wish list for far too long!

I ran into some trouble installing the CWB Perl tools (we don't usually use this part of CWB), so I tried replacing cwb-make with cwb-makeall (from the regular CWB tools) in the code, and everything seems to work fine. Is there a specific reason for using cwb-make, or would cwb-makeall be sufficient?

janiemi added 25 commits August 15, 2024 15:19
requirements.txt:
- Update mysqlclient to 1.3.14, so that running on Ubuntu 22.04 does not
  crash with "ImportError: libmysqlclient.so.20: cannot open shared
  object file: No such file or directory".
Add facility for encoding VRT files into CWB corpora: fixture "corpora"
that uses tests.corpusutils.CWBEncoder.
tests:
- Add an empty __init__.py to make tests a package, to make calling
  pytest directly work (and not only "python -m pytest")
Add tests/testutils.py:
- Function get_response_json can be called from tests for endpoints and
  helps in making them slightly more compact and less repetitive.
tests/conftest.py:
- Add fixtures cache_dir and corpus_config_dir for creating temporary
  cache and corpus configuration directories, and use them in fixture
  app for overriding the app configuration defaults.
tests/conftest.py:
- Define fixture corpus_configs: Copy corpus configurations from the
  configuration source directory (data/corpora/config) to a temporary
  test directory.
janiemi added 23 commits August 15, 2024 15:19
tests:
- Make fixtures "app" and "client" return a function that takes an
  optional dict argument for overriding default configuration values.
- Update tests to use the factory fixtures.
tests/dbutils.py:
- Support expansion of variables "{var}" in the "definitions" of table
  information items.

tests/data/db/tableinfo/relations.yaml:
- Define variable "rel_type" and use it as the type of field "rel" in
  the table definitions.
tests/dbutils.py:
- KorpDatabase: Add methods execute, execute_file for executing SQL
  statements in a string or file. Convert other code to use the methods.
tests/conftest.py:
- Add fixture database_tables for importing database data by table types
  and corpora.

tests/dbutils.py:
- Add import_tables (and auxiliary methods): Find data files (TSV or
  SQL) based on corpus name and table type. This requires the filenames
  in the table info files to contain a "{corpus}" placeholder for corpus
  id.

tests/data/db/tableinfo/*.yaml:
- Add placeholder "{corpus}" for corpus ids in filename patterns.
- Make the filename patterns slightly more explicit.
tests/functional/test_lemgram_count.py,
tests/functional/test_timespan.py:
- Add argument "corpus" to the function returned by a local fixture.
- Use fixture database_tables and load database data only for the
  corpora actually used.
- Use tests.testutils.make_liststr to convert a list of corpora to a
  string.
tests/dbutils.py:
- KorpDatabase.execute_get_cursor: If cursor.connection.commit() fails
  with MySQLdb.ProgrammingError, try cursor.execute("COMMIT;").
- KorpDatabase._get_db_names: Do not commit after "SHOW DATABASES" to
  guarantee retrieving database names via the cursor (which does not
  work after cursor.execute("COMMIT;")).
tests/dbutils.py:
- KorpDatabase.import_table_files: Commit after importing the SQL file,
  as not doing so sometimes caused tests using the imported data not to
  find data.
tests/functional/test_relations.py:
- Parametrize the test with word and corpora.
- Test that all the relations contain the word as either "head" or
  "dep".
tests/dbutils.py:
- Table info files: Use "{CORPUS}" (or "{corpus}") in "tablename" as a
  placeholder for the corpus name (id) in uppercase (or lowercase),
  instead of "{1:u}" or similar, simplifying code and removing the need
  for a custom formatter.
- Remove class KorpDatabase.CaseConversionFormatter.
@janiemi
Copy link
Author

janiemi commented Aug 15, 2024

@MartinHammarstedt, great that you found time to take a look at this PR! In the meantime, I haven’t been working on Korp as much as I had intended but I have done something, and I’d still like to make some improvements to the testing facility before you merge it.

I just now pushed commits from last autumn that implement testing MySQL/MariaDB database access and allow testing with different values for configuration variables. I think the changes largely address my open questions 1 and 5 above. For testing endpoints requiring database access, a new database is created (by a specified MySQL user), and database data is imported from SQL or TSV files; please see tests/README.md for more information. For testing with different configuration settings, I made the fixtures app and client return functions that take an optional argument for overriding configuration values.

I also rebased the whole branch onto the current dev branch.

I think you’re right in that it would better to avoid using cwb-make from the CWB Perl package. As the test corpora are bound to be tiny, the won’t need compression, so I suppose that simply replacing cwb-make with cwb-makeall should be fine. I’ll make the change.

In addition, I’d still like to make the following improvements or additions:

  1. Represent test corpus data in the “pretty” XML format produced by Sparv, as an alternative to VRT.
  2. When using VRT test corpus data, list attributes in a separate file and possibly infer structural attributes from the VRT data itself, as an alternative to listing them in special XML-style comments.
  3. Specify a default collation for the MySQL/MariaDB database.
  4. Remove existing data before importing to a multi-corpus database table.

I hope to have time to do these in September.

The way to add tests for plugins is still an open question, but I think the PR for the plugin system could perhaps address it. (Our extended plugin facility still needs some work before I dare create a PR for it; anyway, it should now be backward-compatible with yours.)

Finally, I think it would be great if you could merge this PR before #20 (and other, forthcoming PRs), so that we’d be able to add some tests to the other PRs. However, I’d like to implement the above improvements first.

@janiemi
Copy link
Author

janiemi commented Aug 16, 2024

As an additional note, the first commit in the PR (upgrading mysqlclient version from 1.3.13 to 1.3.14) is not strictly related to testing, but I couldn’t get the MySQL connection working without the change on my Ubuntu 22.04 with Python 3.10.12. Without it, Korp crashed with the error message “ImportError: libmysqlclient.so.20: cannot open shared object file: No such file or directory”. However, I haven’t tested this change in other environments.

mysqlclient 1.3.14 is still rather an old version of the package. I recall that I tried some more recent ones but they would have required upgrading some other requirements, too, which might have broken compatibility with older Python versions.

If desired, the commit in question could be added separately from this PR, or you could replace it with something more appropriate that also fixes the problem mentioned above.

In general, it doesn’t look to me very simple to support a relatively wide range of Python versions. In addition, different Linux distributions and versions may also complicate the matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants