Issue 10 pytest #24

kan-fu · 2024-02-29T21:46:59Z

Fix #10
Completely replaced robot framework with pytest. Most of the test cases use the params from the OpenAPI page examples.
I deleted some tests that just used different params for the same web service (for example multiple getLocations tests with locationCode, locationName, deviceCategoryCode,...), as the client library simply passes the params to the request, so those tests won't increase the test coverage. They should be in the internal test framework for the backend.
I also dropped support for Python 3.8, which is agreed by Spencer after we fixed the Python version issue on the task machine (previous disccusion). It should be in a new issue, but I used a dictionary union operator in the tests, which is a new feature in Python 3.9.

Jacob-Stevens-Haas

Way to go moving all this from robots! Now that it's easier to read the tests, I have two thoughts. None of these should be seen as blocking - the PR is an improvement on the state of the code as-is, so you should feel comfortable merging it as-is rather than dragging out the review/edit process. Nevertheless, you might still want to consider them before checking off whatever internal ticketing system you have:

This test code seems very nested, with 50 files of 10-50 lines. This means a lot of repeated code across files. When code is flattened and similar code is next to each other, the repetition becomes more obvious and important differences stand out: e.g. scalar_data.test_scalardata_device.test_no_data vs scalar_data.test_scalardata_location.test_no_data behave very differently, even though they're called in the same way: One returns an HTTP error, the other an object with an empty container. That they can be called the same way but behave differently suggests that perhaps the ONC API might be confusing for users and could be fixed - but the comparison is only clear when repetition is removed and code is flattened.

More simply, there's no need for folders with only one file, e.g. discover_properties/test_properties.py could just be test_properties.py. The other folders with multiple files (archive_file, data_product_delivery) probably could also be combined to both shorten the overall SLOC and make it clearer.

Runtime. Have you checked the durations of individual tests (pytest --durations=100)? Consider:

Test durations

============================================================================================== slowest 100 durations ==============================================================================================
67.78s call     tests/data_product_delivery/test_data_product_delivery_order.py::test_valid_default
38.08s call     tests/data_product_delivery/test_data_product_delivery_order.py::test_valid_manual
36.96s call     tests/data_product_delivery/test_data_product_delivery_order.py::test_valid_no_metadata
31.54s call     tests/data_product_delivery/test_data_product_delivery_order.py::test_valid_results_only
25.62s call     tests/archive_file/test_archivefile_direct_download.py::test_valid_params_one_page
19.98s call     tests/archive_file/test_archivefile_download.py::test_valid_params
17.50s call     tests/archive_file/test_archivefile_direct_download.py::test_valid_params_multiple_pages
9.93s call     tests/raw_data/test_rawdata_device.py::test_valid_params_one_page
9.82s call     tests/scalar_data/test_scalardata_device.py::test_valid_params_one_page
7.68s call     tests/scalar_data/test_scalardata_location.py::test_valid_params_one_page
4.47s call     tests/raw_data/test_rawdata_location.py::test_valid_params_one_page
3.22s call     tests/archive_file/test_archivefile_location.py::test_valid_params_one_page
3.18s call     tests/archive_file/test_archivefile_device.py::test_valid_params_one_page
2.25s call     tests/scalar_data/test_scalardata_device.py::test_valid_params_multiple_pages
2.24s call     tests/raw_data/test_rawdata_device.py::test_valid_params_multi_pages
2.06s call     tests/archive_file/test_archivefile_device.py::test_invalid_param_value
1.95s call     tests/discover_data_products/test_data_products.py::test_valid_params
1.94s call     tests/discover_data_products/test_data_products.py::test_no_data
1.83s call     tests/scalar_data/test_scalardata_location.py::test_valid_params_multiple_pages
1.73s call     tests/discover_properties/test_properties.py::test_valid_params
1.54s call     tests/discover_locations/test_locations.py::test_valid_params
1.54s call     tests/scalar_data/test_scalardata_device.py::test_no_data
1.53s call     tests/discover_locations/test_locations.py::test_no_data
1.44s call     tests/raw_data/test_rawdata_device.py::test_no_data
1.37s call     tests/discover_properties/test_properties.py::test_no_data
1.32s call     tests/discover_locations/test_locations_tree.py::test_no_data
1.18s call     tests/archive_file/test_archivefile_location.py::test_valid_params_multiple_pages
1.06s call     tests/discover_deployments/test_deployments.py::test_valid_params
1.04s call     tests/raw_data/test_rawdata_location.py::test_valid_params_multiple_pages
1.03s call     tests/discover_device_categories/test_device_categories.py::test_no_data
1.02s call     tests/discover_locations/test_locations_tree.py::test_valid_params
1.01s call     tests/raw_data/test_rawdata_location.py::test_invalid_param_value
1.01s call     tests/discover_device_categories/test_device_categories.py::test_valid_params
0.95s call     tests/scalar_data/test_scalardata_location.py::test_no_data
0.93s call     tests/discover_device_categories/test_device_categories.py::test_invalid_param_name
0.93s call     tests/discover_locations/test_locations_tree.py::test_invalid_time_range_future_start_time
0.92s call     tests/data_product_delivery/test_data_product_delivery_run.py::test_invalid_request_id
0.92s call     tests/scalar_data/test_scalardata_location.py::test_invalid_param_value
0.92s call     tests/data_product_delivery/test_data_product_delivery_order.py::test_invalid_param_value
0.92s call     tests/raw_data/test_rawdata_device.py::test_invalid_param_value
0.92s call     tests/archive_file/test_archivefile_location.py::test_invalid_param_name
0.92s call     tests/archive_file/test_archivefile_download.py::test_invalid_params_missing_required
0.92s call     tests/data_product_delivery/test_data_product_delivery_download.py::test_invalid_run_id
0.92s call     tests/discover_deployments/test_deployments.py::test_invalid_param_value
0.92s call     tests/archive_file/test_archivefile_location.py::test_no_data
0.92s call     tests/discover_locations/test_locations.py::test_invalid_param_name
0.92s call     tests/archive_file/test_archivefile_device.py::test_invalid_params_missing_required
0.92s call     tests/archive_file/test_archivefile_device.py::test_no_data
0.92s call     tests/raw_data/test_rawdata_device.py::test_invalid_param_name
0.92s call     tests/discover_locations/test_locations_tree.py::test_invalid_param_value
0.91s call     tests/raw_data/test_rawdata_location.py::test_no_data
0.91s call     tests/discover_locations/test_locations.py::test_invalid_time_range_greater_start_time
0.91s call     tests/discover_deployments/test_deployments.py::test_invalid_time_range_future_start_time
0.91s call     tests/discover_deployments/test_deployments.py::test_invalid_time_range_greater_start_time
0.91s call     tests/discover_devices/test_devices.py::test_invalid_time_range_future_start_time
0.91s call     tests/discover_devices/test_devices.py::test_invalid_param_value
0.91s call     tests/scalar_data/test_scalardata_device.py::test_invalid_param_name
0.91s call     tests/raw_data/test_rawdata_location.py::test_invalid_param_name
0.91s call     tests/discover_locations/test_locations_tree.py::test_invalid_param
0.91s call     tests/discover_deployments/test_deployments.py::test_no_data
0.91s call     tests/archive_file/test_archivefile_device.py::test_valid_params_multiple_pages
0.91s call     tests/discover_devices/test_devices.py::test_invalid_param_name
0.91s call     tests/discover_locations/test_locations.py::test_invalid_time_range_future_start_time
0.91s call     tests/scalar_data/test_scalardata_device.py::test_invalid_param_value
0.91s call     tests/archive_file/test_archivefile_location.py::test_invalid_params_missing_required
0.91s call     tests/discover_data_products/test_data_products.py::test_invalid_param_value
0.91s call     tests/discover_devices/test_devices.py::test_invalid_time_range_greater_start_time
0.91s call     tests/discover_locations/test_locations_tree.py::test_invalid_time_range_greater_start_time
0.91s call     tests/discover_deployments/test_deployments.py::test_invalid_param_name
0.91s call     tests/archive_file/test_archivefile_device.py::test_invalid_param_name
0.90s call     tests/archive_file/test_archivefile_download.py::test_invalid_param_value
0.88s call     tests/discover_locations/test_locations.py::test_invalid_param_value
0.88s call     tests/discover_data_products/test_data_products.py::test_invalid_param_name
0.88s call     tests/discover_properties/test_properties.py::test_invalid_param_name
0.88s call     tests/discover_properties/test_properties.py::test_invalid_param_value
0.87s call     tests/discover_device_categories/test_device_categories.py::test_invalid_param_value
0.84s call     tests/scalar_data/test_scalardata_location.py::test_invalid_param_name
0.82s call     tests/discover_devices/test_devices.py::test_no_data
0.81s call     tests/discover_devices/test_devices.py::test_valid_params
0.80s call     tests/archive_file/test_archivefile_location.py::test_invalid_param_value
0.04s setup    tests/discover_data_products/test_data_products.py::test_invalid_param_name
0.04s setup    tests/discover_locations/test_locations.py::test_invalid_param_value
0.04s setup    tests/scalar_data/test_scalardata_location.py::test_invalid_param_name
0.04s setup    tests/discover_properties/test_properties.py::test_invalid_param_value
0.01s setup    tests/raw_data/test_rawdata_location.py::test_valid_params_one_page

The longest test runs for >60 sec, around 20% of the runtime. The top ten tests account for over 80%. This obviously depends upon the behavior of the server, and maybe QA is faster. If so, it might make sense to mark these tests as slow and only call them in CI. Or if QA is much faster, to allow CI (and maybe me?) to use the QA server.

Regardless, ordering data products seems to be responsible for most of the runtime. Maybe this indicates that, as a higher-level function, it doesn't deserve four different tests for parameter combinations, and instead vary the parameters on lower level (requesting, running, and downloading) functionality? Or maybe the ordered data products are too large/complex for the server?

Jacob-Stevens-Haas · 2024-03-10T14:41:45Z

tests/conftest.py

+def pytest_configure():
+    print("========== ONC config ==========")
+    print(f"Testing environment: {'PROD' if is_prod else 'QA'}")


Nothing calls this function - is it needed?

This is a hook function. It will print something like this (from GitHub Actions console output) whenever invoking pytest.

py312: commands[0]> pytest --cov=onc
========== ONC config ==========
Testing environment: PROD
============================= test session starts ==============================
platform linux -- Python 3.12.2, pytest-8.0.2, pluggy-1.4.0
cachedir: .tox/py312/.pytest_cache
rootdir: /home/runner/work/api-python-client/api-python-client
configfile: pyproject.toml
testpaths: tests
plugins: cov-4.1.0
collected 80 items

kan-fu · 2024-03-11T21:33:31Z

Way to go moving all this from robots! Now that it's easier to read the tests, I have two thoughts. None of these should be seen as blocking - the PR is an improvement on the state of the code as-is, so you should feel comfortable merging it as-is rather than dragging out the review/edit process. Nevertheless, you might still want to consider them before checking off whatever internal ticketing system you have:

This test code seems very nested, with 50 files of 10-50 lines. This means a lot of repeated code across files. When code is flattened and similar code is next to each other, the repetition becomes more obvious and important differences stand out: e.g. scalar_data.test_scalardata_device.test_no_data vs scalar_data.test_scalardata_location.test_no_data behave very differently, even though they're called in the same way: One returns an HTTP error, the other an object with an empty container. That they can be called the same way but behave differently suggests that perhaps the ONC API might be confusing for users and could be fixed - but the comparison is only clear when repetition is removed and code is flattened.

More simply, there's no need for folders with only one file, e.g. discover_properties/test_properties.py could just be test_properties.py. The other folders with multiple files (archive_file, data_product_delivery) probably could also be combined to both shorten the overall SLOC and make it clearer.

Runtime. Have you checked the durations of individual tests (pytest --durations=100)? Consider:

Test durations
The longest test runs for >60 sec, around 20% of the runtime. The top ten tests account for over 80%. This obviously depends upon the behavior of the server, and maybe QA is faster. If so, it might make sense to mark these tests as slow and only call them in CI. Or if QA is much faster, to allow CI (and maybe me?) to use the QA server.

Regardless, ordering data products seems to be responsible for most of the runtime. Maybe this indicates that, as a higher-level function, it doesn't deserve four different tests for parameter combinations, and instead vary the parameters on lower level (requesting, running, and downloading) functionality? Or maybe the ordered data products are too large/complex for the server?
Thanks for your detail comment!

My initial motivation for this nested test directory structure is that everything mirrors the same structure in the OpenAPI page, so that I (and also the Matlab library maintainer) don't need to make decisions. Using various packages in Java also seems natural to me. But I agree that "flat is better than nested" based on the zen of python. I will have a new ticket (already got some commits based on this branch) for improving the structure of the tests directory:
- All the discover tests can be extracted out to get rid of their parent folder.
- The scalardata, rawdata and archivefile contains tests that are quite similar (by location and by device). Combining them will definitely help reduce code duplicate and feature comparison (like no_data test). I also have a plan to add a helper method to combine them together in v3.0.0, like get_scalardata that delegates the params to get_scarlardata_by_device and get_scarlardata_by_location based on whether the params include a "deviceCode" key. Testing them in one file (possibly with parameterized tests) makes sense.
- For data product delivery and other tests in archive file (like download and direct download), I prefer keeping the folder. I think combining is not that valuable in terms of feature comparison since the tested methods are quite different. Also after my pr for Add public method to support new dataProductDelivery services. #26, data product delivery would have tests for another three api end points. Merging them together might make the file hard to read.
Unfortunately QA does not make it faster. Data product delivery system is slow in nature. The slowest test took me 63 seconds to finish in QA. The two main reasons for me to use QA is to test new features in QA and not burden the PROD server & database. I think it is a good idea to mark the whole data product delivery valid tests as slow and only enable them in GitHub Actions.

For the lower-level (requesting, running, and downloading) and higher-level (order) functions of data product delivery, I have to say the code is not that intuitive to me. I initially thought orderDataProduct is just the combination of the three methods, like what I did in the test_valid_manual, but the code did something more complex. Maybe this method needs some code refactor first.

kan-fu requested review from Jacob-Stevens-Haas, eliferguson and spencerwplovie February 29, 2024 21:51

kan-fu force-pushed the issue-10-pytest branch from 22ac8ae to 4a9e79d Compare March 1, 2024 18:04

kan-fu added 2 commits March 6, 2024 11:42

test: rewrite tests with pytest

e316886

feat: drop support for Python 3.8

c0d57f1

kan-fu force-pushed the issue-10-pytest branch from 4a9e79d to c0d57f1 Compare March 6, 2024 19:43

eliferguson approved these changes Mar 7, 2024

View reviewed changes

Jacob-Stevens-Haas approved these changes Mar 10, 2024

View reviewed changes

kan-fu merged commit 07652f8 into main Mar 11, 2024

kan-fu mentioned this pull request Mar 11, 2024

Improve pytest folder structure #31

Open

kan-fu deleted the issue-10-pytest branch March 11, 2024 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue 10 pytest #24

Issue 10 pytest #24

Uh oh!

kan-fu commented Feb 29, 2024 •

edited

Loading

Uh oh!

Jacob-Stevens-Haas left a comment

Uh oh!

Jacob-Stevens-Haas Mar 10, 2024

Uh oh!

kan-fu Mar 11, 2024 •

edited

Loading

Uh oh!

kan-fu commented Mar 11, 2024

Uh oh!

Uh oh!

Issue 10 pytest #24

Issue 10 pytest #24

Uh oh!

Conversation

kan-fu commented Feb 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jacob-Stevens-Haas left a comment

Choose a reason for hiding this comment

Uh oh!

Jacob-Stevens-Haas Mar 10, 2024

Choose a reason for hiding this comment

Uh oh!

kan-fu Mar 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kan-fu commented Mar 11, 2024

Uh oh!

Uh oh!

kan-fu commented Feb 29, 2024 •

edited

Loading

kan-fu Mar 11, 2024 •

edited

Loading