TEST: Run all tests through `pytest` #2090

david-cortes-intel · 2024-10-07T15:28:49Z

Description

Currently, unit tests execute through a mixture of built-in unittest and pytest.

From the commit history, it looks like initial tests were added using python's built-in unittest framework, and then subsequent tests were introduced using pytest.

Compared to unittest, pytest has a much nicer output format, offers many options for how to execute tests and how to see their outputs, and has plugins that allow it do extra things which the built-in cannot when desired (e.g. JSON outputs, test coverage reports, etc.).

This PR switches the automated scripts to execute all of the tests using pytest. This doesn't change what tests were executed - just uses pytest as the runner.

One challenge here is how each of these interprets the path from where modules are imported - here I'm trying to control these with the PYTHONPATH variable when pytest is called within shell scripts, but don't know if this will work out for all the variants under which tests are executed.

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least summary table with measured data, if performance change is expected.
I have provided justification why performance has changed or why changes are not expected.
I have provided justification why quality metrics have changed or why changes are not expected.
I have extended benchmarking suite and provided corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

david-cortes-intel · 2024-10-07T15:49:34Z

@Alexsandruss Would you perhaps have any pointers about why the failing CI job here:
https://github.com/intel/scikit-learn-intelex/actions/runs/11218859163/job/31183555329?pr=2090

.. might be failing to import daal4py._daal4py in the new pytest call that I added, while it succeeds in the other calls?

icfaust · 2024-10-07T20:15:58Z

I've had to fight against pytest's package discovery recently and had to use --rootdir to get things to work properly (https://github.com/intel/scikit-learn-intelex/blob/main/.ci/scripts/run_sklearn_tests.py#L47). Maybe that could help you? I believe your daal4py._daal4py error is coming from the $PYTHONPATH which is using the repo daal4py folder to find daal4py instead of the installed site-packages daal4py, because it can't find the compiled _daal4py.so file there.

david-cortes-intel · 2024-10-08T06:08:53Z

I've had to fight against pytest's package discovery recently and had to use --rootdir to get things to work properly (https://github.com/intel/scikit-learn-intelex/blob/main/.ci/scripts/run_sklearn_tests.py#L47). Maybe that could help you? I believe your daal4py._daal4py error is coming from the $PYTHONPATH which is using the repo daal4py folder to find daal4py instead of the installed site-packages daal4py, because it can't find the compiled _daal4py.so file there.

Thanks for the tip. But that was actually for a different reason: some test files import other tests files - for example test_daal4py_spmd_examples.py imports test_daal4py_examples right here, as an absolute import instead of relative, so the folder with test files needs to be under sys.path for it to work:
https://github.com/intel/scikit-learn-intelex/blob/2f73b9de4a79205b67522f502ae140e6470871f3/tests/test_daal4py_spmd_examples.py#L22

But since now these tests should only be getting executed through pytest, I guess a better solution should be to use relative imports. I'm now trying that approach so let's see how the CI will end up working with it.

david-cortes-intel · 2024-10-09T06:34:45Z

Looks like it was just a matter of deleting the __init__.py file in the tests folder. All CI jobs are executing them correctly now.

icfaust

thank you for doing this. Here are my initial thoughts.

icfaust · 2024-10-09T19:19:06Z

.ci/scripts/test_global_patch.py

+    err_code = subprocess.call(
+        [sys.executable, "-m", "sklearnex.glob", "patch_sklearn", "-a", "svc"]
+    )
+    assert not err_code
+
+    def unpatch_from_cmd():
+        err_code = subprocess.call(
+            [sys.executable, "-m", "sklearnex.glob", "unpatch_sklearn"]
+        )
+        assert not err_code
+
+    request.addfinalizer(unpatch_from_cmd)
+    return


Suggested change

err_code = subprocess.call(

[sys.executable, "-m", "sklearnex.glob", "patch_sklearn", "-a", "svc"]

)

assert not err_code

def unpatch_from_cmd():

err_code = subprocess.call(

[sys.executable, "-m", "sklearnex.glob", "unpatch_sklearn"]

)

assert not err_code

request.addfinalizer(unpatch_from_cmd)

return

err_code = subprocess.call(

[sys.executable, "-m", "sklearnex.glob", "patch_sklearn", "-a", "svc"]

)

assert not err_code

yield

err_code = subprocess.call(

[sys.executable, "-m", "sklearnex.glob", "unpatch_sklearn"]

)

assert not err_code

Thoughts on using yield instead?

What would be the advantage of using yield here?

Its not a must, but its the pytest recommended procedure over adding a finalizer https://docs.pytest.org/en/stable/how-to/fixtures.html#yield-fixtures-recommended

Changed to yield.

Looks like the windows job doesn't like usage of 'yield':

INTERNALERROR> ^^^^^^^^^^^^^^^^^^^^^^^^^ INTERNALERROR> File "C:\Miniconda\envs\CB\Lib\site-packages\_pytest\logging.py", line 790, in pytest_collection INTERNALERROR> return (yield) INTERNALERROR> ^^^^^ ... joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

Will try to see where that multi-processing invocation is coming from.

icfaust · 2024-10-09T19:23:26Z

.ci/scripts/test_global_patch.py

+def patch_from_function(request):
+    from sklearnex import patch_sklearn, unpatch_sklearn
+
+    patch_sklearn(name=["svc"], global_patch=True)


Again possibly yield?

.ci/scripts/test_global_patch.py

icfaust · 2024-10-09T19:29:11Z

.ci/scripts/test_global_patch.py

+    )
+
+
+def test_unpatching_from_command_line(patch_from_command_line):


why is the fixture used here again? Is this trying to expand functionality or match the previous one (because this is slightly different from what was before). What was previous assumed that the global setting was returned to normal after the unpatch, so by reapplying the "sklearnex.glob", "patch_sklearn", "-a", "svc" the state of this test is different than what occurred in the file before.

The fixture has a finalizer. So on each test, it gets initialized (global patching is applied), and after the test finishes, gets finalized (global unpatching is applied). Then the same repeats for the next test that uses the fixture, ensuring that state doesn't propagate across tests. This is of course assuming that patching and unpatching are working correctly, which is what the tests are looking at in the first place, but I thought it better than the idea of merging patching and unpatching in a single test.

.ci/scripts/test_global_patch.py

icfaust · 2024-10-09T19:32:39Z

.ci/scripts/test_global_patch.py

+    assert not err_code
+    from sklearnex import unpatch_sklearn
+
+    unpatch_sklearn()


https://github.com/intel/scikit-learn-intelex/blob/main/sklearnex/tests/test_monkeypatch.py#L65 Just be careful with these functions since they impact global state and aren't local to the test (anything with a patch_sklearn call should be with a 'try-finally' with an unpatch_sklearn)

Yes, that's the idea behind the fixtures with finalizers.

icfaust · 2024-10-09T19:36:29Z

.ci/scripts/test_global_patch.py

+    )
+
+
+def test_unpatching_from_function(patch_from_function):


Now that I look at what the code had previously been, I am glad you have done this. Much better isolation of tests.

icfaust · 2024-10-09T19:38:12Z

conda-recipe/meta.yaml

-        - mpirun -n 4 python -m unittest discover -v -s tests -p test*spmd*.py # [not win]
-        - mpiexec -localonly -n 4 python -m unittest discover -v -s tests -p test*spmd*.py # [win]
-        - python -m unittest discover -v -s tests -p test*.py
+        - mpirun -n 4 pytest --pyargs --verbose -s tests/test*spmd*.py # [not win]


is --pyargs necessary since you are using a path and not a package name? (same goes for the others in run_test.bat / run_test.sh)

I don't know. But all the other pytest calls have it.

I'd say cut them out and see what happens. If they fail you can readd them (and ping me)

Looks like they weren't necessary.

tests/test_examples_sklearnex.py

icfaust · 2024-10-10T07:05:54Z

.ci/scripts/test_global_patch.py

-err_code = subprocess.call([sys.executable, "-m", "sklearnex.glob", "unpatch_sklearn"])
-assert not err_code
-unpatch_sklearn()


The state of the unpatch_sklearn() test was previously after a subprocess.call([sys.executable, "-m", "sklearnex.glob", "unpatch_sklearn"]) not a
subprocess.call(
[sys.executable, "-m", "sklearnex.glob", "patch_sklearn", "-a", "svc"] call, meaning the state of the test is different in the new implementation. This is okay if this was what you desired

#2090 (comment)

Thanks, I had missed that detail entirely.

So I tried adding a new test for global state separately from just SVC, but the test actually ends up failing:

@pytest.fixture def patch_all_from_command_line(): err_code = subprocess.call( [sys.executable, "-m", "sklearnex.glob", "patch_sklearn"] ) assert err_code == os.EX_OK yield err_code = subprocess.call( [sys.executable, "-m", "sklearnex.glob", "unpatch_sklearn"] ) assert err_code == os.EX_OK def test_patching_all_from_command_line(patch_all_from_command_line): from sklearn.svm import SVC, SVR assert SVC.__module__.startswith("daal4py") assert SVC.__module__.startswith("sklearnex") assert SVR.__module__.startswith("daal4py") assert SVR.__module__.startswith("sklearnex")

should that be expected to succeed? same happens if checking them with "or", and happens for both SVC and SVR.

I assume that its failing on
assert SVC.__module__.startswith("daal4py"). You can remove
assert SVC.__module__.startswith("daal4py") and assert SVR.__module__.startswith("daal4py")

It also fails for sklearnex.

I added it this new test as an extra commit here, let's see if it also fails in the CI.

Could be a legit failure then, lets see

Looks like it also fails in the CI jobs:

def test_patching_all_from_command_line(patch_all_from_command_line): from sklearn.svm import SVC, SVR > assert SVC.__module__.startswith("daal4py") or SVC.__module__.startswith("sklearnex") E AssertionError: assert (False or False) E + where False = <built-in method startswith of str object at 0x7f8679a04300>('daal4py') E + where <built-in method startswith of str object at 0x7f8679a04300> = 'sklearn.svm._classes'.startswith E + where 'sklearn.svm._classes' = <class 'sklearn.svm._classes.SVC'>.__module__ E + and False = <built-in method startswith of str object at 0x7f8679a04300>('sklearnex') E + where <built-in method startswith of str object at 0x7f8679a04300> = 'sklearn.svm._classes'.startswith E + where 'sklearn.svm._classes' = <class 'sklearn.svm._classes.SVC'>.__module__ scripts/test_global_patch.py:83: AssertionError

Yeah this is bad, i'll admit I don't know much about the glob use. I wonder if sklearn has been pre-imported via sklearnex imports, and the change in glob not affecting it. There is definitely something here.

Yes, looks like that does indeed make a difference. If leaving it as the only test then it actually succeeds. Will try to find a workaround.

I didn't manage to find any solution so in the end I just removed the test and left a comment in the file.

david-cortes-intel · 2024-10-10T11:29:44Z

/intelci: run

david-cortes-intel · 2024-10-11T05:50:28Z

/intelci: run

icfaust

Assuming small English correction ready to merge

.ci/scripts/test_global_patch.py

Co-authored-by: Ian Faust <icfaust@gmail.com>

run legacy tests with pytest

075e016

david-cortes-intel added the testing Tests for sklearnex/daal4py/onedal4py & patching sklearn label Oct 7, 2024

david-cortes-intel requested review from Alexsandruss, samir-nasibli, icfaust and napetrov as code owners October 7, 2024 15:28

david-cortes-intel marked this pull request as draft October 7, 2024 15:28

david-cortes-intel added 3 commits October 7, 2024 17:50

add back pyargs to pytest call

c5e3f4b

more pyargs

3ba6819

try changing order of PYTHONPATH

a2e1e70

try changing to relative imports instead of adding PYTHONPATH

1026e4c

david-cortes-intel added 4 commits October 8, 2024 08:09

missing delete

6e421b7

try executing non-mpi tests first to see if they also fail

639d0fd

try deleting __init__.py in test folder

b295a44

try removing relative import

c56f5bd

david-cortes-intel mentioned this pull request Oct 8, 2024

[On hold] TEST: Produce JSON logs of executed examples #2087

Closed

13 tasks

move patching test also to pytest, restore MPI on windows

6ce5300

david-cortes-intel marked this pull request as ready for review October 9, 2024 06:34

david-cortes-intel changed the title ~~[Draft] TEST: Run all tests through pytest~~ TEST: Run all tests through pytest Oct 9, 2024

Merge branch 'main' into pytest

a8c3009

icfaust reviewed Oct 9, 2024

View reviewed changes

david-cortes-intel added 5 commits October 10, 2024 08:35

fix unexecuted parametrization

02caa5b

remove unused import

de145cc

better naming

accad99

add name of estimator being patched

6c7a161

account for platforms using non-standard process codes

a30eb50

david-cortes-intel added 3 commits October 10, 2024 08:59

correct fixture names

a209abb

use yield instead of finalize

aa22bc5

try removing pyargs to see what happens

5bb9c50

icfaust reviewed Oct 10, 2024

View reviewed changes

david-cortes-intel added 4 commits October 10, 2024 09:23

add separate tests for global and module-specific

cac58b1

remove unused argument

aa57ba0

remove global patching test

5653e3f

fixes for windows

fc09a8e

try changing fixtures back to manual finalizers

ef03240

add comment about the yield fixtures

d4a06eb

icfaust approved these changes Oct 14, 2024

View reviewed changes

.ci/scripts/test_global_patch.py Outdated Show resolved Hide resolved

Update .ci/scripts/test_global_patch.py

5a6e450

Co-authored-by: Ian Faust <icfaust@gmail.com>

david-cortes-intel merged commit 614e088 into uxlfoundation:main Oct 14, 2024
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEST: Run all tests through `pytest` #2090

TEST: Run all tests through `pytest` #2090

david-cortes-intel commented Oct 7, 2024

david-cortes-intel commented Oct 7, 2024

icfaust commented Oct 7, 2024 •

edited

Loading

david-cortes-intel commented Oct 8, 2024

david-cortes-intel commented Oct 9, 2024

icfaust left a comment

icfaust Oct 9, 2024

david-cortes-intel Oct 10, 2024

icfaust Oct 10, 2024

david-cortes-intel Oct 10, 2024

david-cortes-intel Oct 10, 2024

icfaust Oct 9, 2024

icfaust Oct 9, 2024

david-cortes-intel Oct 10, 2024 •

edited

Loading

icfaust Oct 9, 2024

david-cortes-intel Oct 10, 2024

icfaust Oct 9, 2024

icfaust Oct 9, 2024

david-cortes-intel Oct 10, 2024

icfaust Oct 10, 2024

david-cortes-intel Oct 10, 2024

icfaust Oct 10, 2024 •

edited

Loading

david-cortes-intel Oct 10, 2024 •

edited

Loading

icfaust Oct 10, 2024

david-cortes-intel Oct 10, 2024

david-cortes-intel Oct 10, 2024

icfaust Oct 10, 2024

david-cortes-intel Oct 10, 2024 •

edited

Loading

icfaust Oct 10, 2024

david-cortes-intel Oct 10, 2024

david-cortes-intel Oct 10, 2024

david-cortes-intel commented Oct 10, 2024

david-cortes-intel commented Oct 11, 2024

icfaust left a comment

		)


		def test_unpatching_from_command_line(patch_from_command_line):

TEST: Run all tests through pytest #2090

TEST: Run all tests through pytest #2090

Conversation

david-cortes-intel commented Oct 7, 2024

Description

david-cortes-intel commented Oct 7, 2024

icfaust commented Oct 7, 2024 • edited Loading

david-cortes-intel commented Oct 8, 2024

david-cortes-intel commented Oct 9, 2024

icfaust left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-cortes-intel Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

icfaust Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

david-cortes-intel Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-cortes-intel Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-cortes-intel commented Oct 10, 2024

david-cortes-intel commented Oct 11, 2024

icfaust left a comment

Choose a reason for hiding this comment

TEST: Run all tests through `pytest` #2090

TEST: Run all tests through `pytest` #2090

icfaust commented Oct 7, 2024 •

edited

Loading

david-cortes-intel Oct 10, 2024 •

edited

Loading

icfaust Oct 10, 2024 •

edited

Loading

david-cortes-intel Oct 10, 2024 •

edited

Loading

david-cortes-intel Oct 10, 2024 •

edited

Loading