Add Test strategy #688

twsl · 2021-12-19T01:17:19Z

What does this PR do?

Fixes #675

This is just a proposal as I don't have multiple gpus for testing.
This increases the number of tests a lot!

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-12-19T01:20:42Z

Codecov Report

Merging #688 (df21357) into master (c89a740) will decrease coverage by 22%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #688     +/-   ##
=======================================
- Coverage      95%    73%    -22%     
=======================================
  Files         166    166             
  Lines        6413   6413             
=======================================
- Hits         6105   4684   -1421     
- Misses        308   1729   +1421

Borda · 2021-12-21T17:57:09Z

well seems that some tests at the beginning is hanging, have you tried to run it locally?

… into feature/test-strategy

twsl · 2021-12-26T17:58:15Z

tests/classification/test_auc.py

@@ -55,22 +55,29 @@ def sk_auc(x, y, reorder=False):

 @pytest.mark.parametrize("x, y", _examples)
 class TestAUC(MetricTester):
- @pytest.mark.parametrize("ddp", [False])
+ @pytest.mark.parametrize(MetricTesterDDPCases.name_strategy(), MetricTesterDDPCases.cases_strategy())


DDP apparently fails and therefore was probably excluded in the tests, yet there is no mention in the docs, that DDP is not supported.

@Borda @SkafteNicki any idea how to handle this? removing the full ddp/device strategy and opening an issue to fix auc would be my recommendation

Its hard for me to remember, but basically AUC needs ordered input to be working and while we are testing on ordered input when we move to ddp it somehow gets unordered. I wonder if we should just set the reorder flag to True whenever ddp=True.

Borda

hi, thx for this PR, just very sure if it is really needed... what you try to set here is running CPU and GPU in a single test call, but even now we are trying to optimize the test to run as efficiently as possible, so on CPU machine run just CPU version and on GPU run CUDA version (and skip CPU version as it was run already elsewhere)
Said so I think that is also by user expectation, test on the best available resources, so on GPU machine run CUDA tests and on CPU imagine the rest... so if you have a case that you have GPU and you want to run CPU test you can simply disable GPU visibility for the particular run, for example: CUDA_VISIBLE_DEVICES=-1 pytest ...

Borda · 2022-01-04T20:19:07Z

tests/audio/test_pesq.py

@@ -75,25 +75,28 @@ def average_metric(preds, target, metric_func):
 class TestPESQ(MetricTester):
 atol = 1e-2

- @pytest.mark.parametrize("ddp", [True, False])
+ @pytest.mark.parametrize(MetricTesterDDPCases.name_strategy(), MetricTesterDDPCases.cases_strategy())


I would say it is cleaner if it will be a constant or function returning tuple:

def ddp_name_options(): return "ddp", (True, False)

and later:

@pytest.mark.parametrize(*ddp_name_options())

but anyway why is this needed?

The reason I added this stategy was that now all available test cases are generated and you can explicitly test them locally. This way you can guarantee full coverage, don't depend on the ci machines or some launch flags and each test states on which device it runs. This way you dont have to check the machine stats to e.g. find out, if a test failed on single or multi gpu.
In order to optimize the tests, having an additional skip condition with an env var could achieve the same. Especially for test optimization you could then run ddp=false on a machine with a single gpu and ddp=True on one with multiple.
I'm fine putting it into a single function, my idea was to stick with two parameter style of parametrize

I agree with @Borda that it would be better to return a tuple there.

For the general purpose: I think we should split this between arguments for every test (like devices and maybe also ddp strategies) for which we should use pytest fixtures (that's what they are there for) and then you could call pytest --devices=XX tests. For test-specific arguments this approach is fine IMO

I'm unsure how to continue with this. Any suggestions?

maximsch2 · 2022-02-28T19:47:12Z

@SkafteNicki , I think it makes sense to merge this first before merging #867 to make sure things continue working correctly on GPUs.

Borda · 2022-05-05T13:55:03Z

Thank you for your suggestion, but atm we are in the process of additional test optimization 😇

twsl added 4 commits December 19, 2021 00:51

Add test cases

2e71759

Remove comment

17578f6

Add test strategy to map

3131b5a

Fix device type

7f51e99

twsl and others added 10 commits December 19, 2021 15:13

Add test strategy to audio metrics

27aa6c2

Add test strategy to bases

258da03

Add test strategy to classification

b4dae4d

Add test strategy to image

8439b4a

Add test strategy to pairwise

30b4ea7

Add test strategy to regression

2f5dc34

Add strategy to pairwise

2dfe6c8

Add test strategy to text

2e9f235

Add test strategy to wrappers

70bce49

Update paper.md (#690)

938ceee

Borda added enhancement New feature or request test / CI testing or CI labels Dec 21, 2021

Merge branch 'master' into feature/test-strategy

7922860

Borda requested a review from SkafteNicki December 21, 2021 17:57

Borda force-pushed the master branch from 63d0d75 to 0135327 Compare December 22, 2021 11:58

twsl added 6 commits December 23, 2021 17:49

Add ddp skip condition

0f49516

Switch to pure pytest solution

f6eaf38

Merge branch 'master' into feature/test-strategy

7635ddf

Add gpu device restriction

debcd8f

Add ddp restriction

f9d300d

Merge branch 'feature/test-strategy' of https://github.com/twsl/metrics…

4eb16e8

… into feature/test-strategy

twsl commented Dec 26, 2021

View reviewed changes

Borda assigned SkafteNicki Dec 27, 2021

Merge branch 'master' into feature/test-strategy

25a48c9

twsl marked this pull request as ready for review December 27, 2021 21:28

twsl requested review from ananyahjha93, Borda, ethanwharris, justusschock, SeanNaren and tchaton as code owners December 27, 2021 21:28

Merge branch 'master' into feature/test-strategy

df21357

mergify bot added the has conflicts label Jan 4, 2022

Borda reviewed Jan 4, 2022

View reviewed changes

Borda assigned justusschock and unassigned SkafteNicki Jan 6, 2022

Borda self-requested a review January 6, 2022 15:06

Borda force-pushed the master branch 3 times, most recently from 3a0f7dc to 317182c Compare January 10, 2022 13:05

Borda added this to the v0.8 milestone Jan 12, 2022

Borda force-pushed the master branch from cfe5e87 to cccb7a6 Compare January 19, 2022 22:07

Borda modified the milestones: v0.8, v0.9 Mar 22, 2022

Borda force-pushed the master branch from cbbedb1 to 76c603d Compare April 15, 2022 09:19

Borda removed this from the v0.9 milestone Apr 20, 2022

Borda closed this May 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Test strategy #688

Add Test strategy #688

twsl commented Dec 19, 2021 •

edited

Loading

codecov bot commented Dec 19, 2021 •

edited

Loading

Borda commented Dec 21, 2021

twsl Dec 26, 2021

twsl Dec 29, 2021

SkafteNicki Jan 6, 2022

Borda left a comment

Borda Jan 4, 2022

Borda Jan 4, 2022

twsl Jan 4, 2022

justusschock Jan 6, 2022

twsl Jan 11, 2022

maximsch2 commented Feb 28, 2022

Borda commented May 5, 2022

Add Test strategy #688

Add Test strategy #688

Conversation

twsl commented Dec 19, 2021 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Dec 19, 2021 • edited Loading

Codecov Report

Borda commented Dec 21, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Borda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximsch2 commented Feb 28, 2022

Borda commented May 5, 2022

twsl commented Dec 19, 2021 •

edited

Loading

codecov bot commented Dec 19, 2021 •

edited

Loading