Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

tighten test_xlm bound #4261

Merged
merged 1 commit into from
Jan 6, 2022
Merged

tighten test_xlm bound #4261

merged 1 commit into from
Jan 6, 2022

Conversation

austereantelope
Copy link
Contributor

Patch description
Hi,

The test test_xlm in test_transformers.py has an assertion bound (self.assertLessEqual(test['ppl'], 1.3)) that is too loose. This means potential bug in the code could still pass the original test.

To quantify this I conducted some experiments where I generated multiple mutations of the source code under test and ran each mutant and the original code 100 times to build a distribution of their outputs. I used KS-test to find mutants that produced a different distribution from the original and use those mutants as a proxy for bugs that could be introduced. In the graph below I show the distribution of both the original code and also the mutants with a different distribution.

Here we see that the bound of 1.3 is too loose since the original distribution (in orange) is much less than 1.3. Furthermore in this graph we can observe that there are many mutants (proxy for bugs) that are below the bound as well and that is undesirable since the test should aim to catch potential bugs in code. I quantify the "bug detection" of this assertion by varying the bound in a trade-off graph below.

In this graph, I plot the mutant catch rate (ratio of mutant outputs that fail the test) and the original pass rate (ratio of original output that pass the test). The original bound of 1.3 (red dotted line) has a catch rate of 0.

To improve this test, I propose to tighten the bound to 1.02 (the blue dotted line). The new bound has a catch rate of 0.17 (+0.17 increase compare to original) while still has >99 % pass rate (test is not flaky, I ran the updated test 500 times and observed >99 % pass rate). I think this is a good balance between improving the bug-detection ability of the test while keep the flakiness of the test low.

Furthermore, I observed that the value of the assertions in this test never goes below 1. Would it be helpful to include an additional assertion for example assertGreaterEqual(test['ppl'], 1) as well? Some mutants can change the value to below 1 and current assertion setup would miss those mutants/bugs. By adding the additional assertion of greaterthan 1 we can improve the catch rate further to 0.23

Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions or questions.

My Environment:

python=3.7.11
pytorch=1.10.0

my parlai Experiment SHA:
4b1d07d0eeb14f849ad930eeb001327f9bfc2db1

@facebook-github-bot
Copy link

Hi @austereantelope!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@stephenroller
Copy link
Contributor

Super cool! Awesome analysis. Very happy to try this for a bit and see how it goes.

Please just sign the CLA and we'll land.

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@austereantelope
Copy link
Contributor Author

@stephenroller I've signed the CLA, is it good to merge now?

Copy link
Contributor

@stephenroller stephenroller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Sorry for slow response, busy on my end.

@stephenroller stephenroller merged commit 0470430 into facebookresearch:main Jan 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants