Convert `tests` from dict of dicts to list of tuples to avoid key overwrites #167

jagerber48 · 2022-11-20T12:15:19Z

This PR

Moves formatting code into a test_formatting.py module
Refactors test_format to use pytest.mark.parametrize. This solves Bug: repeated dict keys in formatting testing results in some keys going untested. #162 since cases are no longer stored as dict keys (which were accidentally repeated). Now each case is a 4-tuple. Previously test cases with the same value/uncertainty had the option to be grouped together within a dict under that value/uncertainty tuple. That option is lost since I don't really know how to do that in pytest. I've done this using unittest.subTest in the past, I think there might be a pytest extension that does something like this.
Removes checks and switch case for jython. Let me know if these need to be added back in.
Correction to some tests that were previously not running, but which are now failing <>

There is a lot of cruft in comments and possibly tests about subtleties between old versions of python. Hopefully this can be removed soon.

OLD DESCRIPTION:

See #162.
tests object containing test cases for ufloat string formatting is currently a dict where keys are tuples of value/uncertainty pairs and values are dicts of format string/expected output pairs. Similar tests are chunked together under the same value/uncertainty pair. However, the same value/uncertainty pair appears multiple times in the top level dict. In this case only one grouping of tests is captured in the dict, and all others are overwritten.

To avoid this overwriting we can convert the dict of dicts into a list of tuples. In this case elements of the list are tuples whose first element is a value/uncertainty pair tuple and whose second element is a dict of format string/expected output pairs.

…rwriting problems.

jagerber48 · 2022-11-20T12:25:18Z

Note on master commit 804adc running nosetests test_uncertainties.py reports OK but when using this change 1 failure is detected:

AssertionError: Incorrect representation '   3.14150+/-   0.00010' for format '>10f' of 3.1415+/-0.0001: '  3.141500+/-  0.000100' expected.

This corresponds to the >10f format in these tests at line 1568:

    tests = [  # (Nominal value, uncertainty): {format: result,...}

        # Usual float formatting, and individual widths, etc.:
        ((3.1415, 0.0001), {
            '*^+7.2f': '*+3.14*+/-*0.00**',
            '+07.2f': '+003.14+/-0000.00',  # 0 fill
            '>10f': '  3.141500+/-  0.000100',  # Width and align
            '11.3e': '  3.142e+00+/-  0.000e+00',  # Duplicated exponent
            '0.4e': '3.1415e+00+/-0.0000e+00'  # Forced double exponent
        }),
        ...

My expected output for that line is actually 3.1415+/- 0.0001 which is different than both the expected and actual output above. I guess the issue is the precision rather than width. Need to process what's going on here...

jagerber48 · 2022-11-20T12:45:29Z

I see, the f format defaults to a precision of 6 digits after the decimal place. For some reason ufloat formatting gives 5 digits after the decimal.
that is:

from uncertainties import ufloat
u = ufloat(3.1415, 0.0001)
print(f'{u:f}')
print(f'{3.1415:f}')
print(f'{0.0001:f}')

gives me

3.14150+/-0.00010
3.141500
0.000100

so something is funny with f format and default precision.
This is on python 3.9.13.

Not sure how this should be addressed within this PR since the PR is about changing unit test formatting. I could comment out the failing unit test so that the PR commit builds. Otherwise I could change the unit test result so that it passes, but that feels like cheating. The other alternative is wait for a fix to what's causing the unit test to fail for this PR to go through.

jagerber48 · 2022-11-20T13:06:41Z

Oh, sorry for the spam. I see what's going on. If no precision is supplied in the format string then PDG rules for sig figs on uncertainty sets precision, rather than python defaults. So I think the output from the program is correct, and the expected unit test result is wrong.

But even if I fix this one I think there are more unit test failures after this one that will appear one by one. @lebigot what should this PR, which unburies more unit tests, do wrt the unburied unit tests failing? Just go through or wait for resolutions?

andrewgsavage · 2024-07-24T20:40:08Z

Could you get this running again? I ended up setting this to noqa to get the linter to pass.

It would be good to set it up as a pytest style test with the different tests parameterised.

wshanks · 2024-07-24T21:19:05Z

When reviewing #248, I did the same thing as @jagerber48, deduplicating the keys locally and finding at least one test failure. I had wanted to get back to that as part of removing the noqas. At the time, my thought was to defer to @jagerber48's judgment because the failing test was related to string formatting.

jagerber48 · 2024-07-24T22:37:22Z

Yes, I can revisit this. And yes, in doing so I'll upgrade to the pytest parametrized tests.

here I enumerate the failures I found. Some of the failures look like mistakes in the test expected results, but some of the failures look like they reveal either bugs, or at least ambiguities in the format specification.

Before the pytest.parametrize refactor a bug involving repeated keys in the dictionary made it so many tests were not actually running. Moving to pytest.parametrize re-exposed these tests but unfortunately some of them are failing.

codecov · 2024-07-27T02:11:56Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.46%. Comparing base (f097015) to head (24d5920).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #167      +/-   ##
==========================================
+ Coverage   96.33%   96.46%   +0.13%     
==========================================
  Files          15       16       +1     
  Lines        1909     1898      -11     
==========================================
- Hits         1839     1831       -8     
+ Misses         70       67       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jagerber48 · 2024-07-27T02:20:02Z

Ok, I refactored the tests to use pytest.parametrize. In the old format a value/uncertainty pair tuple was used as the key to a dictionary that contained fmt_spec, expected_str key: value pairs. I wanted each case to run in pytest with its own infrastructure so I moved to a value, uncertainty, fmt_spec, expected_str tuple for EACH test. I don't know how to accomplish the nested tests in pytest. It can be done in unittest with subTest. Looks like it could be done with the pytest-subtests package.

I manually copied the test from the old structure to the new. I tried hard not to miss anything but I'm only human. I couldn't think of an automated way to do it easily.

After doing this I found 8 tests that were failing. These can be seen in this diff. In a following post I'll try to enumerate why I think the tests are failing. Some of them look like typos in the expected output, some of them look like typos in the intended format specification string or a forgetting how edge cases in the formatting work. A few of them are testing cases where I don't actually know what to do expect.

For the cases where I truly don't know how to resolve the test case I'm going to leave it as an expected failure and open up a new issue to try to figure out the issue.

Going through these tests remind that there are some pretty confusing rules when it comes to uncertainties formatting. These rules are likely not documented in the uncertainties documentation and they may or may not be documented fully in the source code comments. It re-inspires me to think about implementing sciform as a formatting backend for uncertainties. I think a lot of the complexity of uncertainties formatting comes from the fact that it is trying to somehow "stay true" to the python built-in format specification language for python floats, whereas sciform is inspired by both the python built-in FSML, and the uncertainties FSML, but where ambiguities or confusion arises sciform chooses simplicity for the scientist over adherence to the possibly convoluted prior conventions.

jagerber48 · 2024-07-27T04:25:49Z

Failing tests:

# val, std_dev, fmt_spec, 

# 1
3.1415, 0.0001, ">10f"
Expected :'  3.141500+/-  0.000100'
Actual   :'   3.14150+/-   0.00010'
Discussion: It looks like this test was written assuming `3.1415` would get formatted as `3.141500`, with 6 digits-past-the-decimal which is the default for float formatting. But uncertainties, when no precision is specified, defaults to PDG sig fig rounding on the uncertainty. See https://github.com/lmfit/uncertainties/blob/master/uncertainties/formatting.py#L850. PDG rules indicate this should get rounded to two digits of uncertainty, the fifth digit past the decimal point.
Resolution: Change expected string to `"   3.14150+/-   0.00010"`.

# 2
3.1415, 0.0001, "0.4e-3"
Expected :'3.1415e+00+/-0.0000e+00'
Actual   :'(3.1415+/-0.0001)e+00'
Discussion: This test comes with a comment for "forced double exponent". Here the double exponent is intended to be forced by the explicit width specification of `0`. But actually the width must be explicitly set to `1` to force the double exponent. See https://github.com/lmfit/uncertainties/blob/master/uncertainties/formatting.py#L687. There is also a simple rounding typo.
Resolution: Change the fmt_spec to `1.4e-3` and change the expected string to '3.1415e+00+/-0.0001e+00'

# 3
12.3, 456.78, ""
Expected :'12+/-457'
Actual   :'(0+/-5)e+02'
Discussion: I'm not sure what logic went into writing the expected string here. The rules that apply here are (1) that with no precision specified the precision is determined by PDG sig fig rules on the uncertainty, in this case one sig fig and (2) with no precision type specified then the `g` precision type is used. `457:.1g` gets formatted to `5e+02` because the `g` format specification coerces to `e` before displaying trailing insignificant zeros before the decimal point.
Resolution: Change the expected string to `"(0+/-5)e+02"`.

# 4
12.3, 456.78, ".1ueS"
Expected :'0.123(4.568)e+02'
Actual   :'0(5)e+02'
Discussion: This one got mangled in my reformatting. Originally it was `.4ueS` which should be consistent with the expected output. I realized this because I noticed in my previous post there wasn't a test like this one flagged.
Resolution: Change the fmt_spec back to `.4ueS`. 

# 5
1234.56789, 0.1, ".0f"
Expected :'(1234+/-0.)'
Actual   :'1235+/-0'
Discussion: This test is a bit confusing. First off there is a simple rounding error. Second, the expected string has a trailing decimal symbol but the actual string does not have a trailing decimal symbol. The `uncertainties` convention is that uncertainty that got rounded to zero should have a trailing decimal, but uncertainty that is equal to zero should have no trailing decimal symbol. This test conflicts with that convention. The comment on this test is `Approximate error indicated with "."`. I think the conflict arises from an intersection with edge cases on how python formatting handles trailing decimal symbols. Finally, for some reason the expected string includes enclosing parentheses, but those should only appear on `f` formatting if the `p` option is selected.
Resolution: I think the rounding error and the enclosing parentheses are a typo so I will remove those from the expected string. However, I think this is a real failure of the trailing decimal handling. I will continue to mark it as an expected failure and open an issue.

# 6
1234.56789, 0.1, "e"
Expected :'(1.23456+/-0.00010)e+03'
Actual   :'(1.23457+/-0.00010)e+03'
Discussion: This is a simple rounding typo
Resolution: Set expected string to '(1.23457+/-0.00010)e+03'

# 7
1234.56789, 0.1, "E"
Expected :'(1.23456+/-0.00010)E+03'
Actual   :'(1.23457+/-0.00010)E+03'
Discussion: This is a simple rounding typo
Resolution: Set expected string to '(1.23457+/-0.00010)E+03'


# 8
1234.56789, 0.1, "%"
Expected :'123457+/-10%'
Actual   :'(123457+/-10)%'
Discussion: Percent mode always has parentheses enclosing the value/uncertainty part
Resolution: Set expected string to '(123457+/-10)%'

I'll resolve all of the failing tests in the next commit. Failing test 5 will continue to fail after these corrections. I think it is a real failure. I'm going to mark it as an expected failure and open an issue.

…se_overwrites

jagerber48 · 2024-07-27T11:25:12Z

Opened #259

jagerber48 · 2024-07-28T16:49:23Z

TLDR: This PR is ready for review.

Formatting tests moved into test_formatting.py
Test reformatted to use pytest.
A few tests were not being run. When the un-run test were revealed and run there were 7 failures. Most of them benign having to do with a typo. But one reveals a real bug or at least inconsistency in the formatting code. See Trailing decimal symbol test failing #259
There are other tests in test_formatting.py. These were just copy and pasted. Eventually these can be cleaned up an put into pytest format also.

andrewgsavage · 2024-08-08T17:33:44Z

that looks good, thanks for getting to it

Convert tests from dict of dicts to list of tuples to avoid key ove…

af3c022

…rwriting problems.

jagerber48 mentioned this pull request Nov 20, 2022

Bug: repeated dict keys in formatting testing results in some keys going untested. #162

Open

jagerber48 mentioned this pull request Dec 21, 2022

Request for rounding exponent to multiple of 3 for "engineering notation" #159

Open

wshanks force-pushed the master branch from 7950418 to 2738c32 Compare March 7, 2024 13:31

Merge branch 'master' into bugfix/avoid_test_case_overwrites

aa89a56

jagerber48 added 6 commits July 25, 2024 19:38

delete test_uncertainties

4c0d069

formatting test module

d322dc5

create test cases tuple list

16419ef

reconfigure the tests

2ef23c2

pre-commit

9692002

Mark failures and correct some tests.

95cc80c

Before the pytest.parametrize refactor a bug involving repeated keys in the dictionary made it so many tests were not actually running. Moving to pytest.parametrize re-exposed these tests but unfortunately some of them are failing.

jagerber48 added 2 commits July 26, 2024 22:34

Resolve test issues

d42e43a

Merge remote-tracking branch 'lmfit/master' into bugfix/avoid_test_ca…

24d5920

…se_overwrites

jagerber48 mentioned this pull request Jul 27, 2024

Trailing decimal symbol test failing #259

Open

andrewgsavage approved these changes Aug 8, 2024

View reviewed changes

jagerber48 merged commit ce16269 into lmfit:master Aug 9, 2024
20 checks passed

jagerber48 deleted the bugfix/avoid_test_case_overwrites branch August 9, 2024 04:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert `tests` from dict of dicts to list of tuples to avoid key overwrites #167

Convert `tests` from dict of dicts to list of tuples to avoid key overwrites #167

jagerber48 commented Nov 20, 2022 •

edited

Loading

jagerber48 commented Nov 20, 2022

jagerber48 commented Nov 20, 2022 •

edited

Loading

jagerber48 commented Nov 20, 2022 •

edited

Loading

andrewgsavage commented Jul 24, 2024

wshanks commented Jul 24, 2024

jagerber48 commented Jul 24, 2024

codecov bot commented Jul 27, 2024 •

edited

Loading

jagerber48 commented Jul 27, 2024

jagerber48 commented Jul 27, 2024 •

edited

Loading

jagerber48 commented Jul 27, 2024

jagerber48 commented Jul 28, 2024

andrewgsavage commented Aug 8, 2024

Convert tests from dict of dicts to list of tuples to avoid key overwrites #167

Convert tests from dict of dicts to list of tuples to avoid key overwrites #167

Conversation

jagerber48 commented Nov 20, 2022 • edited Loading

jagerber48 commented Nov 20, 2022

jagerber48 commented Nov 20, 2022 • edited Loading

jagerber48 commented Nov 20, 2022 • edited Loading

andrewgsavage commented Jul 24, 2024

wshanks commented Jul 24, 2024

jagerber48 commented Jul 24, 2024

codecov bot commented Jul 27, 2024 • edited Loading

Codecov Report

jagerber48 commented Jul 27, 2024

jagerber48 commented Jul 27, 2024 • edited Loading

jagerber48 commented Jul 27, 2024

jagerber48 commented Jul 28, 2024

andrewgsavage commented Aug 8, 2024

Convert `tests` from dict of dicts to list of tuples to avoid key overwrites #167

Convert `tests` from dict of dicts to list of tuples to avoid key overwrites #167

jagerber48 commented Nov 20, 2022 •

edited

Loading

jagerber48 commented Nov 20, 2022 •

edited

Loading

jagerber48 commented Nov 20, 2022 •

edited

Loading

codecov bot commented Jul 27, 2024 •

edited

Loading

jagerber48 commented Jul 27, 2024 •

edited

Loading