Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use pytest.approx() with syrupy #438

Closed
hashstat opened this issue Dec 1, 2020 · 5 comments
Closed

How to use pytest.approx() with syrupy #438

hashstat opened this issue Dec 1, 2020 · 5 comments
Labels

Comments

@hashstat
Copy link

hashstat commented Dec 1, 2020

We have a bunch of tests that perform computations using pandas, numpy, and other scientific libraries and produce a dictionary containing the resulting key-value pairs. Some of the values are slightly different when run on different platforms (i.e. macOS vs Linux), so we wrap those values with pytest.approx() to accommodate those minor and acceptable differences.

Digging into the syrupy code, the final comparison between the test value and the snapshot is performed against serialized data, so it appears that pytest.approx() cannot be used. Is that correct? Or can you suggest a way to allow these two great features to be used together?

Thanks!

@noahnu
Copy link
Collaborator

noahnu commented Dec 1, 2020

Syrupy defaults to using the repr of an object when serializing, and as you noted it performs the comparisons on the serialized values.

I just created a test case:

from pytest import approx

def test_case(snapshot):
    assert snapshot == approx(0.3)

Running pytest --snapshot-update creates the following amber file:

# name: test_case
  0.3 ± 3.0e-07
---

If you modify the test case to approx(0.32), you can see the snapshot fails:

image

and so it looks like syrupy might work just fine? I haven't had much of an opportunity to use pytest.approx, so I'm not sure if this is helpful.

Is there some ideal behaviour you'd want when snapshoting approx values?

Note that if you really wanted to, you could customize the serialization of approx. Here's an example:

https://gist.github.com/noahnu/1fc48564dc6b028f131ce7c832f9e748

import pytest
from syrupy.extensions.amber import AmberSnapshotExtension

class ApproxFormatter:
    def __init__(self, data):
        self.data = data

    def __repr__(self):
        return f"{self.data.expected} +/- ({self.data.tolerance})"

class ApproxExtension(AmberSnapshotExtension):
    def serialize(self, data, **kwargs):
        if isinstance(data, type(pytest.approx(0))):
            return super().serialize(ApproxFormatter(data), **kwargs)
        return super().serialize(data, **kwargs)

@pytest.fixture
def snapshot(snapshot):
    return snapshot.use_extension(ApproxExtension)

def test_case(snapshot):
    assert snapshot == pytest.approx(0.3)

and the output:

image

You can place your custom formatter in a conftest file. It'll shadow the default snapshot fixture. As long as you don't override any dunder methods of the serializer, you can expect the API to remain stable until v2 of syrupy is released as we follow semantic versioning.

@noahnu noahnu added the question label Dec 1, 2020
@hashstat
Copy link
Author

hashstat commented Dec 1, 2020

Here's a contrived example that simulates the problem. A function, compute_pi(), computes and returns the value of Pi. Due to differences in hardware or library implementations the precision of the result may vary slightly on different systems. Although the results are still technically correct, the differing precision prevents a direct comparison. Enter pytest.approx() which, similarly to syrupy snapshot fixtures, returns an object implementing __eq__() to compare that two values are approximately equal within some tolerance.

# test_pi.py
import os
import pytest

pi = 3.141592653589793

def compute_pi():
    return {
        'win32': 3.141592654,
        'darwin': 3.14159265359,
    }.get(os.environ.get('PLATFORM'), 3.1415926536897)

def test_not_equal():
    computed = compute_pi()
    assert pi != computed

def test_approx():
    computed = compute_pi()
    assert pi == pytest.approx(computed)

def test_snapshot(snapshot):
    computed = compute_pi()
    assert snapshot == pytest.approx(computed)

Now update the snapshot and run the tests a couple of times:

image

But now let's simulate changing platforms using the PLATFORM environment variable:

image

You can see that test_approx() still passes, as expected. But test_snapshot() does not because the repr of the approx object is being compared with the previous repr instead of calling the __eq__() method of the pytest.approx() object.

I think what really needs to happen is that the serialized pytest.approx() object needs to be completely restored and compared to the value, not it's serialized form. I think this requires overriding both the deserialization and comparison routines. But many/most of those are protected members. Any chance of providing some public, extendable/overidable methods to make this possible without poking around in non-public methods?

@noahnu
Copy link
Collaborator

noahnu commented Dec 1, 2020

Performing the comparisons on the serialized data, i.e. the exact data that appears on the file system, is critical to the way syrupy operates relative to some other snapshotting tools. It ensures soundness, since the snapshot reflects the code 100%. There are also quite a few assumptions around the code base that what you see in the snapshot files are ultimately what get compared in the tests.

For that reason I wouldn't override the deserializer to perform equality checks in application code.

Ultimately this brings up the question of whether syrupy is the right tool for this type of test. I think there are ways you could get it to work, perhaps snapshot intervals, or after some sort of rounding. I fully get though if that doesn't work for your particular use case.

Definitely an interesting problem.

@iamogbz do you have thoughts on some way to make this work?

@iamogbz
Copy link
Collaborator

iamogbz commented Dec 2, 2020

@hashstat it seems like when comparing you could approximate pi and use that compare against the computed value and then you can use a syrupy matcher to check and replace the matched value with the constant pi approximation.

Basically what @noahnu suggested but using matchers instead of a custom extension to make it terser.

import os
import pytest

pi = 3.141592653589793

def compute_pi():
    return {
        'win32': 3.141592654,
        'darwin': 3.14159265359,
    }.get(os.environ.get('PLATFORM'), 3.1415926536897)

def test_not_equal():
    assert pi != compute_pi()

def test_approx():
    assert pytest.approx(pi) == compute_pi()


def test_snapshot(snapshot):
    computed = compute_pi()
    approximated = pytest.approx(pi)
    def pi_approx_matcher(data, path):
        return approximated if data == approximated else data
    assert snapshot(matcher=pi_approx_matcher) == computed

Using your example

# name: test_snapshot
  3.141592653589793 ± 3.1e-06
---
$ PLATFORM=win32 pytest test_pi.py
test_pi.py ...                                                           [100%]
--------------------------- snapshot report summary --------------------------- 
1 snapshot passed.
============================== 3 passed in 0.03s ==============================

@hashstat
Copy link
Author

hashstat commented Dec 3, 2020

Thanks for all your suggestions. I was able to solve the issue by rounding the numbers to a reasonable precision before comparison.

def test_snapshot(snapshot):
    computed = compute_pi()
    assert snapshot == round(computed, 8)

And all is working as expected.

# name: test_snapshot
  3.14159265
---

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants