Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Test fails locally [pyOpenSci review] #238

Closed
NickleDave opened this issue Jan 19, 2024 · 14 comments · Fixed by #294
Closed

TST: Test fails locally [pyOpenSci review] #238

NickleDave opened this issue Jan 19, 2024 · 14 comments · Fixed by #294

Comments

@NickleDave
Copy link

This test failed for me locally:
tests/test_metamers.py::TestMetamers::test_stop_criterion

I need to reboot and re-run so I can give you the full log from the pytest run, will do so after I finish the comment with my entire review 🙂

@billbrod
Copy link
Collaborator

When you do so, can you please also include your system info (as requested in our bug report issue template)? (These tests are passing in our CI, so will need to try and track it down)

System (please complete the following information):

  • OS: [e.g. Mac (with version), Ubuntu 18.04]
  • Python version [e.g. 3.7]
  • Pytorch version [e.g., 1.4]
  • Plenoptic version [e.g. 0.1]

@NickleDave
Copy link
Author

OS: Linux, PopOS 22.04 LTS (Ubuntu-like)
Python version: 3.10.12
Pytorch version: 2.1.2+cu121
Plenoptic version: 1.0.3.dev18

I could also reply with a dump of the virtual environment to a requirements.txt if that would help.

=========================================================== FAILURES ============================================================
____________________________________ TestMetamers.test_stop_criterion[frontend.OnOff.nograd] ____________________________________

self = <test_metamers.TestMetamers object at 0x7effcf2684c0>
einstein_img = tensor([[[[0.0039, 0.0039, 0.0039,  ..., 0.0039, 0.0039, 0.0039],
          [0.0039, 0.1961, 0.1961,  ..., 0.6039, 0.6..., 0.1725, 0.1922, 0.0039],
          [0.0039, 0.0039, 0.0039,  ..., 0.0039, 0.0039, 0.0039]]]],
       device='cuda:0')
model = OnOff(
  (center_surround): CenterSurround()
  (luminance): Gaussian()
  (contrast): Gaussian()
)

    @pytest.mark.parametrize('model', ['frontend.OnOff.nograd'], indirect=True)
    def test_stop_criterion(self, einstein_img, model):
        # checking that this hits the criterion and stops early, so set seed
        # for reproducibility
        po.tools.set_seed(0)
        met = po.synth.Metamer(einstein_img, model)
        # takes different numbers of iter to converge on GPU and CPU
        if DEVICE.type == 'cuda':
            max_iter = 30
            check_iter = 25
        else:
            max_iter = 10
            check_iter = 8
        met.synthesize(max_iter=max_iter, stop_criterion=1e-5, stop_iters_to_check=5)
>       assert len(met.losses) == check_iter, "Didn't stop when hit criterion! (or optimization changed)"
E       AssertionError: Didn't stop when hit criterion! (or optimization changed)
E       assert 14 == 25
E        +  where 14 = len(tensor([0.0002, 0.0004, 0.0003, 0.0003, 0.0003, 0.0003, 0.0003, 0.0002, 0.0002,\n        0.0002, 0.0002, 0.0002, 0.0002, 0.0002]))
E        +    where tensor([0.0002, 0.0004, 0.0003, 0.0003, 0.0003, 0.0003, 0.0003, 0.0002, 0.0002,\n        0.0002, 0.0002, 0.0002, 0.0002, 0.0002]) = <plenoptic.synthesize.metamer.Metamer object at 0x7eff807fe890>.losses
tests/test_metamers.py:235: AssertionError
----------------------------------------------------- Captured stderr call ------------------------------------------------------
  0%|          | 0/30 [00:00<?, ?it/s, loss=1.5279e-04, learning_rate=0.01, gradient_norm=6.8560e-06, pixel_change_norm=1.5941e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=3.5352e-04, learning_rate=0.01, gradient_norm=9.1781e-03, pixel_change_norm=1.5810e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=3.4398e-04, learning_rate=0.01, gradient_norm=9.1742e-03, pixel_change_norm=1.5639e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=3.4316e-04, learning_rate=0.01, gradient_norm=9.3476e-03, pixel_change_norm=1.5441e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=2.9748e-04, learning_rate=0.01, gradient_norm=8.5114e-03, pixel_change_norm=1.5221e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=3.1232e-04, learning_rate=0.01, gradient_norm=9.0283e-03, pixel_change_norm=1.4975e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=2.7997e-04, learning_rate=0.01, gradient_norm=8.4553e-03, pixel_change_norm=1.4708e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=2.4384e-04, learning_rate=0.01, gradient_norm=7.7305e-03, pixel_change_norm=1.4420e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=2.1998e-04, learning_rate=0.01, gradient_norm=7.2615e-03, pixel_change_norm=1.4114e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=1.9966e-04, learning_rate=0.01, gradient_norm=6.8502e-03, pixel_change_norm=1.3789e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=2.0742e-04, learning_rate=0.01, gradient_norm=7.2220e-03, pixel_change_norm=1.3449e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=2.1823e-04, learning_rate=0.01, gradient_norm=7.6441e-03, pixel_change_norm=1.3096e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=1.9260e-04, learning_rate=0.01, gradient_norm=7.0683e-03, pixel_change_norm=1.2733e+0  0%|          | 0/30 [00:00<?, ?it/s, loss=1.9586e-04, learning_rate=0.01, gradient_norm=7.2721e-03, pixel_change_norm=1.2363e+0 43%|████▎     | 13/30 [00:00<00:00, 151.32it/s, loss=1.9586e-04, learning_rate=0.01, gradient_norm=7.2721e-03, pixel_change_norm=1.2363e+00]

@billbrod
Copy link
Collaborator

Hi @NickleDave, just getting around to this now. Can you give me a full dump of the virtual environment?

However, I think this is an issue with setting the seed, which determines the initial conditions for the optimization (most importantly, the patch of white noise that we use as the initial image). Given that, can you run:

import plenoptic as po
import torch
import imageio

po.tools.set_seed(0)
# from the data directory for now
einstein = po.load_images('data/256/einstein.png')
mdl = po.simul.OnOff((31, 31), pretrained=True, cache_filt=True)
po.tools.remove_grad(mdl)
met = po.synth.Metamer(einstein, mdl)
imageio.imsave('init.png', (255 * po.to_numpy(met.metamer).squeeze()).astype(np.uint8))

and then upload the resulting init.png file to this issue? that way I can double check whether it's the same as the one I get.

If they're different, then I can try uploading the version I get and seeing if using that in the tests fixes the problem.

I'd also be open to other alternatives for this test -- basically, I'm checking that, for a given set of initial conditions, metamer.synthesize hits the convergence condition at the right time.

@billbrod
Copy link
Collaborator

Alright, based on testing with @BalzaniEdoardo 's Mac, I'm 99% sure this is a result of getting a different sample of white noise during initialization.

My proposed fix is to actually check against the stop criterion, which makes more sense anyway.

@billbrod billbrod moved this from Todo to Waiting... in Feb 2024 sprint Feb 20, 2024
@billbrod billbrod moved this from Waiting... to Ready for review in Feb 2024 sprint Feb 20, 2024
@NickleDave
Copy link
Author

Sorry for the slow reply

Not sure I follow all the logic of the test but I agree actually checking against the stop criterion makes more sense

@billbrod
Copy link
Collaborator

Can you check with main ? I just merged in #251 , which should solve this issue.

@billbrod billbrod moved this from Ready for review to Waiting... in Feb 2024 sprint Feb 21, 2024
@billbrod
Copy link
Collaborator

@NickleDave can you try running the tests again? (with the main branch, haven't released the fix to pip yet)

@NickleDave
Copy link
Author

Hi @billbrod sorry for the really slow reply on this

I don't have access to the same Ubuntu machine anymore.
I set up for dev on a MBP M3 and I was able to get this single test to pass. I think we can call it good.

Not relevant for this issue but note I had to comment out the line

addopts = "--cov=plenoptic -n auto"

or otherwise I got the error

ERROR: usage: pytest [options] [file_or_dir] [file_or_dir] [...]
pytest: error: unrecognized arguments: --cov=plenoptic -n
  inifile: /Users/davidnicholson/Documents/repos/opensci/plenoptic/pyproject.toml
  rootdir: /Users/davidnicholson/Documents/repos/opensci/plenoptic

Not sure if that's operator error.

Also not a big deal but you might want to change the snippet in the dev installation instructions here to say pip install -e ".[dev]" since stuff like pytest won't work without it. I know you do explain this below in "optional dependencies" but maybe worth making it clear in this snippet that your using the "extras" mechanism to declare your dev dependencies

@billbrod
Copy link
Collaborator

Thanks!

Did you run pip install -e ".[dev]" before getting that error? This is the error you get if pytest is installed but pytest-cov is not. Can you check whether that's installed and, if not, install it and run again?

(And that's a good point about the dev instructions, I'll make that change)

@NickleDave
Copy link
Author

Did you run pip install -e ".[dev]" before getting that error? This is the error you get if pytest is installed but pytest-cov is not. Can you check whether that's installed and, if not, install it and run again?

Ah, I see, I had run pip install -e .[test] -- force of habit from my own stuff. So, yes, operator error.

Not to complicate your life any further but jic you care, you can recursively define optional dependencies so that pip install -e .[dev] gives you doc as well. I usually just lump what you have in nb into doc because the notebooks are part of docs (IIUC), and have tests as another set of dependencies, then make those both "dev" since I usually want to both run tests and build docs in my dev environment

@billbrod
Copy link
Collaborator

Oh interesting, it might make sense to rename my current dev to tests and then add a new dev bundle that includes all the others.

If I understand that link correctly, I'd add something like:

dev = [
    "plenoptic[docs,nb,test]",
]

which will grab the version currently on PyPI, right? Ideally, it would grab the contents of those bundles from the same file, in case main adds an additional requirement that isn't yet included in a release. I'm about to change how my docs are built, so this will be relevant soon, but it might not matter long-term, as things stabilize again.

(docs doesn't require nb, actually; nbsphinx can handle the conversion of the ipynb files without jupyter. readthedocs just uses the docs bundle, not nb.)

@billbrod
Copy link
Collaborator

billbrod commented Sep 20, 2024

Oh wait, I spoke too soon -- it looks like it is grabbing the bundle from the current file (tested by incrementing the version required of a package in docs). Alright, that's good to know.

@NickleDave
Copy link
Author

Yeah it's a bit confusing. IIUC since you will already have (the local version of) your package installed, the dependency resolver won't go get it off PyPI, and instead will just get the optional dependencies that are declared recursively

See https://discuss.python.org/t/where-is-nested-recursive-optional-dependencies-documented/35648/7 and GitHub issue linked therein

@billbrod billbrod linked a pull request Sep 23, 2024 that will close this issue
@billbrod
Copy link
Collaborator

I think for now that I won't use the "recurisve optional dependencies" like this, since that Github issue hasn't been closed. If I understand correctly, it's because of they're not testing this behavior right now, so it seems like a bad idea to rely upon it. If many folks have this confusion in the future, I may revisit this decision.

In that case, I'm going to close this with the merging of #294. (The actual test fix happened in #251)

@github-project-automation github-project-automation bot moved this from Waiting... to Done in Feb 2024 sprint Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants