Refactor Portilla-Simoncelli model #225

billbrod · 2023-11-15T20:28:06Z

This pull request completely refactors the Portilla-Simoncelli texture model. As part of this, following changes to po.simul.PortillaSimoncelli:

The model now can accept images with arbitrary number of batches and channels
The model is faster (about 2x faster on both GPU and CPU)
No longer need to call model.to(torch.float64) in order to accept double-precision inputs -- will do so automatically.
Adds type-hinting
Checks image shape and raises informative error message if we can't handle it (see Make Portilla-Simoncelli texture model work on arbitrarily-sized images #221)
Really really tried to make the code much more modular and legible -- I hope it's much clearer what's happening and why in forward
Building off of Add PortillaSimoncelliMinimal to tutorial get original paper's statistics #216 , model now only returns necessary statistics, throwing away all redundant ones.
Removes support for use_true_correlations=False. We now only support using the true correlations, because the only reason someone would ever set it to False was to check against matlab. I still test against the matlab, but this now requires a bit more work, which now all lives in tests.
Related to above, correctly normalize the cross-correlations so that they're actual correlations (previous vesion wasn't quite right).
PS now uses helper functions when relevant instead of implementing own version.

Other changes:

Several helper functions have been moved from PS code to tools/ or added: center_crop (no longer requires torchvision), expand (upsample an image using Fourier transform), shrink (downsample an image using Fourier transform), modulate_phase (modulate the phase of a complex signal, e.g., double the phase of a steerable pyramid coefficient for correlating it with another scale), autocorrelation (replaces and slightly generalizes existing autocorr function).
Adds many more tests.
Adds section in tips about making sure statistics are all in same range, as that's helpful.
Relevant parts of notebooks have been updated and rerun, PS notebook has been completely rerun to ensure output is qualitatively similar (it is)

Still to do:

The last section of the PS notebook, added in Add PortillaSimoncelliMinimal to tutorial get original paper's statistics #216 , needs reworking for this refactor.

Notes:

at one point, I considered switching away from using the downsampled pyramid, so that the coefficients at all scales have the same shape. This would make the code cleaner (currently, multiscale representations are lists of tensors, rather than a single tensor), but ends up making things much less inefficient and changes the output such that we can no longer guarantee reproduction of the matlab values. So I think this is not worth doing.
I think the efficiency of the model can be further improved, but not quite sure how. I feel like pytrees will help, but my initial attempts at using them did not. See Make Portilla-Simoncelli code more efficient #222 for notes.
While the code now works on multi-channel images, you cannot generate color metamers out-of-the-box. See Add color/channel support #46 for discussion.
This will make it much easier to support the pooled texture model used in Freeman's Metamers of the vetnral stream, among other places.

Closes: #199, #142

and corrects some docstrings

…noptic into ps_refactor

move that out of the bowels of PortillaSimoncelli, because it might be helpful

beause it's getting automatically added by something

This refactors PS to: - remove all unnecessary attributes (representation, etc) - make forward() much more straightforward, calling transparently-named methods that return the thing they say they do

- removes old autocorr, nothing uses it - adds new autocorrelation, the way needed for portilla simoncelli - puts expand and shrink in signal.py

this makes it possible to vmap it

… into ps_refactor

…enoptic into ps_refactor

refactor to use the non-downsampled pyramid. probably won't stick with this, because it's ~2x slower on the cpu, and fairly different values (rtol=1e-1, atol=1e-4 ish)

it's actually more efficient, as long as we can't use vmap (which we can't)

third version of this, but I think this is the way. still use the downsampling pyramid, but now make lists of tensors (one per scale) to use. gets us an intermediate version between the two, while stlil passing all the tests

…enoptic into ps_refactor

adds longer discussion in notebook about what metamers mean in the PS texture context and why they're useful. also renames vector -> tensor, because that's clearer

BalzaniEdoardo

I like the changes. Looks good to me

BalzaniEdoardo · 2024-01-04T15:31:47Z