Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle simple fixed factor filtering internally #75

Merged
merged 37 commits into from
Jul 29, 2021

Conversation

NoraLoose
Copy link
Member

@NoraLoose NoraLoose commented Jun 11, 2021

Issue

This PR fixes issue #71.

For applying a simple fixed factor filter (as for example described in the "Anisoptropic Filtering" section of the Filter Theory), the user was required to manually go through the following steps:

  1. Before filtering, multiply the field by the local cell area.
  2. Apply gcm-filters filter with dx_min=1 and filter_scale= desired fixed factor, pretending the grid was uniform.
  3. After filtering, divide filtered field by local cell area.

The first step is essentially a coordinate transformation where the original (locally orthogonal) grid is transformed to a uniform Cartesian grid with dx=dy=1. The third step is the reverse coordinate transformation.

Changes

These steps are now handled internally by the code.

I introduced a new Laplacian base class BaseScalarLaplacianWithArea for the Laplacians that are for simple fixed factor filtering:

  • TRANFORMED_TO_REGULAR
  • TRANSFORMED_TO_REGULAR_WITH_LAND
  • TRIPOLAR_TRANSFORMED_TO_REGULAR_WITH_LAND.

Steps 1 and 3 from above are handled as part of the filter class for all Laplacians that are a subclass of BaseScalarLaplacianWithArea.

Tutorial changes

I updated all tutorials to reflect the new way of doing simple fixed factor filtering.

Old:

filtered = filter.apply(field * area, dims=['yh', 'xh']) / area

New:

filtered = filter.apply(field, dims=['yh', 'xh']) 

While updating the tutorials, I also fixed some typos and clarified some statements. Thanks @sdbachman for reading through the documentation and for providing these comments.

NoraLoose added 22 commits June 7, 2021 18:07
- This is a more realistic and stronger test
* Pass REGULAR and REGULAR_WITH_LAND Laplacian the additional grid
  variable "area"
* Fix typos and implement Scott's comments
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov-commenter
Copy link

codecov-commenter commented Jun 11, 2021

Codecov Report

Merging #75 (e2ab6c9) into master (65294c7) will increase coverage by 0.19%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #75      +/-   ##
==========================================
+ Coverage   98.47%   98.66%   +0.19%     
==========================================
  Files           7        7              
  Lines         719      824     +105     
==========================================
+ Hits          708      813     +105     
  Misses         11       11              
Flag Coverage Δ
unittests 98.66% <100.00%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
gcm_filters/filter.py 97.26% <100.00%> (+0.10%) ⬆️
gcm_filters/kernels.py 99.02% <100.00%> (+0.14%) ⬆️
tests/test_filter.py 100.00% <100.00%> (ø)
tests/test_kernels.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 65294c7...e2ab6c9. Read the comment docs.

@NoraLoose NoraLoose linked an issue Jun 11, 2021 that may be closed by this pull request
Copy link
Member

@iangrooms iangrooms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me. @rabernat might want to take a look for Python style issues.

NoraLoose added 2 commits July 8, 2021 12:31
* before: REGULAR and REGULAR_WITH_LAND required an area grid variable
  that is needed for fixed factor filtering; this is reverted
* instead: introduce two separate grid types TRANSFORM_TO_REGULAR and
  TRANSFORM_TO_REGULAR_WITH_LAND that are specifically for fixed factor
  filtering
* expand docstrings of Laplacians
* updated tests to mirror name changes
@NoraLoose NoraLoose mentioned this pull request Jul 9, 2021
Copy link
Contributor

@rabernat rabernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, let me apologize deeply and sincerely for taking so long to review this. The past month has been extremely challenging for me work-wise due to family and travel-related reasons.

I really like the spirit of this PR. I think it makes things much simpler to implement the weighting internally. However, I have one large-ish suggestions for the implementation.

I don't think it makes sense to check the type of the kernel in filters.py and then do things differently based on what we find. That is a coding pattern that indicates poor separation of concerns between modules. Instead, I think kernels.py should take care of the weighting.

I think the best way to accomplish this would be to expand the Kernel interface to include prepare and finalize methods in the BaseLaplacian classes.

The default would be to do nothing, e.g.:

    def prepare(self, field):
        return field

But implementations could override this, e.g.

@dataclass
class AreaWeightedLaplacian(BaseLaplacian)
    area: ArrayLike

    def prepare(self, field):
        return field * self.area

    def __call__(self, field):
        # do stuff

    def finalize(self, field):
        return field / self.area

This could even be a mixin, so we don't have to redefine the core laplacian __call__ functions, e.g.

@dataclass
class AreaWeightedMixin(BaseLaplacian)
    area: ArrayLike

    def prepare(self, field):
        return field * self.area

    def finalize(self, field):
        return field / self.area

@dataclass
class RegularLaplacianWithArea(AreaWeightedMixin, RegularLaplacian):
    pass

Then filter.py would just call field = kernel.prepare(field), rather than implementing the weighting directly.

Does this suggestion make sense? Happy to chat more, and sorry again for my slowness.

I really appreciate your diligent and careful work on this project.

"--> dx_min is set to 1",
stacklevel=2,
)
self.dx_min = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What scenario are you imagining here with this check / warming? Rather than overwriting dx_min, why don't we just raise a ValueError here and make the user fix it explicitly?

@@ -309,8 +328,8 @@ def __post_init__(self):
]
if filter_factor >= max_filter_factor:
warnings.warn(
"Warning: Filter scale much larger than grid scale -> numerical instability possible",
UserWarning,
"Filter scale much larger than grid scale -> numerical instability possible",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link to a documentation page with more context?

raise ValueError(
f"Provided Laplacian {self.Laplacian} is a vector Laplacian. "
f"The ``.apply`` method is only suitable for scalar Laplacians."
)
if issubclass(self.Laplacian, BaseScalarLaplacianWithArea):
# simple fixed factor filtering multiplies field by area before filtering
field = field * self.grid_ds["area"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than putting this login in the filter module, why not put it into the kernel itself?

"""

area: ArrayType

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a way to make the class itself do the area weighting / deweighting. To me that would be a lot cleaner than doing it manually in the filter module (better separation of concerns).

* Introduce mixin class AreaWeightedMixin that handles area weighting
  and deweigting in kernel.py
* Take advantage of multiple class inheritance in kernel.py for
        - RegularLaplacianWithArea
        - RegularLaplacianWithLandMaskAndArea
* filter_func now calls prepare and finalize methods of the Laplacian
  classes
* Update all tests
* ... if dx_min is not equal to 1 for simple fixed factor filtering
* update test accordingly
@NoraLoose
Copy link
Member Author

Instead, I think kernels.py should take care of the weighting.

I think the best way to accomplish this would be to expand the Kernel interface to include prepare and finalize methods in the BaseLaplacian classes.

Thanks for this suggestion, Ryan! I really like it; it makes things way cleaner.

I updated the PR, and incorporated all your comments and suggestions.

@NoraLoose NoraLoose mentioned this pull request Jul 27, 2021
3 tasks
Copy link
Contributor

@rabernat rabernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the super quick revisions! 🚀

One more round of minor comments, focused on the naming of stuff and documentation. Then LGTM!

gcm_filters/filter.py Outdated Show resolved Hide resolved
Comment on lines 128 to 132
1) Field on locally orthogonal grid is transformed to field on regularly spaced Cartesian
grid with dx = dy = 1, through multiplication by cell area of original grid.
2) Laplacian acts on regular Cartesian grid.
3) Diffused field on regular Cartesian grid is transformed back to field on original grid,
through division by cell area of original grid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you format this as a proper RST list, it will render correctly in the docs. As is, the list is "inline" and doesn't quite look right: https://gcm-filters--75.org.readthedocs.build/en/75/api.html#gcm_filters.kernels.RegularLaplacianWithArea

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - the lists render correctly now.

I also tried to get rid off the hyphen that appears in the API at the beginning of the rendered docstrings for some of the Laplacians (but not for others). I couldn't resolve this issue. Any ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not, but this is relatively minor, so not a big deal to me.

gcm_filters/kernels.py Outdated Show resolved Hide resolved
gcm_filters/kernels.py Outdated Show resolved Hide resolved
gcm_filters/kernels.py Show resolved Hide resolved

Attributes
----------
area: cell area of original grid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
area: cell area of original grid

When you use the AreaWeightedMixin, you don't need area as part of this class.


Attributes
----------
area: cell area of original grid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
area: cell area of original grid

When you use the AreaWeightedMixin, you don't need area as part of this class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, except that the function required_grid_vars will not pick up area if I don't redefine it here.

This is only an issue for the classes RegularLaplacianWithLandMaskAndArea and TripolarRegularLaplacianTpoint where additional attributes have to be defined (in addition to what is inherited from the superclasses). In contrast, it is not an issue for the RegularLaplacianWithArea which does not need any additional attributes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that makes sense. I don't really understand why return list(self.__annotations__) in BaseScalarLaplacian does not pick up on the mixin attributes; it must have to do with the nitty-gritty details of class inheritance. Perhaps fixable somehow in the required_grid_args but not worth more effort here.

pass


ALL_KERNELS[GridType.TRANSFORMED_TO_REGULAR] = RegularLaplacianWithArea
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this naming confusing. "Transformed" is ambiguous, and it doesn't indicate anything about the area weighting. What aboutREGULAR_AREA_WEIGHTED?

This is not a dealbreaker for me...but I'm just trying to think about what is most obvious for users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually went back and forth with the naming convention here; something similar to what you suggest here has been in the mix too! So I'm glad to hear that you find AREA_WEIGHTED most intuitive.



ALL_KERNELS[
GridType.TRANSFORMED_TO_REGULAR_WITH_LAND
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Do we really want to call this "TRANSFORMED"? Or would "AREA_WEIGHTED" be more clear?

gcm_filters/kernels.py Outdated Show resolved Hide resolved
* change Laplacian naming convention from  TRANSFORMED_TO
  to AREA_WEIGHTED
* update tests according to new naming convention
* reformat docstrings in kernel.py module so lists show up properly
  in API
* link docstrings to kernel methods
@NoraLoose
Copy link
Member Author

NoraLoose commented Jul 28, 2021

Since I changed the names of 3 Laplacians (from TRANSFORMED_TO to AREA_WEIGHTED) I should also update all the example notebooks for the documentation. I will do that as soon as casper is up and running again.

I could also do the notebook update as part of PR #78, which is a big docs update.

@rabernat
Copy link
Contributor

I could also do the notebook update as part of PR #78,

Let's go with that! I think we should merge this now.

@rabernat rabernat merged commit 05b58b9 into ocean-eddy-cpt:master Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle simple fixed factor filtering internally
4 participants