diff --git a/README.md b/README.md index ef8d896..cd1e51f 100644 --- a/README.md +++ b/README.md @@ -4,34 +4,27 @@ [![Tests](https://github.com/ocean-eddy-cpt/gcm-filters/workflows/Tests/badge.svg)](https://github.com/ocean-eddy-cpt/gcm-filters/actions?query=workflow%3ATests) [![Documentation Status](https://readthedocs.org/projects/gcm-filters/badge/?version=latest)](https://gcm-filters.readthedocs.io/en/latest/?badge=latest) -Diffusion-based smoothers for coarse-graining GCM data. +GCM-Filters: Diffusion-based Spatial Filtering of Gridded Data from General Circulation Models -### Documentation and code +### Description -URLs for the docs and code. +**GCM-Filters** is a python package that performs spatial filtering analysis in a flexible and efficient way. +The GCM-Filters algorithm applies a discrete Laplacian to smooth a field through an iterative process that resembles diffusion (`Grooms et al., 2021 `_). +The package is specifically designed to work with gridded data that is produced by General Circulation Models (GCMs) of ocean, weather, and climate. +Such GCM data come on complex curvilinear grids, whose geometry is respected by the GCM-Filters Laplacians. +Through integration with `dask `_, GCM-Filters enables parallel, out-of-core filter analysis on both CPUs and GPUs. ### Installation -For `conda` users you can - -```shell -conda install --channel conda-forge gcm_filters -``` - -or, if you are a `pip` users +GCM-Filters can be installed with `pip`: ```shell pip install gcm_filters ``` -### Example +### Getting Started -```python -from gcm_filters import gcm_filters - - -gcm_filters.meaning_of_life_url() -``` +To learn how to use GCM-Filters for your data, visit the `GCM-Filters documentation `_. ## Get in touch diff --git a/docs/basic_filtering.rst b/docs/basic_filtering.rst new file mode 100644 index 0000000..7e36cf1 --- /dev/null +++ b/docs/basic_filtering.rst @@ -0,0 +1,186 @@ +Basic Filtering +================ + +The Filter Object +------------------ + + +The core object in GCM-Filters is the :py:class:`gcm_filters.Filter` object. Its full documentation below enumerates all possible options. + +.. autofunction:: gcm_filters.Filter + + +Details related to ``filter_scale``, ``filter_shape``, ``transition_width``, and ``n_steps`` can be found in the :doc:`theory`. +The following sections explain the options for ``grid_type`` and ``grid_vars`` in more detail. + +Grid types +---------- + +To define a filter, we need to pick a grid and associated Laplacian that matches our data. +The currently implemented grid types are: + +.. ipython:: python + + import gcm_filters + list(gcm_filters.GridType) + +This list will grow as we implement more Laplacians. + +The following table provides an overview of these different grid type options: what grid they are suitable for, whether they handle land (i.e., continental boundaries), what boundary condition the Laplacian operators use, and whether they come with a scalar or vector Laplacian. You can also find links to example usages. + ++--------------------------------+-----------------------------------------------------------------+--------------+--------------------+------------------+------------------------------------------+ +| ``GridType`` | Grid | Handles land | Boundary condition | Laplacian type | Example | ++================================+=================================================================+==============+====================+==================+==========================================+ +| ``REGULAR`` | Cartesian grid | no | periodic | Scalar Laplacian | | ++--------------------------------+-----------------------------------------------------------------+--------------+--------------------+------------------+------------------------------------------+ +| ``REGULAR_WITH_LAND`` | Cartesian grid | yes | periodic | Scalar Laplacian | see below | ++--------------------------------+-----------------------------------------------------------------+--------------+--------------------+------------------+------------------------------------------+ +| ``IRREGULAR_WITH_LAND`` | locally orthogonal grid | yes | periodic | Scalar Laplacian | :doc:`examples/example_filter_types`; | +| | | | | | :doc:`examples/example_tripole_grid` | ++--------------------------------+-----------------------------------------------------------------+--------------+--------------------+------------------+------------------------------------------+ +| ``TRIPOLAR_POP_WITH_LAND`` | locally orthogonal grid | yes | tripole | Scalar Laplacian | :doc:`examples/example_tripole_grid` | ++--------------------------------+-----------------------------------------------------------------+--------------+--------------------+------------------+------------------------------------------+ +| ``VECTOR_C_GRID`` | `Arakawa C-Grid `_ | yes | periodic | Vector Laplacian | :doc:`examples/example_vector_laplacian` | ++--------------------------------+-----------------------------------------------------------------+--------------+--------------------+------------------+------------------------------------------+ + +Grid types with scalar Laplacians can be used for filtering scalar fields (such as temperature), and grid types with vector Laplacians can be used for filtering vector fields (such as velocity). + +Grid types for simple fixed factor filtering +++++++++++++++++++++++++++++++++++++++++++++ + +The remaining grid types are for a special type of filtering: **simple fixed factor filtering** to achieve a fixed *coarsening* factor (see also the :doc:`theory`). If you specify one of the following grid types for your data, ``gcm_filters`` will internally transform your original (locally orthogonal) grid to a uniform Cartesian grid with `dx = dy = 1`, and perform fixed factor filtering on the uniform grid. After this is done, ``gcm_filters`` transforms the filtered field back to your original grid. +In practice, this coordinate transformation is achieved by area weighting and deweighting (see :doc:`theory`). This is why the following grid types have the suffix ``AREA_WEIGHTED``. + ++-----------------------------------------------+-------------------------+--------------+--------------------+------------------+--------------------------------------+ +| ``GridType`` | Grid | Handles land | Boundary condition | Laplacian type | Example | ++===============================================+=========================+==============+====================+==================+======================================+ +| ``REGULAR_AREA_WEIGHTED`` | locally orthogonal grid | no | periodic | Scalar Laplacian | | ++-----------------------------------------------+-------------------------+--------------+--------------------+------------------+--------------------------------------+ +| ``REGULAR_WITH_LAND_AREA_WEIGHTED`` | locally orthogonal grid | yes | periodic | Scalar Laplacian | :doc:`examples/example_filter_types`;| +| | | | | | :doc:`examples/example_tripole_grid` | ++-----------------------------------------------+-------------------------+--------------+--------------------+------------------+--------------------------------------+ +| ``TRIPOLAR_REGULAR_WITH_LAND_AREA_WEIGHTED`` | locally orthogonal grid | yes | tripole | Scalar Laplacian | :doc:`examples/example_tripole_grid` | ++-----------------------------------------------+-------------------------+--------------+--------------------+------------------+--------------------------------------+ + + +Grid variables +-------------- + +Each grid type from the above two tables has different *grid variables* that must be provided as Xarray `DataArrays `_. For example, let's assume we are on a Cartesian grid (with uniform grid spacing equal to 1), and we want to use the grid type ``REGULAR_WITH_LAND``. To find out what the required grid variables for this grid type are, we can use this utility function. + +.. ipython:: python + + gcm_filters.required_grid_vars(gcm_filters.GridType.REGULAR_WITH_LAND) + +``wet_mask`` is a binary array representing the topography on our grid. Here the convention is that the array is 1 in the ocean (“wet points”) and 0 on land (“dry points”). + +.. ipython:: python + + import numpy as np + import xarray as xr + + ny, nx = (128, 256) + + mask_data = np.ones((ny, nx)) + mask_data[(ny // 4):(3 * ny // 4), (nx // 4):(3 * nx // 4)] = 0 + wet_mask = xr.DataArray(mask_data, dims=['y', 'x']) + +.. ipython:: python + :okwarning: + + @savefig wet_mask.png + wet_mask.plot() + +We have made a big island. + +Creating the Filter Object +-------------------------- + +We create a filter object as follows. + +.. ipython:: python + + filter = gcm_filters.Filter( + filter_scale=4, + dx_min=1, + filter_shape=gcm_filters.FilterShape.TAPER, + grid_type=gcm_filters.GridType.REGULAR_WITH_LAND, + grid_vars={'wet_mask': wet_mask} + ) + filter + +The string representation for the filter object in the last line includes some of the parameters it was initiliazed with, to help us keep track of what we are doing. +We have created a Taper filter that will filter our data by a fixed factor of 4. + +Applying the Filter +------------------- + +We can now apply the filter object that we created above to some data. Let's create a random 3D cube of data that matches our grid. + +.. ipython:: python + + nt = 10 + data = np.random.rand(nt, ny, nx) + da = xr.DataArray(data, dims=['time', 'y', 'x']) + da + +We now mask our data with the ``wet_mask``. + +.. ipython:: python + + da_masked = da.where(wet_mask) + +.. ipython:: python + :okwarning: + + @savefig data.png + da_masked.isel(time=0).plot() + +Now that we have some data, we can apply our filter. We need to specify which dimension names to apply the filter over. In this case, it is ``y``, ``x``. + +.. ipython:: python + + %time da_filtered = filter.apply(da_masked, dims=['y', 'x']) + +.. ipython:: python + + da_filtered + +Let's visualize what the filter did. + +.. ipython:: python + :okwarning: + + @savefig data_filtered.png + da_filtered.isel(time=0).plot() + + +Using Dask +----------- + +Up to now, we have filtered *eagerly*; when we called ``.apply``, the results were computed immediately and stored in memory. +``GCM-Filters`` is also designed to work seamlessly with Dask array inputs. With `dask `_, we can filter *lazily*, deferring the filter computations and possibly executing them in parallel. +We can do this with our synthetic data by converting them to dask. + +.. ipython:: python + :okwarning: + + da_dask = da_masked.chunk({'time': 2}) + da_dask + +We now filter our data lazily. + +.. ipython:: python + :okwarning: + + da_filtered_lazy = filter.apply(da_dask, dims=['y', 'x']) + da_filtered_lazy + +Nothing has actually been computed yet. +We can trigger computation as follows: + +.. ipython:: python + + %time da_filtered_computed = da_filtered_lazy.compute() + +Here we got only a very modest speedup because our example data are too small. For bigger data, the performance benefit will be more evident. diff --git a/docs/environment.yml b/docs/environment.yml index 4643e22..cf61b4f 100644 --- a/docs/environment.yml +++ b/docs/environment.yml @@ -6,6 +6,7 @@ dependencies: - numpy - scipy - xarray + - dask - matplotlib - numpydoc - sphinx diff --git a/docs/tutorial_filter_types.ipynb b/docs/examples/example_filter_types.ipynb similarity index 98% rename from docs/tutorial_filter_types.ipynb rename to docs/examples/example_filter_types.ipynb index f3b3dd4..dc8194e 100644 --- a/docs/tutorial_filter_types.ipynb +++ b/docs/examples/example_filter_types.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Tutorial on different filter types\n", + "# Example: Different filter types\n", "\n", "In this tutorial, we are going to cover how to use `gcm-filters` for different filter types: \n", "* Filters with fixed filter length scale\n", @@ -463,7 +463,7 @@ " filename: averages_00030002.nc\n", " grid_tile: N/A\n", " grid_type: regular\n", - " title: NeverWorld2