Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iss386 #387

Merged
merged 3 commits into from
Aug 30, 2023
Merged

Iss386 #387

merged 3 commits into from
Aug 30, 2023

Conversation

Mikejmnez
Copy link
Collaborator

@Mikejmnez Mikejmnez commented Aug 10, 2023

This PR addresses #386 .

Improves the performance of oceanspy with LLC4320 by avoiding the computation of max/min range values from the dataset, as these are known a priori.

Feel free to suggest something different, not approve, etc...

This PR also makes sure that the chunk size in the horizontal of the resulting dataset from cutout, is set by the domain size. Before it was an option, with the idea for testing. But we can always re chunk afterwards. This eliminates an unused parameter from llc_rearrange.

@codecov
Copy link

codecov bot commented Aug 10, 2023

Codecov Report

Merging #387 (7ba13f0) into main (4eaa06e) will increase coverage by 0.05%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##             main     #387      +/-   ##
==========================================
+ Coverage   94.10%   94.15%   +0.05%     
==========================================
  Files           9        9              
  Lines        4189     4194       +5     
  Branches     1005     1004       -1     
==========================================
+ Hits         3942     3949       +7     
  Misses        154      154              
+ Partials       93       91       -2     
Flag Coverage Δ
unittests 94.15% <75.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
oceanspy/subsample.py 92.62% <ø> (ø)
oceanspy/llc_rearrange.py 84.77% <50.00%> (+0.25%) ⬆️
oceanspy/_ospy_utils.py 98.24% <100.00%> (+0.05%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@MaceKuailv
Copy link
Collaborator

why is chunks removed?

@Mikejmnez
Copy link
Collaborator Author

Mikejmnez commented Aug 20, 2023

why is chunks removed?

Reasons:

  1. It is very virtually impossible to know a priori the size of the resulting dataset after an arbitrary cutout, in particular the length of core dimensions (X, Y, etc)
  2. Because of 1), it is essentially impossible to know a priori whether a "proposed" chunking strategy (passed as argument to llc_rearrange) will create equal, consistent chunks across the dataset. A consistent chunking strategy creates equal chunks / divides the length of a core dimension into equal (number of) pieces. Because center and corner points (Xp1 and X) are offset by a unit, this resorts to finding the largest common divisor (because we want large chunks), or close to equal chunk sizes so that we always have the same number of chunks ("Nx"), each with the same number of grid points.
  3. Performing an arbitrary cutout can sometimes create inconsistent chunking along core dimensions. There is no immediate error associated with this, but it can creep later when making computations (not when plotting). This means that the resulting dataset after cutout should always be rechunked.
  4. Having inconsistent chunking can negatively effect the performance of subsequent operations, and lead to errors. I can come up with an explicit example if needed.

And so the only option that is guaranteed to produce consistent chunking after a cutout via llc_rearrange, is by having a single chunk of the size of the core dimension (e.g. X, 'Xp1', 'Y', 'Yp1', ...). Any subsequent re chunking can be done afterwards, after inspecting the dataset. Having chunks as arguments will very likely lead to errors.

Copy link
Collaborator

@ThomasHaine ThomasHaine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Do these changes work OK with ECCO and daily_ecco LLC datasets too?

@Mikejmnez
Copy link
Collaborator Author

Looks good to me. Do these changes work OK with ECCO and daily_ecco LLC datasets too?

Yes. The changes are independent of the size of the LLC dataset.

@ThomasHaine
Copy link
Collaborator

@asiddi24 @MaceKuailv @malmans2 @renskegelderloos Does this look OK to you? If so, please approve the PR. Thanks!

Copy link
Collaborator

@MaceKuailv MaceKuailv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me.

@Mikejmnez Mikejmnez merged commit 497bd23 into hainegroup:main Aug 30, 2023
@Mikejmnez Mikejmnez deleted the iss386 branch August 31, 2023 22:35
Mikejmnez pushed a commit that referenced this pull request Nov 7, 2023
* format

* format

* format

* fix bug

* do not remove X Y from grid coords

* remove complex topology from grid

* set coords

* format

* format

* format

* format

* fix bugs in test

* fix bug in test

* fix bug

* format

* format

* unpin scipy

* format

* format

* fix test to pass serial

* format

* rename fn

* fix name

* format

* fix dim_name ref

* squeeze vals

* sort by dimension

* dataset is return here

* format

* format

* do not make automatic when `face` is dimension

* format

* format

* comment import of dask - not used

* remove time-chunking

* format

* fix typo

* re-chunk, size of entire mooring array

* test with this

* correct import

* format

* format

* fix test

* fix

* re format

* format

* change var names

* format

* format

* improve test with option

* complete test

* fix typo

* format

* remove repeated

* format

* format

* format

* format

* format

* format

* fix bug

* format

* refactor - allow NoneType

* format

* format

* remove - not needed anymore

* format

* format

* typo

* format

* typo

* rename

* complete assertion

* increase testing

* fix other typo

* format

* format

* inclde test for single point

* rename var

* format

* create `yb, xb`

* format

* format

* fix import

* remove spacing

* remove unused

* remove double redim

* fotmat

* format

* format

* isort

* format

* fix var name

* format

* format

* format

* format

* format

* fix arg so that both fns have same name args

* isort

* fix var names

* improve description of fn

* remove undef vars

* fix imports/vars

* format

* remove unused vars

* format

* format

* fix bug

* revert errs

* allow extra args

* format

* re chunk along new dimension

* fix typos

* format

* fix args

* format

* correct conditional

* format

* remove unused var

* fix typo

* add underscore

* format

* format

* no longer drop vars

* make `None` as default unit

* make array type as default

* use correct import

* improve coverage

* fix type for testing

* format

* fix typo

* format

* format

* format

* fix typos

* fix argument

* format

* format

* format

* format

* remove print statements

* format

* fix typo

* fix testing typo

* format

* format

* format

* more testing ds_edge

* fix import name

* remove print statements

* remove extra testing

* get `pair` from kwargs

* format

* format

* format

* format

* return`axis` for testing

* add `axis` as returned variables

* fix return

* format

* remove assertion with axis

* format

* format

* format

* format

* add testing

* format

* fix args

* format

* remove unused var

* fix Nx

* format

* format

* format

* when adjacent, eval only with two face list

* format

* format

* remove unused var

* remove unused var

* return more vars for testing

* format

* format

* format

* format

* format

* typos

* format

* improve testing

* format

* format

* fix typo

* fix ordering

* format

* format

* improve description of fn

* no cover this conditional

* change conditional

* allow for consistent computation of `diffX` and `diffY` with `len(mooring)`

* fix return when Niter==1

* correct arg

* format

* Pre commit (#385)

* new updated pre-commit

* re-format

* re format

* format

* [pre-commit.ci] pre-commit autoupdate (#381)

updates:
- [github.com/macisamuele/language-formatters-pre-commit-hooks: v2.9.0 → v2.10.0](macisamuele/language-formatters-pre-commit-hooks@v2.9.0...v2.10.0)
- [github.com/psf/black: 23.3.0 → 23.7.0](psf/black@23.3.0...23.7.0)
- [github.com/PyCQA/flake8: 6.0.0 → 6.1.0](PyCQA/flake8@6.0.0...6.1.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Iss386 (#387)

* format

* format

* format

* Iss389 (#391)

* pin python to `3.10`

* test with python version 3.11

* repo2docker does not support python v3.11 yet

* pin to 3.11 last test

* build failed with 3.11

* Bump actions/checkout from 3 to 4 (#390)

Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#392)

updates:
- [github.com/psf/black: 23.7.0 → 23.9.1](psf/black@23.7.0...23.9.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#393)

updates:
- [github.com/pre-commit/pre-commit-hooks: v4.4.0 → v4.5.0](pre-commit/pre-commit-hooks@v4.4.0...v4.5.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#394)

updates:
- [github.com/macisamuele/language-formatters-pre-commit-hooks: v2.10.0 → v2.11.0](macisamuele/language-formatters-pre-commit-hooks@v2.10.0...v2.11.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#395)

updates:
- [github.com/psf/black: 23.9.1 → 23.10.0](psf/black@23.9.1...23.10.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix arctic_control not opening (#396)

* first time the charm

* cleaned up redundant part

* [pre-commit.ci] pre-commit autoupdate (#397)

updates:
- [github.com/psf/black: 23.10.0 → 23.10.1](psf/black@23.10.0...23.10.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* adjust to pre-commit

* refactor

* remove unused and untested function

* fix test

* fix failing test

* format

* fix testing

* omit testing when persist

* formnat

* allow some additional fn to not be covered / unuused

* remove import

* format

* improve test

* formata

* improve coverage

* format

* raise coverage

* improve coverage

* format

* rename instead of compute grid vars

* compute uv grid points when `serial=True` (faced data)

* format

* re set coords after manupilate=true

* fix typo

* format

* format

* fix typo

* typoe

* no longer needed to compute these coords

* fix bug

* fix typo

* should not include arctic

* include shapely for ci

* format

* format

* format

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MaceKuailv <52629492+MaceKuailv@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants