-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PyOpenSci] Reviewer #2 comments #1335
Comments
Nice! Thus indeed, I guess that tagging a testdata version would help solve this! |
My previous worries about splitting the testdata with the code Ouranosinc/xclim-testdata#1 (comment) So tagging the testdata should solve this reproducibility issue. But to ensure smooth dev workflow, the code should allow overriding the tag with a branch name. During dev cycle, both the testdata and the code would most probably move together. Without the override capability, we will have to continuously tag the testdata so it can be used with the code and this can get tedious. However, with this tag override capability, we must not forget to tag the final version of the testdata and bump that tag on the code side before merging. The tag should be the default value when no override is used. |
Thanks for the suggestion on how best to proceed for this. We now have a testing data tagging scheme and some GitHub Actions to prevent us from accidentally breaking it during some development branch. It's nothing fancy, but we should be able to more easily test older versions of |
<!--Please ensure the PR fulfills the following requirements! --> <!-- If this is your first PR, make sure to add your details to the AUTHORS.rst! --> ### Pull Request Checklist: - [x] This PR addresses an already opened issue (for bug fixes / features) - This PR implements a suggestion made in #1335 - [x] Tests for the changes have been added (for bug fixes / features) - [x] (If applicable) Documentation has been added / updated (for bug fixes / features) - [x] CHANGES.rst has been updated (with summary of main changes) - [x] Link to issue (:issue:`number`) and pull request (:pull:`number`) has been added ### What kind of change does this PR introduce? * Add a `make_criteria` helper to reshape dataset into 2D critera arrays, as expected by the ensemble reduction functions. EDIT: Also, made this function a bit complicated so that it accepts dataset with variables of different shapes and still preserves the coordinates. ### Does this PR introduce a breaking change? No. ### Other information: ~I need to add a test and udpate the history.~
Except for #1342, we managed to address all major comments in under a work-week. Nicely done, team! |
Originally posted by @jmunroe in pyOpenSci/software-submission#73 (comment)
This issue is a tickbox summary of comments from the reviewer that seemed addressable in the near-term.
The foreword to the review:
Documentation
TJS: This is addressed in #1338
TJS: This is addressed in #1338
TJS: This is addressed in #1338
Functionality
TJS: The reason why this is occurring is because we made significant changes to our
xclim-testdata
repository in recent versions. I realize now that this is breaking because we aren't tagging explicit versions/commits of the testdata that are guaranteed to work. I'm thinking that we might want to start doing that from now on, rather than always point atmaster
. @aulemahal, what do you think?Update: This is addressed in #1339
TJS:
pylint
is configured but we do not currently pass those compliance checks (run with allowed failure). If the amount of effort to get us passing is reasonable, I'll attempt to get this working.Review Comments
Installation notes
xclim
documentation under Installation, I created a separate conda environment to install the required dependencies:And there I hit my first issue:
The fix (at least for conda 22.11.1) is that
--file
is an option to pass toconda env create
and notconda create
. This needs to be fixed in the install instructions.TJS: This is addressed in #1338
flox
,SBCK
,eigen
,eofs
,pybind
. Since I used conda'senvironment.yml
the extraseofs
,eigen
,pybind11
were already included.I confess I tend to get confused when there is the option of using either
environment.yml
andrequirements_*.txt
files. So, I skipped the instructions following 'Extra Dependencies' in the documentation.I assume there must be situtations when I should and should not install these extra dependencies but as a new user of the package, I don't what those situations are yet.
Since theses installation instructions are right near the top of the documenation, perhaps it would be better for the maintainers to make those choices for me? For example, I am now wondering "should I be installing
flox
?". Since it is 'highly recommended', would it not make more sense to have it as part of the default instructions?TJS: This is addressed in #1338
Basic Usage
My initial reading of this code made me think that this ERA5 dataset was something I need to first download locally (I did not distinguish between
xr.open_dataset
andopen_dataset
in my very first glance at the code).After some review, I see now that there companion GitHub repo that was available that had testing data and the
xclim.testing
API automatically makes a locally cached copy of this file. I think it would be clearer if this very first example was written out asso that it was clear that the
open_dataset
was utility method of the testing framework for xclim.TJS: This is addressed in #1338
In the example of Health checks and metadata attributes there is a typo:
should be
TJS: This is addressed in #1338
While in-code comments are generally fine, these last few examples on graphics feel tacked on given the strong narrative text established in the beginning of the Basic Usage section of the documentation.
TJS: This is addressed in #1338
Examples
Workflow Examples
Minor spelling error in the docs:
xarray
is tightly in*(t)*egrated withdask
, a package that can automatically parallelize operations.TJS: This is addressed in #1338
This is confusing because, as the first example workflow, the user has not yet been shown to use the clisops package. Should there be a subsub-section immediately before such as Subsetting and selecting data with cliops to demonstrate that recommended workflow?
TJS: This is addressed in #1338
that leads to the warning
I think the offending line should be changed to
(and elsewhere in the documenation where seaborn styles are used)
TJS: This is addressed in #1338
TJS: This is addressed in #1338
don't appear to match the values used in the code. I assume the code comments just need to be updated.
PB: This is addressed in #1338
Ensemble-Reductinon Techniques
Is this "criteria" array effectively the equivalent of creating a feature matrix used in data science?
TJS: This is addressed in #1341
Ensemble-Reductinon Techniques
Statistical Downscaling and Bias-Adjustment
TJS: Many typos and grammatical errors have been addressed in #1338
TJS: Many typos and grammatical errors have been addressed in #1338
The text was updated successfully, but these errors were encountered: