Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JOSS review] Python version and Dependecies #43

Closed
cheginit opened this issue May 31, 2021 · 4 comments
Closed

[JOSS review] Python version and Dependecies #43

cheginit opened this issue May 31, 2021 · 4 comments
Assignees

Comments

@cheginit
Copy link

cheginit commented May 31, 2021

I am wondering what's reason behind not supporting Python 3.7+. What specific features of Python 3.9 is being used in the code? Pinning the dependencies can limit the usage of the project with other packages. Although one can create several environments, having such unnecessary pinned dependencies might affects the user experience.

The requirements.txt includes unnecessary dependencies:

recommonmark
setuptools
sphinx
sphinx-rtd-theme

Some other suggestions:

  • I suggest to use conda instead of pyenv, since you're relying on scientific Python libraries, they tend to be orders of magnitudes faster when installed via conda-forge.
  • You don't think you need both pyarrow and fastparquet since they both do the same thing. pyarrow is the default parquet engine in pandas.
  • When using h5netcdf instead of netcdf4 you need to be aware of, and inform the user about, the difference that are documented on xarray's website:

There may be minor differences in the Dataset object returned when reading a NetCDF file with different engines. For example, single-valued attributes are returned as scalars by the default engine=netcdf4, but as arrays of size (1,) when reading with engine=h5netcdf.

Overall, I think the dependencies need a reevaluation.

EDIT: Reference to openjournals/joss-reviews#3221

@cheginit cheginit changed the title Python version and Dependecies [JOSS review] Python version and Dependecies May 31, 2021
@thurber thurber self-assigned this Jun 1, 2021
@thurber
Copy link
Contributor

thurber commented Jun 1, 2021

Fair point, many folks are bringing up the Python version concerns. I briefly tried to make use of some of the new shared memory semantics of 3.9, but abandoned that effort -- honestly my main concern is to push folks to keep up to date on their software tooling, but that isn't really the job of a library. I will update to ensure support for 3.7+

@cheginit
Copy link
Author

cheginit commented Jun 1, 2021

Yes, I understand. The main issue with not supporting the Python versions that haven't reached EOL is compatibility with other libraries. Version 3.6's EOL is in 7 months and most of the major libraries have already dropped support for it. So I think 3.7+ support is reasonable.

@thurber thurber mentioned this issue Jun 3, 2021
@thurber
Copy link
Contributor

thurber commented Jun 3, 2021

In #46, I have set the minimum python version to 3.7 and moved the dependencies related to building the documentation into the extra_require section such that they are optional. Also, removed fastparquet dependency since I'm not using it anywhere.

Wanted to respond to your other points as well:

  • conda -- I think this is a good idea; I need to learn how to submit a package to conda-forge, so I will make a separate issue for this
  • h5netcdf vs netcdf4 -- for all the model outputs i'm using the default netcdf4 engine; the one place h5netcdf is used is when creating a compressed file from all the static grid/model variables, which is mostly just used for the unit tests (i found this engine created slightly smaller files and read them slightly faster). It is unnecessary complexity though so I will make another issue to back out this requirement.

@cheginit
Copy link
Author

cheginit commented Jun 4, 2021

That's looks great!

Regarding conda, that's a (strong) suggestion and not a requirement for JOSS. For development and testing, I highly recommend MambaForge. For Github Actions, you can take a look at the action file that I am using in one of my projects. It uses MambaForge and caching.

Regarding h5netcdf, I agree that it's an unnecessary complexity.

@thurber thurber closed this as completed in a016886 Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants