You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#1219 bumps the default version of Python, which is a breaking change since it may break repositories that pin package versions that won't work with a newer Python version.
We'll also need to update the base-image from Ubuntu 18.04 soon.
There is an issue (somewhere :-/) about the fact that for real reproducibility people need to have the triple: repo link, revision of the code and revision of repo2docker. We discussed adding a feature to repo2docker that would allow it to fetch a version of itself and use that instead of what the user installed when building the container (something like fetch the container image for that revision of r2d).
In the past we have broken the "reproducibility promise" a few times already. For example when we switched to jupyter lab as default and I think some, rare, cases.
In general, I think because the universe keeps evolving it is already likely that (very) old revisions of a repo will not build or lead to a container that is quite different. Despite this I think we should try hard not to add to this source of entropy by going wild with breaking changes in r2d.
For mybinder.org, I think it's going to be tricky to balance repo2docker version as an input with security/reliability requirements to not run images from old repo2docker, but that is going to be the only truly reproducible input.
I don't think repos that don't specify a Python version have a reasonable expectation of long-term reproducibility, since they have only a partially-specified environment, but that's hard to weigh against the fact that the Python community doesn't have any standard, widely adopted way to specify the Python version. Only relatively uncommon tools like pipfiles and conda can specify, and they often don't. In large part, I think this is a documentation question - defaults will be updated, and if you don't specify a version, it will change over time (this is true of any package in requirements.txt or any other env spec, too).
I still think it would be appropriate for us to add a repo's last_modified_date as an input, and use that to pick the default Python. I think it would dramatically improve our success rate for reproducible envs by default, based on sampling data from our study a couple years ago, but that's slightly tangential to the current discussion.
I think we should specify in docs somewhere our upgrade policy for:
defaults that can be overridden, like Python version, and
hardcoded values that can't be overridden (at least in mybinder.org), like base image
I think it would be valid, for instance, for our default Python to be latest - 1 (3.10, as in this PR), and upgrade every year, following Python's own releases.
I think it may also be appropriate to communicate more clearly during the build process with e.g. a warning stating that Python version is unspecified, repo2docker outputs will change over time.
Another valid approach would be to follow something like the Python developer survey, and pick the most popular version (3.9) as default, or something like the 50th percentile version (also 3.9 this year, but 52% would be 3.8).
#1219 bumps the default version of Python, which is a breaking change since it may break repositories that pin package versions that won't work with a newer Python version.
We'll also need to update the base-image from Ubuntu 18.04 soon.
From @betatim
Related:
repo2docker.version
- a config file to make repo2docker run another repo2docker version #550The text was updated successfully, but these errors were encountered: