Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch default python to 3.10 #1219

Merged
merged 10 commits into from
Feb 17, 2023
Merged

Switch default python to 3.10 #1219

merged 10 commits into from
Feb 17, 2023

Conversation

yuvipanda
Copy link
Collaborator

3.7 is quite old, and folks get caught up in it because it also sometimes forces much older versions of packages - and it can be quite confusing to debug (see
2i2c-org/infrastructure#1934) for example.

3.7 is *quite* old, and folks get caught up in it because
it also sometimes forces much older versions of packages - and
it can be quite confusing to debug (see
2i2c-org/infrastructure#1934)
for example.
@manics
Copy link
Member

manics commented Nov 19, 2022

How do we handle communications of this change to the Jupyter and mybinder.org community? This may break some repos that don't pin a Python version if they rely on a package version that's not built for 3.10.

Long term do you have any thoughts on how we can make this process better? We're going to run into the same problem of breaking repos with future bumps, or when we bump the base image from Ubuntu 18.04 to a newer version.

@betatim
Copy link
Member

betatim commented Nov 21, 2022

Long term do you have any thoughts on how we can make this process better?

There is an issue (somewhere :-/) about the fact that for real reproducibility people need to have the triple: repo link, revision of the code and revision of repo2docker. We discussed adding a feature to repo2docker that would allow it to fetch a version of itself and use that instead of what the user installed when building the container (something like fetch the container image for that revision of r2d).

In the past we have broken the "reproducibility promise" a few times already. For example when we switched to jupyter lab as default and some other, rare, cases.

In general, I think because the universe keeps evolving it is already likely that (very) old revisions of a repo will not build or lead to a container that is quite different. Despite this I think we should try hard not to add to this source of entropy by going wild with breaking changes in r2d.

@manics
Copy link
Member

manics commented Nov 21, 2022

I've created a new issue for the big picture of long-term support #1220

@minrk
Copy link
Member

minrk commented Nov 21, 2022

I left a more detailed comment in #1220, but I think it is appropriate to bump the default version at least when it is old enough that it's starting to cause failures.

We definitely do need to keep updating the default version. I'm not sure what's the best strategy would be between:

  • stick with oldest reasonably-supported (will still likely result in annual bumps to Python latest - .3)
  • leapfrog to latest-1 whenever our default starts getting crusty (e.g. 3.7-3.10 now, 3.10-3.13 in 2025, etc.). This would mean less frequent bumps, which may be desirable for our users?
  • stay up-to-date with annual bumps, trailing Python itself by some months (e.g. 3.10 this year, 3.11 next year, etc.)
  • match some other distro, like Ubuntu LTS (update every 2 years, currently 3.10 in 22.04)

or some other strategy.

@betatim
Copy link
Member

betatim commented Nov 21, 2022

I'm not sure which of the proposed options to choose/vote for :-/ While thinking about it I ended up thinking that we should try to make it a semi regular thing (changing default Python version). My reasoning was that we can either make it extremely rare, like every five years, or make it fairly regular. The weird/danger zone is the middle ground.

If a change like this is extremely rare, it is "by construction" a big bang event. This requires docs, patience, giving people enough prior warning, etc. However because you "never" do it, that is Ok.

If it happens quite regularly, then people are used to it, expecting it, have a good "institutional memory" and maybe even some tooling to help.

In the middle ground it seems we end up with the worst of the two other extremes. It is a big deal that requires prior warning, patience, etc because people have forgotten that this is a thing. It happens frequently enough that it is annoying to have to spend so much effort on it.

The hard bit is, of course, figuring out what "quite regularly" means in terms of human time passing. Riding on the Python release schedule seems like a good thing to do. They seem to be pretty regular, well known, expected, etc. Maybe once a year is a good rhythm? Maybe being "slow", as in, sticking with "oldest reasonable version" is good for people in the reproducibility business?

Though maybe, assuming that more repos receive r2d support now than in the past, it means the time till these repos hit the "oups, we didn't specify a Python version but we should have" is "far in the future". Which means the original authors have left/moved on/forgotten about that repo? Maybe I'm thinking too much without actual data :-/

@minrk
Copy link
Member

minrk commented Nov 21, 2022

Given Python's rapid annual release cadence, I think it's perhaps most logical for us to match (maybe lag by exactly one version, e.g. when Python releases 3.12, we bump default to 3.11 and do this every Fall?).

I think it's also appropriate to put pressure on repos, via warnings and docs, to specify their Python version. many repos that just need numpy, pandas, etc. shouldn't specify Python version (if so, they shouldn't specify _any_versions at all). But if you specify any version of anything, specifying the Python version is a good idea, too.

@manics
Copy link
Member

manics commented Jan 10, 2023

@manics
Copy link
Member

manics commented Jan 22, 2023

It's been two weeks and no-ones responded on the Discourse post, so I think we should go ahead with this.

@minrk
Copy link
Member

minrk commented Feb 3, 2023

Merged from main and re-ran freeze

@jtpio
Copy link
Contributor

jtpio commented Feb 6, 2023

This will also help running JupyterLab 4.0 on Binder since lab 4 will require Python 3.8+:

image

Copy link
Member

@manics manics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a py310-requirements-file test in
https://github.com/jupyterhub/repo2docker/tree/738a56dcd5b5169b14ed5b535514cf4cf3fb8e63/tests/conda/
that's intended to test the newest version of Python supported. With this PR it's redundant, but it should be updated when Python 3.11 is added (#1239).

When we merge this to mybinder.org I think we should post an announcement on Discord (or we could even consider announcing it in advance?).

@yuvipanda
Copy link
Collaborator Author

What do we do about the failing external test? Update the pins?

@manics
Copy link
Member

manics commented Feb 6, 2023

Good spot, I thought it was a transient error but it looks real 😞
Solving the environment requires Python 3.9:

$ podman run -it --rm docker.io/condaforge/mambaforge
# python --version
Python 3.10.8

# wget https://raw.githubusercontent.com/jupyter-xeus/xeus-cling/0.6.0/environment.yml
...

# mamba env create -n test -f environment.yml 
...

# mamba list -n test

# packages in environment at /opt/conda/envs/test:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
argon2-cffi               20.1.0           py39hbd71b63_2    conda-forge
asttokens                 2.2.1              pyhd8ed1ab_0    conda-forge
attrs                     22.2.0             pyh71513ae_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                pyhd8ed1ab_3    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
beautifulsoup4            4.11.2             pyha770c72_0    conda-forge
binutils_impl_linux-64    2.33.1               h53a641e_8    conda-forge
binutils_linux-64         2.33.1              h9595d00_17    conda-forge
bleach                    6.0.0              pyhd8ed1ab_0    conda-forge
ca-certificates           2022.12.7            ha878542_0    conda-forge
cffi                      1.14.4           py39he88106c_0    conda-forge
clang_variant             1.0               cling_6.14.06    conda-forge
clangdev                  5.0.0             h935a590_1004    conda-forge
cling                     0.5               he860b03_1007    conda-forge
comm                      0.1.2              pyhd8ed1ab_0    conda-forge
cppzmq                    4.3.0             hc9558a2_1001    conda-forge
debugpy                   1.2.0            py39h41458e0_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
executing                 1.2.0              pyhd8ed1ab_0    conda-forge
gcc_impl_linux-64         7.3.0                hd420e75_5    conda-forge
gcc_linux-64              7.3.0               h553295d_17    conda-forge
gxx_impl_linux-64         7.3.0                hdf63c60_5    conda-forge
gxx_linux-64              7.3.0               h553295d_17    conda-forge
importlib-metadata        6.0.0              pyha770c72_0    conda-forge
importlib_resources       5.10.2             pyhd8ed1ab_0    conda-forge
ipykernel                 6.20.2             pyh210e3f2_0    conda-forge
ipython                   8.9.0              pyh41d4057_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
jedi                      0.18.2             pyhd8ed1ab_0    conda-forge
jinja2                    2.11.3             pyhd8ed1ab_2    conda-forge
jsonschema                4.17.3             pyhd8ed1ab_0    conda-forge
jupyter_client            7.1.2              pyhd8ed1ab_0    conda-forge
jupyter_core              5.2.0            py39hf3d152e_0    conda-forge
jupyterlab_pygments       0.2.2              pyhd8ed1ab_0    conda-forge
ld_impl_linux-64          2.33.1               h53a641e_8    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
libgfortran-ng            7.5.0               h14aa051_20    conda-forge
libgfortran4              7.5.0               h14aa051_20    conda-forge
libgomp                   9.2.0                h24d8f2e_2    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libopenblas               0.3.12          pthreads_hb3c22a3_1    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libstdcxx-ng              9.2.0                hdf63c60_2    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
markupsafe                1.1.1            py39h38d8fee_2    conda-forge
matplotlib-inline         0.1.6              pyhd8ed1ab_0    conda-forge
mistune                   0.8.4           pyh1a96a4e_1006    conda-forge
nbclient                  0.5.13             pyhd8ed1ab_0    conda-forge
nbconvert                 6.4.4            py39hf3d152e_0    conda-forge
nbformat                  5.7.3              pyhd8ed1ab_0    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nest-asyncio              1.5.6              pyhd8ed1ab_0    conda-forge
nlohmann_json             3.6.1                he1b5a44_0    conda-forge
notebook                  6.4.12             pyha770c72_0    conda-forge
openssl                   1.1.1h               h516909a_0    conda-forge
packaging                 23.0               pyhd8ed1ab_0    conda-forge
pandoc                    2.19.2               ha770c72_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pexpect                   4.8.0              pyh1a96a4e_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pip                       23.0               pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_0    conda-forge
platformdirs              2.6.2              pyhd8ed1ab_0    conda-forge
prometheus_client         0.16.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.36             pyha770c72_0    conda-forge
psutil                    5.7.3            py39h38d8fee_0    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygments                  2.14.0             pyhd8ed1ab_0    conda-forge
pyrsistent                0.17.3           py39hbd71b63_1    conda-forge
python                    3.9.0           h2a148a8_4_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.16.2             pyhd8ed1ab_0    conda-forge
python_abi                3.9                      3_cp39    conda-forge
pyzmq                     20.0.0           py39h25affbc_1    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
send2trash                1.8.0              pyhd8ed1ab_0    conda-forge
setuptools                67.1.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
soupsieve                 2.3.2.post1        pyhd8ed1ab_0    conda-forge
sqlite                    3.33.0               h4cf870e_1    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
terminado                 0.17.1             pyh41d4057_0    conda-forge
testpath                  0.6.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.11               h21135ba_0    conda-forge
tornado                   6.1              py39hbd71b63_0    conda-forge
traitlets                 5.9.0              pyhd8ed1ab_0    conda-forge
typing-extensions         4.4.0                hd8ed1ab_0    conda-forge
typing_extensions         4.4.0              pyha770c72_0    conda-forge
tzdata                    2022g                h191b570_0    conda-forge
wcwidth                   0.2.6              pyhd8ed1ab_0    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
xeus                      0.20.0               h4d8c418_1    conda-forge
xeus-cling                0.6.0                he513fc3_1    conda-forge
xtensor                   0.20.8               hc9558a2_0    conda-forge
xtensor-blas              0.16.1               h776b511_0    conda-forge
xtl                       0.6.21               h0efe328_1    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zeromq                    4.3.3                h58526e2_3    conda-forge
zipp                      3.12.1             pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h36c2ea0_1013    conda-forge

The repository doesn't specify the required version of Python. However you could argue that since it pins some of the packages the repository wants "any version of Python that enables the requested package versions to be installed". This wouldn't work for pip, but it works for conda because python is just another package who's dependencies can be taken into account.

If we want to support this we probably need
#1220 (comment)

I still think it would be appropriate for us to add a repo's last_modified_date as an input, and use that to pick the default Python. I think it would dramatically improve our success rate for reproducible envs by default, based on sampling data from our study a couple years ago, but that's slightly tangential to the current discussion.

@minrk
Copy link
Member

minrk commented Feb 8, 2023

Looking at the comment for the xeus-cling test (added in #373), it appears to be meant to verify that downgrades are allowed, but the failure suggests that the downgrade doesn't work. I think downgrade of most packages is allowed, but Python itself is not allowed to be downgraded (by minor version number, at least). I'm not sure where this is enforced, if it's something we do or something in conda/mamba. In any case, I believe the test failing is either correct: implicit downgrade of Python should work and doesn't, or doesn't test what we want it to anymore: implicit downgrade of other packages while keeping the chosen (even if not by the user) Python in tact.

So either we should figure out how to unpin Python (I don't think this has ever worked), or find another repo which triggers a downgrade of something other than Python.

I think this will work if we copy the xeus-cling environment and add python: "3.9" to it.

I also just noticed while testing that this PR didn't actually change the default Python to 3.10 in some important cases! There are two ways to get at a 'default' python env - load the environment.lock file that has no Python version, or end up via the BuildPack.major_pythons dict to pick '3.10'. When using the default buildpack with no environment specified, major_pythons is used here. I think we can probably consolidate things. As a separate PR, I think we can probably remove the python_version="" case, to consolidate the different paths to the default version. (opened as #1243)

this default lives too many places!
rather than building old xeus-cling, which requires a downgrade of Python from 3.10 to 3.9,
which is _not_ supported,
run the build with a 3.9 pin.

This still results in patch-level downgrade of Python, major downgrade of openssl, etc.
@minrk
Copy link
Member

minrk commented Feb 8, 2023

All green now. I've elected to bring the xeus-cling test into our repo with an added python: 3.9. It still tests and verifies that downgrades are allowed, just not beyond x.y for Python itself, which I don't think has ever been allowed.

I also updated the python-3.7 test from binder-examples, since that particular tag actually lacked a requirements.txt, so it started to fail because it includes a numpy pin but not a Python pin (a known recipe for failure). The same behavior is tested with a new python-3.8 tag that actually specifies the Python version, so should be more stable.

The codecov errors are a flakiness issue in codecov itself, and not real. All tests are passing.

@minrk
Copy link
Member

minrk commented Feb 15, 2023

This has approvals and I think it's ready to go, but since I've made noticeable changes since the approvals, I'd like at least one other person to give it a green light before merging.

@minrk
Copy link
Member

minrk commented Feb 16, 2023

Re-synced after #1239. Now does not contain re-freeze, only the change in default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants