Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate wrappers for environment activation #3704

Closed
hjoliver opened this issue Jul 22, 2020 · 7 comments · Fixed by #4425
Closed

Investigate wrappers for environment activation #3704

hjoliver opened this issue Jul 22, 2020 · 7 comments · Fixed by #4425
Assignees
Milestone

Comments

@hjoliver
Copy link
Member

hjoliver commented Jul 22, 2020

Conda environment activation is reputedly quite slow.

conda run might address this in future, but doesn't yet? (and we can't use it in shared environments)

This small project makes some impressive claims, we should take a look: exec wrappers. (from an offline comment: (it might just be sourcing the python interpreter which won't work for all cases)?)

@hjoliver hjoliver added the question Flag this as a question for the next Cylc project meeting. label Jul 22, 2020
@hjoliver hjoliver added this to the some-day milestone Jul 22, 2020
@oliver-sanders
Copy link
Member

oliver-sanders commented Jul 22, 2020

The slow activate times for Conda really confuse me, what the heck is it doing?

Well according to the code:

  1. Set and unset environment variables
  2. Execute/source activate.d/deactivate.d scripts
  3. Update the command prompt

(1) should be quick, for (2) I cant see any activate.d/deactivate.d dirs in my 8.0a2 environment (they would have to be somewhere in the env dir right) and we don't need to update the command prompt for non-interactive commands.

So what's it actually doing?

$ env > before
$ conda activate cylc8
$ env > after
$ diff before after
--- before	2020-07-22 23:27:46.000000000 +0100
+++ after	2020-07-22 23:27:53.000000000 +0100
@@ -8,19 +8,22 @@
 GPG_TTY=/dev/ttys002
 PWD=/Users/user/miniconda3
 LOGNAME=user
+CONDA_PREFIX=/Users/user/miniconda3/envs/cylc8
 HOME=/Users/user
 LANG=en_GB.UTF-8
+CONDA_PROMPT_MODIFIER=(cylc8) 
 TMPDIR=/var/folders/rg/691l5s513_97d5s7ktr1rtt00000gn/T/
 TERM=xterm-256color
 _CE_CONDA=
 USER=user
-CONDA_SHLVL=0
+CONDA_SHLVL=1
 SHLVL=1
 HOME_HOST=myhost
 XPC_SERVICE_NAME=0
 CONDA_PYTHON_EXE=/Users/user/miniconda3/bin/python
-PS1=\[\][\W] $\[\] 
-PATH=/Users/user/miniconda3/condabin:/usr/local/opt/gnu-sed/libexec/gnubin:/usr/local/opt/coreutils/libexec/gnubin:/Users/user/bin:/Users/user/.user/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
+PS1=(cylc8) \[\][\W] $\[\] 
+CONDA_DEFAULT_ENV=cylc8
+PATH=/Users/user/miniconda3/envs/cylc8/bin:/Users/user/miniconda3/condabin:/usr/local/opt/gnu-sed/libexec/gnubin:/usr/local/opt/coreutils/libexec/gnubin:/Users/user/bin:/Users/user/.user/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
+OLDPWD=/Users/user/miniconda3/condabin
 TERM_PROGRAM=Apple_Terminal
 _=/usr/local/opt/coreutils/libexec/gnubin/env
-OLDPWD=/Users/user/miniconda3/condabin

Well wan't that a lot of ado about nothing! One would naturally assume you could cache this simple result?

$ diff before after | grep '^+' --color=never| grep -v 'PS1' --color=never | sed 's/^+//' | tail -n +2 > envfile

I had a quick crack at using this envfile rather than conda activate, it seems to work just fine, suites run ok, even tried a few tests.

Would have to delve a little deeper into conda activate / conda run in order to confirm that this isn't cutting out valuable logic, it does seem too simple, if it were this easy why wouldn't Conda update a file like this with each conda install?

I think this kind of approach would break is the ability to intelligently unset environment variables (which I think Conda does). Then there is the small matter of what happens if you call the script from an environment in which you have already activated a Conda environment?

@hjoliver hjoliver removed the question Flag this as a question for the next Cylc project meeting. label Aug 4, 2020
@oliver-sanders
Copy link
Member

Note: I've come back to this problem from the Rose side as Rose Apps currently execute the configured script in Conda environments - metomi/rose#2464. Updating this issue as I can't see any reason not to do the same for Cylc as Rose.

Summary of the problem

There are a few problems with conda activate for our usage:

  • It's slow
    • It results in many many filesystem operations which each have the potential to lag.
    • On network and parallel filesystems filesystem lag can become significant and random.
    • Our command is called many, many times compounding this problem.
  • Reverting the environment requires running a command.
    • We need to "get back to" the system environment so we can run user defined commands in the system environment as documented.
    • We would have to call conda deactivate to achieve this, it's not as simple as just popping an item from the PATH.
  • Subsequent activations don't work as expected.
    • Try conda acitvate x; conda activate y.
    • I keep hitting this issue with slurm jobs because it copies the submission env over the job env tricking Conda into thinking that an environment is already activated.

Conda Pack

The wonderfully useful conda-pack cuts out the conda command alltogether and provides its own activate script.

This activate script shows how minimal a functional conda activate script can be:

https://github.com/conda/conda-pack/blob/ecede405fa4554d3b986256c0456959919039ea0/conda_pack/scripts/posix/activate

This script:

  1. Changes PS1
  2. Changes PATH.
  3. Runs activate scripts.
  4. Sets CONDA_* environment variables (used in deactivation or subsequent activations).

Activate scripts

Any Conda package could bundle an activation script, however, few actually do, perhaps because conda-build actively discourages the use of activation scripts because:

people do not always activate environments the expected way and these packages may then misbehave.

-- https://docs.conda.io/projects/conda-build/en/latest/resources/activate-scripts.html#activate-scripts

Currently we don't have any dependencies which provide activation scripts. Also I would expect activation scripts on the whole are more likely to be used for configuring shells for interactive use?

Conclusion

Of the four items above we only actually need (2), consequently, because our Python entry points get created with the appropriate shebang we should be able to safely call our script using its absolute path without having to worry about environment activation.

Proposal

  1. Ditch conda activate, call the executable via its absolute path in the wrapper script.
  2. Add a GH action which runs whenever the conda-environment.yml is edited, which checks that none of our deps use activate scripts.
  3. Reject any Conda dependencies which require activate scripts.

@dpmatthews dpmatthews modified the milestones: some-day, cylc-8.0.0 Aug 12, 2021
@hjoliver
Copy link
Member Author

Sounds reasonable (assuming you're right about conda activate!)

@kinow
Copy link
Member

kinow commented Aug 17, 2021

Maybe worth for testing; will it work if the user calls conda activate anyway to his/her own environment? I think a few weeks ago we had a user with an error because he had the base env and cylc env? Might be worth testing that too.

@oliver-sanders
Copy link
Member

Maybe worth for testing; will it work if the user calls conda activate anyway to his/her own environment?

This is an annoying issue with activation (I hit it all the time) and a bug in the current wrapper script. If you already have an environment activated, conda activate <other-env> might not work as expected. The safe option is conda deactivate; conda activate <other-env> but that has downsides.

If we skip activation the environments won't clash in this way (we call the Cylc executable with an absolute path, that script has a shebang to Python with an absolute path, Python loads the site-packages based on that path).

@hjoliver
Copy link
Member Author

(Now: ditch activate).

@hjoliver hjoliver modified the milestones: cylc-8.0rc1, cylc-8.0b3 Sep 22, 2021
@hjoliver
Copy link
Member Author

Optimistically b3, if we can, otherwise bump back. (Good to get tested as soon as possible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants