Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly inherit/don't inherit the shell environment #55194

Closed

Conversation

segevfiner
Copy link
Contributor

@segevfiner segevfiner commented Jul 26, 2018

VSCODE_CLI was checked from the main process, which had it set depending on if the first launch of VS Code was launched from cli or not, instead of whether the current launch has it set.

So I changed the check to checking the userEnv passed to the new browser window.

Note that the assignment of lazyEnv and userEnv into process.env was racing before! Which is a bug in and of itself. After this fix, the shellEnv/lazyEnv overrides the passed in userEnv, but that can be changed later if needed.

We are also always launching the shell now since configuration.userEnv is not available yet when lazyEnv is initialized. It will require a hack or some refactoring to fix, so I left that like this for now. Might want to open an issue for future reference.

Fixes #55144
Part of #15452

@kieferrm kieferrm requested a review from joaomoreno July 27, 2018 21:36
@joaomoreno joaomoreno added this to the Backlog milestone Aug 2, 2018
@joaomoreno joaomoreno added the workbench-os-integration Native OS integration issues label Aug 2, 2018
@Tyriar
Copy link
Member

Tyriar commented Aug 21, 2018

This was a deliberate decision AFAIK to only evaluate the environment on the main process once and then have all windows inherit it. I'm curious to hear @joaomoreno's perspective on this when he gets a chance.

@Tyriar Tyriar self-assigned this Aug 21, 2018
@segevfiner
Copy link
Contributor Author

segevfiner commented Aug 22, 2018

If I'm understanding things correctly, the behavior before this PR is as such:

first \ second CLI Desktop
CLI Inherited from main process and shell Inherited from main process and desktop without overriding from login shell
Desktop Inherited from main process and shell, overridden with login shell vars Inherited from main process and desktop, overriden with login shell vars

The column indicates the type of the first launch, while the row indicates the second. The cell indicates from where the second launch gets it's environment from.

This behavior doesn't look right to me. Why should the first launch matter for the decision of whether to use the login shell env hack? Only whether the current launch is from CLI or desktop should matter.

(I guess this code is for working around desktop environments that don't run the users login shell during their startup, many actually do run it, so you do get a proper environment even when launched from desktop in those.This means this code might end up resulting in an environment where you ran the users profile script twice. Once by the desktop environment on login and once by VS Code. Potentially leading to doubled up directories in PATH).

The behavior after this PR should be:

first \ second CLI Desktop
CLI Inherited from main process and shell Inherited from main process and desktop, overridden with login shell vars
Desktop Inherited from main process and shell, without overriding from login shell vars Inherited from main process and desktop, overriden with login shell vars

P.S. It's quite possible that the inheritance from the main process leaks unwanted environment vars into the window. For example, if the first launch is from a virtualenv, than VIRTUAL_ENV will be set. Than if the second launch is from outside a virtualenv, than the VIRTUAL_ENV variable from the main process will still be set in the second window since the environment passing only sets values but doesn't unset any that were unset. But that's a separate issue from the one in this PR.

@joaomoreno
Copy link
Member

Great analysis.

There's a large discussion over at #15452 about this. We have not decided whether we'd want to support this, because it will bring other issues.

Note that the assignment of lazyEnv and userEnv into process.env was racing before! Which is a bug in and of itself. After this fix, the shellEnv/lazyEnv overrides the passed in userEnv, but that can be changed later if needed.

Can you explain this better?

I guess this code is for working around desktop environments that don't run the users login shell during their startup, many actually do run it, so you do get a proper environment even when launched from desktop in those.

Most actually do, but they end up skipping essential files which developers depend on, namely .bash_profile or .bashrc, where people set up tools like nvm or Go.

This means this code might end up resulting in an environment where you ran the users profile script twice. [...] Potentially leading to doubled up directories in PATH.

Yup, this is a purposeful choice on our side. We prefer this than having everyone complain that they don't have Node in the PATH (nvm users).

Why should the first launch matter for the decision of whether to use the login shell env hack?

The second row of the table shows that this PR still doesn't fix this. It seems that overriding from login shell vars will depend on whether the user has started with the CLI or with the Desktop. Why shouldn't it override always?

@segevfiner
Copy link
Contributor Author

segevfiner commented Aug 22, 2018

Again... Now with markdown rendered...

Great analysis.

There's a large discussion over at #15452 about this. We have
not decided whether we'd want to support this, because it will bring other
issues.

Don't you already try to, partially, support this? Isn't that what the login shell hack does? I think it's just a matter of getting the behavior right...

Note that the assignment of lazyEnv and userEnv into process.env was
racing before! Which is a bug in and of itself. After this fix, the
shellEnv/lazyEnv overrides the passed in userEnv, but that can be changed
later if needed.

Can you explain this better?

Who will complete first: The assignment of userEnv from the URL query string configuration in main, or the assignment of lazyEnv from the promise callback?

I don't think there is any guarantee that one will complete before the other (Barring any: We don't sleep anywhere before using userEnv and node doesn't preempt excuse 😜), and since process.env is a global. That's a race condition.

I guess this code is for working around desktop environments that don't
run the users login shell during their startup, many actually do run it, so
you do get a proper environment even when launched from desktop in those.

Most actually do, but they end up skipping essential files which
developers depend on, namely .bash_profile or .bashrc, where people set
up tools like nvm or Go.

If they skip the .bash_profile/.profile/.zprofile than they didn't run the users login during startup, so you will get an empty environment making this hack necessary.

There is the idiosyncrasy of Bash not sourcing .bashrc from interactive login shells, but that's up to the user sourcing it from his .bash_profile/.profile (Done by default on most distros). VS Code shouldn't try working around that one.

In Zsh, you likely end up without .zshrc sourced when you run from a desktop shortcut. Zsh does source .zshrc in an interactive login shell and doesn't source it from a non-interactive one. Since the shell the desktop runs during login is supposed to be non-interactive, it likely doesn't get sourced.

You are supposed to put PATH modifications that should affect desktop apps or non interactive programs in .zprofile, but since this is quite subtle, many things likely get this wrong, such as nvm:
https://github.com/creationix/nvm/blob/9854928ba9ff6f24360207fa90135574746838f5/install.sh#L235.
To correctly work around this, you want to run an interactive non-login shell to get the environment when running from a desktop shotcut (but not from a shell) in an environment that did source .zprofile. Whether actually working around that is the right thing to do or requiring users/software to correctly use their shell startup scripts is debatable.

This means this code might end up resulting in an environment where you
ran the users profile script twice. [...] Potentially leading to doubled up
directories in PATH.

Yup, this is a purposeful choice on our side. We prefer this than having
everyone complain that they don't have Node in the PATH (nvm users).

Having an empty env is obviously surprising and bad. But running the login script twice can also lead to issues.

My understanding is that the profile scripts can do non-idempotent things such as starting an ssh-agent or other services. This being the whole point of why there is both a profile script and an rc script. That's how you would auto start things in an SSH session for example. There is even a logout script for this purpose.

There are often PATH appends in the OS default login scripts that are not-idempotent and will get duplicated.

You can see what Atom does https://github.com/atom/atom/blob/5b485dcb4b48386d9480ab45dc389cbcaad3d454/src/update-process-env.js. But I'm not sure even it handles all cases correctly. It probably doesn't
handle the case of a .zshrc that didn't get sourced during login that I mentioned above, for example.

Why should the first launch matter for the decision of whether to use the
login shell env hack?

The second row of the table shows that this PR still doesn't fix this. It
seems that overriding from login shell vars will depend on whether the
user has started with the CLI or with the Desktop. Why shouldn't it
override always?

The whole point is not to clobber the environment if the user runs VS Code from shell using the code command. Otherwise if the user uses something like virtualenv or nvm, which you might be more familiar with, the environment they set in the shell will get clobbered. The tables are kinda hard to read, maybe these pivoted ones are easier:

Before:

first second result
Desktop Desktop Inherited from main process and desktop, overriden with login shell vars
Dekstop CLI Inherited from main process and shell, overridden with login shell vars
CLI CLI Inherited from main process and shell, without overriding from login shell
CLI Desktop Inherited from main process and desktop without overriding from login shell

After:

first second result
Desktop Desktop Inherited from main process and desktop, overriden with login shell vars
Dekstop CLI Inherited from main process and shell, without overriding from login shell vars
CLI CLI Inherited from main process and shell, without overriding from login shell
CLI Desktop Inherited from main process and desktop, overridden with login shell vars

@microsoft microsoft deleted a comment from segevfiner Aug 23, 2018
@Tyriar Tyriar removed their assignment Sep 13, 2018
@segevfiner
Copy link
Contributor Author

Ouch, src/vs/workbench/electron-browser/bootstrap/index.js got refactored...

@joaomoreno
Copy link
Member

Sorry for that @segevfiner

`VSCODE_CLI` was checked from the main process, which had it set
depending on if the *first* launch of VS Code was launched from cli or
not, instead of whether the *current* launch has it set.

So I changed the check to checking the userEnv passed to the new browser
window.

Note that the assignment of `lazyEnv` and `userEnv` into `process.env`
was racing before! Which is a bug in and of itself. After this fix, the
`shellEnv`/`lazyEnv` overrides the passed in `userEnv`, but that can be
changed later if needed.

We are also always launching the shell now since `configuration.userEnv`
is not available yet when `lazyEnv` is initialized. It will require a
hack or some refactoring to fix, so I left that like this for now. Might
want to open an issue for future reference.

Fixes microsoft#55144
@ysndr
Copy link

ysndr commented Jan 17, 2019

P.S. It's quite possible that the inheritance from the main process leaks unwanted environment vars into the window. For example, if the first launch is from a virtualenv, than VIRTUAL_ENV will be set. Than if the second launch is from outside a virtualenv, than the VIRTUAL_ENV variable from the main process will still be set in the second window since the environment passing only sets values but doesn't unset any that were unset. But that's a separate issue from the one in this PR.

What would be needed to achieve this behaviour?
My problem is that I might have multiple instances of vscode open where each is launched from terminals with different environments. So this is a quite important feature, at least for me..

@segevfiner
Copy link
Contributor Author

@ysndr Changing the logic so as to also remove env vars that don't exist in the environment from the cli or retrieved from shell except those required by VS Code (e.g. VSCODE_HOME and friends). Or otherwise fully base the environment on the environment from the shell and only bring in the env vars required by VS Code from the main process.

I planned on also fixing that on a separate PR, but sadly this PR has been hanging here for a long time...

@joaomoreno
Copy link
Member

@segevfiner I did look into this once more recently... but didn't get the courage to jump into it fully. I'll try to get to it asap.

@segevfiner
Copy link
Contributor Author

😢

@joaomoreno
Copy link
Member

joaomoreno commented Apr 22, 2020

I know... apologies @segevfiner. I did get to it eventually. And again later. But every time I get back to it, I simply can't pull the trigger. I feel this is a big change which will have a long tail of issues of users complaining that something is changed/broken... And right now there's only <10 users talking about this and even tolerating the current behavior. I am definitely not comfortable moving this forward.

I'd like to bring in @bpasero @Tyriar and @deepak1556, do you have any opinions on this?

@bpasero
Copy link
Member

bpasero commented Apr 22, 2020

I don't feel comfortable to make statement because I am no user of the shell environment whatsoever in any of my code. The main users are probably:

  • debug
  • tasks
  • terminal
  • extension host & extensions?

So I would drive this change with their input and sign off.

Btw with the advent of sandbox coming to the renderer we should be careful of not making the solution too node specific. I did a quick test and it seems you can still access process.env in a sandboxed environment from a preload script, so we may be safe to continue using this (ipc to main side is also fine).

@Tyriar
Copy link
Member

Tyriar commented Apr 22, 2020

Making vscode act like most other apps would be good, I have to spell out when people are having issues with the terminal to make sure that they close all the windows and try opening again, this is because it's unexpected.

Just to make sure I understand things this PR does:

  • When a window is launched via the CLI, the window will inherit the shell's environment (not the main process' environment).
  • When a window is launched via the desktop, the window will inherit the main process' environment (getUnixShellEnvironment). This environment is sourced regardless of how the first window was created such that it doesn't matter how the first window was launched, all desktop launched environments use the environment from getUnixShellEnvironment.

If that's how it works it seems good to me and shouldn't affect the terminal negatively. I'd expect tasks to be fine too since it's built on the terminal.

@segevfiner
Copy link
Contributor Author

@Tyriar Yeah should be like that. But does have one caveat in that env vars that are unset in the shell or the main process environment (getUnixShellEnvironment) but are set in the main VS Code daemon process, probably remain set (This change is subtle enough, which makes it hard to review, without me trying to change that) .

@segevfiner
Copy link
Contributor Author

segevfiner commented Apr 22, 2020

Reviewing my PR again. I noticed that getDefaultTerminalLinuxReady is awaiting on process.lazyEnv: externalTerminalService.ts:313, and then checking process.env. With this change it will now race with the new location of setting process.env which happens after the promise is resolved.

Not currently sure what's the best way to fix this. Maybe a new promise for it to await instead? That is, a promise that is resolved once process.env is correctly set. To do so I can technically rename getLazyEnv to getShellEnv, and just save it into a const, since only that the startup code needs it, and replace process.lazyEnv with a Promise<void> or Promise<typeof process.env> that is resolved once process.env is set correctly, though not sure how to get the promise to resolve from the startup code. What do you think?

I'm trying to implement that change anyhow to be sure.

@segevfiner
Copy link
Contributor Author

segevfiner commented Apr 22, 2020

Changed the code to preserve the previous behavior of process.lazyEnv.

Now the only remaining caveat is what I mentioned about it possibly keeping variables from the parent process that are unset in both the CLI and shell environment. Fixing that will probably mean fully cloberring process.env with the userEnv and shellEnv, possibly preserving any variables from the original process.env that should be kept.

EDIT: So looks like userEnv is sanitized and is not a complete environment, and of course process.env contains a lot of env vars that are likely important, such as DE ones in Linux that should be kept even when using shellEnv, so a fix for this last bit will require more thought.

Just in case it wasn't clear. I believe this PR is ready. This last issue can be tackled separately.

I think I can also change the code not to call getShellEnv at all if not needed now, moving the condition upwards and assigning a resolved promise if the call is not needed instead.

@bpasero
Copy link
Member

bpasero commented Oct 12, 2020

@segevfiner can you resolve the merge conflicts

@bpasero bpasero added the info-needed Issue requires more information from poster label Oct 12, 2020
@segevfiner
Copy link
Contributor Author

This diverged so much within the year that it is pretty much doing this from scratch at this point...

@bpasero
Copy link
Member

bpasero commented Oct 12, 2020

Then I suggest to close this PR and maybe you can put the suggested solution as a comment into the issue?

@segevfiner
Copy link
Contributor Author

I will try to redo it first on this branch first. But there might be caveats due to the electron browser<->node separation.

@segevfiner
Copy link
Contributor Author

I tried to redo this change. But note that I did so on Windows while this fix is really for Linux and Mac, meaning this still needs to be tested that it even still works. I'm also not sure what I did in the new workbench.js files is even legal with whatever is going on there (No idea what is electron-sandbox) or working, and we might want to rename and change the documentation of whenEnvResolved.

@bpasero bpasero removed the info-needed Issue requires more information from poster label Oct 13, 2020
@bpasero
Copy link
Member

bpasero commented Oct 13, 2020

@segevfiner thanks. you seem to indicate there is a race condition even in todays code, can you clarify?

Unfortunately the assignment to process.env has to stay in preload.js because once we enable sandbox and contextIsolation you cannot change process.env outside of preload.js.

@segevfiner
Copy link
Contributor Author

@segevfiner thanks. you seem to indicate there is a race condition even in todays code, can you clarify?

The previous code used to assign userEnv from the window configuration and shellEnv in two promises running in parallel. There was no guarantee at which order they would run. Not sure if the existing code still suffers from this issue, I'm not sure where userEnv is assigned to process.env now.

Unfortunately the assignment to process.env has to stay in preload.js because once we enable sandbox and contextIsolation you cannot change process.env outside of preload.js.

And this introduces a problem, for us to know if we should assign it, we need the configuration.userEnv, but that is only available later and is not supplied to that promise so it can make the decision. The promise that gets the configuration and the shell environment ones run in parallel. How should this be tackled? (P.S. If there is an acceptable way to pass it in, or do the check before it, we can also avoid sending the message to get the shell env at all if not needed)

Just reiterating so we are on the same page, the problem that this tries to fix is that VS Code is trying to check the wrong environment for VSCODE_CLI and VSCODE_FORCE_USER_ENV. It checks the parent process environment, rather than the environment of the new window, which makes it make the wrong decision based on the environment of the first opened window.

@bpasero
Copy link
Member

bpasero commented Oct 13, 2020

@segevfiner thanks for finding that, I can confirm it looks suspicious. I have opened #108607 to look into it. This will also require that preload scripts get access to the configuration, or a subset. Once I have tackled that problem, we should revisit this PR too.

Just reiterating so we are on the same page, the problem that this tries to fix is that VS Code is trying to check the wrong environment for VSCODE_CLI and VSCODE_FORCE_USER_ENV.

Understood. One impact your change has is that we will now always resolve the shell environment on startup, even if you start from the command line. This may have a bad impact on startup perf because we need to spawn another process. A more elegant solution imho would be to really only trigger this dance for the first window that needs it. But I cannot really see a good solution for this at the moment given we resolve the shell environment once and then store the promise.

@bpasero bpasero modified the milestones: October 2020, On Deck Oct 13, 2020
@chaosjak

This comment has been minimized.

@segevfiner
Copy link
Contributor Author

If the preload script will have access to the configuration and we move the VSCODE_CLI condition there, it can skip sending the shell environment messages altogether, saving us from spawning the shell. Caching the information can be problematic if the user later changes his shell startup scripts and expects the changes to apply.

@bpasero
Copy link
Member

bpasero commented Oct 14, 2020

@segevfiner sorry, how can we prevent spawning a shell? The idea of this was to get the shell's environment variables when you start VSCode from e.g. the dock. So the basic functionality needs to stay like that, or am I missing something?

@bpasero
Copy link
Member

bpasero commented Oct 14, 2020

Btw I just checked the code again and do not see a race condition, because the code Object.assign(safeProcess.env, configuration.userEnv) executes sync from the bootstrap-window.js while the other one needs a IPC roundtrip to the main side.

@segevfiner
Copy link
Contributor Author

segevfiner commented Oct 14, 2020

@segevfiner sorry, how can we prevent spawning a shell? The idea of this was to get the shell's environment variables when you start VSCode from e.g. the dock. So the basic functionality needs to stay like that, or am I missing something?

I'm talking about maintaining the previous behavior of only spawning the shell when we actually need the shell environment, not about not spawning it entirely. I had to run make it spawn the shell also when we don't need it due to the configuration not being available yet when we need to decide to run it or not due to the way the current code is structured.

Btw I just checked the code again and do not see a race condition, because the code Object.assign(safeProcess.env, configuration.userEnv) executes sync from the bootstrap-window.js while the other one needs a IPC roundtrip to the main side.

But it looks like the order is now that the shell env is applied after the user env, do we really sure we want the shell env to clobber over the user env rather than the other way around? Considering that user env is also the environment captured when the user runs code from the terminal. (It also looks like the user env isn't set at all when the sandbox is enabled? I guess that's a task someone hasn't done yet... Might require adding the ability to set the environment of a browser window process to electron itself)

@bpasero
Copy link
Member

bpasero commented Oct 14, 2020

I'm talking about maintaining the previous behavior of only spawning the shell when we actually need the shell environment,
not about not spawning it entirely. I had to run make it spawn the shell also when we don't need it due to the configuration not
being available yet when we need to decide to run it or not due to the way the current code is structured.

@segevfiner keep in mind that we WANT to resolve the shell environment as early as possible even before opening a window because we do NOT WANT to slow down the opening of the window only to wait for the shell environment to be resolved. So we cannot move this to a later place, it has to start right on startup. And with your change we will do it ALWAYS, even when you run from the command line (as opposed to today where we do NOT spawn shell when you run from command line). I could not understand the last paragraph you wrote, sorry. I hope this clarifies things.

But it looks like the order is now that the shell env is applied after the user env, do we really sure we want the shell env to
clobber over the user env rather than the other way around? Considering that user env is also the environment captured when
the user runs code from the terminal. (It also looks like the user env isn't set at all when the sandbox is enabled? I guess that's a
task someone hasn't done yet... Might require adding the ability to set the environment of a browser window process to
electron itself)

I cannot speak to the intent of this as I am not the original author of this flow, I just try to preserve what we had before and explain that there is no race condition.

@segevfiner
Copy link
Contributor Author

segevfiner commented Oct 14, 2020

keep in mind that we WANT to resolve the shell environment as early as possible even before opening a window because we do NOT WANT to slow down the opening of the window only to wait for the shell environment to be resolved. So we cannot move this to a later place, it has to start right on startup. And with your change we will do it ALWAYS, even when you run from the command line (as opposed to today where we do NOT spawn shell when you run from command line). I could not understand the last paragraph you wrote, sorry. I hope this clarifies things.

Of course. If the configuration is available early enough in the window startup or at least what we need to determine this condition, then we can start it right away in parallel as is today, just with the condition inside the resolveEnv promise instead.

I cannot speak to the intent of this as I am not the original author of this flow, I just try to preserve what we had before and explain that there is no race condition.

Might need to rope in whoever is familiar with this, cause it doesn't feel right, should the shell env really be allowed to override env vars set by VS Code itself?

@bpasero
Copy link
Member

bpasero commented Nov 18, 2020

I am closing this in favour of #108804 which I plan to look at during this month. I also want to support and/or understand #15452 and #108571

Work happens in https://github.com/microsoft/vscode/tree/dev/execution-service

@bpasero bpasero closed this Nov 18, 2020
@bpasero
Copy link
Member

bpasero commented Nov 19, 2020

@segevfiner btw thanks for putting us on the right track for how to resolve these issues and sorry for the long time this took. You are right for example that there is a race condition where the user environment gets overwritten by the shell-probe-environment and we plan to fix all these issues. After discussion with @joaomoreno I documented our proposed solution in #108804 (comment)

@github-actions github-actions bot locked and limited conversation to collaborators Jan 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
workbench-os-integration Native OS integration issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VS Code always uses getUnixShellEnvironment if the first launch is from the GUI
6 participants