Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor how env variables are loaded and passed into kernels and processes spawned #10359

Closed
DonJayamanne opened this issue Jun 8, 2022 · 3 comments
Assignees
Labels
debt Code quality issues
Milestone

Comments

@DonJayamanne
Copy link
Contributor

DonJayamanne commented Jun 8, 2022

We have a lot of places & different ways we load env variables from .env files & merges them with the activated Python envs & then we also get these env variables from the Python extension.

We also have a few issues where these env varaibles are getting cached when they shouldn't or we're not loading the updated .env files.
To get OCAML & a couple of other kernels working we need to load env variables form the shell and merge them again with the .env files.

Before we do that, I think we need to first fix up how env variables are parsed, loaded, merged, cached.

Required for :

Other possible issues:

@DonJayamanne DonJayamanne added bug Issue identified by VS Code Team member as probable bug debt Code quality issues and removed bug Issue identified by VS Code Team member as probable bug labels Jun 8, 2022
@github-actions github-actions bot added the triage-needed Issue needs to be triaged label Jun 8, 2022
@DonJayamanne DonJayamanne self-assigned this Jun 8, 2022
@rchiodo rchiodo removed the triage-needed Issue needs to be triaged label Jun 9, 2022
@greazer greazer added kernel-management bug Issue identified by VS Code Team member as probable bug and removed debt Code quality issues iteration-candidate labels Jul 7, 2022
@greazer greazer added this to the July 2022 milestone Jul 7, 2022
@DonJayamanne
Copy link
Contributor Author

Personal notes:

  • KernelEnvironmentVariablesService (one method to get kernel env vars)
  • IEnvironmentActivationService (one method to get activated Python env vars)
  • IEnvironmentVariablesService (helper, parsing .env file and merging vars)
  • IEnvironmentVariablesProvider (helper, event when env vars change and ability to get env vars and custom env vars)

PythonApi.getActivatedEnvironmentVariables

  • Does this take .env variables into account
  • Does this merge process.env into the return value
  • Is this cached?
  • If user updates .env file, will this be updated?
  • Is user updates shell env variables, will this be updated?

KernelEnvironmentVariablesService

  • getEnvironmentVariables <Top level>
    • Used in one place to get the environment variables of the kernel when starting raw kernel process.
    • This calls KernelEnvironmentVariablesService.getCustomEnvironmentVariables and IEnvironmentActivationService.getActivatedEnvironmentVariables and merges the two results, also with the process.env.
      TODO: Won't both already have been merged with process.env.
    • Special code for handling PYTHONPATH and PATH|Path variables.
    • Special code for PYTHONNOUSERSITE

EnvironmentVariablesProvider

  • getEnvironmentVariables <Top level>

    • Merges the env variables from .env into process.env and returns the merged result.
    • Special code for handling PYTHONPATH and PATH|Path variables.
    • Caches the results in memory and gets cleared when editing .env files.
    • Not cached across vscode sessions.
    • Used in:
      • JupyterPaths.node.ts (reading user env vars)
      • ProcessFactory.node.ts (use this when spawning new processes).
      • PreWarmVariables.node.ts (just calls this method to initialize cache for each workspace folder)
      • EnvironmentActivationService.node.ts <IEnvironmentActivationService>
  • getCustomEnvironmentVariables

    • Locates the .env file
    • Parses the .env file using IEnvironmentVariablesService.parseFile
    • Parsed info is returned
    • Used in:
      • KernelEnvironmentVariablesService.getEnvironmentVariables
      • EnvironmentActivationService.getActivatedEnvironmentVariables

EnvironmentActivationService.node.ts

  • getActivatedEnvironmentVariables <Top level>
    • Returns the merged variables from .env file and python environment
    • If we fail to activate the Python env then merges the results from getEnvironmentVariables with process.env.
    • TODO: this is pointless as getEnvironmentVariables already merges the results with process.env.
    • We seem to do this in two places within the same file EnvironmentActivationService.node.ts
    • Used in:
      • ModuleInstaller.installModule <IModuleInstaller>,
        • TODO: This also merges the result with process.env variables & a few other things.
      • JupyterInterpreterSubCommandExecutionService.startNotebook <IJupyterSubCommandExecutionService>, and that class also merges the result with process.env variables & a few other things.
      • KernelEnvironmentVariablesService.getEnvironmentVariables already covered above.
      • JupyterKernelService.updateKernelEnvironment <IJupyterKernelService>
        • Creates the env variables for use in kernels spawned using Jupyter.
        • Merges the result with the env variables in the kernelspec.
        • Special code for PYTHONNOUSERSITE
        • Why can't this do the same thing as raw kernels, by using KernelEnvironmentVariablesService.getEnvironmentVariables
      • PythonExecutionFactory.createDaemon
      • PythonExecutionFactory.createActivatedEnvironment
      • PreWarmVariables.node.ts (just calls this method to initialize cache)
  • Split into two, one to get env vars from Python ext and one to get ourselves. Right now we have two public methods in there that should be private and its easier to test if separate (at the end of the day they are two separate implementations for the same feature).

Process ctor

  • We pass the env variables into Process class for use by all processes spawned using the Process class.
  • This is done only in two places:
    • EnvironmentActivationService.node.ts
    • ProcessFactory.node.ts
    • PythonExecutionFactory.node.ts

@microsoft microsoft locked as off-topic and limited conversation to collaborators Jul 14, 2022
@DonJayamanne
Copy link
Contributor Author

Notes:

  • When generating kernelspec.json file with env variables, we can have two instances of VSC opened and each with two sepraate folders having two seprate .env files.
    • Either we should re-start Jupyter server when .env files change
    • Or have a way to keep the env variables in kernelspec.json per workspace folder (e.g. separate env_<workspace folder hash> sections in the kernelspec.json file or the like)
    • Or create seprate kernelspec.json files (not best as this could cause other issues as this is cached and used accros vscode sessions and folders)

@connor4312
Copy link
Member

Are there any verification steps for this?

@connor4312 connor4312 added the verification-steps-needed Steps to verify are needed for verification label Jul 29, 2022
@DonJayamanne DonJayamanne added debt Code quality issues and removed bug Issue identified by VS Code Team member as probable bug verification-steps-needed Steps to verify are needed for verification labels Jan 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
debt Code quality issues
Projects
None yet
Development

No branches or pull requests

4 participants