Skip to content

Bug / Feature Request: Add brace expansion, extended glob matching, and 'globstar' matching to user/project environment variable repo patterns #15744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kenibrewer opened this issue Jan 15, 2023 · 10 comments
Labels
meta: stale This issue/PR is stale and will be closed soon

Comments

@kenibrewer
Copy link

kenibrewer commented Jan 15, 2023

There are other feature request and bugs that have been submitted related to this problem, but none of them encompass the full set of issues that could be solved by implementing a proper solution. I'm going to attempt to consolidate all those into one issue.

Is your feature request related to a problem? Please describe

Currently the environment variable RepoPattern only supports simple repo structures of the type "/" and simple glob matching where a glob takes the place of one of those two variables. The lack of support for more complex repo structures and advanced matching introduces the following problems.

  1. Environment variable scopes are broken in any multi-level Gitlab repo.

Setting SECRET_KEY = XYZ with the scope company/* will not be loaded in a Gitlab repo of the structure company/subgroup/testrepo.

  1. Environment variables need to be managed in a redundant manner.

If I would like SECRET_KEY = XYZ available in the repos company/db and company/frontend but not company/public I would need to create two different environment variables and remember to change both of them instead of setting an environment variable with the pattern company/{db,frontend}.

  1. Environment variables cannot be excluded from domains using NOT logic.

I would like SECRET_KEY = XYZ credentials to be available in any dev environment matches the pattern company/!(internship*)

Describe the behaviour you'd like

Environment variable matching patterns should support the full set of glob-style matching features that developers expect including:

  • Globstar ** matching
  • Matching at multiple repo layers deep
  • Brace expansion, {a,b}
  • NOT matching with !

Describe alternatives you've considered

  • Manually running eval $(gp env -e) after workspace creation can load variables with the basic pattern company/* in nested Gitlab repos
  • There is no alternative for matching at deeper gitlab nested repos.
  • Alternatives for brace expansion include managing redundant ENV variables for each separate context.
  • There is no alternative the lack of negate matching.

Related Issues

The following pre-existing issues would be solved by this feature/request:

Two attempted fixes that were merged but did not actually solve the issue:

Additional context

There is an inconsistency between what environment variables that get loaded during workspace creation/launch and which environment variables get loaded by eval $(gp env -e)

I have not tested whether project environment variables are subject to the same problems and limitations.

Possible Implementation Notes:

The easiest way to enable full advanced glob pattern matching is probably by using switching the WorkspaceEnvVarAccessGuard to use minimatch.

The score function would need to be changed to resolved environment variable conflicts at any layer of depth. Perhaps a function where the score added for an exact match at a given level of depth is 2^(22-depth). (Gitlab supports up to 20 layers of nested subgroups). With this score function higher score would have priority.

The various places that use splitRepositoryPattern like this one would need to be modified.

Additional tests like these ones should be added to test for proper multi-level repo pattern scoping.

@mbrevoort
Copy link
Contributor

Hi @kenibrewer, first thank you for the thorough and thoughtful issue. We really appreciate it.

The challenges you describe and reference and the inconsistencies of environment variables are loaded is definitely something we plan to improve.

I do have a few follow-up questions

  • How many repositories are you managing?
  • Are you using Projects?

@kenibrewer
Copy link
Author

Hi @mbrevoort ,

Thanks for the response. Short answer to your questions:

  • Somewhere over 60 with that number expected to grow substantially over the next year.
  • We are using projects

Context:
ProFound Tx is a new Gitpod customer and we are currently in the process of gradually shifting our development activities into workspaces. Our workflows are probably very different than a software development focused company. As a small biotech startup, our CompBio group does a lot of exploratory analysis in many different directions which need to be captured and may need to be reproduced at a later date, but won't ever get merged into a central set of repositories that represent the software we are working on.

@mbrevoort
Copy link
Contributor

The context is interesting, thanks!

How much does the naming convention of repositories matter to you? Or is it that is just how we have implemented the ability to share environment variables across multiple projects?

What if there was a more explicit way to share variables between projects (e.g. pick which projects to share to). How important is the wildcard support?

@kenibrewer
Copy link
Author

Wildcard support would certainly make our lives much easier than than simply having a way to pick which projects to share environment variables to. We've organized our code repositories in Gitlab very hierarchically. For example, any of our bioinformatic pipelines built using Nextflow are sorted into company/nextflow/<team>/<repo-name>. If we can define the ENV variables required to connect to our Nextflow Tower deployment scoped to company/nextflow/** that means we don't need to go in and re-share those variables with new repos we create under that naming scheme.

@kenibrewer
Copy link
Author

@mbrevoort

The workaround that we figured out whereby users could store credentials scoped to company/* and access them manually via eval $(gp env -e) is now broken.

I think this was likely caused by the removal of the support for user environment variables in workspaces launched via the project interface in this pull request.

Is the intent to no longer support access to individual environment variables in projects? If so, is there an alternative you could recommend whereby individualized credentials could be accessed? Later this year we plan to move to the AWS SSO pattern you have in this template but it will require some substantial re-configuring of our AWS environment and so we're looking for a secure solution for storing credentials we can use in the meantime.

@axonasif
Copy link
Member

hi @kenibrewer

The workaround that we figured out whereby users could store credentials scoped to company/* and access them manually via eval $(gp env -e) is now broken.

Do you mean it no longer works for before>init prebuild task environment?

@kenibrewer
Copy link
Author

@axonasif

No. gp env -e has never worked for gitlab subgroup repos despite what is described in this documentation here. Our workaround instead involved manually running eval $(gp env -e) after the workspace finishes loading.

The new problem is that gp env -e does not work at all to pull personal environment variables into a workspace that originates from an organization project. From the pull request above, it looks like the ability to access personal environment variables in an organization workspace may have been completely removed.

This doesn't make any sense to me. Even in an organization, individual developers may need access to personally defined environment variables.

@axonasif
Copy link
Member

No. gp env -e has never worked for gitlab subgroup repos despite what is described in this documentation here.

This is a known issue, see this comment for a workaround:

The new problem is that gp env -e does not work at all to pull personal environment variables into a workspace that originates from an organization project.

It's not an issue with gp CLI, mainly the normal scoping pattern of env-vars are not working for gitlab subgroups as that introduces an extra /. See #8618

From the pull request above, it looks like the ability to access personal environment variables in an organization workspace may have been completely removed.

I think there's a slight misunderstanding there 😇 It rather fixes an issue that gp env had, it didn't print/pull the project env-vars before alongside personal env-vars, but now it does for consistency in behavior.

@stale
Copy link

stale bot commented Jun 17, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Jun 17, 2023
@kenibrewer
Copy link
Author

This bug is resolved in #17831

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta: stale This issue/PR is stale and will be closed soon
Projects
None yet
Development

No branches or pull requests

3 participants