Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please add support for credentials via environment variables #4789

Open
venthur opened this issue Oct 17, 2017 · 56 comments
Open

Please add support for credentials via environment variables #4789

venthur opened this issue Oct 17, 2017 · 56 comments
Labels
type: feature request Request for a new feature

Comments

@venthur
Copy link
Contributor

venthur commented Oct 17, 2017

  • Pip version: all
  • Python version: all
  • Operating system: all

Description:

We're using pip in a CI/CD pipeline to install packages from a private repository protected by username/password. Currently there are two options to pass those credentials to pip, either encode it directly in the URL or create a pip.conf file. Both options are not very attractive. The first option would entail to have those credentials hard coded in the source code, the second one would mean we'd have to generate this config file during the build process.

Most CI/CD build pipelines support some kind of "secret variables", which is a fancy word for environment variables that you can set in the CI/CD and that will be enabled in the build pipeline. This is usually the way to pass secrets.

It would be very helpful if pip would also support some mechanism to read secrets from environment variables.

See also: https://www.jfrog.com/confluence/display/RTF/PyPI+Repositories#PyPIRepositories-UsingCredentials for a realistic use case.

@pradyunsg pradyunsg added the type: enhancement Improvements to functionality label Oct 17, 2017
@pradyunsg
Copy link
Member

Adding the word authentication so that this shows up in my searches. :)

@xavfernandez
Copy link
Member

Note that you can pass the --index-url option containing the login/password via the PIP_INDEX_URL environment variable.

@pradyunsg pradyunsg added type: feature request Request for a new feature and removed type: enhancement Improvements to functionality labels Oct 24, 2017
@cmcginty
Copy link

cmcginty commented Mar 3, 2018

Also wanted to point out that you can pass --extra-index-url with PIP_EXTRA_INDEX_URL and this will override the value set in requirements.txt.
So you can have a bare URL for source control and then set the environment value with secrets in your CI/CD.

@countergram

This comment has been minimized.

@pradyunsg

This comment has been minimized.

@jahickey
Copy link

Thank you for adding the expansion of environment variables in requirement files. However, I was wondering if environment variables could be implemented for Pip similar to Twine? With Twine (especially for CI) you just need to set TWINE_USERNAME and TWINE_PASSWORD as environment variables in the CI. Thus, there's no need to add the username and password to the repository URL's.

Just curious.

@pradyunsg
Copy link
Member

There's keyring support that's integrated and up for the next release -- #5948.

@lhupfeldt
Copy link

I'm going to submit a PR to accept taking credentials from env variables. to make it much simpler to integrate safely with CI servers. Credentials should not be specified as command line options in any way as they may easily be leaked in logs or seen in process listings.

@ric79
Copy link

ric79 commented May 27, 2020

Hello, still open...

The PR was closed, even if PIP_PASSWORD was a nice idea.

@uranusjr
Copy link
Member

Feel free to propose a new one if you think it is a good idea.

@ric79
Copy link

ric79 commented Jun 8, 2020

Sorry, I have not the knowledge to implement this PR. Is it possible to resubmit that PR.
It's quite strange having still now this strong security issue.

@TimOrme
Copy link

TimOrme commented Jul 22, 2020

Wanted to chime in and voice support for this. It looks like the PR was closed, but maybe is still a viable option.

What would be the process for reviving it? Could someone else just open a new copy of the existing PR, or should we wait and see if @lhupfeldt can revive it?

@lhupfeldt
Copy link

lhupfeldt commented Jul 22, 2020 via email

@ric79
Copy link

ric79 commented Jul 23, 2020

Hello, I think that the PR was fantastic and it works similar to twine. I do not like at all putting my credentials on the file system.
Please resubmit again. I have not knowledge about this kind of procedures

@pfmoore
Copy link
Member

pfmoore commented Jul 23, 2020

You'll hit the same resistance again, I suspect. This is precisely the sort of issue that keyring support was intended to avoid - pip needing to implement multiple mechanisms for handling authentication, each for a particular (entirely valid) use case. If keyring support (or keyring itself) isn't sufficient for this use case, we should be improving them, not implementing an alternative mechanism.

@ric79
Copy link

ric79 commented Jul 23, 2020

I do not find any article about how to use keyring for "Python3 pip + virtualenv". Could you point me a link?

@lancetarn
Copy link

If keyring support (or keyring itself) isn't sufficient for this use case, we should be improving them, not implementing an alternative mechanism.

This is a reasonable request, but it also doesn't seem like keyring is designed for the problem of CI/CD or automated builds or however you want to think of the issue that env var auth is trying to solve. The fact that it needs a "headless linux" section seems like an indicator of this. Maybe I'm wrong. Either way, this comment from the PR seems to capture my situation nicely:

If I understand correctly you expect me to:

  1. Install the python keyring package on my systems (without using pip)
  2. Implement my own keyring backend and install that on my systems (without
    using pip)
  3. Configure keyring to on all my systems to use my special backend.

That sounds like so much effort that I would rather munge my pip config/requirements.txt to inject credentials on the fly during builds.

If specifying explicit environment variables that pip will use for auth feels like a slippery slope, another option that seems to work for npm would be to attempt to replace things that look like environment variables in configuration files. This would at least allow one to set index-url with userinfo parts like https://${MY_USER}:${MY_PASS}@secretpypi.example.org and inject the relevant variables in CI. I don't know the pip code base at all or if that is feasible; just a thought that occurred to me. It also has the advantage of allowing the user to auth against multiple indexes if necessary by specifying different env vars for each.

@pfmoore
Copy link
Member

pfmoore commented Aug 26, 2020

it also doesn't seem like keyring is designed for the problem of CI/CD or automated builds or however you want to think of the issue that env var auth is trying to solve

Have you raised that with the keyring project? Honestly, that's all we're suggesting here, and we're getting a lot of pushback. Pip added support for keyring, in a good-faith attempt to handle the requests we were getting for a mechanism to store credentials outside of pip¹. We were led to understand that the submitted PR was a good solution to this issue, and we took it on trust that keyring did the job we'd been told that it did.

To date, no-one has demonstrated that keyring isn't up to the job. Certainly, we've had people say that keyring doesn't support their use case. But nor does pip - someone will need to write new code, and the idea of adding keyring support was to delegate handling this sort of use case to that project. Until we see a definite statement from them that they aren't interested in supporting the use cases being described here, there's not much pip should be doing (IMO). If keyring come back and say they don't want to support this use case, then pip needs to look at what to do - and in my view, I'd want to reconsider whether we should be looking for an alternative to keyring that does support our users - I still don't want pip to get into the business of credential management².

attempt to replace things that look like environment variables in configuration files

You can use environment variables in requirements files. Have you tried using that feature to see if it handles your use case?

¹ The approach of using keyring had the additional benefit of not requiring the pip developers to get into questions around what is a secure way of handling credentials - we could leave that to the experts maintaining the keyring project.
² Yes, environment variables seem like a simple enough solution, and safe enough. But I'm not an expert, so I'm not going to make that decision. That's basically the point here...

@lancetarn
Copy link

Thanks for your reply. I clearly only have a tiny fragment of the context here. I'll try to find an appropriate place to ask about keyring in CI. Also, I did not know that requirements files would expand environment variables. I had tried it within pip.conf, which does not appear to. I can probably get the job done through that mechanism!

I appreciate the perspective of leaving credential management to experts and your efforts to keep things going in the right direction.

@TimOrme
Copy link

TimOrme commented Aug 26, 2020

@pfmoore chiming in here a bit and don't want to speak too much for others, but one of the cases mentioned elsewhere is that you end up in a bit of catch-22 situation that can't be resolved without some hacking, unless pip itself supports this.

If you're in an CI/CD environment where you only have access to a private, password protected PyPI repo, then you are in an unfortunate situation where you can't even install keyring to begin authenticating to that repo, even if it does support it.

There are perhaps other ways to get keyring installed, but they end up being a bit messy. Maybe an alternative is to ship keyring with pip or something along those lines, but I'm not sure of the feasibility or impact of that.

In short though, the concern is that if pip doesn't support that auth, and the only way to support auth is to install an external package, then we end up stuck in the cases where the external package requires auth.

@jasonstitt
Copy link

jasonstitt commented Aug 26, 2020

I think the "pushback" is because environment variables are a normal way of doing this and keyring is not. Several replies here have outlined the specific issues with adding this as a dependency. Given that, rather than requiring a "definite statement" from the keyring project (and who is going to obtain such a statement?), it would make more sense to explain how keyring is a good solution particularly for CI/CD, as it is basically an exception we would make from norm in order to use this tool. (i.e. this is the only similar tool that would use keyring...)

I still don't want pip to get into the business of credential management

Environment variables don't put you in the business of credential management. Something else is responsible for setting the environment variables; that's the point of environment variables.

Realistically the alternative here is not going to be keyring. An alternative is using PIP_INDEX_URL (an environment variable) with basic auth embedded in it. Which means you already take credentials in an environment variable, so the concern there is odd. And you already fixed the logging of the credentials in the URL several versions ago I think. The problem with this approach is simply that the entire URL now becomes a secret value, rather than just the credentials. I think this request is simply to split the URL from the credentials so that the URL could be hardcoded in a checkin without the credentials.

It sounds like environment expansion in requirements.txt is potentially superior.

@pfmoore
Copy link
Member

pfmoore commented Aug 26, 2020

@TimOrme

Maybe an alternative is to ship keyring with pip or something along those lines, but I'm not sure of the feasibility or impact of that.

You're right - bootstrapping keyring is an issue. But it was known (and acknowledged) when the feature was added, so all I can really say is that the original implementation saw that as an acceptable limitation. I don't personally have a good answer here.

Vendoring keyring is unfortunately not possible, because keyring depends on C extensions, and pip cannot vendor C extensions (because pip needs to be platform-neutral - there's a lot more background here, but that's the reality and it's not going to change, unfortunately).

@jasonstitt

and who is going to obtain such a statement?

Someone who needs this to work, surely? You seem to be assuming that it's up to the pip developers. Sorry, but it really isn't.

Which means you already take credentials in an environment variable, so the concern there is odd.

OK, I've no problem if you think my reluctance is odd. Feel free to take it as simply meaning that I won't do anything about this myself, if that helps.

It sounds like environment expansion in requirements.txt is potentially superior.

It does indeed sound like that is helpful for people in this situation. Which makes me wonder why no-one found that information. Is the section here in the documentation unclear? Is it hard to find? It may be that people have wasted time debating keyring, when if they'd found the existing feature they could have solved their problem much more easily - so if there's any improvement to the documentation that would have helped, it would be great if you could offer a suggestion (ideally as a PR, but even just an issue describing what you'd like to have seen would be good).

@lhupfeldt
Copy link

I guess that most people just put 'package>=version' (or == ...) in requirements.txt.
I definitely would never have had the idea to look at the requirements.txt specification in order to pass credentials.

Maybe some reference to requirements.txt from the existing documentation about how to authenticate, and an explanation of what to put in requirements.txt to make pip read credentials from there would help?

But I hope you are not referring to embedding credentials in index URLs or even adding index URLs in requirements.txt?
Credentials in URLs are generally considered insecure.

Adding URLs in requirements.txt for me would just make the file unreadable with even more substitutions to be made. We have production and not production pypi proxies, and I think other people will have the same.
It would also mean that every requirements.txt would have to add the credentials vs just having to add them on the CI server.

I understand that you are trying to limit the maintenance burden of pip, but we are talking 10 lines of code including logging (excluding the test) (and that would be maybe 7 if the check that both password and username is set was removed, as suggested) and 5 lines of documentation.

And this feature seems to be in popular demand.

@absassi
Copy link

absassi commented Aug 27, 2020

@pfmoore you've said (emphasis mine):

Pip added support for keyring, in a good-faith attempt to handle the requests we were getting for a mechanism to store credentials outside of pip¹.

And I think this is the misunderstanding in this discussion. We aren't asking for a mechanism to store credentials outside pip. We already have that one (e.g. the credential store in our CI server), and our mechanism, whatever it is, provides the credentials in the form of environment variables (which is very common). However, this mechanism is not keyring. The problem we face is then: how do we pass these credentials, that are already in environment variables, to pip?

The option to make pip to use keyring directly is very nice and solves a valid, but different problem, which is how to take credentials from keyring and pass them to pip.

@imjohsep
Copy link

imjohsep commented Oct 2, 2020

This probably isn't to everyone's standard, but I do this to store credentials as environment variables.

@pradyunsg
Copy link
Member

pradyunsg commented Oct 22, 2021

I'm overall ambivalent on this -- this discussion has a weird mix of misrepresenting what pip's keyring integration does and never getting an update on what the underlying design constraints are -- it seems like a reasonable request but all the proposed solutions so far seem infeasible to me.

This issue never got a "proper" update on the current credential management story for pip after the keyring support got added, so... I guess I just posted that above.

I think the next steps here are:

  • Someone explaining why a programmatic API to interact with credential stores (via the keyring library) isn't good-enough here.
  • Someone figuring out a design for a solution that'll work with the constraints that pip operates with.

Vendoring keyring is unfortunately not possible, because keyring depends on C extensions, and pip cannot vendor C extensions (because pip needs to be platform-neutral - there's a lot more background here, but that's the reality and it's not going to change, unfortunately).

It also breaks the fundamental assumption -- keyring is a Python package with a programmatic API that allows users to import things from it to write a third-party backend. Vendoring it breaks that, eliminating the primary benefit of it -- externally maintained third-party backends for interacting with different credential stores.


Also... https://pip.pypa.io/en/stable/topics/authentication/ is a thing now, and I'll add a follow up issue to add a cross-reference to https://pip.pypa.io/en/stable/reference/requirements-file-format/#using-environment-variables there.

And, finally, please be mindful that pip is primarily maintained by volunteers.

@reixd
Copy link

reixd commented Oct 22, 2021

This isn't sufficient information to understand your usage pattern.

Ok, I will give it a try.

Where do these credentials come from?

The credentials will be set on the current running (shell) environment. Hence normal environment variables. For example, you can fetch them with os.getenv('MY_PIP_USERNAME') and os.getenv('MY_PIP_PASSWORD').

Which/How many package indexes do you interact with in a single pip execution?

Usually two, the default one and a private one.

@uranusjr
Copy link
Member

The part I still fail to understand is why a separate environment variable is needed in the first place, since it is already possible to specify auth in PIP_INDEX_URL and the like. The only use case I can come up with is when repository URLs are set in configuration files, and wants to supplement the auth part without writing that configuration again, which is already slightly weird, but still rather easily achievable with something like

PIP_INDEX_URL=$(pip config get global.index-url | sed "s/\/\//\0${PIP_USER}:${PIP_PASS}@/")

Note that this is more or less what we would do if the support is built into pip, there's nothing hacky about this—or rather, there's nothing magical about having this implemented in pip instead of an ad-hoc Bash one-liner.

So I think the bottom line is that we need more concrete, objective reasons to explain exactly why this is a needed feature, rather than subjective "I think pip should pick it up".

@pradyunsg
Copy link
Member

Usually two, the default one and a private one.

And you want pip to use the credentials to access both of them?

@pradyunsg
Copy link
Member

pradyunsg commented Oct 22, 2021

use/write/contribute a keyring backend that picks up credentials from environment variables (i.e., treating those environment variables as the "store").

It's got the same design constraints as pip's needs here, so... honestly, yea... that's quite possibly one of the better outcomes here -- it'll likely even work transparently with twine / flit / poetry etc if you do this right. :)

@wwuck
Copy link

wwuck commented Oct 22, 2021

use/write/contribute a keyring backend that picks up credentials from environment variables (i.e., treating those environment variables as the "store").

That could be a solution to my use case, if I can work out the keyring bootstrapping problem for my CI environment.
It shouldn't be too hard to write a backend doing this, after looking at some existing keyring backend implementations. Working out how to publish it on pypi.org seems like a bigger challenge 😆

@wwuck
Copy link

wwuck commented Oct 25, 2021

Twine and Flit operate on a single domain/package index at a time, and they can safely assume/bake in the assumption that using a single credential pair is sufficient.

Poetry uses a name for their package indexes, which allows them to namespace the environment variable: POETRY_HTTP_BASIC_{REPOSITORY_NAME}_PASSWORD.

pip has neither of these conviniences -- and no one has come up with a feasible approach to cover for the usage patterns that we know are possible, with credential management. Our keyring integration solves that, by allowing users to use a credential store, have different credentials for different domains, and to enable them to tell pip to use that credential store directly via a keyring plugin.

@pradyunsg considering that an environment variables implementation for either pip or keyring would have the same issues regarding naming of environment variables, would this be sufficient to support multiple index urls? Is there any objection to using the PIP_ prefix?

PIP_INDEX_AUTH_URL_0=https://index0.example.com
PIP_INDEX_AUTH_USERNAME_0=myusername
PIP_INDEX_AUTH_PASSWORD_0=mypassword
PIP_INDEX_AUTH_URL_1=https://index1.example.com
PIP_INDEX_AUTH_USERNAME_1=myusername1
PIP_INDEX_AUTH_PASSWORD_1=mypassword1

or this might be a better way

PIP_INDEX_AUTH_0_URL=https://index0.example.com
PIP_INDEX_AUTH_0_USERNAME=myusername
PIP_INDEX_AUTH_0_PASSWORD=mypassword
PIP_INDEX_AUTH_1_URL=https://index1.example.com
PIP_INDEX_AUTH_1_USERNAME=myusername1
PIP_INDEX_AUTH_1_PASSWORD=mypassword1

@wwuck
Copy link

wwuck commented Oct 28, 2021

Hmm, looking at the pip docs on keyring support, I can't see where to specify a username when installing a package using keyring auth. Keyring allows me to set multiple username/password credentials for a single service/index-url. Is this specified elsewhere in the docs @pradyunsg?

@lhupfeldt
Copy link

It is rare to see this many people take interest in en issue :)

It was mentioned that pip is a volunteer effort. I think everybody understands this, and I did submit a PR for this, complete with tests and documentation.

I think neglecting the issue of pip requiring installation of a package is really bad (I'm aware of the workaround). In a large company CI setup it quicly becomes a mess if the CI installation also has to take care of installing the build tools for individual and very diverse projects which uses a lot of different technologies. At my company, individual projects do not have OS login to the CI servers, and the servers do not have internet access, so all package/tool installation is done by the CI server and goes through our local repositories.

The package installer should not depend on a package.

I think the issue of supporting multiple indexes can be seen as an extension, so maybe we could start by just documenting that multiple indexes are not (currently) supported through env variables. If you think supporting this is required before accepting a PR, then we can add that. I have no need for it, and I think most people wont. If you have a private protected repository you can probably proxy all indexes through that.

Please take a look at @absassi's comment which explains very nicely why supporting env variables is a good idea, and not a competitor to keyring.

@lhupfeldt
Copy link

For those suggesting embedding credentials in URLs, please read e.g. this: https://neilmadden.blog/2019/01/16/can-you-ever-safely-include-credentials-in-a-url/

@pradyunsg
Copy link
Member

pradyunsg commented Oct 30, 2021

Please take a look at @absassi's comment which explains very nicely why supporting env variables is a good idea, and not a competitor to keyring.

Please take a look at my comment which quotes that, and mentions why the proposed PR wasn't sufficient either. :)

See also #6723 (comment)

@pradyunsg
Copy link
Member

pradyunsg commented Oct 30, 2021

Is there any objection to using the PIP_ prefix?

I'm fine either way.

This is going to have to be distributed separately from pip, so it should be reasonable to pick something generic; but either way, it shouldn't be that difficult to make changes / allow making changes to that prefix. :)

@lhupfeldt
Copy link

Please take a look at @absassi's comment which explains very nicely why supporting env variables is a good idea, and not a competitor to keyring.

Please take a look at my comment which quotes that, and mentions why the proposed PR wasn't sufficient either. :)

See also #6723 (comment)

Sorry @pradyunsg, which of your comments are you referring to?

@lhupfeldt
Copy link

I see that @reixd directly accesses a public repo (pypi.org?) and a private one. In that case my implementation would leak the credentials to pypi.org (as documented, but who reads the documentation :) ). A solution which also checks the index url would definitely be better in that case. This exact scenario could be handled by alway attempting access without credentials first, but of cause this would not handle multiple protected repositories requiring different credentials. I'm not sure if this is a real issue though.

An index checking solution should allow patterns like *.mydomain.host and *.mydomain.host/p1 and chose the best match.

@wwuck
Copy link

wwuck commented Nov 1, 2021

This is going to have to be distributed separately from pip, so it should be reasonable to pick something generic; but either way, it shouldn't be that difficult to make changes / allow making changes to that prefix. :)

But the package will be pip-specific, so it makes sense to use PIP_. In any case, I still can't determine what happens in the case that keyring has multiple username/password defined for a single index-url? Does pip just pick one at random?

@wwuck
Copy link

wwuck commented Nov 1, 2021

@lhupfeldt would my example for env vars handle all the cases for multiple public and private index urls if it were implemented directly in pip? Public urls wouldn't have any PIP_INDEX_AUTH_URL_ or other env vars defined, while private indexes requiring auth would get a full mapping of index-url/username/password. Due to constraints on naming of environment variables it's not possible to embed a url directly in the environment variable name. As @pradyunsg mentioned previously, poetry can do this because it has a mapping specified in pyproject.toml for the index-url/repo name but this is something which pip would never support.

PIP_INDEX_AUTH_URL_0=https://index0.example.com
PIP_INDEX_AUTH_USERNAME_0=myusername
PIP_INDEX_AUTH_PASSWORD_0=mypassword
PIP_INDEX_AUTH_URL_1=https://index1.example.com
PIP_INDEX_AUTH_USERNAME_1=myusername1
PIP_INDEX_AUTH_PASSWORD_1=mypassword1

@lhupfeldt
Copy link

@wwuck I think your solution with matching index url with PIP_INDEX_AUTH_URL_<n> and the getting credentials from corresponding ...USERNAME_<n>/...PASSWORD_<n> is fine. I would like it to allow glob pattern matching on the URL, because I think that if using multiple private indexes, it is likely that the same credentials are used.

@wwuck
Copy link

wwuck commented Nov 9, 2021

Hmmm, so after reading #10389 I guess I should hold off on trying to implement a keyring backend for environment variables.

@uranusjr
Copy link
Member

uranusjr commented Nov 9, 2021

I don't think that credential helper API would happen anytime soon to solve your problem. If you start developing a keyring backend now, you'd probably be like version 3.0 when that API is released.

@wwuck
Copy link

wwuck commented Nov 9, 2021

Ok, so can anyone help on my keyring questions? The pip docs don't mention how to specify the username for the keyring credentials. Where is this username specified?

And what would happen if I enter multiple username credentials into keyring for a single pip index url? Will it pick one at random? Will it try them all until it gets a successful auth attempt?

keyring set https://private-pypi-index.example.com myusername1
keyring set https://private-pypi-index.example.com myusername2

@rehevkor5
Copy link

rehevkor5 commented Apr 6, 2022

it is already possible to specify auth in PIP_INDEX_URL and the like

I don't see any info about that in the docs... am I missing something? https://pip.pypa.io/en/stable/search/?q=PIP_INDEX_URL&check_keywords=yes&area=default

Edit: ah, they're named based on command line options, ok https://pip.pypa.io/en/stable/topics/configuration/#environment-variables

@YevheniiPokhvalii
Copy link

YevheniiPokhvalii commented Nov 20, 2022

Are there any updates on this?
I'll explain where it could be useful. For example, Tekton pipelines in Kubernetes.

I have requirements.txt with this private repo link:

http://nexus:8081/repository/edp-python-releases/packages/some_python_package.whl

(Let's assume, it is possible to use variables PIP_USERNAME and PIP_PASSWORD).
So I can attach a secret to python pod this way:

spec:
  params:
    - name: BASE_IMAGE
      type: string
      default: "python:3.8-alpine3.16"
    - name: PIP_EXTRA_INDEX_URL
      type: string
      default: "http://nexus:8081/"
    - name: PIP_TRUSTED_HOST
      type: string
      default: "nexus"
    - name: ci-secret
      type: string
      default: ci.user
  steps:
    - name: python
      image: $(params.BASE_IMAGE)
      workingDir: $(workspaces.source.path)
      env:
        - name: PIP_USERNAME
          valueFrom:
            secretKeyRef:
              name: $(params.ci-secret)
              key: username
        - name: PIP_PASSWORD
          valueFrom:
            secretKeyRef:
              name: $(params.ci-secret)
              key: password
        - name: PIP_EXTRA_INDEX_URL
          value: "$(params.PIP_EXTRA_INDEX_URL)"
        - name: PIP_TRUSTED_HOST
          value: "$(params.PIP_TRUSTED_HOST)"
      script: |
        pip install -r requirements.txt

And the above example should work without hassle.

But what I have to do without variables PIP_USERNAME and PIP_PASSWORD is to change requirements.txt to this:

http://${NEXUS_USERNAME}:${NEXUS_PASSWORD}@nexus:8081/repository/edp-python-releases/packages/some_python_package.whl

Or another workaround for multiple pipeline steps is to create pip.conf:

      env:
        - name: HOME
          value: $(workspaces.source.path)
        - name: CI_USERNAME
          valueFrom:
            secretKeyRef:
              name: $(params.ci-secret)
              key: username
        - name: CI_PASSWORD
          valueFrom:
            secretKeyRef:
              name: $(params.ci-secret)
              key: password
      script: |
        pipdir="$HOME/.pip"
        if [ ! -d "${pipdir}" ]; then
          mkdir -p "${pipdir}"
          cat <<-EOF > "${pipdir}"/pip.conf
        [global]
        trusted-host = nexus
        extra-index-url = http://${CI_USERNAME}:${CI_PASSWORD}@nexus:8081/
        EOF
        fi
        pip install -r requirements.txt

It does not look nice. PIP should have the way to pass a user and password via environment variables like Twine has.

@rehevkor5
Copy link

It's worth noting that for Docker image builds, ~/.pip/pip.conf or similar is almost certainly a better choice than using environment variables, even if values are provided via ARG. By using pip.conf, you can provide it to the Docker build in a secure way via the RUN --mount=type=secret approach documented here https://docs.docker.com/engine/reference/builder/#run---mounttypesecret assuming that you've enabled BuildKit via export DOCKER_BUILDKIT=1 or similar.

@wwuck
Copy link

wwuck commented Jul 23, 2023

pypi/warehouse#10030 has finally been fixed, so the keyring plugin is finally uploaded to pypi.

https://pypi.org/project/keyrings.envvars/

@vyadh
Copy link

vyadh commented Jul 17, 2024

I think I have worked out a way to use pip in a secure way when building container images and environment variables, which might be helpful for others using Docker in a CI/CD context. It's a lot of code, but I think it's at least secure.

My specific requirements are:

  • Avoiding the chicken-and-egg scenario of using the keyring functionality (alternative would be to use curl to get the keyring wheel from somewhere and install it first, but that would be even more complicated!)
  • Not using the index-url property as it would leak credentials into our various HTTP logs
  • Using Docker's buildx secure build secrets
  • Not leaking credentials into any container layers
  • Not leaking credentials onto physical storage even temporarily in case they are not correctly cleaned up by a particular usage

The following assumes the environment variables INDEX_USER and INDEX_PASS have been set as environment variables by the CI/CD system (or running locally when testing).

Firstly, to build the below image, the command would be:

docker build --secret id=INDEX_USER --secret id=INDEX_PASS .

The Dockerfile below is not particularly readable, but hopefully it makes some sense. Very briefly, this basically uses Docker build secrets, the netrc functionality and a tmpfs mount to install packages from our internal authenticated feed within our build systems. All without the credentials being leaking in HTTP logs or hitting physical storage.

FROM [...]

ARG INDEX_SERVER=[...]
ARG INDEX_URL=https://$INDEX_SERVER/path

RUN pip config --global set global.index-url $INDEX_URL && \
    # Make pip use standard certificate store (on at least Ubuntu, Debian & Alpine)
    pip config --global set global.cert /etc/ssl/certs/ca-certificates.crt && \
    # Avoid not found errors given no Internet access within our CI/CD system
    pip config --global set global.disable-pip-version-check true

# Copy build files into a working folder
WORKDIR /build
COPY requirements.txt ./

RUN --mount=type=secret,id=INDEX_USER \
    --mount=type=secret,id=INDEX_PASS \
    # Create a tmpfs (memory-based) mount for the netrc authentication file valid for this layer only
    --mount=type=tmpfs,target=/run/auth \
    # The mounted secrets are held in tmpfs (memory-based) files that we need to pull out
    INDEX_USER=$(cat /run/secrets/INDEX_USER) \
    INDEX_PASS=$(cat /run/secrets/INDEX_PASS) \
    # Set the authentication credentials securely (using index-url is insecure)
    NETRC_CONTENT="machine $INDEX_SERVER login $INDEX_USER password $INDEX_PASS" \
    sh -c "echo \$NETRC_CONTENT" > /run/auth/.netrc && \
    # Variable to tell pip (actually Python's netrc module) to pick up the memory-based .netrc file location
    NETRC=/run/auth/.netrc \
    # Install required packages
    pip install --no-warn-script-location --no-cache-dir -r requirements.txt && \
    pip check

# Remainder of build
[...]

I'll add that this common use of environment variables by Docker seems a good case for supporting environment variables more generally to simplify above, but at least there is a workable alternative for Linux containers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.