Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] The git clone URL is no longer logged. #15684

Closed
shoeffner opened this issue Feb 15, 2024 · 6 comments · Fixed by #16038
Closed

[feature] The git clone URL is no longer logged. #15684

shoeffner opened this issue Feb 15, 2024 · 6 comments · Fixed by #16038
Assignees

Comments

@shoeffner
Copy link
Contributor

Environment details

Steps to reproduce

Any Git(self).clone command will no longer log the URL, instead show <hidden>.

Logs

user997@ffaebed02c7c:/recipes/boto3$ conan source .
conanfile.py (boto3/1.26.144): Calling source() in /recipes/boto3/src
conanfile.py (boto3/1.26.144): Cloning git repo
conanfile.py (boto3/1.26.144): RUN: git clone "<hidden>" --branch 1.26.144 --depth 1 "."

We fork all our thirdparties internally so we have stable URLs in case they disappear and to avoid hitting external servers too much. As a side-effect, we build up our URLs as https://internalgitserver/our-thirdparties/{name}.git (and potentially use git@internalgitserver:our-thirdparites/{name}.git for local development).
Since Conan 2 removed the authentication options initially (see e.g. conan-io/docs#2918), we are using git credential.helpers which we configure in our CI pipelines, which makes it trivial to provide credentials via Jenkins, GitLab CI, etc.:

[credential "https://internaltgitserver"]
    helper = "!f() { test \"$1\" = \"get\" && printf \"username=%s\\npassword=%s\" \"${GIT_USERNAME}\" \"${GIT_PASSWORD}\"; }; f"

Thus our credentials are masked by the CI runner already and we do not store them inside the recipes or anything.
However, due to the "dynamic" nature of our URLs, it is nice to actually see them -- with the recent change to hide the URLs, it's always a little bit of guesswork if the URL was actually correct.

As a work-around, I added logs for all recipes now, so it's no big blocker, but if there is need, I would offer to provide a patch to make the url-hiding optional.

@memsharded memsharded self-assigned this Feb 15, 2024
@memsharded
Copy link
Member

Hi @shoeffner

Thanks very much for your feedback.

I am afraid this is not a bug, this was done deliberately, based on 2 things:

  • Even if the recommended way is to use git ssh keys or similar, and not URL encoded ones, there was still a relatively high risk and users using URL-encoded auth. Conan must avoid as much as possible to be printing those credentials in the logs, so we decided to mask the URLs in the Git() helper. As you have guessed, you can always and easily print them, but we certainly don't want to be in the position of Conan printing credentials to plain text logs.
  • The Conan output has been explicitly documented as non-stable and being able to change at any time (https://docs.conan.io/2/introduction.html#stable). Users shouldn't rely at all in the text output, only machine output of --format=json can be used and considered stable for automation

So overall, it was a trade-off between security and convenience, and in this case the balance clearly went in the direction of security.

As a work-around, I added logs for all recipes now, so it's no big blocker, but if there is need, I would offer to provide a patch to make the url-hiding optional.

That sounds nice and convenient, as long as users explicitly opt-in to print the URLs, I am good, if you want to contribute it, it will be welcome. Are you considering a Git(..., hide_url=False) argument?

@shoeffner
Copy link
Contributor Author

shoeffner commented Feb 15, 2024

Thanks, I was aware it's not really a bug, but was not sure what else from the issue templates would fit best; I agree feature is a good idea.

I also completely understand the decision, which is why I also linked to the related issues and the merge request, and I certainly welcome security considerations -- in fact, that's the reason why we were looking at alternatives for authentication when Conan removed the credentials from the scm tools with Conan 2 and decided to let git handle it instead of encoding credentials in URLs.
However, I wonder how the credentials would end up in such URLs – one should not put credentials inside the conanfile.py, so they should only end up there if the URLs are put there in some other way (environment variables?) and the URLs are built on the fly.
In that case, however, there is a workaround we use in our groovy scripts for sparse checkouts: we simply provide the credentials via environment variables and use git -c credential.helper ....

Are you considering a Git(..., hide_url=False) argument?

That would be the simple option to re-enable logging and is what I had in mind, but I am note sure if it is a good idea to "bloat" up the API here with such a flag. Now that I spend another hour pondering on the subject, I think it might not be.

I do not really know how other users perform the authentication right now, and I hope they do not store the credentials inside the recipe itself. If they provide them in the way "username" and "password/token", I could imagine two ways to solve authentication to disarm the credential-leak potential of printing the URL:

First, one could use credential arguments (or auth=(...) similar to other libraries):

git = Git(self, username=..., password=...)
git = Git(self, ssh_public_key=...)
git = Git(self, auth=(<username>, <password>))

The second option would be handle authentication similar to conan remote, which allows for CONAN_LOGIN_USERNAME; we could support CONAN_SCM_USERNAME and CONAN_SCM_PASSWORD and pick those up automatically.

The first option would allow to use the Git wrapper for multiple repositories with different credentials; this is difficult for the second option.

In both cases, it would be possible to allow to print the URL in multiple ways:
a) by masking the credentials directly (as we now separated them from the URL string): https://<hidden>:<hidden>@giturl.git)
b) by using a credential helper, passing the environment variables, and not masking anything at all:

RUN git -c credential.helper '!f() { test "$1" = "get" && printf "username=%s\npassword=%s" "${CONAN_GIT_USERNAME}" "${CONAN_GIT_PASSWORD}"; }; f' clone https://giturl.git --branch 1.0 --depth 1 "."

Or, of course, masking inside the credential helper, to prevent using extra environment variables for the run call:

RUN git -c credential.helper '!f() { test "$1" = "get" && printf "username=<hidden>\npassword=<hidden>"; }; f' clone https://giturl.git --branch 1.0 --depth 1 "."

One major caveat is in any cases escaping the credentials properly for the shell, as shlex.quote (which is for unix systems) does not handle !, ^, and > in passwords well on Windows (we actually had these cases in the past...), but in most cases it works well.
Variant a is close to the current situation but at least allows to show the URL.
Variant b is arguably the best choice from a security perspective, as it will not leak the credentials inside, e.g., htop (which is currently the case and also with variants a and c).
Variant c is arguably cleaner than a, but b and c clutter the log with quite some "technical detail" (which, in turn, could be hidden, but I am not sure if that's really helpful -- authentication is always tricky enough to debug).

Long story short: For the short term, simply adding self.output.info(f"Git URL: {self.url}") solves it for us, Git(self, hide_url=False) would be a quick solution for everyone, and offering a way to handle authentication in a way such that it removes the necessity to add credentials to the URL directly might make it a lesser concern to hide the url for security reasons, potentially removing the mask again.

But especially for the git authentication, we should definitely discuss the details more to avoid a half-baked solution which has to be carried around. In fact, after initial confusion how to do authentication, we applauded the fact that Conan 2 moved the responsibility of authentication to git and the relevant tools, so that Conan does not need to handle credentials at all. But if it leads to recipe URLs containing credentials, this is certainly something worth discussing.

@shoeffner shoeffner changed the title [bug] The git clone URL is no longer logged. [feature] The git clone URL is no longer logged. Feb 15, 2024
@memsharded
Copy link
Member

However, I wonder how the credentials would end up in such URLs – one should not put credentials inside the conanfile.py, so they should only end up there if the URLs are put there in some other way (environment variables?) and the URLs are built on the fly.

Yes, mostly url = f"https://{os.getenv("MYGITUSER")}:{os.getenv("MYGITPASSWD")/some/url/repo...". In which users think this wouldn't be an issue because env-vars are defined in CI system only, never in code, but then if Conan prints the URL internally, that is already replaced and exposed.
I know this wouldn't be that common, but those things happen, I have seen it personally.

Conan 1.X did in fact a lot of effort to hide that password by parsing URLs, more or less with the strategies you describe, but that is difficult to make 100% perfect, so better not doing it at all than leaking 1% of our users doing this, with so many thousands it will be a bunch of them, and we cannot afford even a single one.

I agree that providing auth for Git with Conan shouldn't be rushed, and doing it carefully is better, so we will keep recommending the git-provided credentials at the moment and keep monitoring the potential demand for this.

Thanks for all the feedback!

@shoeffner
Copy link
Contributor Author

Thanks for the insights!

While I did not consider url parsing yet, it sounds like a good idea, what are the cases which are hard to do? Shouldn't the standard library got this covered already?

from urllib.parse import urlparse

res = urlparse("https://username:password@example.com/some/url/repo.git")
if res.username:
    cmd = cmd.replace(res.username, "<hidden>")
if res.password:
    cmd = cmd.replace(res.password, "<hidden>")

Also, every CI system I know has some way to mask variables from any log output, so I am really not sure this should be something conan must be concerned with; for example, Jenkins GitSCM does not bother hiding URLs and instead relies on Jenkins to mask usernames and passwords from the logs, so does the GitLab CI.

If the clone takes long, credentials might still be exposed either way, e.g. to other processes running on the build machine, you can simulate this with:

 def source(self):
        git = Git(self)
        git.clone("http://user:pass@example.com/git.git", args=["; sleep 10000"])

Which will show: conanfile.py (recipe/version): RUN: git clone "<hidden>" ; sleep 10000.
Your process manager, e.g., htop, however, will show the credentials:

sh -c git clone "http://user:pass@example.com/git.git" ; sleep 10000 > /var/folders/....

Granted, this is a non-trivial take-over and logs are much easier, but it's also something to consider.

In any case, I will be happy to either provide an argument hide_url=False or replace the logic with masking username/passwords if the urlparse solution is fine for you. Just let me know which way you prefer, if you want one. Otherwise, we can also close this issue again :)

@memsharded
Copy link
Member

Following up from your comment in the other issue.

I am fine with a hide_url=False opt-out in the Git.clone and Git.fetch_commit methods, I think it would be the easiest, and still safe as it is a user decision. Would this be useful for you then? If that is the case, sure, feel free to contribute it, many thanks!

@shoeffner
Copy link
Contributor Author

I will have a look at it, thanks for the feedback!

@shoeffner shoeffner mentioned this issue Apr 7, 2024
5 tasks
AbrilRBS added a commit that referenced this issue Apr 8, 2024
* Add hide_url tests for git scm tool

* Add hide_url flag to clone and fetch_commit.

Resolves #15684

* Update conans/test/functional/tools/scm/test_git.py

* Update conans/test/functional/tools/scm/test_git.py

---------

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>
memsharded pushed a commit to memsharded/conan that referenced this issue Apr 9, 2024
* Add hide_url tests for git scm tool

* Add hide_url flag to clone and fetch_commit.

Resolves conan-io#15684

* Update conans/test/functional/tools/scm/test_git.py

* Update conans/test/functional/tools/scm/test_git.py

---------

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>
memsharded pushed a commit to memsharded/conan that referenced this issue Apr 10, 2024
* Add hide_url tests for git scm tool

* Add hide_url flag to clone and fetch_commit.

Resolves conan-io#15684

* Update conans/test/functional/tools/scm/test_git.py

* Update conans/test/functional/tools/scm/test_git.py

---------

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>
memsharded added a commit that referenced this issue Apr 11, 2024
* cmake deps internal refactor for readability

* fix

* fix

* fix

* fix

* fix

* copy only if different (#16031)

* allow conf in exports-sources and export (#16034)

* refactor apple_min_version_flag() (#16017)

* refactor apple_min_version_flag()

* Refactored all the apple module and where it was being used (AutotoolsToolchain and MesonToolchain for now)

* Fixed bad return

* Fixing tests

* Keeping legacy behavior in apple_min_version_flag function

* Preventing possible breaking change

---------

Co-authored-by: Francisco Ramirez de Anton <franchuti688@gmail.com>

* Allow to unhide git url (#16038)

* Add hide_url tests for git scm tool

* Add hide_url flag to clone and fetch_commit.

Resolves #15684

* Update conans/test/functional/tools/scm/test_git.py

* Update conans/test/functional/tools/scm/test_git.py

---------

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>

* remove repeated non-running test (#16053)

* refactor transitive_requires

---------

Co-authored-by: Francisco Ramirez de Anton <franchuti688@gmail.com>
Co-authored-by: Sebastian Höffner <info@sebastian-hoeffner.de>
Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>
czoido pushed a commit that referenced this issue Jun 4, 2024
* wip

* wip

* wip

* Test for BazelDeps in the build context (#16025)

* feat add test to bazeldeps

* Fixed several bugs. Improved tests coverage

* Reverted

* Better name

---------

Co-authored-by: Francisco Ramirez de Anton <franchuti688@gmail.com>

* copy only if different (#16031)

* allow conf in exports-sources and export (#16034)

* refactor apple_min_version_flag() (#16017)

* refactor apple_min_version_flag()

* Refactored all the apple module and where it was being used (AutotoolsToolchain and MesonToolchain for now)

* Fixed bad return

* Fixing tests

* Keeping legacy behavior in apple_min_version_flag function

* Preventing possible breaking change

---------

Co-authored-by: Francisco Ramirez de Anton <franchuti688@gmail.com>

* Allow to unhide git url (#16038)

* Add hide_url tests for git scm tool

* Add hide_url flag to clone and fetch_commit.

Resolves #15684

* Update conans/test/functional/tools/scm/test_git.py

* Update conans/test/functional/tools/scm/test_git.py

---------

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>

* wip

* wip

* wip

* fix

* Update conan/tools/build/cstd.py

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>

* Update conan/tools/build/cstd.py

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>

* Update conan/tools/build/cstd.py

Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>

---------

Co-authored-by: Ernesto de Gracia Herranz <vivalaburocracia@hotmail.com>
Co-authored-by: Francisco Ramirez de Anton <franchuti688@gmail.com>
Co-authored-by: Sebastian Höffner <info@sebastian-hoeffner.de>
Co-authored-by: Rubén Rincón Blanco <git@rinconblanco.es>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants