Skip to content
This repository has been archived by the owner on Nov 18, 2021. It is now read-only.

Allow fetching commit by id #436

Open
tgr opened this issue Aug 8, 2015 · 10 comments
Open

Allow fetching commit by id #436

tgr opened this issue Aug 8, 2015 · 10 comments

Comments

@tgr
Copy link

tgr commented Aug 8, 2015

git fetch <remote> <commit id> usually does not work in git, as a security measure to avoid unintentional disclosure of information that has been made unavailable from every branch/tag but has not been garbage collected (this discussion has some details). Recently, the uploadpack.allowReachableSHA1InWant configuration flag has been added to enable fetching of reachable commits, but it is off by default and generally advised against on performance reasons.

This makes sense, but GitHub already has an API to get the contents of a particular commit, so a performant way to determine whether a commit can be reached already exists. It would be nice if commits that are accessible via the API would also be accessible via git fetch.

(The context in which this came up - but I'm sure there are many other uses - was verifying that no one tampered with a file when I have its commit id but I cannot trust any information I receive from GitHub, e.g. I need to use an insecure connection. In such a case the commit object that's obtained via git fetch could be used as a cryptographically secure fingerprint of the file. It can probably be constructed from the GitHub API responses as well, but that's rather tedious.)

@cirosantilli
Copy link
Collaborator

This should be possible. Even Git could implement it by first checking if the commit is reachable or not.

I just don't think GitHub should modify the behaviour of Git on it's website, that would be confusing.

Out of curiosity, I don't understand your use case very well: you need to use an insecure connection to get the commits, but you can use a secure connection to do the fetch is that so?

@tgr
Copy link
Author

tgr commented Aug 10, 2015

Git is a standard with multiple implementations and I doubt GitHub uses the original one. That said, as it turns out they did implement it recently. I will update the ticket.

As for the use case, all connections could be insecure (a git commit is basically its own signature, it cannot be forged in a non-noticeable way), but that's not really the point as getting a secure connection is not that hard. Someone could have stolen the project owner's keys and changed the content though, so using mutable references like tag or branch names is not safe.

Let me put it another way: at some point in the past someone reviewed the code of a GitHub-hosted project, decided that a certain version is secure, and recorded its commit id. The threat model is that the review was sound, and the place where commit ids are recorded is reliable but every other source isn't. I.e. we need to download the actual code from somewhere (that somewhere is quite likely not GitHub, for performance reasons) and the downloaded code could have been tampered with.

@cirosantilli
Copy link
Collaborator

@tgr GitHub likely uses https://github.com/libgit2/libgit2

Although there are multiple implementations, the external facing API should be the same for all git clients. But true, a small extension like that could be considered.

But as you've found out (I didn't know!) it is already possible by a "standard" server config, so it definitely could be done.

I understand the use case better, thanks. Safe source for checking, unsafe for fast download.

darxriggs added a commit to darxriggs/git-client-plugin that referenced this issue Jul 29, 2018
* in case the used git does not support shallow submodules, just
  log a warning instead of throwing an exception
* Javadoc texts are used from the already existing CloneCommand
* tests are running fine now and cover more cases
  * file:// has to be used as protocol for local remotes,
    otherwise git doesn't perform shallow cloning
  * a local repository has to be used for shallow clone testing,
    because GitHub doesn't allow fetching dedicated commits
    * see isaacs/github#436
    * this would result in the following
      error: Server does not allow request for unadvertised object
* other minor improvements
darxriggs added a commit to darxriggs/git-client-plugin that referenced this issue Jul 29, 2018
* in case the used git does not support shallow submodules, just
  log a warning instead of throwing an exception
* Javadoc texts are used from the already existing CloneCommand
* tests are running fine now and cover more cases
  * file:// has to be used as protocol for local remotes,
    otherwise git doesn't perform shallow cloning
  * a local repository has to be used for shallow clone testing,
    because GitHub doesn't allow fetching dedicated commits
    * see isaacs/github#436
    * this would result in the following
      error: Server does not allow request for unadvertised object
* other minor improvements
@akx
Copy link

akx commented Nov 19, 2018

This is still an issue, unfortunately.

The reason I bumped into this is git clone --depth=1 --shallow-submodules (and judging by those linked issues above, I'm not the only one); --shallow-submodules seems to do the equivalent of git clone -b COMMITHASHHERE (which also does not work against GitHub).

@Prcuvu
Copy link

Prcuvu commented Aug 8, 2019

This is an issue to me. I want to re-create a branch from no-branch no-tag commits (head SHA1 known, tree viewable on webpage), but the restriction of GitHub server disables an easy way to do that.

@uri-canva
Copy link

There is now a uploadpack.allowAnySHA1InWant setting that doesn't have the issue of having to calculate reachability, introduced in git/git@f8edeaa.

@uri-canva
Copy link

It looks like this works as long as you use protocol v2, protocol v1 won't work.

@marc-h38
Copy link

as a security measure to avoid unintentional disclosure of information that has been made unavailable from every branch/tag but has not been garbage collected (http://thread.gmane.org/gmane.comp.version-control.git/257807 has some details).

Very sadly, gmane is no more. @tgr do you remember the subject, date, participants, anything?

@tgr
Copy link
Author

tgr commented Dec 10, 2019

I don't. At a guess it might have been this thread.

Not super relevant though, the point is there are valid security reasons to limit what commits you expose, but GitHub has a web API for exposing commits by sha1, so it already has to deal with that and exposing the same commits via git fetch as well would not be much extra complexity.

@marc-h38
Copy link

I don't. At a guess it might have been https://public-inbox.org/git/CAPBPrnsA4KxNximtKXcC37kuwBHK0Esytdm4nsgLHkrJSg3Ufw@mail.gmail.com/

Thanks for the prompt answer, indeed this one keeps popping up. For the record this is: "Can I fetch an arbitrary commit by sha1?", 2-9 Oct 2014.

the point is there are valid security reasons to limit what commits you expose,

Yes and while security is very often mentioned, I still couldn't find any good, official and clear git (!=github) documentation about these security aspects, which is why I asked for the link (which still doesn't provide much).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants