Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git shallow clone for speeding up large repos #4255

Closed
hmaarrfk opened this issue Jul 29, 2021 · 3 comments
Closed

git shallow clone for speeding up large repos #4255

hmaarrfk opened this issue Jul 29, 2021 · 3 comments
Labels
locked [bot] locked due to inactivity stale::closed [bot] closed after being marked as stale stale [bot] marked as stale due to inactivity

Comments

@hmaarrfk
Copy link
Contributor

I'm trying to speed up build times for pytorch in hopes of gaining a few minutes on CIs.

I'm hoping that I can save time cloning a gigantic repository.

However, git clone and checkout fail for shallow clones.

I feel like git should be able to clone a single branch or tag with the command

git clone --branch GIT_REV --depth 1 URL

however, it seems that the checkout happens after the clone if I specify even as deep as 5000 commits for the shallow clone.

Is there a way to skip making the mirror.

Cloning a specific branch: 18MB

git clone --branch v1.9.0 --depth 1 git@github.com:pytorch/pytorch.git
Cloning into 'pytorch'...
remote: Enumerating objects: 9310, done.
remote: Counting objects: 100% (9310/9310), done.
remote: Compressing objects: 100% (8275/8275), done.
Receiving objects: 100% (9310/9310), 18.08 MiB | 9.03 MiB/s, done.
remote: Total 9310 (delta 1322), reused 2845 (delta 828), pack-reused 0
Resolving deltas: 100% (1322/1322), done.
Note: switching to 'd69c22dd61a2f006dcfe1e3ea8468a3ecaf931aa'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

Cloning full tree: 500MB

$ git clone git@github.com:pytorch/pytorch.git
Cloning into 'pytorch'...
remote: Enumerating objects: 618076, done.
remote: Counting objects: 100% (6125/6125), done.
remote: Compressing objects: 100% (2169/2169), done.
remote: Total 618076 (delta 4619), reused 5224 (delta 3947), pack-reused 611951
Receiving objects: 100% (618076/618076), 507.99 MiB | 41.92 MiB/s, done.
Resolving deltas: 100% (500231/500231), done.

Would you consider having an opt-out of the "mirror" + "clone the mirror" strategy that is currently implemented for git repos?

xref: mamba-org/boa#172 (comment)
xref: https://github.com/conda/conda-build/blob/master/conda_build/source.py#L236

cc: @rgommers
cc: @wolfv

@rgommers
Copy link

rgommers commented Aug 2, 2021

+1 I've seen this issue a few times. We do this in SciPy CI as well, creating a full clone is just too expensive for large repos, and unnecessary if you are just building a tag.

@jakirkham
Copy link
Member

This is already possible and has been around for a while. Please see git_depth.

@github-actions
Copy link

Hi there, thank you for your contribution!

This issue has been automatically marked as stale because it has not had recent activity. It will be closed automatically if no further activity occurs.

If you would like this issue to remain open please:

  1. Verify that you can still reproduce the issue at hand
  2. Comment that the issue is still reproducible and include:
    - What OS and version you reproduced the issue on
    - What steps you followed to reproduce the issue

NOTE: If this issue was closed prematurely, please leave a comment.

Thanks!

@github-actions github-actions bot added the stale [bot] marked as stale due to inactivity label May 31, 2023
@github-actions github-actions bot added the stale::closed [bot] closed after being marked as stale label Jul 1, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2023
@github-project-automation github-project-automation bot moved this to 🏁 Done in 🧭 Planning Jul 1, 2023
@github-actions github-actions bot added the locked [bot] locked due to inactivity label Jun 30, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity stale::closed [bot] closed after being marked as stale stale [bot] marked as stale due to inactivity
Projects
Archived in project
Development

No branches or pull requests

3 participants