Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add depth flag #16064

Closed
wants to merge 1 commit into from
Closed

feat: add depth flag #16064

wants to merge 1 commit into from

Conversation

yordis
Copy link

@yordis yordis commented Oct 22, 2023

No description provided.

@yordis yordis force-pushed the add-git-depth branch 5 times, most recently from 5ca3a38 to 746debc Compare October 23, 2023 18:32
@codecov
Copy link

codecov bot commented Oct 23, 2023

Codecov Report

Attention: Patch coverage is 57.89474% with 8 lines in your changes missing coverage. Please review.

Project coverage is 49.55%. Comparing base (92b22fd) to head (f118d53).
Report is 6 commits behind head on master.

Current head f118d53 differs from pull request most recent head ceb647f

Please upload reports for the commit ceb647f to get more accurate results.

Files Patch % Lines
reposerver/repository/repository.go 55.55% 2 Missing and 2 partials ⚠️
util/git/client.go 60.00% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16064      +/-   ##
==========================================
+ Coverage   44.95%   49.55%   +4.59%     
==========================================
  Files         355      269      -86     
  Lines       47917    46645    -1272     
==========================================
+ Hits        21541    23114    +1573     
+ Misses      23571    21258    -2313     
+ Partials     2805     2273     -532     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yordis yordis force-pushed the add-git-depth branch 7 times, most recently from bd0c6c1 to 1c1a5ed Compare October 24, 2023 01:16
@yordis yordis marked this pull request as ready for review October 24, 2023 03:02
@yordis yordis requested a review from a team as a code owner October 24, 2023 03:02
@yordis yordis force-pushed the add-git-depth branch 2 times, most recently from 964af87 to f118d53 Compare October 25, 2023 03:36
@yordis
Copy link
Author

yordis commented Oct 28, 2023

@crenshaw-dev could help me to get to the finish line with this one?

@crenshaw-dev
Copy link
Member

crenshaw-dev commented Oct 29, 2023

@yordis the code looks to be in good shape. We probably need some docs.

I'd also really like to see how this behaves on a large, active monorepo over time. I am curious if the shallow fetch is going to cause inefficient storage and slowly fill up the repo server.

@yordis
Copy link
Author

yordis commented Oct 29, 2023

I'd also really like to see how this behaves on a large, active monorepo over time. I am curious if the shallow fetch is going to cause inefficient storage and slowly fill up the repo server.

We should synchronize communication with the community, it should, in theory, be faster download and cheaper storage ...

We probably need some docs.

Where should I add it?

@blakepettersson
Copy link
Member

We should synchronize communication with the community, it should, in theory, be faster download

AFAIU the issue is that it's more expensive with shallow clones (see #5467 and its linked issues)

@yordis
Copy link
Author

yordis commented Oct 31, 2023

Hey peeps, what is the resolution here?

@crenshaw-dev
Copy link
Member

I think documentation should be added both of these places:

  1. https://argo-cd.readthedocs.io/en/latest/operator-manual/high_availability/#monorepo-scaling-considerations
  2. https://argo-cd.readthedocs.io/en/latest/operator-manual/declarative-setup/#repositories

AFAIU the issue is that it's more expensive with shallow clones (see #5467 and its linked issues)

Yep. I'm not a git expert, but I could definitely imagine "shallow clone" meaning "fresh copy stored with every pull." A fresh copy might not be as light to download or as easy to store as the diff between the current commit and the previous commit.

In other words, we need someone to run this code on a large, frequently-updated monorepo for at least a few days and compare repo-server behavior to the same monorepo fetched without depth: 1.

@yordis
Copy link
Author

yordis commented Nov 1, 2023

@crenshaw-dev do you mind giving me a hand with the docs? Honestly I am getting stuck with it, I would appreciate if you can help me with it and I get to learn from you.

@blakepettersson
Copy link
Member

What is the status of #14272 btw? It seems as if this would be a preferable solution (in most cases)

@yordis
Copy link
Author

yordis commented Nov 2, 2023

I am gonna go back to #14272 since they are some caveats I need to make sure it works:

  1. The sparse checkout sticks around
  2. Ideally we could use worktree in order to keep the concurrency and avoid conflict between different spare checkout

So, I knew how to get the depth in based on what I learned from that work, so I switched to add this feature first, so @crenshaw-dev could gather info about its usage and buy a bit of more time** (assuming here)

@todaywasawesome
Copy link
Contributor

What is the status of #14272 btw? It seems as if this would be a preferable solution (in most cases)

I think they each have their use case. Honestly curious to use them together for a repo that I have.

@todaywasawesome
Copy link
Contributor

@yordis We can test on this repo. It's not a super large repo but it has 10k commits so the shallow clone is ideal. https://github.com/aperture-sci/DevOps

It's also updated several times a day.

@yordis
Copy link
Author

yordis commented Dec 15, 2023

@todaywasawesome any support to validate the feature will help us, especially if it is being done in close to production usage

@yordis
Copy link
Author

yordis commented Jan 22, 2024

Hey folks, are you still interested in taking this feature?

From what I can tell, it will be a matter of time before people ask to pass this flag due to the nature of wrapping git.

But it is not up to me; I would appreciate any resolution regarding what needs to happen with this PR.

@crenshaw-dev
Copy link
Member

In other words, we need someone to run this code on a large, frequently-updated monorepo for at least a few days and compare repo-server behavior to the same monorepo fetched without depth: 1.

I think this is the next step towards getting this merged. Until we can confirm that setting the depth flag doesn't blow up storage due to inefficient local repo maintenance, I don't think we should merge it.

@todaywasawesome
Copy link
Contributor

@yordis I'm setting some time for myself to come back and work on this is the next few weeks.

@yordis yordis force-pushed the add-git-depth branch 2 times, most recently from 246dff8 to 095683d Compare June 6, 2024 04:54
@yordis
Copy link
Author

yordis commented Jun 6, 2024

I rebased, hopefully still working fine

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
@batazor
Copy link

batazor commented Jul 26, 2024

any news?

@yordis
Copy link
Author

yordis commented Jul 27, 2024

🤷🏻

@crenshaw-dev
Copy link
Member

I think we still need evidence that enabling this doesn't blow up storage for a given repo over time.

@crenshaw-dev
Copy link
Member

Putting this here for reference from other places:

Sparse and shallow git repos are common requests for monorepos. People are accustomed to using these features in CI pipelines. But these settings are much easier to get right in CI pipelines because you throw away the clone when you're done with it. Argo CD maintains a persistent clone on the repo-server, allowing concurrent access to the same clone until the repo-server restarts. When managing a persistent clone, you have to handle cases which would be safe to ignore for a throw-away clone.

For example:

  1. What happens if two applications with different depths/paths access the same repo at the same time?
  2. If I change one of these settings, when does it reflect? Immediately? On next checkout?
  3. Is storage efficiency impacted? i.e. is the size of my clone going to blow up over time?
  4. Is CPU use impacted? Will I see CPU spikes due to cleanup processes?
  5. Is there a need for manual cleanups? When do you call them? Are there concurrency concerns when calling them?

The concurrency concerns are different depending on whether you configure depth/paths at the app level or the repo level. If at the repo level, you're less likely to encounter races, but it's still possible.

@crenshaw-dev
Copy link
Member

Specific to this PR: does any depth besides "1" ever make sense for Argo CD's use case?

@jsolana
Copy link
Contributor

jsolana commented Oct 7, 2024

Hi! Just for awareness, trying with depth=1 we found these kind of errors unable to get changed files for repo:

image

In our case, we are using a mono-repo with thousands of commits per week.

For context, we did the next change here to test the impact:

func (m *nativeGitClient) fetch(revision string) error {
	var err error
	if revision != "" {
		err = m.runCredentialedCmd("fetch", "origin", revision, "--tags", "--force", "--prune", "--depth=1")
	} else {
		err = m.runCredentialedCmd("fetch", "origin", "--tags", "--force", "--prune", "--depth=1")
	}
	return err
}

@yordis yordis closed this Oct 7, 2024
@yordis yordis deleted the add-git-depth branch October 7, 2024 15:40
@crenshaw-dev
Copy link
Member

Consolidating conversation here: #11198

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants