forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add experimental 'git survey' builtin (#5174)
This introduces `git survey` to Git for Windows ahead of upstream for the express purpose of getting the path-based analysis in the hands of more folks. The inspiration of this builtin is [`git-sizer`](https://github.com/github/git-sizer), but since that command relies on `git cat-file --batch` to get the contents of objects, it has limits to how much information it can provide. This is mostly a rewrite of the `git survey` builtin that was introduced into the `microsoft/git` fork in microsoft#667. That version had a lot more bells and whistles, including an analysis much closer to what `git-sizer` provides. The biggest difference in this version is that this one is focused on using the path-walk API in order to visit batches of objects based on a common path. This allows identifying, for instance, the path that is contributing the most to the on-disk size across all versions at that path. For example, here are the top ten paths contributing to my local Git repository (which includes `microsoft/git` and `gitster/git`): ``` TOP FILES BY DISK SIZE ============================================================================ Path | Count | Disk Size | Inflated Size -----------------------------------------+-------+-----------+-------------- whats-cooking.txt | 1373 | 11637459 | 37226854 t/helper/test-gvfs-protocol | 2 | 6847105 | 17233072 git-rebase--helper | 1 | 6027849 | 15269664 compat/mingw.c | 6111 | 5194453 | 463466970 t/helper/test-parse-options | 1 | 3420385 | 8807968 t/helper/test-pkt-line | 1 | 3408661 | 8778960 t/helper/test-dump-untracked-cache | 1 | 3408645 | 8780816 t/helper/test-dump-fsmonitor | 1 | 3406639 | 8776656 po/vi.po | 104 | 1376337 | 51441603 po/de.po | 210 | 1360112 | 71198603 ``` This kind of analysis has been helpful in identifying the reasons for growth in a few internal monorepos. Those findings motivated the changes in #5157 and #5171. With this early version in Git for Windows, we can expand the reach of the experimental tool in advance of it being contributed to the upstream project. Unfortunately, this will mean that in the next `microsoft/git` rebase, Jeff Hostetler's version will need to be pulled out since there are enough conflicts. These conflicts include how tables are stored and generated, as the version in this PR is slightly more general to allow for different kinds of data. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
- Loading branch information
Showing
13 changed files
with
1,139 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
survey.*:: | ||
These variables adjust the default behavior of the `git survey` | ||
command. The intention is that this command could be run in the | ||
background with these options. | ||
+ | ||
-- | ||
verbose:: | ||
This boolean value implies the `--[no-]verbose` option. | ||
progress:: | ||
This boolean value implies the `--[no-]progress` option. | ||
top:: | ||
This integer value implies `--top=<N>`, specifying the | ||
number of entries in the detail tables. | ||
-- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
git-survey(1) | ||
============= | ||
|
||
NAME | ||
---- | ||
git-survey - EXPERIMENTAL: Measure various repository dimensions of scale | ||
|
||
SYNOPSIS | ||
-------- | ||
[verse] | ||
(EXPERIMENTAL!) 'git survey' <options> | ||
|
||
DESCRIPTION | ||
----------- | ||
|
||
Survey the repository and measure various dimensions of scale. | ||
|
||
As repositories grow to "monorepo" size, certain data shapes can cause | ||
performance problems. `git-survey` attempts to measure and report on | ||
known problem areas. | ||
|
||
Ref Selection and Reachable Objects | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
In this first analysis phase, `git survey` will iterate over the set of | ||
requested branches, tags, and other refs and treewalk over all of the | ||
reachable commits, trees, and blobs and generate various statistics. | ||
|
||
OPTIONS | ||
------- | ||
|
||
--progress:: | ||
Show progress. This is automatically enabled when interactive. | ||
|
||
Ref Selection | ||
~~~~~~~~~~~~~ | ||
|
||
The following options control the set of refs that `git survey` will examine. | ||
By default, `git survey` will look at tags, local branches, and remote refs. | ||
If any of the following options are given, the default set is cleared and | ||
only refs for the given options are added. | ||
|
||
--all-refs:: | ||
Use all refs. This includes local branches, tags, remote refs, | ||
notes, and stashes. This option overrides all of the following. | ||
|
||
--branches:: | ||
Add local branches (`refs/heads/`) to the set. | ||
|
||
--tags:: | ||
Add tags (`refs/tags/`) to the set. | ||
|
||
--remotes:: | ||
Add remote branches (`refs/remote/`) to the set. | ||
|
||
--detached:: | ||
Add HEAD to the set. | ||
|
||
--other:: | ||
Add notes (`refs/notes/`) and stashes (`refs/stash/`) to the set. | ||
|
||
OUTPUT | ||
------ | ||
|
||
By default, `git survey` will print information about the repository in a | ||
human-readable format that includes overviews and tables. | ||
|
||
References Summary | ||
~~~~~~~~~~~~~~~~~~ | ||
|
||
The references summary includes a count of each kind of reference, | ||
including branches, remote refs, and tags (split by "all" and | ||
"annotated"). | ||
|
||
Reachable Object Summary | ||
~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The reachable object summary shows the total number of each kind of Git | ||
object, including tags, commits, trees, and blobs. | ||
|
||
GIT | ||
--- | ||
Part of the linkgit:git[1] suite |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.