-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maintenance: add pack-refs task #871
Maintenance: add pack-refs task #871
Conversation
316f455
to
8012d2d
Compare
/submit |
Submitted as pull.871.git.1612795943.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Taylor Blau wrote (reply to this):
|
User |
@@ -145,6 +145,12 @@ incremental-repack:: | |||
which is a special case that attempts to repack all pack-files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Taylor Blau wrote (reply to this):
On Mon, Feb 08, 2021 at 02:52:22PM +0000, Derrick Stolee via GitGitGadget wrote:
> +pack-refs::
> + The `pack-refs` task collects the loose reference files and
> + collects them into a single file. This speeds up operations that
> + need to iterate across many refereences. See linkgit:git-pack-refs[1]
> + for more information.
> +
Do you think it's worth documenting here or in git-gc(1) that running
this has the effect of disabling the same step during gc?
> OPTIONS
> -------
> --auto::
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 4c40594d660e..8f13c3bb8607 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -54,7 +54,6 @@ static const char *prune_worktrees_expire = "3.months.ago";
> static unsigned long big_pack_threshold;
> static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
>
> -static struct strvec pack_refs_cmd = STRVEC_INIT;
> static struct strvec reflog = STRVEC_INIT;
> static struct strvec repack = STRVEC_INIT;
> static struct strvec prune = STRVEC_INIT;
> @@ -163,6 +162,15 @@ static void gc_config(void)
> git_config(git_default_config, NULL);
> }
>
> +struct maintenance_run_opts;
> +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts)
It may be worth calling this "unused", since you explicitly pass NULL
below.
> +{
> + struct strvec pack_refs_cmd = STRVEC_INIT;
> + strvec_pushl(&pack_refs_cmd, "pack-refs", "--all", "--prune", NULL);
> +
> + return run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD);
> +}
> +
> static int too_many_loose_objects(void)
> {
> /*
> @@ -518,8 +526,8 @@ static void gc_before_repack(void)
> if (done++)
> return;
>
> - if (pack_refs && run_command_v_opt(pack_refs_cmd.v, RUN_GIT_CMD))
> - die(FAILED_RUN, pack_refs_cmd.v[0]);
> + if (pack_refs && maintenance_task_pack_refs(NULL))
> + die(FAILED_RUN, "pack-refs");
Hmm. Am I missing where opting into the maintenance step disables this
in gc? You suggest that it does in the commit message, but I can't seem
to see it here.
> +test_expect_success 'pack-refs task' '
> + for n in $(test_seq 1 5)
> + do
> + git branch -f to-pack/$n HEAD || return 1
> + done &&
> + git maintenance run --task=pack-refs &&
> + ls .git/refs/heads/ >after &&
> + test_must_be_empty after
Worth testing that $GIT_DIR/packed-refs exists, too?
Thanks,
Taylor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 2/8/2021 5:53 PM, Taylor Blau wrote:
> On Mon, Feb 08, 2021 at 02:52:22PM +0000, Derrick Stolee via GitGitGadget wrote:
>> +pack-refs::
>> + The `pack-refs` task collects the loose reference files and
>> + collects them into a single file. This speeds up operations that
>> + need to iterate across many refereences. See linkgit:git-pack-refs[1]
>> + for more information.
>> +
>
> Do you think it's worth documenting here or in git-gc(1) that running
> this has the effect of disabling the same step during gc?
It doesn't disable the step during gc.
Perhaps I should use a better term than "extract". The 'gc' task shouldn't
change as we make some of its steps also available as independent tasks.
Instead, users can select a subset of its steps by enabling them directly.
Other such tasks could include:
* prune-reflog
* full-repack (as opposed to the existing incremental-repack task)
>> +struct maintenance_run_opts;
>> +static int maintenance_task_pack_refs(struct maintenance_run_opts *opts)
>
> It may be worth calling this "unused", since you explicitly pass NULL
> below.
Good idea.
>> + if (pack_refs && maintenance_task_pack_refs(NULL))
>> + die(FAILED_RUN, "pack-refs");
>
> Hmm. Am I missing where opting into the maintenance step disables this
> in gc? You suggest that it does in the commit message, but I can't seem
> to see it here.
My commit message suggests wrong.
>> +test_expect_success 'pack-refs task' '
>> + for n in $(test_seq 1 5)
>> + do
>> + git branch -f to-pack/$n HEAD || return 1
>> + done &&
>> + git maintenance run --task=pack-refs &&
>> + ls .git/refs/heads/ >after &&
>> + test_must_be_empty after
>
> Worth testing that $GIT_DIR/packed-refs exists, too?
That would start breaking if we change the backing store, right? I
want to be sure this doesn't create test breakages with reftable.
I _could_ add a 'test_subcommand' check here, though.
Thanks,
-Stolee
Nice work Derrick. Could you please correct my name to 'Son Luong'? Thanks! |
@@ -145,6 +145,12 @@ incremental-repack:: | |||
which is a special case that attempts to repack all pack-files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Eric Sunshine wrote (reply to this):
On Mon, Feb 8, 2021 at 9:52 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> +pack-refs::
> + The `pack-refs` task collects the loose reference files and
> + collects them into a single file. This speeds up operations that
> + need to iterate across many refereences. See linkgit:git-pack-refs[1]
> + for more information.
s/refereences/references/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 2/8/2021 6:06 PM, Eric Sunshine wrote:
> On Mon, Feb 8, 2021 at 9:52 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> +pack-refs::
>> + The `pack-refs` task collects the loose reference files and
>> + collects them into a single file. This speeds up operations that
>> + need to iterate across many refereences. See linkgit:git-pack-refs[1]
>> + for more information.
>
> s/refereences/references/
Thanks!
-Stolee
User |
This branch is now known as |
This patch series was integrated into seen via git@93756a6. |
User |
It is valuable to collect loose refs into a more compressed form. This is typically the packed-refs file, although this could be the reftable in the future. Having packed refs can be extremely valuable in repos with many tags or remote branches that are not modified by the local user, but still are necessary for other queries. For instance, with many exploded refs, commands such as git describe --tags --exact-match HEAD can be very slow (multiple seconds). This command in particular is used by terminal prompts to show when a detatched HEAD is pointing to an existing tag, so having it be slow causes significant delays for users. Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all --prune' to move loose refs into a packed form. For now, that is the packed-refs file, but could adjust to other file formats in the future. This is the first of several sub-tasks of the 'gc' task that could be extracted to their own tasks. In this process, we should not change the behavior of the 'gc' task since that remains the default way to keep repositories maintained. Creating a new task for one of these sub-tasks only provides more customization options for those choosing to not use the 'gc' task. It is certainly possible to have both the 'gc' and 'pack-refs' tasks enabled and run regularly. While they may repeat effort, they do not conflict in a destructive way. The 'auto_condition' function pointer is left NULL for now. We could extend this in the future to have a condition check if pack-refs should be run during 'git maintenance run --auto'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
When the 'maintenance.strategy' config option is set to 'incremental', a default maintenance schedule is enabled. Add the 'pack-refs' task to that strategy at the weekly cadence. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
8012d2d
to
c38fc9a
Compare
/submit |
Submitted as pull.871.v2.git.1612878149.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@e371173. |
This patch series was integrated into seen via git@7b2b219. |
On the Git mailing list, Taylor Blau wrote (reply to this):
|
This patch series was integrated into seen via git@97d06dd. |
This patch series was integrated into seen via git@befa7f8. |
This patch series was integrated into seen via git@79d2bc6. |
This patch series was integrated into seen via git@d28aa44. |
This patch series was integrated into next via git@36f56bc. |
This patch series adds a new
pack-refs
task to the maintenance builtin. This operation already happens withingit gc
(and hence thegc
task) but it is easy to extract. Packing refs does not delete any data, only collects loose objects into a combined file. This makes things faster in subtle ways, especially when a command needs to iterate through refs (especially tags).Credit for inspiring this goes to Suolong, who asked for this to be added to Scalar [1]. I've been waiting instead to add it directly to Git and its background maintenance. Now is the time!
[1] microsoft/scalar#382
I chose to add it to the incremental maintenance strategy at a weekly cadence. I'm not sure there is significant value to the difference between weekly and daily. It just seems to me that weekly is often enough. Feel free to correct me if you have a different opinion.
My hope is that this patch series could be used as an example for further extracting tasks out of the
gc
task and making them be full maintenance tasks. Doing more of these extractions could be a good project for a new contributor.One thing that is not implemented in this series is a notion of the behavior for the
pack-refs
task duringgit maintenance run --auto
. This could be added in the future, but I wanted to focus on getting this behavior into the incremental maintenance schedule.Updates in V2
Thanks,
-Stolee
Cc: gitster@pobox.com
Cc: sluongng@gmail.com
Cc: martin.agren@gmail.com
Cc: sunshine@sunshineco.com
cc: Taylor Blau me@ttaylorr.com
cc: Eric Sunshine sunshine@sunshineco.com
cc: Derrick Stolee stolee@gmail.com