Skip to content

Commit

Permalink
Merge pull request #410: Sparse Index: latest integrations
Browse files Browse the repository at this point in the history
```
6e74958 p2000: add 'git checkout -' test and decrease depth
3e1d03c p2000: compress repo names
cd94f82 commit: integrate with sparse-index
65e79b8 sparse-index: recompute cache-tree
e9a9981 checkout: stop expanding sparse indexes
4b801c8 t1092: document bad 'git checkout' behavior
71e3015 unpack-trees: resolve sparse-directory/file conflicts
5e96df4 t1092: test merge conflicts outside cone
defab1b add: allow operating on a sparse-only index
9fc4313 pathspec: stop calling ensure_full_index
0ec03ab add: ignore outside the sparse-checkout in refresh()
adf5b15 add: remove ensure_full_index() with --renormalize
```

These commits are equivalent to those already in `next` via gitgitgadget#999.

```
80b8d6c Merge branch 'sparse-index/add' into stolee/sparse-index/add
```

This merge resolves conflicts with some work that happened in parallel, but is already in upstream `master`.

```
c407b2c t7519: rewrite sparse index test
9dad0d2 sparse-index: silently return when not using cone-mode patterns
2974920 sparse-index: silently return when cache tree fails
e7cdaa0 unpack-trees: fix nested sparse-dir search
347410c sparse-checkout: create helper methods
4537233 attr: be careful about sparse directories
5282a86 sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
3a2f316 sparse-checkout: clear tracked sparse dirs
fb47b56 sparse-checkout: add config to disable deleting dirs
```

These commits are the ones under review as of gitgitgadget#1009. Recent review made this less stable. It's a slightly different and more robust version of #396.

> Note: I'm still not done with the feedback for upstream, but the remaining feedback is "can we add tests that cover these tricky technical bits?" and in `microsoft/git` these are already covered by the Scalar functional tests (since that's how they were found).

```
080b02c diff: ignore sparse paths in diffstat
d91a647 merge: make sparse-aware with ORT
df49b5f merge-ort: expand only for out-of-cone conflicts
cdecb85 t1092: add cherry-pick, rebase tests
0c1ecfb sequencer: ensure full index if not ORT strategy
406dfbe sparse-index: integrate with cherry-pick and rebase
```

These commits integrate with `git merge`, `git cherry-pick`, `git revert`, and `git rebase` as of gitgitgadget#1019. This got some feedback that changed how the tests were working so they are more robust. This led to a new commit (0c1ecfb).

```
cbb0ab3 Merge branch 'sparse-index/merge' into vfs-2.33.0
acb8623 t7524: test no longer fails
```

Finally, the commits are merged into `vfs-2.33.0` and also we include a fix to a `microsoft/git` test that is no longer broken.

Cc: @vdye and @ldennington to get a (possibly overwhelming?) taste of sparse-index stuff. If you focus solely on the `git merge` commits you'll get a feel for what a sparse index integration looks like.
  • Loading branch information
derrickstolee authored Aug 24, 2021
2 parents d5ec357 + acb8623 commit 4bcd533
Show file tree
Hide file tree
Showing 23 changed files with 524 additions and 76 deletions.
6 changes: 6 additions & 0 deletions Documentation/config/index.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
index.deleteSparseDirectories::
When enabled, the cone mode sparse-checkout feature will delete
directories that are outside of the sparse-checkout cone, unless
such a directory contains an untracked, non-ignored file. Defaults
to true.

index.recordEndOfIndexEntries::
Specifies whether the index file should include an "End Of Index
Entry" section. This reduces index load time on multiprocessor
Expand Down
10 changes: 10 additions & 0 deletions Documentation/git-sparse-checkout.txt
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,16 @@ case-insensitive check. This corrects for case mismatched filenames in the
'git sparse-checkout set' command to reflect the expected cone in the working
directory.

When changing the sparse-checkout patterns in cone mode, Git will inspect each
tracked directory that is not within the sparse-checkout cone to see if it
contains any untracked files. If all of those files are ignored due to the
`.gitignore` patterns, then the directory will be deleted. If any of the
untracked files within that directory is not ignored, then no deletions will
occur within that directory and a warning message will appear. If these files
are important, then reset your sparse-checkout definition so they are included,
use `git add` and `git commit` to store them, then remove any remaining files
manually to ensure Git can behave optimally.


SUBMODULES
----------
Expand Down
14 changes: 14 additions & 0 deletions attr.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "utf8.h"
#include "quote.h"
#include "thread-utils.h"
#include "dir.h"

const char git_attr__true[] = "(builtin)true";
const char git_attr__false[] = "\0(builtin)false";
Expand Down Expand Up @@ -744,6 +745,19 @@ static struct attr_stack *read_attr_from_index(struct index_state *istate,
if (!istate)
return NULL;

/*
* The .gitattributes file only applies to files within its
* parent directory. In the case of cone-mode sparse-checkout,
* the .gitattributes file is sparse if and only if all paths
* within that directory are also sparse. Thus, don't load the
* .gitattributes file since it will not matter.
*
* In the case of a sparse index, it is critical that we don't go
* looking for a .gitattributes file, as the index will expand.
*/
if (!path_in_cone_modesparse_checkout(path, istate))
return NULL;

buf = read_blob_data_from_index(istate, path, NULL);
if (!buf)
return NULL;
Expand Down
10 changes: 7 additions & 3 deletions builtin/add.c
Original file line number Diff line number Diff line change
Expand Up @@ -144,8 +144,6 @@ static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
{
int i, retval = 0;

/* TODO: audit for interaction with sparse-index. */
ensure_full_index(&the_index);
for (i = 0; i < active_nr; i++) {
struct cache_entry *ce = active_cache[i];

Expand Down Expand Up @@ -198,7 +196,10 @@ static int refresh(int verbose, const struct pathspec *pathspec)
_("Unstaged changes after refreshing the index:"));
for (i = 0; i < pathspec->nr; i++) {
if (!seen[i]) {
if (matches_skip_worktree(pathspec, i, &skip_worktree_seen)) {
const char *path = pathspec->items[i].original;

if (matches_skip_worktree(pathspec, i, &skip_worktree_seen) ||
!path_in_sparse_checkout(path, &the_index)) {
string_list_append(&only_match_skip_worktree,
pathspec->items[i].original);
} else {
Expand Down Expand Up @@ -532,6 +533,9 @@ int cmd_add(int argc, const char **argv, const char *prefix)
add_new_files = !take_worktree_changes && !refresh_only && !add_renormalize;
require_pathspec = !(take_worktree_changes || (0 < addremove_explicit));

prepare_repo_settings(the_repository);
the_repository->settings.command_requires_full_index = 0;

hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);

/*
Expand Down
3 changes: 3 additions & 0 deletions builtin/merge.c
Original file line number Diff line number Diff line change
Expand Up @@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
if (argc == 2 && !strcmp(argv[1], "-h"))
usage_with_options(builtin_merge_usage, builtin_merge_options);

prepare_repo_settings(the_repository);
the_repository->settings.command_requires_full_index = 0;

/*
* Check if we are _not_ on a detached HEAD, i.e. if there is a
* current branch.
Expand Down
6 changes: 6 additions & 0 deletions builtin/rebase.c
Original file line number Diff line number Diff line change
Expand Up @@ -559,6 +559,9 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
argc = parse_options(argc, argv, prefix, options,
builtin_rebase_interactive_usage, PARSE_OPT_KEEP_ARGV0);

prepare_repo_settings(the_repository);
the_repository->settings.command_requires_full_index = 0;

if (!is_null_oid(&squash_onto))
opts.squash_onto = &squash_onto;

Expand Down Expand Up @@ -1430,6 +1433,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
usage_with_options(builtin_rebase_usage,
builtin_rebase_options);

prepare_repo_settings(the_repository);
the_repository->settings.command_requires_full_index = 0;

options.allow_empty_message = 1;
git_config(rebase_config, &options);
/* options.gpg_sign_opt will be either "-S" or NULL */
Expand Down
3 changes: 3 additions & 0 deletions builtin/revert.c
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,9 @@ static int run_sequencer(int argc, const char **argv, struct replay_opts *opts)
PARSE_OPT_KEEP_ARGV0 |
PARSE_OPT_KEEP_UNKNOWN);

prepare_repo_settings(the_repository);
the_repository->settings.command_requires_full_index = 0;

/* implies allow_empty */
if (opts->keep_redundant_commits)
opts->allow_empty = 1;
Expand Down
102 changes: 102 additions & 0 deletions builtin/sparse-checkout.c
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,106 @@ static int sparse_checkout_list(int argc, const char **argv)
return 0;
}

static void clean_tracked_sparse_directories(struct repository *r)
{
int i, value, was_full = 0;
struct strbuf path = STRBUF_INIT;
size_t pathlen;
struct string_list_item *item;
struct string_list sparse_dirs = STRING_LIST_INIT_DUP;

/*
* If we are not using cone mode patterns, then we cannot
* delete directories outside of the sparse cone.
*/
if (!r || !r->index || !r->worktree)
return;
init_sparse_checkout_patterns(r->index);
if (!r->index->sparse_checkout_patterns ||
!r->index->sparse_checkout_patterns->use_cone_patterns)
return;

/*
* Users can disable this behavior.
*/
if (!repo_config_get_bool(r, "index.deletesparsedirectories", &value) &&
!value)
return;

/*
* Use the sparse index as a data structure to assist finding
* directories that are safe to delete. This conversion to a
* sparse index will not delete directories that contain
* conflicted entries or submodules.
*/
if (!r->index->sparse_index) {
/*
* If something, such as a merge conflict or other concern,
* prevents us from converting to a sparse index, then do
* not try deleting files.
*/
if (convert_to_sparse(r->index, SPARSE_INDEX_MEMORY_ONLY))
return;
was_full = 1;
}

strbuf_addstr(&path, r->worktree);
strbuf_complete(&path, '/');
pathlen = path.len;

/*
* Collect directories that have gone out of scope but also
* exist on disk, so there is some work to be done. We need to
* store the entries in a list before exploring, since that might
* expand the sparse-index again.
*/
for (i = 0; i < r->index->cache_nr; i++) {
struct cache_entry *ce = r->index->cache[i];

if (S_ISSPARSEDIR(ce->ce_mode) &&
repo_file_exists(r, ce->name))
string_list_append(&sparse_dirs, ce->name);
}

for_each_string_list_item(item, &sparse_dirs) {
struct dir_struct dir = DIR_INIT;
struct pathspec p = { 0 };
struct strvec s = STRVEC_INIT;

strbuf_setlen(&path, pathlen);
strbuf_addstr(&path, item->string);

dir.flags |= DIR_SHOW_IGNORED_TOO;

setup_standard_excludes(&dir);
strvec_push(&s, path.buf);

parse_pathspec(&p, PATHSPEC_GLOB, 0, NULL, s.v);
fill_directory(&dir, r->index, &p);

if (dir.nr) {
warning(_("directory '%s' contains untracked files,"
" but is not in the sparse-checkout cone"),
item->string);
} else if (remove_dir_recursively(&path, 0)) {
/*
* Removal is "best effort". If something blocks
* the deletion, then continue with a warning.
*/
warning(_("failed to remove directory '%s'"),
item->string);
}

dir_clear(&dir);
}

string_list_clear(&sparse_dirs, 0);
strbuf_release(&path);

if (was_full)
ensure_full_index(r->index);
}

static int update_working_directory(struct pattern_list *pl)
{
enum update_sparsity_result result;
Expand Down Expand Up @@ -141,6 +241,8 @@ static int update_working_directory(struct pattern_list *pl)
else
rollback_lock_file(&lock_file);

clean_tracked_sparse_directories(r);

r->index->sparse_checkout_patterns = NULL;
return result;
}
Expand Down
8 changes: 8 additions & 0 deletions diff.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include "parse-options.h"
#include "help.h"
#include "promisor-remote.h"
#include "dir.h"

#ifdef NO_FAST_WORKING_DIRECTORY
#define FAST_WORKING_DIRECTORY 0
Expand Down Expand Up @@ -3900,6 +3901,13 @@ static int reuse_worktree_file(struct index_state *istate,
if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
return 0;

/*
* If this path does not match our sparse-checkout definition,
* then the file will not be in the working directory.
*/
if (!path_in_sparse_checkout(name, istate))
return 0;

/*
* Similarly, if we'd have to convert the file contents anyway, that
* makes the optimization not worthwhile.
Expand Down
54 changes: 54 additions & 0 deletions dir.c
Original file line number Diff line number Diff line change
Expand Up @@ -1494,6 +1494,60 @@ enum pattern_match_result path_matches_pattern_list(
return result;
}

int init_sparse_checkout_patterns(struct index_state *istate)
{
if (!core_apply_sparse_checkout ||
istate->sparse_checkout_patterns)
return 0;

CALLOC_ARRAY(istate->sparse_checkout_patterns, 1);

if (get_sparse_checkout_patterns(istate->sparse_checkout_patterns) < 0) {
FREE_AND_NULL(istate->sparse_checkout_patterns);
return -1;
}

return 0;
}

static int path_in_sparse_checkout_1(const char *path,
struct index_state *istate,
int require_cone_mode)
{
const char *base;
int dtype = DT_REG;
init_sparse_checkout_patterns(istate);

/*
* We default to accepting a path if there are no patterns or
* they are of the wrong type.
*/
if (!istate->sparse_checkout_patterns ||
(require_cone_mode &&
!istate->sparse_checkout_patterns->use_cone_patterns))
return 1;



base = strrchr(path, '/');
return path_matches_pattern_list(path, strlen(path), base ? base + 1 : path,
&dtype,
istate->sparse_checkout_patterns,
istate) > 0;
}

int path_in_sparse_checkout(const char *path,
struct index_state *istate)
{
return path_in_sparse_checkout_1(path, istate, 0);
}

int path_in_cone_modesparse_checkout(const char *path,
struct index_state *istate)
{
return path_in_sparse_checkout_1(path, istate, 1);
}

static struct path_pattern *last_matching_pattern_from_lists(
struct dir_struct *dir, struct index_state *istate,
const char *pathname, int pathlen,
Expand Down
8 changes: 8 additions & 0 deletions dir.h
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,14 @@ enum pattern_match_result path_matches_pattern_list(const char *pathname,
const char *basename, int *dtype,
struct pattern_list *pl,
struct index_state *istate);

int init_sparse_checkout_patterns(struct index_state *state);

int path_in_sparse_checkout(const char *path,
struct index_state *istate);
int path_in_cone_modesparse_checkout(const char *path,
struct index_state *istate);

struct dir_entry *dir_add_ignored(struct dir_struct *dir,
struct index_state *istate,
const char *pathname, int len);
Expand Down
15 changes: 15 additions & 0 deletions merge-ort.c
Original file line number Diff line number Diff line change
Expand Up @@ -4058,6 +4058,21 @@ static int record_conflicted_index_entries(struct merge_options *opt)
if (strmap_empty(&opt->priv->conflicted))
return 0;

/*
* We are in a conflicted state. These conflicts might be inside
* sparse-directory entries, so check if any entries are outside
* of the sparse-checkout cone preemptively.
*
* We set original_cache_nr below, but that might change if
* index_name_pos() calls ask for paths within sparse directories.
*/
strmap_for_each_entry(&opt->priv->conflicted, &iter, e) {
if (!path_in_sparse_checkout(e->key, index)) {
ensure_full_index(index);
break;
}
}

/* If any entries have skip_worktree set, we'll have to check 'em out */
state.force = 1;
state.quiet = 1;
Expand Down
3 changes: 3 additions & 0 deletions merge-recursive.c
Original file line number Diff line number Diff line change
Expand Up @@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
assert(opt->ancestor == NULL ||
!strcmp(opt->ancestor, "constructed merge base"));

prepare_repo_settings(opt->repo);
opt->repo->settings.command_requires_full_index = 1;

if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
return -1;
clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
Expand Down
2 changes: 0 additions & 2 deletions pathspec.c
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,6 @@ void add_pathspec_matches_against_index(const struct pathspec *pathspec,
num_unmatched++;
if (!num_unmatched)
return;
/* TODO: audit for interaction with sparse-index. */
ensure_full_index(istate);
for (i = 0; i < istate->cache_nr; i++) {
const struct cache_entry *ce = istate->cache[i];
if (sw_action == PS_IGNORE_SKIP_WORKTREE && ce_skip_worktree(ce))
Expand Down
4 changes: 2 additions & 2 deletions read-cache.c
Original file line number Diff line number Diff line change
Expand Up @@ -3111,7 +3111,7 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
int ret;
int was_full = !istate->sparse_index;

ret = convert_to_sparse(istate);
ret = convert_to_sparse(istate, 0);

if (ret) {
warning(_("failed to convert to a sparse-index"));
Expand Down Expand Up @@ -3224,7 +3224,7 @@ static int write_shared_index(struct index_state *istate,
int ret, was_full = !istate->sparse_index;

move_cache_to_base_index(istate);
convert_to_sparse(istate);
convert_to_sparse(istate, 0);

trace2_region_enter_printf("index", "shared/do_write_index",
the_repository, "%s", get_tempfile_path(*temp));
Expand Down
Loading

0 comments on commit 4bcd533

Please sign in to comment.