Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase -i: fix re-reading the todo list when newly created objects are referenced #2121

Merged
merged 4 commits into from
May 7, 2019

Conversation

dscho
Copy link
Member

@dscho dscho commented Mar 11, 2019

In the ever-green.sh script, we make use of an obscure, yet powerful feature of the interactive rebase: we use an exec command to augment the todo list (to allow for a 2-phase rebase to be executed in a single one, so that the second phase will be executed even if a merge conflict had to be resolved in the first phase).

However, this poses a challenge with the new loose object cache, as we might reference newly-created commits in that new todo list part, and the loose object cache might pretend that those commits do not exist.

Let's fix that.

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

Cc @peff (because there might very well be more use cases where the loose object cache goes stale and would need to either be blown away, or simply disabled).

@peff
Copy link

peff commented Mar 12, 2019

and the loose object cache might pretend that those commits do not exist.

Can you clarify the symptoms of the bug a little more? Specifically, which use of the cache is having problems?

Since the cache is otherwise only used for OBJECT_INFO_QUICK queries, my guess is that this has to do with the abbrev code? We'd normally drop-and-reload the cache when we hit a missing object, the same way we trigger reprepare_packed_git(). I'm wondering if we could/should do the same in the abbreviation code. It's looking for near-miss objects, but in doing so it should always come across the actual object name we're abbreviating (either as loose or packed). If not, should we consider doing the same reprepare process to pick up new objects?

@peff
Copy link

peff commented Mar 12, 2019

By the way, if the symptom is in the abbrev code, I suspect the problem you are seeing would pre-date the recent loose-object-cache work. That had its own cache since cc817ca (which couldn't even be invalidated!). I'm guessing it's coming up now because of moving rebase -i into C?

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

This is the symptom after appending some lines to the todo list:

error: invalid line 515: fixup 7fbdafc914 fixup! ident: add the ability to provide a "fallback identity"
error: invalid line 523: fixup e48378074f fixup! stash: add tests for `git stash show` config
error: invalid line 529: fixup dc857a9f9f fixup! stash: convert drop and clear to builtin
[...]

I did single-step through one of those, and it fails at https://github.com/git-for-windows/git/blob/v2.21.0.windows.1/sequencer.c#L2132:

	status = get_oid(bol, &commit_oid);

I had not actually debugged further, but instead I came up with the fix in this PR, on the hunch that it might be the loose object cache. And it seems to have fixed the problem...

I guess I really should add a regression test ;-)

@peff
Copy link

peff commented Mar 12, 2019

Hmm. That get_oid call would not be using the loose object cache at all. So I'd guess that the problem is that the abbreviated hash is ambiguous, because it was generated without the newly-mentioned commits. But at 10 characters, it seems unlikely to collide (and to collide three times). So I'm quite confused about how the loose object cache is even at play here.

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

That get_oid call would not be using the loose object cache at all.

But it would... this is the backtrace:

#0  odb_loose_cache (odb=0x1e2e3d0, oid=0x7ffffd957025) at sha1-file.c:2227
#1  0x0000000000603658 in find_short_object_filename (ds=0x7ffffd956fe0) at sha1-name.c:98
#2  0x0000000000604371 in get_short_oid (name=0x3b121d4 "6392f5b2fd", len=10, oid=0x7ffffd957280, flags=0) at sha1-name.c:441
#3  0x0000000000605e03 in get_oid_1 (name=0x3b121d4 "6392f5b2fd", len=10, oid=0x7ffffd957280, lookup_flags=0) at sha1-name.c:1127
#4  0x000000000060745a in get_oid_with_context_1 (repo=0x97e440 <the_repo>, name=0x3b121d4 "6392f5b2fd", flags=0, prefix=0x0, oid=0x7ffffd957280,
    oc=0x7ffffd957200) at sha1-name.c:1693
#5  0x0000000000607a37 in get_oid_with_context (repo=0x97e440 <the_repo>, str=0x3b121d4 "6392f5b2fd", flags=0, oid=0x7ffffd957280, oc=0x7ffffd957200)
    at sha1-name.c:1831
#6  0x0000000000606d87 in get_oid (name=0x3b121d4 "6392f5b2fd", oid=0x7ffffd957280) at sha1-name.c:1518
[...]

... and a patch to demonstrate the problem (which I will submit to the Git mailing list Real Soon Now):

diff --git a/t/t3429-rebase-edit-todo.sh b/t/t3429-rebase-edit-todo.sh
index b9292dfc2a..d78f057f35 100755
--- a/t/t3429-rebase-edit-todo.sh
+++ b/t/t3429-rebase-edit-todo.sh
@@ -11,4 +11,28 @@ test_expect_success 'rebase exec modifies rebase-todo' '
 	test -e F
 '
 
+test_expect_success 'loose cache does not interfere with re-reading todo list' '
+	GIT_REBASE_TODO=.git/rebase-merge/git-rebase-todo &&
+	export GIT_REBASE_TODO &&
+	write_script append-todo.sh <<-\EOS &&
+	# It was determined manually that the following fast-import generates
+	# SHA-1s with the same first two digits for the command-line arguments
+	# 5 and 6:
+
+	git fast-import --quiet <<-EOF
+	commit refs/tags/ref$1
+	committer A U Thor <author@example.org> $1 +0000
+	data <<EOM
+	$1
+	EOM
+	EOF
+
+	echo "pick $(git rev-parse --short ref$1)" >>$GIT_REBASE_TODO
+	test $1 -gt 6 ||
+	echo "exec $0 $(($1+1))" >>$GIT_REBASE_TODO
+	EOS
+
+	git rebase HEAD -x "./append-todo.sh 5"
+'
+
 test_done

@peff
Copy link

peff commented Mar 12, 2019

That get_oid call would not be using the loose object cache at all.

But it would... this is the backtrace:

Ah, right, of course. It triggers the abbrev code for a lookup. So I do think this was probably broken for some time due to the old cache. And has probably been broken forever in a racy way if somebody was simultaneously repacking. We'd probably want something like this:

diff --git a/sha1-name.c b/sha1-name.c
index 6dda2c16df..cfe5c874b6 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -442,6 +442,18 @@ static enum get_oid_result get_short_oid(const char *name, int len,
 	find_short_packed_object(&ds);
 	status = finish_object_disambiguation(&ds, oid);
 
+	/*
+	 * If we didn't find it, do the usual reprepare() slow-path,
+	 * since the object may have recently been added to the repository
+	 * or migrated from loose to packed.
+	 */
+	if (status == MISSING_OBJECT) {
+		reprepare_packed_git(the_repository);
+		find_short_object_filename(&ds);
+		find_short_packed_object(&ds);
+		status = finish_object_disambiguation(&ds, oid);
+	}
+
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
 

That fixes your test, though I'm not sure if it's entirely safe to call those find_* functions again without reinitializing the ds struct.

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

But wasn't part of the idea behind the loose object cache to accelerate the awful performance on NFS when trying to find the same object over and over again, even if we already know that it does not exist?

That fix would regress on that, unless I am mistaken.

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

FYI my current (unsubmitted) iteration is here: gitgitgadget#161

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

I'm not sure if it's entirely safe to call those find_* functions again without reinitializing the ds struct.

I am fairly certain that it is safe.

But I still think that we should do this only if we also introduce a flag that is set by any loose object-writing (or -deleting) code path, and we should only then blow away the loose object cache.

@peff
Copy link

peff commented Mar 12, 2019

But wasn't part of the idea behind the loose object cache to accelerate the awful performance on NFS when trying to find the same object over and over again, even if we already know that it does not exist?

That fix would regress on that, unless I am mistaken.

No, the caller has to opt into the "quick" mode explicitly, with the intent that we would not introduce these kinds of cache-invalidation bugs. So has_object_file_with_flags(&oid, OBJECT_INFO_QUICK) will do it, but a normal has_object_file will not.

And I'd say the same thing should apply here. We do not currently have a GET_OID_QUICK flag, but if there was a caller who did want "faster, but maybe racily inaccurate", we could add one and respect it here.

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

Fair enough!

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

I still think that we should do this only if we also introduce a flag that is set by any loose object-writing (or -deleting) code path, and we should only then blow away the loose object cache.

That would actually not work in the regression test that I added, as the objects are written in a different process. D'oh.

@dscho
Copy link
Member Author

dscho commented Mar 12, 2019

FYI my current (unsubmitted) iteration is here: gitgitgadget#161

FWIW I updated that PR, and the actual fix is this one: gitgitgadget@3be9170

dscho added 4 commits March 13, 2019 12:50
We specifically support `exec` commands in `git rebase -i`'s todo lists
to rewrite the very same todo list. Of course, we need to validate that
todo list when re-reading it.

It is also totally legitimate to extend the todo list by `pick` lines
using short names of commits that were created only after the rebase
started.

And this is where the loose object cache interferes with this feature:
if *some* loose object was read whose hash shares the same first two
digits with a commit that was not yet created when that loose object was
created, then we fail to find that new commit by its short name in
`get_oid()`, and the interactive rebase fails with an obscure error
message like:

	error: invalid line 1: pick 6568fef
	error: please fix this using 'git rebase --edit-todo'.

Let's first demonstrate that this is actually a bug in a new regression
test, in a separate commit so that other developers who do not believe
me can cherry-pick it to confirm the problem.

This new regression test generates two commits whose hashes share the
first two hex digits (so that their corresponding loose objects live in
the same subdirectory of .git/objects/, and are therefore supposed to be
in the same loose object cache bin).

It then picks the first, to make sure that the loose object cache is
initialized and cached that object directory, then generates the second
commit and picks it, too. Since the commit was generated in a different
process than the sequencer that wants to pick it, the loose object cache
had no chance of being updated in the meantime.

Technically, we would need only one `exec` command in this regression
test case, but for ease of implementation, it uses a pseudo-recursive
call to the same script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The interactive rebase simply complains about an "invalid line" when the
object hash of, say, a `pick` line could not be parsed.

Let's tell the user what happened in a little more detail.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is quite possible that the loose object cache gets stale when new
objects are written. In that case, get_oid() would potentially say that
it cannot find a given object, even if it should find it.

Let's blow away the loose object cache as well as the read packs and try
again in that case.

Note: this does *not* affect the code path that was introduced to help
avoid looking for the same non-existing objects (which made some
operations really expensive via NFS): that code path is handled by the
`OBJECT_INFO_QUICK` flag (which does not even apply to `get_oid()`,
which has no equivalent flag, at least at the time this patch was
written).

This incidentally fixes the problem identified earlier where an
interactive rebase wanted to re-read (and validate) the todo list after
an `exec` command modified it.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the fix-rereading-todo-list branch from 34d9f62 to 119c983 Compare March 13, 2019 11:51
@dscho
Copy link
Member Author

dscho commented May 7, 2019

This was accepted into master of core Git; Let's have it in Git for Windows' master, too.

@dscho dscho merged commit 59a40ee into git-for-windows:master May 7, 2019
@dscho dscho deleted the fix-rereading-todo-list branch May 7, 2019 14:20
@dscho dscho added this to the v2.21.0(2) milestone May 7, 2019
git-for-windows-ci pushed a commit that referenced this pull request May 8, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit to git-for-windows/build-extra that referenced this pull request May 9, 2019
`git rebase -i` used to get confused when an `exec` command created
new commits and then appended `pick` lines for them. This [has been
fixed](git-for-windows/git#2121).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
git-for-windows-ci pushed a commit that referenced this pull request May 9, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
git-for-windows-ci pushed a commit that referenced this pull request May 10, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit to dscho/git that referenced this pull request May 13, 2019
…-list

rebase -i: fix re-reading the todo list when newly created objects are referenced
git-for-windows-ci pushed a commit that referenced this pull request May 14, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit that referenced this pull request May 21, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit that referenced this pull request May 22, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit that referenced this pull request May 25, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit that referenced this pull request Jun 4, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
dscho added a commit that referenced this pull request Jun 8, 2019
rebase -i: fix re-reading the todo list when newly created objects are referenced
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants