Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scalar diagnose failures #543

Merged
merged 3 commits into from
Nov 8, 2022

Conversation

derrickstolee
Copy link
Collaborator

This resolves #541.

The root cause here is that this loop that intended to get all files from the info directory of the shared object cache never worked, but previously we did not propagate that to a complete failure.

The reason it didn't work is that we tried to read file contents from the directory path.

I also noticed a silly retry issue because we are now using run_git() which does three retries due to concurrency issues in the functional tests. We don't want to repeat failures three times in scalar diagnose.

When 672196a (scalar-diagnose: use 'git diagnose --mode=all',
2022-08-12) updated 'scalar diagnose' to run 'git diagnose' as a
subprocess, it was passed through the run_git() caller. Upstream, this
caller does not repeat, but in microsoft/git we have a retry loop
specifically for cases where 'git config' commands fail due to
concurrency.

If the underlying 'git diagnose' command fails, then the failure is
repeated. Remove this retry loop for this case.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
When this loop was adapted into diagnose.c, it gained an error response
when failing to read a file. This started causing consistent failures
for users with a shared object cache, demonstrating a gap in our testing
support.

The error message itself is confusing:

error: could not read '[...]/info': Permission denied
fatal: unable to create diagnostics archive [...].zip: Permission denied

The "Permission denied" is a red herring for both of these. The second
one is due to a die_errno() even though it's not related to that
filename. The first is actually due to trying to _read the contents of
the directory_.

Since we previously did not error out in this case, we never noticed
this problem in the past. Now, it is causing 'scalar diagnose' to fail.

The fix is to add the filename to the info directory path and read from
that file. Be careful that the entry is actually a file.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
@derrickstolee derrickstolee self-assigned this Nov 8, 2022
Cherry pick commit d3775de (Makefile: force -O0 when compiling with
SANITIZE=leak, 2022-10-18), as otherwise the leak checker at GitHub
Actions CI seems to fail with a false positive.

[Backported to v2.37.0 for easier merging e.g. into Git for Windows]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@derrickstolee
Copy link
Collaborator Author

I added b6afb3f from git-for-windows#4085 to fix the linux-leaks job.

Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@dscho
Copy link
Member

dscho commented Nov 8, 2022

I added b6afb3f from git-for-windows#4085 to fix the linux-leaks job.

Technically, this should be done in its own PR, but since it will be rebased out of microsoft/git's patch thicket, that would be a lot of work for no gain, so I am totes cool with slipping in this "affidavit patch" here.

@derrickstolee derrickstolee merged commit e380c61 into microsoft:vfs-2.38.1 Nov 8, 2022
@derrickstolee derrickstolee deleted the diagnose-fixes branch November 8, 2022 16:00
dscho pushed a commit that referenced this pull request Nov 23, 2022
This resolves #541.

The root cause here is that this loop that intended to get all files
from the `info` directory of the shared object cache _never worked_, but
previously we did not propagate that to a complete failure.

The reason it didn't work is that we tried to read file contents from
the directory path.

I also noticed a silly retry issue because we are now using `run_git()`
which does three retries due to concurrency issues in the functional
tests. We don't want to repeat failures three times in `scalar
diagnose`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scalar Diagnose Issue 2.38.1.vfs.0.0
3 participants