Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons #2725

Merged
merged 1 commit into from
Oct 2, 2020
Merged

Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons #2725

merged 1 commit into from
Oct 2, 2020

Conversation

native-api
Copy link

Fixes #2709

Copy link
Member

@rimrul rimrul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of unrelated changes.
The commit message could also use some more text.

compat/mingw.c Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated
(LPWSTR)w_system_dir,
ARRAY_SIZE(w_system_dir))))
die("GetSystemDirectoryW failed: 0x%08lx",
GetLastError());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any precedent of just coughing up a raw hexadecimal Windows System error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- e.g. at WSAStartup in this file and in git_CC_error_check uses elsewhere.
I can't imagine why this API would fail other than insufficient buffer -- and MAX_PATH is more than sufficient. So if it fails, something is seriously wrong and we cannot do anything meaningful about it anyway.

And Windows NTSTATUS error codes are more readable in hex than decimal...
Oh, I missed that this is not an NTSTATUS. Yes, a decimal would be more logical then.

Copy link
Member

@rimrul rimrul Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking along the lines of err_win_to_posix() & die_errno() or maybe even FormatMessage(). Thinking about it some more, should we die() here at all? We failed to get the system dir, but either

Maybe a warning() would be better? Or maybe we should just silently accept that we can't determine the system directory?

compat/mingw.c Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated
switch (winerr) {
case ERROR_ACCESS_DENIED:
error = EACCES;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be quite honest, I would like to see this commit dropped from this PR. The code cleanup has nothing to do with what the PR actually tries to accomplish.

And when it comes to code cleanup, we will still have to fight with clang-format to do the right thing for us: we never managed to quite finish the work on that.

Besides, .clang-format was never meant to reformat the existing code base, but to serve as a guide how to format new contributions (via clang-format-diff.py).

There is a lot of contention around source code formatting, and it would be really best if we kept that separate from the question how to handle HOMEDRIVE/HOMEPATH.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I agree that VS got overboard reformatting the entire file without notice and will drop it.

Although the result does look more readable IMO.

@dscho
Copy link
Member

dscho commented Jun 30, 2020

The commit message could also use some more text.

I totally agree with that. At the moment, the commit message only says what the patch does. Not why. It also does not follow the convention of prefixing the commit subject with an area (in this case: mingw: ). The reference to Vista is rather pointless at this stage, I think. Something like this might be better:

mingw: respect HOMEDRIVE/HOMEPATH only in logged-in sessions

And then the body should explain the bug well, and what alternatives we considered (and why they were rejected). We don't want future readers to have to wonder why a different approach was not taken.

compat/mingw.c Outdated
* We don't want to (and likely can't) save stuff there
* so ignore them in such a case.
*/
wchar_t w_system_dir[MAX_PATH];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere else in the code base do we use that w_ prefix. Please match the existing coding style, for consistency.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the w_ and len_ prefixes from mingw_getenv -- it likewise takes a value from a Unicode WinAPI call. Other code only uses strbufs so its naming scheme is not applicable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was wrong. It's not nowhere that we use it, it is highly unconventional in our source code.

I'd much rather see you using a name like system32_path.

compat/mingw.c Outdated
strbuf_addstr(&buf, tmp);
if (is_directory(buf.buf))
if (!(GetSystemDirectoryW(
(LPWSTR)w_system_dir,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this cast totally unnecessary? And if it is, it would be a lot nicer to avoid that awkward line break after an opening parenthesis.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it after I got a warning in PR check about the types: wchar_t * vs wchar_t(*)[] IIRC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use WCHAR instead of wchar_t, then? Would that fix it?

compat/mingw.c Outdated
die("GetSystemDirectoryW failed: %ld",
GetLastError());
len_w_buf = buf.len + 1;
w_buf = alloca(len_w_buf * sizeof(wchar_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the rest of compat/mingw.c, we do not use alloca() for buffers that receive the output of xutftowcs(), but we use stack allocation directly (wchar_t wpathname[MAX_LONG_PATH];).

Besides, it strikes me as the wrong way round to convert the UTF-8 encoded candidate to UTF-16 to compare it to the system32 directory's path. In Git for Windows' source code, we usually handle paths in UTF-8, therefore the output of GetSystemDirectoryW() should be converted to UTF-8 instead, and then we compare.

In this instance, I further would strongly suggest to untangle the post-image of this diff by extracting the code to determine the path of the system32 directory to its own function. Something like this:

static const char *get_system32_path(void)
{
        static char path[MAX_LONG_PATH];
        static int initialized;

        if (!initialized) {
                WCHAR wpath[MAX_LONG_PATH];

                if (!GetSystemDirectoryW(wpath, ARRAY_SIZE(wpath)) ||
                    xwcstoutf(path, wpath, sizeof(path) < 0)
                        /* Ignore failures silently */
                        path[0] = '\0';
                initialized = 1;
        }

        return path;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the way I did it initially. But the code was going to be much longer and complex (e.g. needed to construct an strbuf from wchar_t to be able to compare with strbuf_cmp -- i.e. strbuf_alloc then strbuf_release it) and indeed warranted a separate subroutine.

Since UTF-16 and UTF-8 have a 1-1 correspondence, it doesn't make a difference which format is used for the purpose of comparison accuracy. And the prospective subroutine isn't likely to be needed anywhere else in the foreseeable future. So preferred simpler code.

Used alloca instead of a static array on the assumption that the value is likely much shorter than both 32768 (the size limit for mingw_getenv) and even than MAX_LONG_PATH. Since you are so much worried about saving stack space that you had to introduce the MAX_LONG_PATH arbitrary limitation, using a much shorter buffer if possible looked reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a valid strategy to use alloca(). It's just not a strategy that we commonly use in Git's source code. And since it makes it harder to maintain inconsistent code, I would insist on imitating compat/mingw.c's existing code style better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it'll be a wchar_t[32768] -- or buf.buf might not fit (plus adding a magic constant to the code that must be the same as in mingw_getenv). You okay with that?

compat/mingw.c Outdated

/* The defaults are likely set from GetSystemDirectory, too,
so the values should be exactly the same */
if ((wcscmp(w_buf, w_system_dir) != 0) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we are comparing paths, not strings. Therefore, we have to use the case-insensitive variant of the comparison. Together with my advice to keep handling paths in UTF-8 wherever possible, this would then become strcasecmp(...). Also: in Git's source code, we don't add the != 0 in conditionals.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the check is to specifically ignore the system default. I checked that the system default looks like it's generated from this same API call (it uses a specific unusual casing that's not the same as raw directory names). This is why a case-sensitive check is okay in this particular case and this is why I added a comment explaining this unusual choice.

If the goal was rather to ignore c:\windows\system32, in any format, I'd have to use a much more complex comparison as per https://stackoverflow.com/questions/562701/best-way-to-determine-if-two-path-reference-to-same-file-in-windows .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the goal was not to ignore that path in any format. The goal was to at least respect the case-insensitive nature of the paths.

Just because we don't want to fully canonicalize the path doesn't mean that we have to go to the other extreme and expect HOMEDRIVE/HOMEPATH to match the case of GetSystemDirectory().

@dscho
Copy link
Member

dscho commented Jun 30, 2020

And then the body should explain the bug well, and what alternatives we considered (and why they were rejected). We don't want future readers to have to wonder why a different approach was not taken.

I totally forgot to mention: the commit message itself should have a line like this toward the end:

This fixes https://github.com/git-for-windows/git/issues/2709.

Copy link
Author

@native-api native-api left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dscho The why is already explained in the code comment where users would arguably more readily see it. Do I need to duplicate it in commit message?
Rejected alternative solutions -- sounds reasonable.

Standards for commit messages -- could you link to them? The PR template has no guidelines or checklists whatsoever and only links to https://gitforwindows.org/#contribute which has neither as well.

compat/mingw.c Outdated
* We don't want to (and likely can't) save stuff there
* so ignore them in such a case.
*/
wchar_t w_system_dir[MAX_PATH];
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the w_ and len_ prefixes from mingw_getenv -- it likewise takes a value from a Unicode WinAPI call. Other code only uses strbufs so its naming scheme is not applicable.

compat/mingw.c Outdated
switch (winerr) {
case ERROR_ACCESS_DENIED:
error = EACCES;
break;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I agree that VS got overboard reformatting the entire file without notice and will drop it.

Although the result does look more readable IMO.

compat/mingw.c Outdated
strbuf_addstr(&buf, tmp);
if (is_directory(buf.buf))
if (!(GetSystemDirectoryW(
(LPWSTR)w_system_dir,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it after I got a warning in PR check about the types: wchar_t * vs wchar_t(*)[] IIRC.

compat/mingw.c Outdated
die("GetSystemDirectoryW failed: %ld",
GetLastError());
len_w_buf = buf.len + 1;
w_buf = alloca(len_w_buf * sizeof(wchar_t));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the way I did it initially. But the code was going to be much longer and complex (e.g. needed to construct an strbuf from wchar_t to be able to compare with strbuf_cmp -- i.e. strbuf_alloc then strbuf_release it) and indeed warranted a separate subroutine.

Since UTF-16 and UTF-8 have a 1-1 correspondence, it doesn't make a difference which format is used for the purpose of comparison accuracy. And the prospective subroutine isn't likely to be needed anywhere else in the foreseeable future. So preferred simpler code.

Used alloca instead of a static array on the assumption that the value is likely much shorter than both 32768 (the size limit for mingw_getenv) and even than MAX_LONG_PATH. Since you are so much worried about saving stack space that you had to introduce the MAX_LONG_PATH arbitrary limitation, using a much shorter buffer if possible looked reasonable.

compat/mingw.c Outdated

/* The defaults are likely set from GetSystemDirectory, too,
so the values should be exactly the same */
if ((wcscmp(w_buf, w_system_dir) != 0) &&
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the check is to specifically ignore the system default. I checked that the system default looks like it's generated from this same API call (it uses a specific unusual casing that's not the same as raw directory names). This is why a case-sensitive check is okay in this particular case and this is why I added a comment explaining this unusual choice.

If the goal was rather to ignore c:\windows\system32, in any format, I'd have to use a much more complex comparison as per https://stackoverflow.com/questions/562701/best-way-to-determine-if-two-path-reference-to-same-file-in-windows .

@rimrul
Copy link
Member

rimrul commented Jun 30, 2020

The standards for commit messages are described in Documentation/SubmittingPatches.

Describe your changes well.

The first line of the commit message should be a short description (50
characters is the soft limit, see DISCUSSION in git-commit[1]),
and should skip the full stop. It is also conventional in most cases to
prefix the first line with "area: " where the area is a filename or
identifier for the general area of the code being modified, e.g.

  • doc: clarify distinction between sign-off and pgp-signing
  • githooks.txt: improve the intro section

If in doubt which identifier to use, run git log --no-merges on the
files you are modifying to see the current conventions.

summary-section

It's customary to start the remainder of the first line after "area: " with a lower-case letter. E.g. "doc: clarify...", not "doc: Clarify...", or "githooks.txt: improve...", not "githooks.txt: Improve...".

meaningful-message

The body should provide a meaningful commit message, which:
  • explains the problem the change tries to solve, i.e. what is wrong
    with the current code without the change.

  • justifies the way the change solves the problem, i.e. why the
    result with the change is better.

  • alternate solutions considered but discarded, if any.

imperative-mood

Describe your changes in imperative mood, e.g. "make xyzzy do frotz" instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy to do frotz", as if you are giving orders to the codebase to change its behavior. Try to make sure your explanation can be understood without external resources. Instead of giving a URL to a mailing list archive, summarize the relevant points of the discussion.

commit-reference

If you want to reference a previous commit in the history of a stable branch, use the format "abbreviated hash (subject, date)", like this:
	Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30)
	noticed that ...

The "Copy commit summary" command of gitk can be used to obtain this
format (with the subject enclosed in a pair of double-quotes), or this
invocation of git show:

	git show -s --pretty=reference <commit>

or, on an older version of Git without support for --pretty=reference:

	git show -s --date=short --pretty='format:%h (%s, %ad)' <commit>

@dscho
Copy link
Member

dscho commented Jun 30, 2020

Since UTF-16 and UTF-8 have a 1-1 correspondence, it doesn't make a difference which format is used for the purpose of comparison accuracy.

Sure, but it is inconsistent with Git's source code to prefer UTF-16 for anything but immediately feeding to a Win32 API function. That is literally the only use case where I want anything to be converted to UTF-16 in compat/mingw.c.

@dscho
Copy link
Member

dscho commented Jun 30, 2020

The purpose of the check is to specifically ignore the system default.

No, the purpose is to verify that the HOME* variables do not refer to the system32 directory.

I checked that the system default looks like it's generated from this same API call (it uses a specific unusual casing that's not the same as raw directory names).

That might be the case right now.

This is why a case-sensitive check is okay in this particular case and this is why I added a comment explaining this unusual choice.

If you have to explain the unusual choice when a simple change to the case-insensitive comparison would make the explanation completely unnecessary, let's just go with the latter.

If the goal was rather to ignore c:\windows\system32, in any format, I'd have to use a much more complex comparison as per https://stackoverflow.com/questions/562701/best-way-to-determine-if-two-path-reference-to-same-file-in-windows .

Right if you require to canonicalize the path, I agree.

But to go into the other, equally incorrect direction to perform a path comparison case-sensitively just because it seems to work on your machine (I doubt that you inspected the source code, and even then you would not have a guarantee that this wouldn't be changed on you in a future Windows version) is still the wrong thing to do.

I am rather certain that my proposed solution is more appropriate than the code I reviewed.

compat/mingw.c Outdated
* We don't want to (and likely can't) save stuff there
* so ignore them in such a case.
*/
wchar_t w_system_dir[MAX_PATH];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was wrong. It's not nowhere that we use it, it is highly unconventional in our source code.

I'd much rather see you using a name like system32_path.

compat/mingw.c Outdated
strbuf_addstr(&buf, tmp);
if (is_directory(buf.buf))
if (!(GetSystemDirectoryW(
(LPWSTR)w_system_dir,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use WCHAR instead of wchar_t, then? Would that fix it?

compat/mingw.c Outdated
die("GetSystemDirectoryW failed: %ld",
GetLastError());
len_w_buf = buf.len + 1;
w_buf = alloca(len_w_buf * sizeof(wchar_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a valid strategy to use alloca(). It's just not a strategy that we commonly use in Git's source code. And since it makes it harder to maintain inconsistent code, I would insist on imitating compat/mingw.c's existing code style better.

compat/mingw.c Outdated

/* The defaults are likely set from GetSystemDirectory, too,
so the values should be exactly the same */
if ((wcscmp(w_buf, w_system_dir) != 0) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the goal was not to ignore that path in any format. The goal was to at least respect the case-insensitive nature of the paths.

Just because we don't want to fully canonicalize the path doesn't mean that we have to go to the other extreme and expect HOMEDRIVE/HOMEPATH to match the case of GetSystemDirectory().

Copy link
Author

@native-api native-api left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dscho (For some reason, I cannot reply to your last set of requests, so replying to them further below.)

It looks like you feel strongly about how exactly each bit should look. Then it'll be much more productive if you yourself postprocess my code to your heart's content (as a member, you have write access to my PR branch) rather than playing the game of guessing your thoughts (which I'm not capable of doing, sorry).


Maybe we should use WCHAR instead of wchar_t, then? Would that fix it?

Yes, that fixed the warning, too. The code is LESS consistent in using types from this though.

No, the goal was not to ignore that path in any format. The goal was to at least respect the case-insensitive nature of the paths.

This is still a compromise and not the correct way to compare paths in the general case so still warranted a comment 🙂

Sorry, I was wrong. It's not nowhere that we use it, it is highly unconventional in our source code.
I'd much rather see you using a name like system32_path.

This suggestion is impossible to use as it is. Please provide an entire naming convention. It should be able to handle at least 3 variables of different types representing the same value: strbuf, wchar_t[] and size_t.
And since such a convention already exists and already used for exactly this case, I fail to see how inventing yet another one will make the code more consistent or easier to maintain.

compat/mingw.c Outdated
die("GetSystemDirectoryW failed: %ld",
GetLastError());
len_w_buf = buf.len + 1;
w_buf = alloca(len_w_buf * sizeof(wchar_t));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it'll be a wchar_t[32768] -- or buf.buf might not fit (plus adding a magic constant to the code that must be the same as in mingw_getenv). You okay with that?

@native-api
Copy link
Author

The https://github.com/git-for-windows/git/runs/835986294 check seems to have hung so its failure looks irrelevant.

@dscho
Copy link
Member

dscho commented Jul 4, 2020

It looks like you feel strongly about how exactly each bit should look.

  • It's more like: I know that the patch in its current form would not be accepted on the Git mailing list. And that's where almost all of Git for Windows' patches need to go, ultimately.

Then it'll be much more productive if you yourself postprocess my code to your heart's content (as a member, you have write access to my PR branch) rather than playing the game of guessing your thoughts (which I'm not capable of doing, sorry).

I see how that might look appealing from your side.

From my side, it looks like I spent more time than average on reviewing this PR and helping improve it, and now you expect me to do even more.

Maybe we should use WCHAR instead of wchar_t, then? Would that fix it?

Yes, that fixed the warning, too. The code is LESS consistent in using types from this though.

From the point of using multiple data types for the same thing (wchar_t vs WCHAR), that might seem so. From the point of view that looks at the data types expected be the called functions, it looks a bit more consistent, though.

No, the goal was not to ignore that path in any format. The goal was to at least respect the case-insensitive nature of the paths.

This is still a compromise and not the correct way to compare paths in the general case so still warranted a comment 🙂

We're not talking about the general case. We're talking about the GetSystemDirectoryW() function making no guarantees about the case-ness matching that of HOME*.

Sorry, I was wrong. It's not nowhere that we use it, it is highly unconventional in our source code.
I'd much rather see you using a name like system32_path.

This suggestion is impossible to use as it is. Please provide an entire naming convention. It should be able to handle at least 3 variables of different types representing the same value: strbuf, wchar_t[] and size_t.

Why not use system32_path as suggested? And for a temporary strbuf, you can use buf, as it is done elsewhere in Git's source code.

And since such a convention already exists and already used for exactly this case, I fail to see how inventing yet another one will make the code more consistent or easier to maintain.

I would not call sporadic use cases "a convention". They're more like evidence that I traditionally have less time to spend on PRs than I'd like, and am forced to be more pragmatic than dogmatic.

@native-api
Copy link
Author

As an aside, the "This fixes..." convention is not documented anywhere in the repo.

@native-api
Copy link
Author

It's more like: I know that the patch in its current form would not be accepted on the Git mailing list.

Why would that possibly be? It adheres to the coding guidelines as written. If there's a violation you see that I don't, please point it out and which guideline it violates so that I have the tools to not make that mistake again.

I see how that might look appealing from your side.

From my side, it looks like I spent more time than average on reviewing this PR and helping improve it, and now you expect me to do even more.

But what you are spending that time on is telling me to write code exactly as you want. And justifying your demands not by referring to written conventions or hard numbers but with intangible, subjective and unverifiable claims like "consistency" and "ease of maintenance" (which are not even mentioned in the guidelines btw) -- while contradicting actual evidence and hard numbers. That's not something that I can work with.

So from my side, it looks like you are telling me to write worse code just because it adheres to your personal tastes, correctness and efficiency be damned -- because you don't bother to support the claims of superiority of your options with any objective grounds (see https://en.wikipedia.org/wiki/Paul_Graham_(programmer)#Graham's_hierarchy_of_disagreement). Let's say that writing bad code just because it adheres to someone's tastes is not something that I enjoy or wish to do in my free time -- or can do consistently anyway because the only way to do that is to read that someone's thoughts (or carefully study their tastes over a large period of time which is not something I'm willing to invest into, either).

So let's either have a factual argument, or no argument at all because that would not be (and is currently not) a productive use of our time.
You edit my code to your personal tastes and be done with it is simply a fast and efficient way to execute the "no argument" option if the former is not your preferred MO. By suggesting it, I'm actually trying to save your time and nerves while doing things the way you are evidently accustomed to.


First, to the central point of contention (resolving which would likely remove most others). As I explained at #2725 (comment), I didn't write a separate subroutine because the added code is small, and a subroutine would be noticeably longer, do unnecessary work and use more memory. You didn't contest that premise, that's the reason why it's still so.

Are these mere 12 lines unrelated to the fn's main job already much enough for a subroutine here given the structure of this specific code (it's all in a single, well-defined and visually separated branch)?
If yes, the added code violates the guideline "Try to make your code understandable..." by not splitting out a subroutine where this would make the code clearer -- so I would need to split out a subroutine to adhere to the guidelines. If using a subroutine, it would indeed be logical to use strbuf as the exchange format -- not doing so would force me to leave the lines converting buf to wchar_t in the main fn which would still be violating this guideline because arranging all code unrelated to the main task into a subroutine is way clearer than only a part of it.
BTW the "Use the API" guideline is not applicable here because it suggests using strbuf for convenience and efficiency, and it's not convenient or efficient to use strbuf in this particular case if not splitting a subroutine as I explained in the linked message.


Maybe we should use WCHAR instead of wchar_t, then? Would that fix it?

Yes, that fixed the warning, too. The code is LESS consistent in using types from this though.

From the point of using multiple data types for the same thing (wchar_t vs WCHAR), that might seem so. From the point of view that looks at the data types expected be the called functions, it looks a bit more consistent, though.

But wcsicmp expects wchar_t, and so do xutf*. So the consistency from the 2nd standpoint is actually the same, we cannot satisfy both fns at once.

We're not talking about the general case. We're talking about the GetSystemDirectoryW() function making no guarantees about the case-ness matching that of HOME*.

It doesn't make any official guarantees of consistency with HOME* whatsoever. They just may point to the same physical location. So, whether we use wcscmp or wcsicmp, it's still a compromise which only works correctly if we make additional assumptions (about no trailing slashes and no FS shenanigans like symlinks and mount points in both cases). And when we rely on non-obvious assumptions, we document them with comments if it's impossible/impractical to document them with code (N.N.Nepeivoda, a prominent Russian CS scientist, calls these extra bits of knowledge required for the program to work but not present in the code "призраки" (specters), see https://www.intuit.ru/studies/courses/3402/40/lecture/1208).

Why not use system32_path as suggested? And for a temporary strbuf, you can use buf, as it is done elsewhere in Git's source code.

Because I need not one but up to three variables relating to the same value (a wchar_t buffer, its size_t length, and an strbuf converted to/from those). And multiple variables in the same scope cannot have the same name in C even if they are of different types. So I must use type affixes, the pattern of which would be a naming convention. For some reason, you don't like the affixes I and other code used, and code using strbufs exclusively doesn't use affixes. So please suggest some others and explain how they are better, or I cannot proceed.

I used system_dir as a base name because "system directory" is the official term for this entity, not "system32 path". Using official terminology is the best way to make it crystal clear to others (and yourself after a few months) what you mean.

buf is already in use in this scope so I cannot use that name (nor do I need to atm).

And since such a convention already exists and already used for exactly this case, I fail to see how inventing yet another one will make the code more consistent or easier to maintain.

I would not call sporadic use cases "a convention". They're more like evidence that I traditionally have less time to spend on PRs than I'd like, and am forced to be more pragmatic than dogmatic.

Converting strbuf<->wchar_t for WinAPI interaction is hardly a "sporadic use case". Granted, due to Git's own APIs, they are rare, but they do exist and there's no way around them.

@native-api
Copy link
Author

@dscho
Copy link
Member

dscho commented Aug 25, 2020

"This fixes..." convention is not documented anywhere in the repo.

It is not a Git convention, but a GitHub convention documented here: https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword

It's more like: I know that the patch in its current form would not be accepted on the Git mailing list.

Why would that possibly be? It adheres to the coding guidelines as written. If there's a violation you see that I don't, please point it out and which guideline it violates so that I have the tools to not make that mistake again.

Like every group of people, the Git contributors have rules and tastes that are not written down but assumed to be implicit rules. For example, simpler, more obvious code is usually favored over more complex one, code that is indented in a different style than surrounding code needs to be adjusted, etc

These rules are not fixed, either, they evolve over time. Therefore it is a bit of a hassle to write all of them down (and ensure that the documentation does not become stale).

Hence my advice, I am active in the Git project, so I know a little bit about what gets in and what issues are usually blocking new contributions.

you are spending that time on is telling me to write code exactly as you want.

Not exactly. Yes, I want the helper function to have a specific signature (except if you find a better name for it, or for its parameters). That is clearly more elegant than having the code inline, heavily indented, that makes the surrounding code less obvious than it was before.

So from my side, it looks like you are telling me to write worse code just because it adheres to your personal tastes, correctness and efficiency be damned -- because you don't bother to support the claims of superiority of your options with any objective grounds (see https://en.wikipedia.org/wiki/Paul_Graham_(programmer)#Graham's_hierarchy_of_disagreement).

I am trying to optimize for maintenance efficiency. The more obvious the code is, the easier it is to maintain.

In the same spirit, the farther away a discussion about code deviated from said code (e.g. into philosophy, management theory and even cultural context), while it may be more fun (even for me), it is not exactly efficient for maintaining code.

So I would like, with all due respect, to focus back on the change you want to make. Here is how I would have written the patch, if I had been in your shoes:

diff --git a/compat/mingw.c b/compat/mingw.c
index 1a64d4efb26b..a5d61b93d759 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -3289,6 +3289,18 @@ static size_t append_system_bin_dirs(char *path, size_t size)
 }
 #endif
 
+static int is_system32_path(const char *path)
+{
+	WCHAR system32[MAX_LONG_PATH], wpath[MAX_LONG_PATH];
+
+	if (xutftowcs_long_path(wpath, path) < 0 ||
+	    !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) ||
+	    _wcsicmp(system32, wpath))
+		return 0;
+
+	return 1;
+}
+
 static void setup_windows_environment(void)
 {
 	char *tmp = getenv("TMPDIR");
@@ -3329,7 +3341,8 @@ static void setup_windows_environment(void)
 			strbuf_addstr(&buf, tmp);
 			if ((tmp = getenv("HOMEPATH"))) {
 				strbuf_addstr(&buf, tmp);
-				if (is_directory(buf.buf))
+				if (!is_system32_path(buf.buf) &&
+				    is_directory(buf.buf))
 					setenv("HOME", buf.buf, 1);
 				else
 					tmp = NULL; /* use $USERPROFILE */

You will note that I actually followed your idea to convert to wide characters and compare the result, simply because there is already a xutftowcs_long_path() but no xwcstoutf_long_path() equivalent, and I really like that the name makes it explicit that we are talking about paths here.

You will also note that I took your objections about over-use of variables to heart. Not so much because it is more memory-efficient that way (because this code is run exactly once, during startup of the Git process, and only if no HOME variable is set yet, i.e. only for the top-level process) but because it is indeed much, much more readable.

Is the function name optimal? Maybe not. Maybe is_system32_directory() would be better. Are the variable names chosen wisely? Maybe, maybe not.

What I do know is that by keeping most of the logic in a separate, short function (that might even be inlined by GCC for what I know!), the code is super obvious, and not intrusive at all.

Now, where do we want to take this from here? You tell me, because I am just the maintainer, and as long as what you contribute is easy to maintain, and as long as the commit message does a really convincing job of explaining why we want this change, I am happy to take it.

…ctory

Internally, Git expects the environment variable `HOME` to be set, and
to point to the current user's home directory.

This environment variable is not set by default on Windows, and
therefore Git tries its best to construct one if it finds `HOME` unset.

There are actually two different approaches Git tries: first, it looks
at `HOMEDRIVE`/`HOMEPATH` because this is widely used in corporate
environments with roaming profiles, and a user generally wants their
global Git settings to be in a roaming profile.

Only when `HOMEDRIVE`/`HOMEPATH` is either unset or does not point to a
valid location, Git will fall back to using `USERPROFILE` instead.

However, starting with Windows Vista, for secondary logons and services,
the environment variables `HOMEDRIVE`/`HOMEPATH` point to Windows'
system directory (usually `C:\Windows\system32`).

That is undesirable, and that location is usually write-protected anyway.

So let's verify that the `HOMEDRIVE`/`HOMEPATH` combo does not point to
Windows' system directory before using it, falling back to `USERPROFILE`
if it does.

This fixes #2709

Initial-Path-by: Ivan Pozdeev <vano@mail.mipt.ru>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member

dscho commented Aug 25, 2020

For fun, I rewrote your commit message, in a way that I believe is better suited for readers who are unfamiliar with Windows (and want to remain in that state): main...dscho:system32-is-not-a-good-home

@native-api
Copy link
Author

I'm fine with merging 064718e and closing this PR.

@dscho dscho requested a review from rimrul September 28, 2020 08:34
@native-api
Copy link
Author

I didn't see the way you did it (make the comparison inside the subroutine) and you never pointed to it so a demand to make a subroutine looked very unreasonable (see #2725 (comment) on what it would look like -- hardly more obvious or easier to maintain in any sense of word).
I would also never silently ignore an error from API (which is also the only thing that let you evade having to use w_ and len_ pair of variables) as I've suffered from too many bugs resulting from this kind of practice. The commit text duplicates the code logic, imperfectly -- so it's redundant work that also runs the risk of misleading people, it'd be sufficient to only describe what is not in the code -- i.e the why.

The "fixes #" phrase is intended for use in posts, not in commit messages, since it's meaningless without the magic link that only works within Github UI and within a specific project (e.g. it would break in a fork, locally, or at another hosting site). So mandating its use is hardly something you can expect from people without requiring it explicitly.

Finally, Git is just another of a myriad projects I've contributed to. I have no intention to become an active contributor (as you seem to think everyone is aspiring to, calling the PR a "new contribution"), I just fix a problem that I've run into, regardless of which software it is in, and move on to the next thing. So if you expect everyone to first learn the unwritten quirks of the project and your personal tastes before being able to contribute, you'll just have much disappointment and very few contributors. I personally will think twice before coming here ever again -- too many artificial obstructions and they also require me to read their minds, that's silly and not worth the time.

@dscho
Copy link
Member

dscho commented Sep 28, 2020

@native-api thank you for your contribution! I'll take it from here.

@dscho dscho merged commit e782f50 into git-for-windows:main Oct 2, 2020
dscho added a commit to git-for-windows/build-extra that referenced this pull request Oct 2, 2020
When using Git via the "Run As..." function, [it now uses the correct
home directory](git-for-windows/git#2725).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit that referenced this pull request Jan 21, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 22, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 25, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 25, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 26, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 27, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 27, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 27, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 28, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 29, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Jan 29, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 1, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 2, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 2, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 2, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 2, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 4, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 4, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 4, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 4, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 4, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 4, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 5, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 5, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 7, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 7, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 7, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 8, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 9, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
dscho added a commit that referenced this pull request Feb 9, 2021
Ignore Vista+ HOMEDRIVE/HOMEPATH default for non-shell logons
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

git-bash.exe saves/loads config from wrong home-directory (When using 'run as')
3 participants