-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't assume program files folder locations #1456
Conversation
This is to support a forthcoming fix where the paths will no longer be hard-coded.
+ Let one of the already affected `Cargo.lock` files pick up the (unrelated to this) `portable-atomic` dependency from 3639678.
Instead of directly OsString. This is because these locations are paths. Representing them this way has the further advantage of allowing some code to be slightly simplified.
The code still worked with the extra step there, but now it's unnecessary.
The `PathsByRole` test helper struct's `maybe_pf_64bit` member was previously `Result<PathBuf, std::io::Error>`. But construction, which is by `PathsByRole::obtain_envlessly()`, already asserted the expected existence of the other two paths. It should likewise assert that if the guaranteed-64-bit program files path cannot be obtained from the registry, then this is due to it being absent, rather than some other error. Checking this in `obtain_envlessly()` makes it reasonable for `maybe_pf_64bit` to be an `Option<PathBuf>`, as its name would suggest it should be. This in turn makes clear that all the validation and assertions for this "reference value" have already been done. Thus `PathsByRole::validate()` can continue omitting validation for the error type, validation it should not perform because the point of it is do checks that are agnostic of how the paths were obtained. This also makes `PathsByRole` cloneable, which is now possible because it no longer holds on to a `std::io::Error` with its non-cloneable dynamic error information, and adds some other missing traits on test helper types.
+ Remove the "pf" parts of its fields. + Refactor usage now that unpacking isn't needed to refer to the values intuitively. + Write a fragment of the unit test that uses it. This was part of checking that the new names make sense, but these particular assertions may not all be kept (and may drastically change form). This is because, even if the process architecture specific path is something we want to list first, that is a change to the implementation that is separate from the first and main fix that this test is being introduced to provide coverage for.
And extract the corresponding suffixes, to be used in some of the subsequent assertions.
+ Rename some variables for clarity. + Comment about the possibility of a future clangarm64/bin subdirectory to check under (the same) 64-bit program files.
Add a couple explanatory comments in the implementation, one of which relates to (the absence of) this feature.
This stops attempting to make the Windows-specific helper function semi-portable. I had done that with the idea tha it could be tested on all platforms so as to gain better coverage in practice. But it doesn't really make sense to do this because it keeps the native path separator `\` from being used even where it is now clearer to use it, and because such tests would not be able to use common values such as `C:\Program Files` because a path that starts with `C:\`, while absolute on Windows, is relative on Unix-like systems.
+ Adjust wording of some related comments to further decrease confusion. + Put more details in the assert_from panic message when the list of paths has the wrong length, since we would want to know what's going on.
The new name is locations_under_program_files. The bug was in the order of the `bin` and `mingw*` components.
The module is in the same file, nested inside the `tests` module written there (which this test code had resided directly inside). It is currently named `loc`. This allows the repeated `#[cfg(windows)]` to be eliminated, allows the imports to be presented readably and to be grouped without `cargo fmt` undoing the grouping, and makes clear that these tests are all related to each other more than to the preexsting tests. But I am not sure if this is the best approach to organizing them. Maybe the test code for `locations_under_program_files`, which is a helper function meant to be used only in this module and only to define `ALTERNATIVE_LOCATIONS`, could remain in this file, while the other tests of `ALTERNATIVE_LOCATIONS` could be moved to another file. `ALTERNATIVE_LOCATIONS` has `pub(super)` visibility, so it's not part of the interface of `gix_path::env`, but it could perhaps have tests in another file directly or indirectly inside `gix-path/src/env`.
This checks where *program files* directories are located on a Windows system, which are used for a fallback check after `git` has not been found in a `PATH` search (to invoke `git` to find out information such as the location of its system config file). Previously, two hard-coded paths were used. These were correct for the vast majority of 64-bit Windows systems, but were in practice never correct on 32-bit Windows systems. Checking programmatically for the locations should thus enable detection to succeed on more systems and under more circumstances, and avoid other problems.
This was needed on both Windows and non-Windows systems ever since I moved the tests from `tests` to `tests::loc`, but I forgot to update it for the non-Windows `alternative_locations` test.
This removes the periods from the ends of sentences passed as `expect` messages in recently introduced tests and their helpers, because when called on a `Result`, `Some text: Result info` is better than `Some text.: Result info`. Doing this also improves stylistic consistency with other `expect` messages in the project. I didn't make an analogous change to assertion messages, because the awkward `.:` construction does not arise from them. There may be further refinement that can be done, as the new `expect` messages are still worded as assertions and, as such, could be misread as claiming that the situation they decribe wrongly happened, rather than the intended meaning that it wrongly did not happen. This also changes "key" to "registry key" in one of the `expect` messages so that its meaning is clear in output of big test runs.
* Update a couple of preexisting doc comments for clarity and to avoid implying that the `git` that is found is *always* from PATH, since it may be from an alternative location. * Correct the claim about `EXEPATH` and MSYS, as it is present in Git Bash but not other MSYS environments. (This remains something of an oversimplification, since it is also present in Git Cmd environments, but that shell environment is decreasingly used and probably not essential information for that explanatory comment.) * Hedge claim about what `EXEPATH` gives us, because some ways of running Git Bash cause it to point to the `bin` directory rather than the installation directory, in which case our shortcut use of it will fail. This seems to happen when Git Bash is run by means other than the `git-bash.exe` graphical launcher, even if an attempt is made to start the shell the same way. Traditionally such approaches to running Git Bash may not have been overtly supported by Git for Windows. But the difference seems to be preserved even when the non-`git-bash.exe` way Git Bash is run is through a Windows Terminal profile added by the Git for Windows installer (which recently added that feature). * Remove a couple spurious semicolons I had accidentally added. * Remove unnecessary qualification for an imported name.
Since they have the same effect even down where that is not intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this incredible improvement - things get complex, and it's nice that most of that complexity is with the tests which nicely illustrate the paths that are produced.
It's unfortunate that I made the mistake not creating a new commit after the module changes, so my edits aren't easily visible. However, I made comments with highlights.
Thanks! It is a big improvement splitting the module into its implementation and tests and putting the files side-by-side in this way. I guess the correct organization for this code was obvious, but for whatever reason I didn't see it. Regarding the change being in just one commit, another commit for that could be rebased in if desired, but I think it should not be necessary. |
This pull request fixed CVE-2024-40644 (RUSTSEC-2024-0355, GHSA-mgvv-9p9g-3jv4).
I've updated this description with a bunch of details, but the most important information is in the advisories. The sections below summarize the bug from a security perspective, give the original PR description that addressed its non-security aspects, and present a lightly edited version of most of the supplementary material that had accompanied the advisory.
Summary of the bug this PR fixes
As presented in more detail in the advisories,
gix-path
sought Git for Windows in some hard-coded locations when not found in aPATH
search. The hard-coded locations were intended as the 64-bit and 32-bit program files directories. But this gave rise to a CWE-427 vulnerability, because the exact paths of those directories are not the same on all systems, and because limited (non-administrator) users can create new directories in the root of the system drive, it was possible to create them.In practice this mostly affected 32-bit Windows systems where the 32-bit program files directory is usually
C:\Program Files
and a directoryC:\Program Files (x86)
would not exist. This allowed a limited user to create it and place arbitrary code under it masquerading as a Git for Windows installation; then other users who rangix-path
would run that executable, causing code execution and privilege escalation. This is comparable to CVE-2022-24765.Original PR description
The original PR description did not cover security aspects of the bug. It is as follows.
In #1419,
gix-path
facilities got the ability to check Program Files directories as alternative locations for when Git for Windows is installed but thegit
binary isn't in thePATH
. However, it hard-codes the paths, which are not correct for all Windows systems.The most common scenario in which the currently hard-coded paths are not correct is a 32-bit Windows system, for which the 32-bit Program Files directory is the only Program Files directory and is (it least in all ordinary setups) just called
Program Files
. It is also technically possible, though in practice very uncommon, for the system drive not to beC:
and, in extremely weird setups, maybe even for Program Files directories to have different names.This obtains them through three specially named environment variables, which is one of the reasonably reliable and documented means that works for all combinations of process architecture (i.e. build target) and architecture of the system, as well as if they are somehow in unusual places. Because locating these directories carries potentially surprising subtleties, I've included extensive tests, including an end-to-end style test that uses a combination of other techniques to obtain them that is actually more reliable, but that would require new production dependencies, or nontrivial unsafe code in production, to do outside the tests.
I don't think CI covers the cases most likely to go wrong here, but I have tested this on a 32-bit x86 Windows 10 system, on a 64-bit x64 Windows 10 system with both 64-bit and 32-bit builds, and an ARM64 (AArch64) Windows 11 system with 64-bit ARM, 64-bit x64, and 32-bit x86 builds, both manually and by running all tests for
gix-path
, including the new tests I added.Supplementary material
This is most of the supplementary material I included with the advisory. In a few areas it is lightly edited. It is of much less interest than the advisory. Portions that would be of no interest at all, such as those relating to coordination or copyediting of the advisory text, are omitted. Portions consisting of code that I was proposing are likewise omitted other than my descriptions and commentary on that code, since the code is in the pull request.
Some diagnostically useful payloads
To confirm that any executable can be run rather than only those with digital signatures or otherwise marked as trusted, and even more so to make sure I understood what
git
is being called to do, I made two simple diagnostic utilities that I used as alternative payloads tocalc.exe
:showargs
, which displays and distinguishes its command-line arguments in a console even if all standard streams are redirected.showargsw
, which is likeshowargs
but it displays a graphical message box rather than using the console.Either of these can be renamed to
git.exe
and placed as described in the advisory.Possible fixes for the vulnerability
This vulnerability was introduced a few weeks ago in #1419, due to the hard-coded paths in the crate-internal
gix_path::env::git::ALTERNATIVE_LOCATIONS
static item. Ways it could be fixed that I think are reasonable include:I favor way 2, which I think achieves the intent of the current hard-coded paths. This way could also be extended--though it may be better to add this later rather than in the bugfix--to include the user's local program files directory. Git for Windows may exist there either due to being placed there by the installer, or by the user manually extracting the portable version.
Although way 1 has the advantage of simplicity and it can be superseded later, it has the disadvantage that some users may judge that they are not vulnerable and keep using the unpatched version for the alternative locations feature, and some of those users may be mistaken in that judgment.
I have not extensively researched way 3 but I worry it may lead to strange false matches due to installations that the registry still records as installed due to an error during uninstallation, or old Git for Windows installations that users put in weird places and then forget about because they were in weird places. I think the alternative locations should be specific, small in number, and as predictable as possible without sacrificing correctness, which is also in the spirit of the original.
Way 3 should not be confused with getting the program files paths themselves from the registry, which is one possible approach to implementing way 2.
Observing the bug and fix
This bug prevents
git.exe
from being found in an alternative location on a 32-bit Windows system even in the absence of any exploit, because the path it uses underC:\Program Files
is 64-bit specific.I have run unit tests as much as I can--I cannot get the full test suite to reliably build and run on the 32-bit Windows 10 system I am using for testing, due to build errors in dependencies reporting insufficient memory resources. But all
gix-path
tests pass on all platforms tested, including that 32-bit Windows 10 system, where I also manually verified that the vulnerability is present without the patch and absent with it. I have checked that the new tests I added fail in the absence of the functionality they are testing for.On my main x86-64 development system, I ran the full test suite, with both
x86_64-pc-windows-msvc
andi686-pc-windows-msvc
targets. All tests passed on the 64-bit build and there were no new failures on the 32-bit build. I also ran thegix-path
tests but not the full test suite on an ARM64 (AArch64) system, withaarch64-pc-windows-msvc
as well asx86_64-pc-windows-msvc
andi686-pc-windows-msvc
(64-bit ARM builds of Windows support those other architectures by emulation). All thegix-path
tests passed there too, except that a performance-related test failed due to taking too long on the two targets requiring emulation.Scope of the fix - current exclusions
There are two changes I considered but decided not to propose as part of the fix, though I think it makes sense to explore them later.
1. Simpler paths could be used
It seems to me that the paths being used are more complicated than necessary.
Git for Windows has target-specific paths to
git.exe
, whichgix-path
currently (as well as with my patch) uses:But it also has these paths that do not encode any architecture or toolchain, and this is even in the directory that Git for Windows places in the
PATH
when installed with default choices:And there is also this, alongside the
cmd
andmingw*
directories and not to be confused withmingw*\bin
:All of these are relative to the top-level installation directory, which is usually named
Git
. I think at least thatcmd
directory is always present, and maybe bothcmd
andbin
are.Currently, the target-specific directory is always called either
mingw64
ormingw32
, but in the broader MSYS2 world, those are not the only possible target environments. My guess is that when Git for Windows starts offering ARM64 (AArch64) builds, the directory that is currently always calledmingw64
ormingw32
will be calledclangarm64
in those builds.The reason I have not implemented this simplification in my patch is that it produces different observable results for the path to
git.exe
even in ordinary situations, including when that path is obtained through public functions ingix_path::env
. Maybe there is some reason to prefer these paths.However, please note that it should never make a difference for the output of
git --exec-path
. That giveslibexec/git-core
under an architecture and target specific path, and outputs the same path whichever of the above mentionedgit.exe
files is run.2. Local paths could be added
As briefly mentioned above, there can also be a "local" per-user program files directory. It is most often located at
AppData\Local\Programs
relative to the user's user profile (i.e. home) directory. It is not created with the user profile directory, but it is is created the first time an application is installed there. For some applications, including Git for Windows, beginning the installation process will create it even before any step that copies files.Limited users cannot install programs in the system-wide program files directories. So applications that are installed per-user--which administrators may also choose to do for themselves--are usually installed in the user's local program files directory.
It seems to me that, after checking
PATH
, this directory should ideally be checked first. If a user running an application that usesgix-path
has installed Git for Windows in their local program files directory, they presumably prefer that installation to be used over a system-wide installation they may not even control.Each user has at most one local program files directory, i.e., at most one location officially recognized as such. It may contain a mix of 32-bit and 64-bit applications. Therefore it may make sense to figure out whether to implement this feature at the same time as whether it makes sense to drop the use of architecture-specific paths, though it could be done either way by checking the user's local directory with both architecture-specific subdirectories.
(We can also check both subdirectories everywhere, which is reasonable, but I lean slightly away from it. Additional such directories that are not from Git for Windows--for example if users try copying directories from a separate MSYS2 installation to see if Git for Windows will use it, which is something I have personally tried in the past--may be unsuitable.)
So how do we find the program files directories?
Although the advisory focuses on the common case of default setups, in rare cases the paths may differ from what is typical. Microsoft recommends checking for them rather than hard-coding them, and I think we should.
There is more than one way to check for them. All are secure, as far as I can tell, but they are not equivalent, and some are more reliable than others. Rather than attempting to present the complete details here, I've written two programs that I used experimentally to research how they behave, as well as to figure out and document how to access them from Rust. These programs are heavily commented:
pfdirs
reports program files information obtained in all of the ways.knfo
queries the known folders system and reports all known folders with their paths or the error that occurs when attempting to get their paths.With that said, here's a summary of the four ways:
knfo
, I used the COM-based API to list them, to ensure I would discover all relevant ones. But this isn't necessary when one only needs to query specific preexisting known folders. Inpfdirs
, I used the simpler API which requires less resource acquisition and release. In both cases I used thewindows
crate and had to writeunsafe
blocks. If specific error information is not required about why known folders aren't available, then theknown-folders
crate can be used, which is the simplest of all.\\?\
prefixed paths are possible, though I don't know if a program files directory can actually have such a path. I think all supported versions of Windows, including extended forms of support, have known folders; they were introduced in Windows Vista. As such, I don't think we should use CSIDLs.LOCALAPPDATA
variable may be safe.winreg
crate facilitates querying the registry without writing unsafe code. As far as I know, no Microsoft documentation recommends the registry over other approaches for this, and I do not recommend it as a primary strategy.In my opinion, for the system-wide program files directories (which currently are the only ones we use), we should either use environment variables for all or none. This is because, in situations where a parent process customizes the environment, the effect of consulting environment variables for some of the paths but not others would be extremely strange and difficult to reason about.
Further notes on environment variables
A summary of the rules for how they are set
On a 64-bit system, the
ProgramFiles
environment variable is inherited from an architecture-specific environment variable, rather than from the same-named environment variable. This way, a child process gets aProgramFiles
variable suitable to its own architecture, rather than its parent's, at least so long as the parent process doesn't remove those architecture-specific variables. If the relevant architecture-specific variable is not passed down, thenProgramFiles
is inherited fromProgramFiles
itself. On a 32-bit Windows system, it is always inherited from the same-named variable and the other variables are not significant.The architecture-specific environment variables are
ProgramW6432
,ProgramFiles(x86)
, andProgramFiles(ARM)
. The latter is less relevant because it is for 32-bit ARM programs and Rust has no 32-bit ARM targets for Windows (at least viarustup
), and it does not need to be examined ingix-path
. It is not quite totally irrelevant, though, since the parent process of a process built withgix-path
can be a 32-bit ARM process. As for 64-bit ARM, that has the same program files directory as x86-64.Experimental confirmation and eludication
The Microsoft documentation is somewhat vague about how this works, so I made cross-platform utilities
useenv
andprenv
similar to theenv
andprintenv
Unix commands, respectively.useenv
, likeenv
, recognizes-i
to unset all non-specified variables and-u
to unset the single variable whose name is its operand. On an x86-64 system, withuseenv
andprenv
built for the target architectures indicated by their modified names:If a parent process passes none of those through, we get none. If it passes
ProgramFiles
but not the others, which is an easy mistake to make if attempting to produce a sanitized environment, we get that forProgramFiles
and learn of at most one program files directory, which may not even be the one that corresponds to our process architecture.Subtle failure modes found in the wild
Some stranger situations are also plausible: suppose a parent process is 64-bit and passes only
ProgramFiles
andProgramFiles(x86)
through, and we are a 32-bit process:So environment variables are of limited reliability...
But the case for environment variables remains strong
...Yet they also have significant advantages:
std::env::var_os
.My ARMv7 attempts
Because the parent could be a 32-bit ARM process but there's no Rust target for that on Windows, I also wrote a
wenv
tool in C, related toprenv
, and was going to write another one related touseenv
.I didn't get thar far. The only interesting thing about
wenv
is that I cross-compiled it for a 32-bit ARM target. The build succeeded, but I was unable to run it, due to missing DLLs. It seems I don't really know how to make a working 32-bit ARM program on Windows, even though my ARM64 VM does have 32-bit ARM alternates for some programs such as WordPad.Hopefully I'll be able to figure that out eventually, but I don't think it's a blocker here.
To detect or not detect the system architecture
If we use environment variables, there is the question of whether to always look at all of them, or to look at all of them when the system is 64-bit but only at
ProgramFiles
when the system is 32-bit. Both approaches fix the bug: on a 32-bit system, the others would presumably be absent, and if present it is presumably intentional.I am not sure what is best, but I favor having the same behavior across systems, and thus not programmatically checking the system architecture. This is also simpler. Also, to the best of my knowledge, the only ways to check are with unsafe code or by adding another library dependency.
However, the tests still have to check, at least for an end-to-end style test that compares the situation on the running system with the behavior of the code under test. The traditional approach still works. I was concerned this technique might report an x86-64 process running by emulation on an ARM64 Windows system as 32-bit, and some old documentation I was reading seemed to suggest this. I wrote the very simple program
wowreally
to check, and confirmed that this does not happen; when both are 64-bit, it is not considered to be "Windows on Windows" emulation.My view
In my opinion, detecting the program files directories' locations, even though it intuitively feels like it should be simple, is very easy to get wrong, and if done should be robustly tested. Yet I think it is worthwhile to do, because:
Program Files (x86)
on a 32-bit Windows system where that path is not special, occasionally a rare system, perhaps a thin client, may even not haveC:
as the system drive. (These days, Windows has a strong preference for labeling its own installation driveC:
, but it is not recommended to assume this.) Getting the information from the system should automatically do the right thing in these rare situations.PATH
is worthwhile because omitting it is one of the options directly offered by the Git for Windows installer, and such an installation is usable through Git Bash.ALTERNATIVE_LOCATIONS
can be implemented in one way, and an end-to-end style test can be written for it using another way or combination of ways to prepare its expected values.Regarding that last point, I think this approach is beneficial even if it is to be rejected, because it can offer insight into which approach ought ultimately to be used. If the implementation and a test helper both retrieve the information from the system but in independent ways, that may give a sense of what it would be like to do it the other way.
The comparison will not be perfect, because in the test setup
expect
should be used to assert that the test has the information it needs, while in the code under test any errors finding program files paths should simply lead to fewer alternative locations. But I think it is nonetheless illustrative.What I did
Because environment variables do not require additional unsafe code or new dependencies in production, and support not just integration-style tests but also unit-style tests mocking
var_os
, I implementedALTERNATIVE_LOCATIONS
that way.To form test system specific expected values in the integration-style test, I used known folders except for the 64-bit path, for which I used the registry.
Because the alternative locations are now to be obtained from the system, the element type of
ALTERNATIVE_LOCATIONS
should be someOsString
-based type. Since these are paths, I usedPathBuf
, which also accords with howALTERNATIVE_LOCATIONS
is used everywhere, and slightly simplifies some uses of it.The meat of the implementation is in the
locations_under_program_files
function that uses dependency injection so that unit-style tests can pass stub implementations.I put all the tests related to alternative locations, and their helpers, in a nested
tests::loc
module [note: since renamed totests::locations
].The unit-style tests have helper macros used to form
os_env
-like stubs that are really maps in function form, and to decrease duplication across assertions.The unit-style tests are divided into
locations_under_program_files_ordinary
andlocations_under_program_files_strange
.The integration-style test delegates most of its work to its helpers, which are more involved. This is partly because this includes obtaining the information about program files directory locations in an independent way.
The design I chose was to make assertions about those paths, and then make sure that those paths are the ones that appear as prefixes in the alternative locations produced by the code under test.
First there are helpers, mainly
PlatformArchitecture
that will be used in assertions that vary by system architecture, which unlike the process architecture is not known at compile-time.The
ProgramFilesPaths
struct is tasked with obtaining and representing the paths that we get independently without the use of environment variables so we can use them as expected values, and checking that they are themselves reasonable.The
RelativeGitBinPaths
struct contains the core assertion logic for the integration-style test, making direct claims about what the code under test produces. Its name may benefit from refinement, or maybe the awkwardness of the name is a sign that it should just be two helper functions.Its
assert_from
constructor checks that paths obtained from the code under test, i.e. by readingALTERNATIVE_LOCATIONS
, have the same program files directories as their path prefixes and strips them to obtain the suffixes. Then theassert_architectures
method on the resulting object asserts that the suffixes are correct.Then the test function for the integration-style test, named
alternative_locations
, puts it together, constructing aProgramFilesPaths
viaobtain_envlessly()
and callingvalidate()
to check it, then using it to testALTERNATIVE_LOCATIONS
by callingRelativeGitPaths::assert_from()
to check the matching program files paths and extract the suffices, and callingassert_architectures()
to check the extracted suffixes.The foregoing is all Windows-specific. The
#[cfg(not(windows))]
test code has no helpers and is much shorter, consisting only of the non-Windowsalternative_locations
test asserting thatALTERNATIVE_LOCATIONS
is empty.