Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Vampire/setup-wsl from 4.1.1 to 5.0.0 #2008

Merged
merged 1 commit into from
Mar 12, 2025

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 3, 2025

Bumps Vampire/setup-wsl from 4.1.1 to 5.0.0.

Release notes

Sourced from Vampire/setup-wsl's releases.

v5.0.0

Highlights

WSL update command is now only done for GitHub-hosted windows-2022 runners

Breaking Changes

Before this release a WSL update command was always issued if WSLv2 was going to be used. The windows-2025 image now has the necessary update already installed, so currently only the GitHub-hosted windows-2022 runners need the call to get WSLv2 working. But as there are still updates available for the windows-2025 runners it would potentially waste time unnecessarily.

So now the WSL update command is only done for WSLv2 on GItHub-hosted windows-2022 runners, where it is known that the update is necessary currently to make it work at all.

This could break running on self-hosted runners if they do not have the necessary WSL update installed. Because of that this is a breaking release.

In such a case a WSL update command can be manually issued before calling the action, or much preferable is to install the WSL update in the runner image so that no time is wasted during the workflow execution unnecessarily.

Details

  • Increase version to 5.0.0 [a87a88ea7c6d9da627e393db7fef4ac1c35bc85d]
  • Only do WSL update command on windows-2022 runner image hosted by GitHub [f4cd6461ad271d606be1bebc312da4e3ec37ab25]
Commits
  • f40fb59 [Gradle Release Plugin] - pre tag commit: 'v5.0.0'.
  • f4cd646 Only do WSL update command on windows-2022 runner image hosted by GitHub
  • a87a88e Increase version to 5.0.0
  • c19a8f3 [Gradle Release Plugin] - new version commit: 'v4.1.2-SNAPSHOT'.
  • See full diff in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code labels Mar 3, 2025
Copy link
Member

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cygwin failure here is due to #2004 and the same as occurs on main; it will go away with #2009. Because it is unrelated to the changes in this PR and affects only a workflow where the action this PR upgrades is not used, I think it need not block this.

The other failure is of the action being updated, and it looks to be due to Vampire/setup-wsl#72. This problem is intermittent and not due to the changes in this PR, but it is related to the changes in this PR, both in the general sense that it is a failure in the action being upgraded, and in the specific sense that the upgrade includes changes that are related to the error. But the changes do not seem to exacerbate the error; instead, they mitigate it, but the mitigation unfortunately does not apply to the way we are currently using setup-wsl.

I recommend rerunning the workflow on this PR just in case I am missing something, and merging the PR if it passes. I am probably not missing anything, because the failure would have happened on all the Windows test jobs instead of just one if it were due to a non-intermittent cause, and because I made a branch from this in my fork and the tests passed, including when rerun.

We can also modify the way we use WSL to prevent the error from ever occurring, which I plan to open a PR for, but I don't think that needs to be included here, nor to block this.

Relevant log output

From the run log here:

C:\Windows\system32\wsl.exe --set-default-version 2
  For information on key differences with WSL 2 please visit https://aka.ms/wsl2
  
  
  The operation completed successfully.
  
  
  ]C:\Windows\system32\wsl.exe --update
  Forbidden (403).
  
  
  ::debug::Error: The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295%0A    at ExecState._setResult (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:592:1)%0A    at ExecState.CheckComplete (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:575:1)%0A    at ChildProcess.<anonymous> (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:469:1)%0A    at ChildProcess.emit (node:events:5[19](https://github.com/gitpython-developers/GitPython/actions/runs/13631523273/job/38100167395?pr=2008#step:4:21):28)%0A    at maybeClose (node:internal/child_process:1105:16)%0A    at Process.ChildProcess._handle.onexit (node:internal/child_process:305:5)%0A
  Failure happened, retrying (The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295)

A red herring: encodings

The above looks different from what you see when you view the log. If you're not worried about that, you can skip this section. Otherwise, this describes the cause and why it is not related to the failure seen here.

That looks different from what appears in the linked run output, because the linked run output shows text from the output of wsl.exe with spaces after each letter, e.g., F o r b i d d e n ( 4 0 3 ) . This is not related to the error being reported, nor to the underlying cause of it, but is instead due to the messages being output by wsl.exe in the UTF-16LE encoding. The resulting null bytes are displayed in different ways based on how the log is examined. Copying them, as here, makes them seem to vanish, while looking at them in the non-raw log shows them as spaces. (This is not new, but it can be misleading if not addressed, so I figured I'd cover it here.)

To verify that the spaces are just null bytes, as occur in those positions when UTF-16LE is interpreted as UTF-8, here's the same thing copied from the raw log:

2025-03-03T13:02:53.1042669Z [command]C:\Windows\system32\wsl.exe --set-default-version 2
2025-03-03T13:02:53.1239543Z F�o�r� �i�n�f�o�r�m�a�t�i�o�n� �o�n� �k�e�y� �d�i�f�f�e�r�e�n�c�e�s� �w�i�t�h� �W�S�L� �2� �p�l�e�a�s�e� �v�i�s�i�t� �h�t�t�p�s�:�/�/�a�k�a�.�m�s�/�w�s�l�2�
2025-03-03T13:02:53.1240472Z �
2025-03-03T13:02:53.1240758Z �
2025-03-03T13:02:53.1241229Z �T�h�e� �o�p�e�r�a�t�i�o�n� �c�o�m�p�l�e�t�e�d� �s�u�c�c�e�s�s�f�u�l�l�y�.�
2025-03-03T13:02:53.1241806Z �
2025-03-03T13:02:53.1242099Z �
2025-03-03T13:02:53.2195310Z �[command]C:\Windows\system32\wsl.exe --update
2025-03-03T13:02:53.3985816Z F�o�r�b�i�d�d�e�n� �(�4�0�3�)�.�
2025-03-03T13:02:53.3986360Z �
2025-03-03T13:02:53.3986573Z �
2025-03-03T13:02:53.6038224Z �::debug::Error: The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295%0A    at ExecState._setResult (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:592:1)%0A    at ExecState.CheckComplete (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:575:1)%0A    at ChildProcess.<anonymous> (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:469:1)%0A    at ChildProcess.emit (node:events:519:28)%0A    at maybeClose (node:internal/child_process:1105:16)%0A    at Process.ChildProcess._handle.onexit (node:internal/child_process:305:5)%0A
2025-03-03T13:02:53.6044228Z Failure happened, retrying (The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295)

An analogous encoding issue arises when dealing with error messages from WSL (rather than from a program running in a WSL distribution) due to running the WSL-associated bash.exe wrapper, which our test suite handles this way:

# When bash.exe is the WSL wrapper but the output is from WSL itself rather than
# code running in a distribution, the output is often in UTF-16LE, which Windows
# uses internally. The UTF-16LE representation of a Windows-style line ending is
# rarely seen otherwise, so use it to detect this situation.
if b"\r\0\n\0" in stdout:
return stdout.decode("utf-16le")

That works well enough because it is only used in the test suite and only for a very narrow purpose. Nonetheless, there may be a way to incorporate encoding detection into the setup-wsl action to produce more readable messages. One complicating factor is that the total output contains UTF-16LE portions as well as UTF-8 portions. I hope eventually to investigate and, if I can, propose a patch to the setup-wsl action to help with this. But there are a number of other things I'm working on that take precedence, so I don't know when I would get around to doing that. In any case, the GitPython test suite performs detection because it needs to examine the text of the output. That doesn't seem to be needed in setup-wsl, so this doesn't cause or contribute to any other problems.

The actual problem

As described in Vampire/setup-wsl#72 (comment), the windows-2022 image requires a WSL update to run WSL2. Downloading this update from GitHub-hosted runners sometimes fails with HTTP 403, and automated retries within the same job run are not usually enough to make it go away, even though rerunning the job or workflow manually is. This seems to happen due to behavior of the remote server, rather than something that can be fixed in the action.

Although Vampire/setup-wsl#72 is closed as not planned, it is mitigated in Vampire/setup-wsl@f4cd646 by refraining from attempting the update check on the windows-2025 (actions/runner-images#11228) where it is no longer needed (Vampire/setup-wsl#72 (comment)). That is actually the change in the version this Dependabot PR is upgrading to, v5.0.0:

Before this release a WSL update command was always issued if WSLv2 was going to be used. The windows-2025 image now has the necessary update already installed, so currently only the GitHub-hosted windows-2022 runners need the call to get WSLv2 working. But as there are still updates available for the windows-2025 runners it would potentially waste time unnecessarily.

So now the WSL update command is only done for WSLv2 on GItHub-hosted windows-2022 runners, where it is known that the update is necessary currently to make it work at all.

While windows-2025 is available, windows-latest still aliases windows-2022, which is why it was possible for this failure to occur in this Dependabot PR.

Fixing the problem here

This Dependabot PR mitigates the problem but does not directly help here. So we should make some further change. One possible change is to switch to windows-2025, which we could switch to explicitly or wait for windows-latest to point to. In that case, the update in this PR is a necessary part of fixing it.

This problem can also be fixed by switching to WSL1. When we started using setup-wsl in #1745, it defaulted to WSL1 on the runner we were using at that time. At some point, it switched to defaulting to WSL2. (This is related to the availability of nested virtualization, and I think also to other factors. See actions/runner-images#5920 and actions/runner-images#10563. I don't remember what Windows image windows-latest was on as of #1745, but I believe windows-2019 on GitHub-hosted runners does not support WSL2, or at least not with the setup-wsl action. Recent versions of setup-wsl default to WSL2 on windows-2022 and windows-2025. I am not sure if windows-2022 on GitHub-hosted runners has always supported WSL2, or if this started later due the virtualization infrastructure supporting nested virtualization, or if other factors are in play.)

Usually WSL2 is preferable to WSL1, but I think we may actually be better off using WSL1. The test cases that pass and fail seem to do so irrespective of whether WSL1 or WSL2 is used (as well as not depending on what distribution is used in WSL, so long as it provides bash). The check in the test suite for whether we are using WSL works both in WSL1 and WSL2. The core of that check is:

# Output rather than forwarding the test command's exit status so that if a
# failure occurs before we even get to this point, we will detect it. For
# information on ways to check for WSL, see https://superuser.com/a/1749811.
script = 'test -e /proc/sys/fs/binfmt_misc/WSLInterop; echo "$?"'
command = ["bash.exe", "-c", script]
process = subprocess.run(command, capture_output=True)

From this answer to "How can I check if the environment is WSL from a shell script?" on Super User by NotTheDr01ds:

The presence of /proc/sys/fs/binfmt_misc/WSLInterop is a pretty good indicator that you are on WSL. This may be the most reliable method, and it is what Ubuntu's Snapd project uses as its detection mechanism. This file exists under both WSL1 and WSL2 by default. Even when Interop is disabled via /etc/wsl.conf, this file will still be created by WSL at startup.

The answer goes on describe rare and deliberate cases where this would misdetect whether WSL is present. For our purposes, I believe those cases are not important, and that in at least some of them it is actually better that we misdetect, since they seems to arise mainly when someone is deliberately investigating the effect of misdetection.

We only use bash for the existing partial implementation of running hooks. In the future, it seems likely that this will become more robust and sophisticated, and gain many new tests. However, that would only happen after #1791 (or after the problem it is solving is solved in some other way, but my guess is that the conceptually related work I am doing on gitoxide will eventually result in a level of understanding of the relevant considerations such that I would then feel comfortable making any remaining needed changes to #1791 to allow it to be merged, even if its author is not available to critique them and propose further changes, though I hope that will also be able to happen).

Given that our current use of WSL is just in a few tests that end up running hooks with the WSL bash or attempting to do so, the extra memory resources associated with using WSL2 instead of WSL1 on CI do not seem justified. (WSL2 distributions actually run as containers rather than virtual machines, but this means that they all run in the same VM, not that no VM is needed.) After installation, WSL2 tends to be significantly faster than WSL1, but I don't think the difference matters much for the way we are currently using it.

I am aware of two possible reasons to prefer WSL2 for us:

  • My assumption about speed may be wrong, because of the operations involved in installing bash in the Alpine Linux distribution. This is taken care of by setup-wsl but I believe it is achieved by running apk in the distribution.

    This first reason is the more powerful of the two. (I include the following second reason mainly to say why I think it is not compelling.)

  • I don't currently have access to nested virtualization when I run manual tests. Therefore, for some of the weirder cases that involve modifying a system in a way that I would only want to do on a virtual machine where I can restore a checkpoint afterwards to decisively undo them, local testing is only on WSL1. Having CI run WSL2 might theoretically surface something that is not found otherwise.

    (I think systems where bash.exe is absent in System32 exist, including in versions that are still supported with security updates and therefore reasonable to use. But the way I produce this in manual testing is to delete bash.exe from there. That requires either taking ownership of it or performing the rename or delete operation with a token in which the TrustedInstaller SID is enabled. These are unusual things to do to a file in System32 and, while I believe I always do them in a way that has no extra unanticipated effects, I prefer not to rely on that belief.)

    However, this is unlikely. At least for the case of bash.exe being absent in System32, we do not currently attempt to test that on CI anyway. It would be feasible to do so. The virtual machines are recreated for every job on GitHub-hosted runners, so doing strange things in System32 would be self-contained. But this is less important than more common cases and the extra time and usage needed to test it on CI may not be justified.

Since major changes are unlikely prior to further work on #1791, which would eliminate the need for WSL altogether--unless we want to test that it not used anymore even when fully available--I think our current use of WSL is limited enough that it is fine to use either WSL1 or WSL2 but likely preferable to use WSL1.

But for this PR...

Even if we remain implicitly on windows-2022 (via windows-latest until it upgrades) and even if we switch to WSL1, the changes in this PR should cause no harm (the major version bump is only because it is potentially breaking for self-hosted runners that may start out with fewer WSL updates even on Windows Server 2025).

EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request Mar 7, 2025
This avoids an occasional HTTP 403 error updating WSL for WSL2.

For details on that issue and possible approaches, see:
gitpython-developers#2008 (review)
@EliahKagan EliahKagan mentioned this pull request Mar 7, 2025
@Byron
Copy link
Member

Byron commented Mar 12, 2025

@dependabot rebase

Bumps [Vampire/setup-wsl](https://github.com/vampire/setup-wsl) from 4.1.1 to 5.0.0.
- [Release notes](https://github.com/vampire/setup-wsl/releases)
- [Commits](Vampire/setup-wsl@v4.1.1...v5.0.0)

---
updated-dependencies:
- dependency-name: Vampire/setup-wsl
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/github_actions/Vampire/setup-wsl-5.0.0 branch from 2e1c5b4 to 54e1c1b Compare March 12, 2025 11:55
@Byron
Copy link
Member

Byron commented Mar 12, 2025

Thanks a lot for the analysis.

Frankly, I am not able to grasp it (or even read it) because it feels like the topic is way above my head and am absolutely happy to just do as you say, you are the expert.

[..] and because I made a branch from this in my fork and the tests passed, including when rerun.

This and my paragraph above should be an indicator that you really should have more capabilities in this repository. Once again, I leave it to you to decide if you'd like to follow up on this offer, and what permissions you'd like to have. If you think more permissions can improve your convenience and/or save you time, I really think it should be done.

@Byron Byron merged commit dc7ba87 into main Mar 12, 2025
47 checks passed
@dependabot dependabot bot deleted the dependabot/github_actions/Vampire/setup-wsl-5.0.0 branch March 12, 2025 12:25
Byron pushed a commit to EliahKagan/GitPython that referenced this pull request Mar 12, 2025
This avoids an occasional HTTP 403 error updating WSL for WSL2.

For details on that issue and possible approaches, see:
gitpython-developers#2008 (review)
@EliahKagan
Copy link
Member

EliahKagan commented Mar 13, 2025

Frankly, I am not able to grasp it (or even read it) because it feels like the topic is way above my head [...]

I think I did not write #2008 (review) all that well, which probably also contributed. I've written a clearer summary below. This is partly to ensure things are clear enough that I will myself be able to understand them in the future. So you are under no obligation to read it. But I suspect you might find it useful as well.

Once again, I leave it to you to decide if you'd like to follow up on this offer, and what permissions you'd like to have. If you think more permissions can improve your convenience and/or save you time, I really think it should be done.

Yes, this sounds like a good idea. I suggest adding me to the gitpython-developers organization, unless there is some reason you prefer not to do that. In addition, please feel free to confer write access or whatever other capabilities of any kind that you expect would be helpful (I am not sure what is best).

The main caveat I can think of is that although, as noted in the readme, GitPython is open to new maintainers, I am not currently able or willing to take on a social role of maintainership. That is, I cannot at this time take on any actual commitments (other than the responsibility to avoid damaging things and to try to fix things if I do inadvertently damage them). Based on #1654 (comment), #1769 (review), and the current context, my guess is that your offer is not contingent on this. But I want to be sure, rather than assuming.

(Speculating wildly: I do not expect that I would be able or willing to take on such a full role in GitPython in the future either. What is more likely is that, at some point in the future, I may be able to take on responsibilities in specific areas, such as security advisories, CI upkeep, and helping out with GSoC or Outreachy mentorship if GitPython ever participates in those.)


WSL2 uses virtualization, while WSL1 uses a translation layer that does not involve virtualization. Because GitHub-hosted runners are virtual machines, using WSL2 on them requires support for nested virtualization. That and other factors affected the availability of WSL2 on CI. We don't need WSL2, and when we started using setup-wsl only WSL1 could be used. So at that time no WSL2-specific problems affected our use of WSL.

Since then, setup-wsl has added support for WSL2. But on some runners it has to download and install a WSL update to allow WSL2 to work. This download is separate from downloading the distribution (such as Debian or Alpine), and it is significantly more likely to fail. That caused the setup-wsl step to fail occasionally. There are two ways to fix that: using WSL1 instead, or switching to a runner where WSL doesn't need to be updated before using WSL2.

The change in setup-wsl 5.0.0, provided in this PR, fixed a bug where WSL was updated before using WSL2 even on Windows Server 2025, where it didn't have to be updated. Our test jobs currently run on Windows Server 2022, so that bugfix didn't fix the problem for us. By chance, the download failure occurred on one of the CI jobs in this PR. This was actually sort of convenient, in that it made the intermittent problem clear. (That this was the cause of the test failure was potentially obscured by how some errors from setup-wsl look weird due to encoding problems.)

The new version improved setup-wsl, and we might in the future even benefit from that improvement, so it made sense to take the Dependabot update here, even though it left the problem we were having unchanged. As for actually solving that problem, we never meant to use WSL2 on CI, there is no decisive reason to prefer WSL2 for what we are doing (especially since we should need neither once #1791 is done), and switching to WSL1 solves the download problem while decoupling it from what Windows version we use. So I opened #2010 to use WSL1, which is now in place.

@Byron
Copy link
Member

Byron commented Mar 13, 2025

Yes, this sounds like a good idea. I suggest adding me to the gitpython-developers organization, unless there is some reason you prefer not to do that. In addition, please feel free to confer write access or whatever other capabilities of any kind that you expect would be helpful (I am not sure what is best).

Great, I have invited you! The permissions are Triage + Write, and there are no obligations or expectations. Just that having these permissions makes your life easier, doing what you do.

Also, thanks for the summary, I think I get it now.

@EliahKagan
Copy link
Member

Thanks! I have joined the gitpython-developers org, and I have also checked that the GitHub interface shows abilities conferred by the new permissions such as the ability to rerun CI checks in this repository (both in PRs and on main), to request reviews on a PR (unprivileged users can only request re-reviews from reviewers who have already reviewed), and to create new branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code
Development

Successfully merging this pull request may close these issues.

2 participants