Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wsl2] Import failed with: The operation timed out because a response was not received from the virtual machine or container. #4726

Closed
simonferquel opened this issue Dec 3, 2019 · 27 comments
Labels
distro-mgmt problems related to import, export, conversion

Comments

@simonferquel
Copy link

Please fill out the below information:

  • Your Windows build number: 19018

  • What you're doing and what's happening:
    Sometimes (very rarely, but it happens) registering a distro with wsl --import <...> --version=2 fails with The operation timed out because a response was not received from the virtual machine or container..
    After that, wsl seem totally unusable (wsl -l -v hangs).
    I can confirm that the machine on which I last saw the error has no file encryption/compression enabled.

  • What's wrong / what should be happening instead:
    wsl --import should work reliably, and errors should not put wsl in an unstable state.

  • For WSL launch issues, please [collect detailed logs]:
    The issue happened on a CI machine and it happens randomly and rarely, so I am unable to get detailed logs for that.

@simonferquel
Copy link
Author

wsl --shutdown also fails after encountering this issue with server execution failed

@simonferquel
Copy link
Author

According to docker/for-win#5256 (comment), this also happens on 19033

@simonferquel
Copy link
Author

However, out user does not see "server execution failed" on following wsl commands. Might be an unrelated problem

@benhillis
Copy link
Member

@simonferquel - could you take a trace please?

@simonferquel
Copy link
Author

It is difficult to trace as it is difficult to reproduce consistently.
The user reporting the issue on 19033 seemed to have the compressed file attribute. I wonder if on some conditions, windows would automatically compress ~/AppData. I'll try to explicitly unset it and see if I can reproduce

@simonferquel
Copy link
Author

@benhillis after automatically uncompressing / unencrypting the folder we had much less occurrences of this bug.

However, we still see it happening on Windows Home 19041. Problem is, we have no way to connect to those machines in RDP to play with the machine once the problem is triggered :/.
Do you think systematically enabling the traces on all our tests would have an impact on WSL behavior ? (so that in case of failure, we can easily send them here)

@benhillis
Copy link
Member

Enabling the traces during your tests would be a good idea and should not impact functionality.

@simonferquel
Copy link
Author

We just updated our jenkinsfiles to do just that. Brace for a lot of log parsing :D

@simonferquel
Copy link
Author

(also don't hesitate to ping me if you think other traces could be interesting - for hns maybe ?

@simonferquel
Copy link
Author

lxcorelogs.zip
Hi @benhillis, got a repro today with logs recorded:
In our logs we have an error on deploying a distro - this is part of the first start logic:

[14:37:33.300][WslEngine         ][Error  ] Failed to deploy distro docker-desktop-data to C:\Users\docker\AppData\Local\Docker\wsl\data: exit code: -1
stdout unicode: The operation timed out because a response was not received from the virtual machine or container.

After that error, we do a second try. This time it is failing when terminating the lingering docker-desktop distro:

[14:37:46.897][WslEngine         ][Error  ] Failed to terminate distro: exit code: -1
stdout unicode: The remote procedure call failed.

Hope the logs can help!

@benhillis
Copy link
Member

Thanks @simonferquel, do you happen to have any memory dumps for the lxssmanager service? They would be either WER or Watson crashes, you should be able to find them in eventvwr.

Thanks!

@simonferquel
Copy link
Author

Unfortunately these are ephemeral Azure agents, without RDP on it (and the particular instance on which this happen got scrapped out already). Would you have a powershell script at hand to automate collecting this ?

@benhillis
Copy link
Member

@simonferquel, no I'm not aware of any automated way to collect these dumps.

@simonferquel
Copy link
Author

Ok, let me ping @StefanScherer so that we find a way to enable at least a remote powershell to those machines and we'll monitor those kind of errors, and try to get dumps if we see this happen in the coming days

@simonferquel
Copy link
Author

traces and wer file.zip
We managed to get a .wer file alongside the etl traces on our latest repro. I suppose the wer file is enough for you to get a dump from Microsoft diagnostics server ?

Build is 19041

@benhillis
Copy link
Member

@simonferquel - Thanks, I was able to confirm this is an issue we have fixed and are working on backporting to 19041.

@simonferquel
Copy link
Author

Awesome! That is actually the most frequent issue in our CI nowadays, glad to hear a fix is coming :)

@aykborstelmann
Copy link

aykborstelmann commented Mar 13, 2020

@simonferquel - Thanks, I was able to confirm this is an issue we have fixed and are working on backporting to 19041.

What is the current state of the issue? I have a similar issue and I'm currently on 19041 ...

@simonferquel
Copy link
Author

Another report, with slightly different wsl.exe output (zip contains wsl.exe output + etl traces):
report206790.zip

@briangreenery
Copy link

briangreenery commented Nov 18, 2020

This happens to me reliably every time I try to start Ubuntu that I got from the Microsoft Store.

I followed the steps here: https://docs.microsoft.com/en-us/windows/wsl/install-win10

And the very first time I started the Ubuntu container, it prompted me for the root password, then displayed:

The operation timed out because a response was not received from the virtual machine or container.

Now every time I try to start the container, it hangs for a while then prints that message. This is easily reproducible for me since it happens every time.

I'm running Windows 10 19042.630. Is there any debugging information that I can collect or any steps that I can take to get WSL2 working?


edit: I installed VirtualBox before I installed WSL2 since I did not know that they might conflict. Perhaps that is the issue?


edit2: I uninstalled VirtualBox but I still get the same error message.

@pranithan-kang
Copy link

wsl --shutdown also fails after encountering this issue with server execution failed

Thanks for this clue, I tried this command and the problem solved.

@Dale-NUC
Copy link

Since this issue is still open I would like to add that today, I experienced WSL problem failing to import and thereafter hanging. I've run the import on the same system and a couple others without fail until now.

  • System is Win10Home, 20H2, build 19042.985.

More details:

  • The failed import did not complete, the .vhdx was not created.
    • I was left with a folder structures with a lot of different files. I tried to manually delete the files created during the failed import, now I have a folder which cannot be deleted since there is a folder in the tree named "apache.". The "remove-item" with "-force" option is not able to remove the folder either.
    • Removing the folder from explorer returns an error: "Could not find this item".

I've since disabled and enable WSL on the system, which seemingly fixed the WSL problem (WSL doesn't hang anymore and the Import completed on the second attempt). However. I still cannot delete the target folder created for the failed import.

@CumpsD
Copy link

CumpsD commented Oct 7, 2021

This started happening to me after an upgrade to Windows 11.

Failed to deploy distro docker-desktop to C:\Users\xxxxx\AppData\Local\Docker\wsl\distro: exit code: -1
 stdout: The operation timed out because a response was not received from the virtual machine or container.

@anu-01
Copy link

anu-01 commented Oct 14, 2021

This started happening to me after an upgrade to Windows 11.

Failed to deploy distro docker-desktop to C:\Users\xxxxx\AppData\Local\Docker\wsl\distro: exit code: -1
 stdout: The operation timed out because a response was not received from the virtual machine or container.

I face the same issue as well after upgrading to windows 11.

@anu-01
Copy link

anu-01 commented Oct 14, 2021

This started happening to me after an upgrade to Windows 11.

Failed to deploy distro docker-desktop to C:\Users\xxxxx\AppData\Local\Docker\wsl\distro: exit code: -1
 stdout: The operation timed out because a response was not received from the virtual machine or container.

What worked for me is -
install wsl again in windows 11 by running this command in cmd/windows powershell : wsl install
and verify the version of wsl using this command: wsl --list --online
Restart docker desktop and voila, it worked 👍

@OMIDBIDBID
Copy link

i was working for git connection to wsl that i found this error too...ouuuuch......
The operation timed out because a response was not received from the virtual machine or container and i cant use wsl --user root ........i want only ubuntu desktop ...why windoes dont let me for it by wsl......lool

Copy link
Contributor

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distro-mgmt problems related to import, export, conversion
Projects
None yet
Development

No branches or pull requests

10 participants