-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: Update QEMU to support emulating AVX instructions on ARM64 hosts #6620
Comments
Oops, the QEMU functionality in QEMU 7.2 hasn’t been released yet. 😅 |
Very important issue for my team. |
The latest Docker for Mac release is apparently still using QEMU 6.2.0:
Looks like QEMU was upgraded to 7.0.0 in Docker Desktop 4.13.0 but was downgraded to 6.2.0 in Docker Desktop 4.13.1 due to some other issue. Is anyone working on trying again to upgrade QEMU? 🥺 |
any news on this? I'm struggling with the |
^ @stephen-turner who previously commented on #5148. |
Negative news: I tested the recent Rosetta 2 support in Docker Desktop but Rosetta 2 does not seem to support AVX either. |
tried on Debian 11. rocket.chat container exiting (132) |
Until this is fixed and docker is upgraded to use qemu 7.2+ (latest is 8.0.0), one could try run qemu/colima directly. you will still build and run as usual after stopping docker and having started colima.
for other cpu models: https://qemu.readthedocs.io/en/latest/system/qemu-cpu-models.html I would not hold my breath for Rosetta support, it's not going to happen. https://developer.apple.com/documentation/apple-silicon/about-the-rosetta-translation-environment |
Any update on this issue? With Apple Silicon now being a staple in a lot of engineering departments were facing the same issues here. |
Any update here? Pain ongoing... |
Apple Silicon combined with QEMU above 7.0 causes a regression where the syscall Specifically, whereas QEMU 6.2 and under passed on the syscall without modification, QEMU 7.0 and above disables it and put a comment saying "TODO to implement a safe pass-through for it". https://gitlab.com/qemu-project/qemu/-/commit/220717a6f46a99031a5b1af964bbf4dec1310440 And it's still not implemented to this day, which means nothing above QEMU 6.2 will work for those applications. Until that is fixed, I think a QEMU update will cause unexpected regressions to Docker users. |
Great find, @yutotakano!
Dumb question: Is this issue specific to Apple Silicon? At first glance, the commit you linked doesn’t seem to depend on architecture?
Do you know if there is a QEMU ticket tracking this issue? If not, I think we should create one so that the QEMU developers don’t forget about it! |
Hmm. I'm certainly on an Apple Silicon so I decided to keep my assumptions small. Perhaps it's on all devices as long as you use QEMU to emulate Linux. But would Docker use QEMU if it's running an x86 container on Intel x86? |
Related but somewhat off-topic, because MongoDB 5.0 and later relies on the AVX instruction set, among other tools Side note: https://www.mongodb.com/docs/v7.0/administration/production-notes/#x86_64
|
I took the liberty of creating an issue in the QEMU tracker since I did not find an existing one: https://gitlab.com/qemu-project/qemu/-/issues/1929 |
Hello everyone, we're updating QEMU in the upcoming version of Docker Desktop. |
@dgageot Good news! I have not tested but in theory, QEMU 7.2 or above should resolve this issue. |
Thank you @fumoboy007. |
@fumoboy007 Sorry, that'll have to wait for 4.27.0. |
@fumoboy007 do you have an example of a docker command that fails with the latest version of Docker Desktop? |
@dgageot One Docker image that is affected by this issue is |
@dgageot it would be awesome seeing it coming for 4.27.0 or early 2024 :-) I'm also blocked by tensorflow/serving#1948 Do you have any updates? |
We currently have a QEMU Also, this morning, I've started testing I mainly focused on using the most recent versions of qemu. I didn't test the support for AVX, yet. Do you have a simple |
Here are the commands that fail with Docker Desktop 4.26.1 but succeed on our main branch, with Qemu 8.0.4: cd /tmp
git clone https://github.com/tensorflow/serving
docker run -t --rm -it --init -p 8501:8501 --platform linux/amd64 -e EXPERIMENTAL_DOCKER_DESKTOP_FORCE_QEMU=1 -v "./serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu:/models/half_plus_two" -e MODEL_NAME=half_plus_two tensorflow/serving:2.14.1 Although, from the output of the program, I'm not sure it does actually use AVX instructions:
Tip:
|
@dgageot that's so great to hear! Exactly, running your command on Docker Desktop v4.26.1 for mac, on apple silicon, without /usr/bin/tf_serving_entrypoint.sh: line 3: 12 Illegal instruction tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@" Similarly, running the command with [libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/descriptor_database.cc:560] Invalid file descriptor data passed to EncodedDescriptorDatabase::Add().
[libprotobuf FATAL external/com_google_protobuf/src/google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
qemu: uncaught target signal 6 (Aborted) - core dumped
/usr/bin/tf_serving_entrypoint.sh: line 3: 12 Aborted tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@" |
Hello, I've tried the recently released Docker for Mac 4.27.0 and AVX seems to work now on ARM64 🎊 |
Woot! Thanks @matemijolovic, that's really good news! Is it fully working? How's the perf? |
Didn't have time to benchmark the performance, but seems okay at a first glance (I'd say roughly ~2x slower than running on comparable Linux x64 machine). For our usecase this is perfectly acceptable, as we don't run any production inference on ARMs. [EDIT: to clarify, regarding performance, I'm not sure that AVX is actually being used in its full potential, but for us the important thing is that the containers don't crash] The only issue I observed is that |
Good to hear!
Have you tried running the container with |
(I'm, closing this issue. Feel free to ping me if you think it needs to be re-opened) |
Can confirm this helps, thank you! |
Hi everyone! There's a good chance that we rollback the qemu upgrade in Docker Desktop 4.28.0. It has too many regressions for the majority of users. A temporary solution will be for you to stick with 4.27.X. |
That's unfortunate but thank you so much @dgageot for the heads up! |
We're the regressions reported on Gitlab? Also what patch release are you on? |
Why wasn't this issue re-open if the change was undone? |
Looks like w/ v4.28.0 they actually upgraded to Qemu 8.1.5, https://docs.docker.com/desktop/release-notes/#4280, unlike what @dgageot said about the rollback cc @xanather, I tested with 4.28.0, and systems that rely on AVX commands are working (my usecase is MSFTs Kusto emulator) |
Expected behavior
AMD64 images that use AVX instructions are able to run on ARM64 hosts.
Actual behavior
#5148
Information
AVX support was recently added to QEMU. I believe Docker needs to update its QEMU version to pull in this functionality?
The text was updated successfully, but these errors were encountered: