Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Use Rosetta": Certain node.js/amd64 workloads cause container to become unresponsive/100% CPU #6998

Closed
jbinto opened this issue Sep 26, 2023 · 33 comments

Comments

@jbinto
Copy link

jbinto commented Sep 26, 2023

Description

A few folks on our team have been using Use Rosetta for x86/amd64 emulation on Apple Silicon with our moderately sized docker-compose stack, which is primarily amd64 images of Node.js apps. For the most part, Rosetta is speedier and a net gain, however, we've been noticing that some of our containers will hit 100% CPU and become entirely unresponsive.

I tried many things to debug the 100% CPU, but between the virtualization and the containerization I wasn't able to get perf or any linux debuggers working. I was able to strace but no syscalls were shown, which suggests the app was caught in a tight CPU-bound user loop and not e.g. doing IO or network calls. I was able to use node --inspect with port forwarding, but the debugger stopped responding once it got into a 100% CPU state.

I was able to get an isolated reproduction case however, using an old npm library that performs crypto operations in pure Node. See below.

Reproduce

At a high level, to reproduce, create an amd64 Docker image which uses the npm package keypair, and attempt to create a 2048 bit key.

With Rosetta off, this succeeds after 5-20s.

With Rosetta on, the container immediately hits 100% CPU and "never" returns (I gave up after 14 hours).

Reproduction repo with instructions: https://github.com/jbinto/rosetta-what
Dockerhub: https://hub.docker.com/repository/docker/jbinto/rosetta-what/general

Expected behavior

These types of Node.js amd64 workloads should succeed in Rosetta mode, and not become unresponsive/100% CPU.

docker version

Client:
 Cloud integration: v1.0.35+desktop.4
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:28:49 2023
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.23.0 (120376)
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:31:36 2023
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:        8165feabfdfe38c65b599c4993d227328c231fca
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    24.0.6
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2-desktop.4
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.21.0-desktop.1
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.20
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.7
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-scan
  scout: Command line tool for Docker Scout (Docker Inc.)
    Version:  0.24.1
    Path:     /Users/jbuchanan/.docker/cli-plugins/docker-scout

Server:
 Containers: 15
  Running: 13
  Paused: 0
  Stopped: 2
 Images: 21
 Server Version: 24.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
 runc version: v1.1.8-0-g82f18fe
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.3.13-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 8
 Total Memory: 7.666GiB
 Name: docker-desktop
 ID: 84081c6c-7297-46a4-82bb-1039ade535ad
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Diagnostics ID

3F4DBEC3-6E23-4A45-9421-B2A0CCBF2058/20230926181652

Additional Info

It's unclear whether or not this code is just very, very slow, or if there's some issue in translating the code that causes an incorrect infinite loop. I don't have the skills or tools to drop any lower to see what's actually happening.

But in any case, the outcome is the same: I have to tell my team to turn off Use Rosetta for x86/amd64 emulation on Apple Silicon because our containers will eventually hit 100% CPU and die. And this is not ideal because Rosetta otherwise represents something like a 4-5x speedup over qemu, which we really appreciate.

@fnaoto
Copy link

fnaoto commented Sep 28, 2023

In my case, disabling experimental features was working.
This might be help for your situation. :)

@douglascayers
Copy link

douglascayers commented Oct 5, 2023

Hi, I'm noticing the same hanging effect with any npm command when Use Rosetta for x86/amd64 emulation on Apple Silicon is enabled.

I'm using an Apple M2 Pro (Apple Silicon) running Docker Desktop for Mac v4.23.0 (120376).

This simple command demonstrates the issue for me:

docker run --rm -it --platform=linux/amd64 --entrypoint /bin/sh node:18-alpine
$> node --version
v18.18.0
$> npm --version
9.8.1
# then it hangs, command never ends

Disabling the Rosetta setting in Docker Desktop as suggested by @fnaoto fixes the issue for me.

Thanks

These other posts on Reddit may be people experiencing the same issue:

@jbinto
Copy link
Author

jbinto commented Oct 5, 2023

Disabling experimental features did not work for us (it was already disabled when I submitted this issue).

Our workaround has been to bite the bullet and build arm64 images for everything.

@sbleon
Copy link

sbleon commented Oct 5, 2023

yarn install also fails if Rosetta is enabled. On an M1 Mac with Rosetta:

docker run -it --platform linux/amd64 node:18-alpine yarn install
yarn install v1.22.19

and then there's no further output. I'm using Docker 4.23.0. If I disable Rosetta, the command runs correctly.

@corneliusroemer
Copy link

Same issue here, in my case on M1 macOS 14.0 Docker Desktop v4.24.0 (122432)

Workaround: switch off Rosetta as shown on screenshot

image

@sbleon
Copy link

sbleon commented Nov 3, 2023

I updated to Docker 4.25.0 and macOS from 13.3.1 to 13.6.1 and my build issues are fixed, even with Rosetta enabled!

@Krienas
Copy link

Krienas commented Nov 9, 2023

@jbinto

Disabling experimental features did not work for us (it was already disabled when I submitted this issue).

Could be because Rosetta was out of experimental already (way to switch it off had changed).

@enzofrnt
Copy link

Hi, Any news about that ?

@jgillard
Copy link

jgillard commented Nov 22, 2023

I've also just bumped into this issue. M1 MBP, MacOS 13.6.2 (latest Ventura), Docker Desktop 4.25.2 (latest) with Rosetta enabled, running nuxt build via npm run build inside a --platform=linux/amd64 node:16-alpine image. The build process just hangs while compiling the server and never completes. Disabling Rosetta fixes the issue for me. A colleague on an M2 MBP on 14.1.1 (latest Sonoma) and same Docker Desktop version has the same issue.

@djcristi
Copy link

i have the same issue on m1, macos 13.6.1 , and last 3 docker for mac versions (have not tested older ones).
without rosetta it's slow and if you don't allow more memory (3.8->5.5 GB) it will fail with SIGKILL on 'npm build'

@dgageot
Copy link
Member

dgageot commented Nov 24, 2023

Hi everyone, I'm trying to reproduce the issue and so far, I failed to do so.

I've tried:

$ docker run --rm -it --platform=linux/amd64 --entrypoint /bin/sh node:18-alpine
$> node --version
v18.18.0
$> npm --version
9.8.1
# then it hangs, command never ends <-- Not for me

I've tried:

$ docker run -it --platform linux/amd64 node:18-alpine yarn install
yarn install v1.22.19
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...

success Saved lockfile.
Done in 0.15s.

(Docker Desktop 4.25.2, Sonoma 14.1.1, Mac M1 Pro and M2 Pro)

@djcristi
Copy link

djcristi commented Nov 24, 2023

that one is already built, it hangs when you have 'npm run build' in your Dockerfile for example. or you can try with the rosetta-what repo linked in the first post (docker run --platform linux/amd64 jbinto/rosetta-what).

@jaredjj3
Copy link

FWIW, I've made another example repository where workloads seem to hang indefinitely in https://github.com/jaredjj3/vexflow-key-properties-issue. I thought it was library-specific, but I submitted a PR in 0xfe/vexflow#1596 and I'm still having the issue unfortunately.

@dgageot
Copy link
Member

dgageot commented Nov 24, 2023

Thanks a lot @jaredjj3 for the projet. I was able to repro on Docker Desktop 4.25.2 and I can confirm it's fixed now and should ship in 4.26

@jbinto
Copy link
Author

jbinto commented Nov 29, 2023

@dgageot Nice! Can you help us understand what caused this issue? I'm very curious what low-level craziness led to e.g. @jaredjj3's wild finding (see below)

I've observed that making the following change causes the tests in all environments to run successfully:

before

octave += -1 * options.octave_shift;
after

octave -= options.octave_shift;

@dmytr0x
Copy link

dmytr0x commented Dec 8, 2023

Release 4.26 has fixed the issue. I cannot reproduce it anymore. Thank you!

@jaredjj3
Copy link

jaredjj3 commented Dec 10, 2023

Thanks for the help so far. I can confirm that 4.26 fixes the issue in https://github.com/jaredjj3/vexflow-key-properties-issue, but I'm still having the issue in another repository. I haven't had time to isolate the problem yet.

In https://github.com/stringsync/vexml, I have integration tests that sometimes hang in Docker, but not on macOS.

Environment

  • MacBook Pro
  • Chip: Apple M1 MAX
  • OS: macOS Sonoma 14.1.2
  • Docker Desktop 4.26.0 (130397)
  • Docker Engine: 24.0.7

How to run

Prerequisites

  1. Install yarn.
  2. Install docker.
  3. Run yarn in the root directory.

macOS

yarn jest integration --runInBand

The integration tests should run successfully in ~15s.

Docker

yarn test integration

(you don't need to specify --runInBand)

The integration tests should hang indefinitely at some point. You might see CPU pinned at ~100% when running docker stats. I also sometimes see the following message:

assertion failed [find_leftmost_allocation(allocation_info.vm_interval) == nullptr]: interval being added overlaps existing allocation


Would someone double check if it's still broken and that it's the same issue?

@dgageot
Copy link
Member

dgageot commented Dec 11, 2023

Hi @jaredjj3 thanks for the report! I'm happy to try it out. What would be super useful is a Dockerfile that reproduces the issue, though. I'll try to follow the steps you gave but a Dockerfile is always the best way to help us solve those issues.

@dgageot
Copy link
Member

dgageot commented Dec 11, 2023

@jaredjj3 I forgot to ask: Can you confirm that you build and run an amd64 image?

@jaredjj3
Copy link

jaredjj3 commented Dec 11, 2023

Hi @jaredjj3 thanks for the report! I'm happy to try it out. What would be super useful is a Dockerfile that reproduces the issue, though. I'll try to follow the steps you gave but a Dockerfile is always the best way to help us solve those issues.

Here's the Dockerfile.

@jaredjj3 I forgot to ask: Can you confirm that you build and run an amd64 image?

Thanks for double checking. It turns out that while I was troubleshooting, I commented out export DOCKER_DEFAULT_PLATFORM=linux/amd64 in my shell profile. I'm unable to reproduce the issue now that I'm definitely running an amd64 image.

@dgageot
Copy link
Member

dgageot commented Dec 11, 2023

I'm going to close this issue. Feel free to reopen or open a new issue if you think something is not fixed yet.

@dgageot dgageot closed this as completed Dec 11, 2023
@jjang16
Copy link

jjang16 commented Dec 16, 2023

On m2 chip, docker buildx build --platform linux/amd64 was causing following errors on yarn installs within docker build (yarn or yarn workspaces focus --production)
The problem was persistent regardless of node / yarn version.

YN0001: │ Error [ERR_WORKER_INVALID_EXEC_ARGV]: <dependency>: Initiated Worker with invalid execArgv flags: --no-opt

Yarn spits following error message for all dependency to install.

Turning off rosetta in the docker desktop options solved it;
Docker desktop version was v4.26.1
I'm leaving a comment because the symptom is different but I feel it might share the internal cause.

@dgageot dgageot self-assigned this Jan 2, 2024
@dgageot dgageot reopened this Jan 2, 2024
@dgageot
Copy link
Member

dgageot commented Jan 2, 2024

Thank you @jjang16, your issue is indeed related because it's caused by the fix to the original issue. I'll take a look at it

@dgageot
Copy link
Member

dgageot commented Jan 2, 2024

@jjang16 Would you have a simple set of commands that I can use to reproduce your issue?

@dgageot
Copy link
Member

dgageot commented Jan 2, 2024

@jjang16 This will be fixed in Docker Desktop 4.27

@MattyBalaam
Copy link

@jjang16 This will be fixed in Docker Desktop 4.27

Is there an ETA for this release?

@JavierAmaya
Copy link

Macbook M3 Pro
Node 18.10.0
Docker Desktop 4.26.1

Recibia el mismo error YN0001: │ Error [ERR_WORKER_INVALID_EXEC_ARGV]: <dependency>: Initiated Worker with invalid execArgv flags: --no-opt .

Al tratar de ejecutar un DockerFile donde esta el comando yarn @workspace run build , al parecer se soluciono desactivando el rosetta , gracias @jjang16 .

@dgageot
Copy link
Member

dgageot commented Jan 22, 2024

@JavierAmaya @MattyBalaam 4.27 should be out this week

@dgageot
Copy link
Member

dgageot commented Jan 26, 2024

@JavierAmaya @MattyBalaam Could you test Docker Desktop 4.27 and close this issue if it's fixed?

@MattyBalaam
Copy link

@dgageot it works for my case 🥳

@dgageot dgageot closed this as completed Jan 27, 2024
@dgageot dgageot removed their assignment Jan 27, 2024
@vincent-herlemont
Copy link

I am encountering exactly the same problem when running Docker (25.0.2) on a Mac M3 (14.3) in an Ubuntu 22 Virtual Machine supporting compatibility with Rosetta (Parallels). This could be due to Docker on Ubuntu or the emulation of the Ubuntu VM (Parallels)?

@sbleon
Copy link

sbleon commented Feb 6, 2024

I am encountering exactly the same problem when running Docker (25.0.2) on a Mac M3 (14.3) in an Ubuntu 22 Virtual Machine supporting compatibility with Rosetta (Parallels). This could be due to Docker on Ubuntu or the emulation of the Ubuntu VM (Parallels)?

@vincent-herlemont Are you running Docker for Linux? If so, I think you should create a separate issue in the Docker for Linux repo. You can reference this issue there, but this issue is about Docker for Mac.

@vincent-herlemont
Copy link

@sbleon Yes, I am using Docker for Linux with Ubuntu, and you are right, I have opened an issue here: docker/for-linux#1483.
Thank you for pointing it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests