fatal Process exited with status 1 #1333

rugginic · 2024-10-24T12:35:17Z

What happened?
Creating AWS Workspace very often (like 50% of the times) fails with the following FATAL error:
[13:26:36] fatal Process exited with status 1 run agent command github.com/loft-sh/devpod/pkg/devcontainer/sshtunnel.ExecuteCommand.func2 D:/a/devpod/devpod/pkg/devcontainer/sshtunnel/sshtunnel.go:129 runtime.goexit C:/Users/runneradmin/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.5.windows-amd64/src/runtime/asm_amd64.s:1695

What did you expect to happen instead?
To be able to run the same workspace successfully every time.

How can we reproduce the bug? (as minimally and precisely as possible)
I have no idea.

My devcontainer.json:

{
    "name": "AFT",
    // Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
    "image": "mcr.microsoft.com/devcontainers/python:1-3.11-bullseye",
    "features": {
        "https://github.qualcomm.com/Cloud/nscert-feature/releases/download/0.0.2/devcontainer-feature-nscerts.tgz": {},
        "ghcr.io/devcontainers-contrib/features/poetry:2": {}
    },
    "overrideFeatureInstallOrder": [
        "https://github.qualcomm.com/Cloud/nscert-feature/releases/download/0.0.2/devcontainer-feature-nscerts.tgz"
    ],

    // Use 'postCreateCommand' to run commands after the container is created.
    "postCreateCommand": "./.devcontainer/postCreateCommand.sh",

    // Configure tool-specific properties.
    "customizations": {
        "vscode": {
            "extensions": [
                "ms-python.python",
                "editorconfig.editorconfig",
                "streetsidesoftware.code-spell-checker",
                "ms-python.vscode-pylance",
                "ms-python.black-formatter",
                "mutantdino.resourcemonitor"
            ],
            "settings": {
                "remote.SSH.connectTimeout": 3600,
                "python.testing.pytestArgs": ["tests"],
                "python.testing.unittestEnabled": false,
                "python.testing.pytestEnabled": true,
                "python.defaultInterpreterPath": "/workspaces/AFT/.venv/bin/python",
                "python.testing.pytestPath": "/workspaces/AFT/.venv/bin/pytest",
                "editor.defaultFormatter": "ms-python.black-formatter",
                "python.formatting.provider": "black",
                "editor.formatOnType": true,
                "editor.formatOnSave": true
            }
        }
    },
    "remoteUser": "vscode"
}

Local Environment:

DevPod Version: v0.5.21
Operating System: windows
ARCH of the OS: AMD64

DevPod Provider:

Cloud Provider: aws

Anything else we need to know?
This happens randomly, but almost 50% of the times.

The text was updated successfully, but these errors were encountered:

bkneis · 2024-10-25T07:58:34Z

@rugginic looking at the logs, I can see

{"type":"data","data":{"time":"2024-10-24T13:26:36.7809381+01:00","message":"#2 ERROR: failed to do request: Head \"https://registry-1.docker.io/v2/docker/dockerfile/manifests/1.4\": http: server gave HTTP response to HTTPS client","level":"info"}}

It seems that your environment may be experiencing network issues connecting to the docker registry. Can you SSH into the VM? Can you pull the image using docker pull mcr.microsoft.com/devcontainers/python:1-3.11-bullseye on the VM?

rugginic · 2024-10-29T15:43:50Z

I'm able to pull the image from the EC2 instance. This problem happens randomly. Are you implying the FATAL error is related to the pull error? This correlation is not obvious for me. Can you elaborate?

bkneis · 2024-10-30T08:35:49Z

@rugginic the error I posted was a build error, it was at the top of the stacktrace in your logs where the issue seems to start. I cannot resolved the DNS github.qualcomm.com, how is this resolved for you? How long does the build process take? I wonder if there is a timeout occuring. Does the pipeline fail if you remove the https://github.qualcomm.com/Cloud/nscert-feature/releases/download/0.0.2/devcontainer-feature-nscerts.tgz feature?

rugginic · 2024-10-31T08:52:14Z

Hi, just to add more context to this, I don't get any error when I define the workspace. It deploys successfully.
The issue appears when I stop and start the workspace or if I try rebuild or reset.
The only way to recover is to delete and re-create the provider.

rugginic · 2024-10-31T08:56:37Z

[08:53:45] info #2 resolve image config for docker-image://docker.io/docker/dockerfile:1.4
[08:53:45] info #2 ERROR: failed to do request: Head "https://registry-1.docker.io/v2/docker/dockerfile/manifests/1.4": http: server gave HTTP response to HTTPS client
[08:53:46] info devcontainer up: build image: buildx build: build image: exit status 1
[08:53:46] info ------
[08:53:46] info error parsing workspace info: rerun as root: exit status 1
[08:53:46] error Try enabling Debug mode under Settings to see a more verbose output
[08:53:46] fatal run agent command: Process exited with status 1

We don't have access to registry-1.docker.io. How can I configure a different registry so that I don't hit this error?

bkneis · 2024-10-31T09:38:44Z

@rugginic registry-1 is a public docker registry and used in most images, here it is getting the dockerfile spec for version 1.4. You don't need a mirror for this. Did you try without the feature? Or check how long it takes to download? Hmmm that is odd the issue is only during rebuilds. So when you run devpod up, it's always happy? Then when you try devpod up --reset it fails 50% of the time?

rugginic · 2024-10-31T14:40:09Z

@bkneis I can't remove the feature otherwise ssl communication will fail. And, as I said before, we don't have access to the docker public registry. There are company policies.

Is there a way to configure where to pull that image that is failing, so that I can specify an internal (non public) registry?
Why is it pulling it only on rebuild/restart of the same workspace?

I narrowed down the issue. When I create the provider it works 100% of the times. If I stop / reset/ rebuild, it fails 100% of the times with the same error.
So basically my workspaces only work once. Anytime I'm reconnecting to the same repo I have to delete and re-create the workspace. And just to give more context, this happens to everyone in my team.

bkneis · 2024-11-01T16:26:21Z

I believe this is the docker image you are looking for https://hub.docker.com/r/docker/dockerfile/tags/

rugginic added the kind/bug label Oct 24, 2024

bkneis added the wait for user label Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fatal Process exited with status 1 #1333

fatal Process exited with status 1 #1333

rugginic commented Oct 24, 2024

bkneis commented Oct 25, 2024

rugginic commented Oct 29, 2024

bkneis commented Oct 30, 2024

rugginic commented Oct 31, 2024

rugginic commented Oct 31, 2024

bkneis commented Oct 31, 2024

rugginic commented Oct 31, 2024

bkneis commented Nov 1, 2024

fatal Process exited with status 1 #1333

fatal Process exited with status 1 #1333

Comments

rugginic commented Oct 24, 2024

bkneis commented Oct 25, 2024

rugginic commented Oct 29, 2024

bkneis commented Oct 30, 2024

rugginic commented Oct 31, 2024

rugginic commented Oct 31, 2024

bkneis commented Oct 31, 2024

rugginic commented Oct 31, 2024

bkneis commented Nov 1, 2024