Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"RuntimeError: HIP error: invalid device function" when running "mnist" on 7900XTX #1313

Open
SuGotLand opened this issue Feb 18, 2025 · 0 comments

Comments

@SuGotLand
Copy link

Context

  • Pytorch version: 2.6.0+rocm6.2.4
  • Operating System and version: Ubuntu 24.04.2 LTS x86_64

Your Environment

  • Installed using source? [yes/no]: no
  • Are you planning to deploy it using docker container? [yes/no]: no
  • Is it a CPU or GPU environment?: GPU
  • Which example are you using: mnist
  • Link to code or data to repro [if any]: mnist

Expected Behavior

Train Epoch: 1 [0/60000 (0%)]	Loss: 2.326473
Train Epoch: 1 [640/60000 (1%)]	Loss: 1.377825
Train Epoch: 1 [1280/60000 (2%)]	Loss: 0.828890
Train Epoch: 1 [1920/60000 (3%)]	Loss: 0.623807
Train Epoch: 1 [2560/60000 (4%)]	Loss: 0.447925
Train Epoch: 1 [3200/60000 (5%)]	Loss: 0.293224
Train Epoch: 1 [3840/60000 (6%)]	Loss: 0.163648
Train Epoch: 1 [4480/60000 (7%)]	Loss: 0.633399
Train Epoch: 1 [5120/60000 (9%)]	Loss: 0.226126
Train Epoch: 1 [5760/60000 (10%)]	Loss: 0.226796
...

Current Behavior

Traceback (most recent call last):
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 147, in <module>
    main()
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 138, in main
    train(args, model, device, train_loader, optimizer, epoch)
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 45, in train
    output = model(data)
             ^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 25, in forward
    x = self.conv1(x)
        ^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 554, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 549, in _conv_forward
    return F.conv2d(
           ^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

Possible Solution

export HIP_VISIBLE_DEVICES=1
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export PYTORCH_ROCM_ARCH="gfx1100"

But it doesn't work for me.

Steps to Reproduce

  1. Install the lastest pytorch by pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4
  2. clone examples and cd the directory.
  3. python3 mnist/main.py

Failure Logs [if any]

Output of AMD_LOG_LEVEL=3 python main.py
AMD_LOG.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant