Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build Apex latest version failed with pytorch 1.4.0 due to missing ATen/cuda/DeviceUtils.cuh #1200

Open
yumi-cn opened this issue Oct 23, 2021 · 9 comments

Comments

@yumi-cn
Copy link

yumi-cn commented Oct 23, 2021

While i try to build the apex for my new pytorch env, there is a error says:

csrc/layer_norm_cuda_kernel.cu:4:37: fatal error: ATen/cuda/DeviceUtils.cuh: No such file or directory

Actually, I could not find the file in the dir(Pytorch 1.4.0 just do not include this file), and I also found that the include dependency for DeviceUtils.cuh just added 1 month ago in the commit : cleanup missing THCDeviceUtils.cuh header

#include "ATen/ATen.h"
#include <THC/THCDeviceUtils.cuh> <----
#include "ATen/cuda/DeviceUtils.cuh" <----

#include <cuda.h>
#include <cuda_runtime.h>

So I can only build with an old-version-donwload-before apex to use (no bug in this time)

Is there something wrong with the recent commit?

@crcrpar
Copy link
Collaborator

crcrpar commented Oct 25, 2021

PyTorch recently removed THCDeviceUtils.cuh recently thus we needed the change you mentioned.

@DAVIDNEWGATE
Copy link

PyTorch recently removed THCDeviceUtils.cuh recently thus we needed the change you mentioned.

Would you pls to provide a method to install apex on previous Pytorch?

@crcrpar
Copy link
Collaborator

crcrpar commented Nov 3, 2021

picking up a commit before #1191 may work -- https://github.com/NVIDIA/apex/commits/master

@AndyYuan96
Copy link

picking up a commit before #1191 may work -- https://github.com/NVIDIA/apex/commits/master

maybe before #1171, commit between #1171 and #1191 doesn't work.

@zyl1336110861
Copy link

Have you solved the problem? I am faced with this problem now. @AndyYuan96

@zyl1336110861
Copy link

Hi, I just solved my problem by rolling back the version of apex like this:
git checkout f3a960f80244cf9e80558ab30f7f7e8cbf03c0a0

@yumi-cn
Copy link
Author

yumi-cn commented Nov 3, 2021

een #1171 and #1191 doesn't work.

yes, code changes commited at #1171, so, a little early version should work.

A usable version zip: Download old version apex

@yumi-cn
Copy link
Author

yumi-cn commented Nov 3, 2021

@crcrpar I think maybe apex should add something info into README.md file, which about these changes may affect some old version torch.

@hszhoushen
Copy link

git checkout f3a960f80244cf9e80558ab30f7f7e8cbf03c0a0

git checkout -b f3a960f

y-okumura-isp added a commit to y-okumura-isp/packnet-sfm that referenced this issue Mar 3, 2022
When running `make docker-build`, the following error occurs.

(1) pip no more supports Python 3.6

It looks get-pip.py is updated on 03-Feb-2022 09:45.

ERROR: This script does not work on Python 3.6 The minimum supported Python version is 3.7. Please use https://bootstrap.pypa.io/pip/3.6/get-pip.py instead.
The command '/bin/bash -cu curl -O https://bootstrap.pypa.io/get-pip.py &&     python get-pip.py &&     rm get-pip.py' returned a non-zero code: 1
Makefile:65: recipe for target 'docker-build' failed
make: *** [docker-build] Error 1

(2) NVIDIA apex build failure

I got `ATen/cuda/DeviceUtils.cuh: No such file or directory`.
As in NVIDIA/apex#1200, I roll back the
version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants