-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyproject.toml
missing packaging
dependency
#1679
Comments
that sounds right, would you mind opening a pull request? |
I encountered the same error, when i added packaging and torch to pyproject.toml, a new error occurred:
my conda env is torch1.7.1, cuda 11.0, and using the 22.04branch for installation. |
Personally I recommend using |
Is the latest version of apex necessary? |
@Colezwhy and @crcrpar - when I build with Sample build here that installs apex with |
hmm, I haven't found myself in the same situation. what if the latest pip and multiple |
is it firgured out? |
same issue in cd apex && MAX_JOBS=1 python3 -m pip install --global-option="--cpp_ext" --global-option="--cuda_ext" --no-cache -v --disable-pip-version-check . |
@crcrpar - even with the newer pip the issue doesn't seem to be multiple config settings either. When you tested this did you hit any issues with this? |
The latest version of apex currently does not install, as mentioned here facebookresearch#52. This issue with apex has also been reported here NVIDIA/apex#1679 huggingface/transformers#24351 suggests pinning apex to a specific commit, `cd apex && git checkout 82ee367f3da74b4cd62a1fb47aa9806f0f47b58b`, after which apex installs successfully. However, that version of apex is incompatible with the version of torch used here, and I get this error NVIDIA/apex#1532. The previous link suggest using version `22.04-dev` (`cd apex && git checkout 22.04-dev`) of apex. With this, apex compiles successfully and `python ./main_finetune.py` also runs training using amp successfully. If the authors can tell us the exact HEAD commit of apex version that they used, we can use that version instead!
Something like this should work for multiple extensions
|
Adding torch to the dependencies does help, but the bigger issue seems to be that no matter who I specify the build options, they're not being picked up as being in argv here |
@xwang233 - were you able to build the cpp or cuda exts with these commands at all? I've been able to build, but I'm not seeing the arguments passed to setup.py? |
Yes, I was able to build cpp, cuda, and other extensions with this command. #1679 (comment) We were using python 3.10.11 and pip 23.0.1 |
@xwang233 - Are you using the most up to date master branch from the repo? Since you should also need to add packaging and torch as dependencies to the pyproject.toml first, right? Here is my output:
And this hasn't built amp_C or apex_C. |
We're using the latest commit, which includes #1669. Our pytorch is from a source build, but I'm not sure if that's the issue. I also tried pip 23.1.2 and it worked as expected. Can you try if pip install with |
Interesting, mine is torch 1.13, but I doubt the torch version makes a difference. Also in a venv if that matters. I'm just not able to see it ever get to the part where it parses any sys.argv values?
|
also have install problem that only appeared after this change. if I fix the missing "packaging" i get an error about torch despite it being installed. for everyone with this problem I suggest checking out |
@RuABraun - that's what many of us are doing, but that will prevent future changes from being taken. If you add pytorch and packaging, that will resolve those issues but the overall installation (at least for me) is failing with another issue of not installing the cpp_ext or cuda_ext. |
I've been experiencing the precise same issues as @loadams |
one way (I wouldn't recommend though) to dodge pyproject.toml dependency management could be to use |
Did you solve it? I have the same error |
I think the README is simply wrong. You need to use pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--global-option=--cpp_ext" --config-settings "--global-option=--cuda_ext" ./ |
`--build-option` is not the correct config settings to use; we need `--global-option`. Ref NVIDIA#1679.
`--build-option` is not the correct config setting to use; we need `--global-option`. Ref NVIDIA#1679.
what about this message (from pip I guess)?
Doesn't seem like |
Thank @RuABraun, I had not seen that warning! So this is an incorrect solution after all, but at least it's a workaround until pip 23.3. Maybe the problem of accessing |
I don't get the same warning for some reason, but I created a new PR with an alternative solution based on what Pillow uses to support custom arguments.
|
I am using
Any workarounds? |
Try with |
|
But then I get this error: |
@VarunGumma try |
we have the same environment, did you solve the problem? |
So, what I did was clone the repo and checkout to an older commit (something around |
@ChaosPengs - you'd need to ensure you have |
× Getting requirements to build wheel did not run successfully. note: This error originates from a subprocess, and is likely not a problem with pip. × Getting requirements to build wheel did not run successfully. note: This error originates from a subprocess, and is likely not a problem with pip. |
May I know what versions of Torch, CUDA, and Python you have? |
torch 1.9.0+cu111 cuda 11.3 python 3.9 |
Thank you for your advice. I'll try it later.
1589210472
***@***.***
…------------------ 原始邮件 ------------------
发件人: "NVIDIA/apex" ***@***.***>;
发送时间: 2023年11月2日(星期四) 晚上10:17
***@***.***>;
***@***.******@***.***>;
主题: Re: [NVIDIA/apex] `pyproject.toml` missing `packaging` dependency (Issue #1679)
pip install -v --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--deprecated_fused_adam" --global-option="--xentropy" --global-option="--fast_multihead_attn" ./ it's works for me!
torch 1.9.0+cu111 cuda 11.3
this work for me
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
it works for me!! thanks~ |
@VarunGumma do you still get the warning: "amp_C fused kernels unavailable" when using fairseq-train? |
But when I run the :
New error raises like below:
|
@Taskii-Lei - that looks like a new/different error, I'd recommend opening a new issue for that. |
I have solved it. It raises because the cuda installed by conda is not complete, and there's no
and by the way, if still not ok, one can try:
|
If you add pytorch and packaging, that will resolve those issues but the overall installation (at least for me) is failing with another issue of not installing the cpp_ext or cuda_ext. @loadams I have the exact same issue, I want to build apex with cpp_ext and cuda_ext for mixed precision training. I am using the following command:
but it simply does not works. when running my code on multiple gpus i get the following error: Any will be appreciated! |
Using the below commands:
It install the apex-0.1 but still it does not built with Sun May 26 14:11:38 2024 +-----------------------------------------------------------------------------------------+
All requested packages already installed.Using pip 24.0 from /homes/hayatu/miniconda3/envs/focal/lib/python3.8/site-packages/pip (python 3.8) torch.version = 2.3.0+cu121 running dist_info torch.version = 2.3.0+cu121 running bdist_wheel I am using HPC server, having PyTorch version 2.3.0+cu121 and CUDA 12.4. I suspect this behavior might be caused by a mismatch between the installed CUDA version (12.4) on my server and the pre-compiled CUDA version (12.1) of PyTorch. After installation, when I run my code I get the same warning: Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'") |
Describe the Bug
#1669 adds a
pyproject.toml
file, but the build dependencies are underspecified. Thesetup.py
file depends onpackaging
but this dependency isn't declared in the build dependencies.Minimal Steps/Code to Reproduce the Bug
yields
full log: https://gist.github.com/calebho/35fa3bf2fdc4e818bc5bded4456988c3
Expected Behavior
It should install without errors
Environment
The text was updated successfully, but these errors were encountered: