You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am struggled to compile and run llm.c with conda environments :
I am using this environment.yml file:
name: my-env2
channels:
- conda-forge
dependencies:
- cuda-libraries # this is the cuda metapackage
- cudnn # this is specifically for cudnn
- cuda-nvcc # ensures that a compatible nvidia C compiler is available!
# This may be sufficient, but it's probably safer to specify the CUDA built
# variant explicitly to make the conda solver's job easier.
#- jaxlib
- jaxlib=*=cuda
- cuda-version=12.4
- jax
- python=3.10
and I am getting erros :
(my-env2) pars@pars-Precision-5540:~/Documents/deniz/llm.c$ make train_gpt2cu
→ cuDNN is manually disabled by default, run make with USE_CUDNN=1 to try to enable
✓ OpenMP found
✓ NCCL found, OK to train with multiple GPUs
✓ MPI enabled
✓ nvcc found, including GPU/CUDA support
/home/pars/miniconda3/envs/my-env2/bin/nvcc -O3 -t=0 --use_fast_math -std=c++17 --generate-code arch=compute_75,code=[compute_75,sm_75] -DMULTI_GPU -DUSE_MPI -DENABLE_BF16 train_gpt2.cu -lcublas -lcublasLt -L/usr/lib/x86_64-linux-gnu/openmpi/lib/ -I/usr/lib/x86_64-linux-gnu/openmpi/include/ -lnccl -lmpi -o train_gpt2cu
In file included from train_gpt2.cu:37:
llmc/cuda_common.h:13:10: fatal error: nvtx3/nvToolsExt.h: No such file or directory
13 | #include <nvtx3/nvToolsExt.h>
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
In file included from train_gpt2.cu:37:
llmc/cuda_common.h:13:10: fatal error: nvtx3/nvToolsExt.h: No such file or directory
13 | #include <nvtx3/nvToolsExt.h>
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:268: train_gpt2cu] Error 255
How can I solve this issue or do you guys have any proper environment.yml file for llm.c .
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am struggled to compile and run llm.c with conda environments :
I am using this environment.yml file:
name: my-env2
channels:
- conda-forge
dependencies:
- cuda-libraries # this is the cuda metapackage
- cudnn # this is specifically for cudnn
- cuda-nvcc # ensures that a compatible nvidia C compiler is available!
# This may be sufficient, but it's probably safer to specify the CUDA built
# variant explicitly to make the conda solver's job easier.
#- jaxlib
- jaxlib=*=cuda
- cuda-version=12.4
- jax
- python=3.10
and I am getting erros :
(my-env2) pars@pars-Precision-5540:~/Documents/deniz/llm.c$ make train_gpt2cu
→ cuDNN is manually disabled by default, run make with
USE_CUDNN=1
to try to enable✓ OpenMP found
✓ NCCL found, OK to train with multiple GPUs
✓ MPI enabled
✓ nvcc found, including GPU/CUDA support
/home/pars/miniconda3/envs/my-env2/bin/nvcc -O3 -t=0 --use_fast_math -std=c++17 --generate-code arch=compute_75,code=[compute_75,sm_75] -DMULTI_GPU -DUSE_MPI -DENABLE_BF16 train_gpt2.cu -lcublas -lcublasLt -L/usr/lib/x86_64-linux-gnu/openmpi/lib/ -I/usr/lib/x86_64-linux-gnu/openmpi/include/ -lnccl -lmpi -o train_gpt2cu
In file included from train_gpt2.cu:37:
llmc/cuda_common.h:13:10: fatal error: nvtx3/nvToolsExt.h: No such file or directory
13 | #include <nvtx3/nvToolsExt.h>
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
In file included from train_gpt2.cu:37:
llmc/cuda_common.h:13:10: fatal error: nvtx3/nvToolsExt.h: No such file or directory
13 | #include <nvtx3/nvToolsExt.h>
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:268: train_gpt2cu] Error 255
How can I solve this issue or do you guys have any proper environment.yml file for llm.c .
Beta Was this translation helpful? Give feedback.
All reactions