stuck when infer wIth trained lora model #30

moon-fall · 2023-10-05T11:00:21Z

infer using 2 A100 80g
run the inference,py script

the log shows below

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 113
CUDA SETUP: Loading binary /home/notebook/data/group/cubelm/lf_cubelm/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
[2023-10-05 10:09:55,419] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 100%|████████████████| 15/15 [04:43<00:00, 18.93s/it]
Processing batch 1 of 1...

but using original llama-2-70b model without lora to infer is OK

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stuck when infer wIth trained lora model #30

stuck when infer wIth trained lora model #30

moon-fall commented Oct 5, 2023

stuck when infer wIth trained lora model #30

stuck when infer wIth trained lora model #30

Comments

moon-fall commented Oct 5, 2023