Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2.5-0.5B-Instruct 运行apply_lora.py失败 #3095

Open
jfduma opened this issue Nov 20, 2024 · 1 comment
Open

Qwen2.5-0.5B-Instruct 运行apply_lora.py失败 #3095

jfduma opened this issue Nov 20, 2024 · 1 comment
Labels
bug Something isn't working llm llm export error or use error

Comments

@jfduma
Copy link

jfduma commented Nov 20, 2024

开发机:ubuntu 20.04 mnn 3.0.0

模型 huggingface:Qwen2.5-0.5B-Instruct 和 Qwen2.5-0.5B-Instruct-GPTQ-Int8

导出 onnx 模型

$ python mnn/transformers/llm/export/llmexport.py --path pretrained_model/Qwen2.5-0.5B-Instruct --export onnx --dst_path mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3

✅ Done load pretrained model pretrained_model/Qwen2.5-0.5B-Instruct [ 1.10 s]
⠋ export tokenizer to 2024-11-20 15:21:53.270750: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-11-20 15:21:53.285959: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1732087313.300938 1727776 cuda_dnn.cc:8322] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1732087313.305363 1727776 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-20 15:21:53.322212: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
✅ Done export tokenizer to mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/tokenizer.txt[ 2.71 s]
✅ Done export embedding to mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/embeddings_bf16.bin[ 0.12 s]
✅ Done export onnx model to mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/onnx/llm.onnx[ 3.43 s]
✅ Done export model weight to mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/onnx/llm.onnx.data[ 3.19 s]
✅ Done export config to mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/llm_config.json[ 0.00 s]

导出 mnn 模型

$ mnn/build/MNNConvert --modelFile mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/onnx/llm.onnx --framework ONNX --MNNModel mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/llm.mnn --weightQuantBits 8 --weightQuantBlock 128 --weightQuantAsymmetric --saveExternalData --transformerFuse --allowCustomOp

The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0
Don't has bizCode, use MNNTest for default
Start to Convert Other Model Format To MNN Model..., target version: 3
[15:22:06] /work/mnn/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 8
[15:22:06] /work/mnn/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 15
Start to Optimize the MNN Net...
Fuse Attention as /Reshape_8_output_0
Fuse Attention as /Reshape_17_output_0
Fuse Attention as /Reshape_26_output_0
Fuse Attention as /Reshape_35_output_0
Fuse Attention as /Reshape_44_output_0
Fuse Attention as /Reshape_53_output_0
Fuse Attention as /Reshape_62_output_0
Fuse Attention as /Reshape_71_output_0
Fuse Attention as /Reshape_80_output_0
Fuse Attention as /Reshape_89_output_0
Fuse Attention as /Reshape_98_output_0
Fuse Attention as /Reshape_107_output_0
Fuse Attention as /Reshape_116_output_0
Fuse Attention as /Reshape_125_output_0
Fuse Attention as /Reshape_134_output_0
Fuse Attention as /Reshape_143_output_0
Fuse Attention as /Reshape_152_output_0
Fuse Attention as /Reshape_161_output_0
Fuse Attention as /Reshape_170_output_0
Fuse Attention as /Reshape_179_output_0
Fuse Attention as /Reshape_188_output_0
Fuse Attention as /Reshape_197_output_0
Fuse Attention as /Reshape_206_output_0
Fuse Attention as /Reshape_215_output_0
Remove past KV for presents
Save Weight to mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/llm.mnn.weight
inputTensors : [ input_ids, position_ids, attention_mask, past_key_values, ]
outputTensors: [ logits, presents, ]
Converted Success!

转换 LoRA

$ python mnn/tools/script/apply_lora.py --base mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/base.json --lora /work/task_alpha/alpha_lora/checkpoint-800 --scale 2 --out mnn-output/basemodel_0.5b_instruct_q88_gptq_onnx_mnn_v3/lora_alpha.json

Traceback (most recent call last):
File "/work/mnn/tools/script/apply_lora.py", line 156, in
main(args)
File "/work/mnn/tools/script/apply_lora.py", line 146, in main
base.apply(lora, args.out)
File "/work/mnn/tools/script/apply_lora.py", line 94, in apply
self.apply_lora(op, lora)
File "/work/mnn/tools/script/apply_lora.py", line 70, in apply_lora
tag = names[1].split('.')[1] + names[3]
IndexError: list index out of range

经调试:name = ['', 'mlp', 'gate_proj', 'FakeLinear_output_0__matmul_converted']

@jxt1234 jxt1234 added bug Something isn't working llm llm export error or use error labels Nov 20, 2024
@jxt1234
Copy link
Collaborator

jxt1234 commented Nov 20, 2024

看着是没考虑名字带空格的情况,我们排查一下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working llm llm export error or use error
Projects
None yet
Development

No branches or pull requests

2 participants