-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault on OPPO FindX7 Ultra (Snapdragon8Gen3) #117
Comments
Thank you for the detailed log. It seems that the prefilling procedure has completed correctly, with the However, the issue is that there is no decoding log even when the Besides, if the segmentation fault occurs at the end of the total execution, this is a known issue. It results from the order in which the mllm-NPU releases QNN resources. We are working on a fix for this, but rest assured, it does not affect the normal prefilling and decoding processes. Thanks again for your valuable assistance! |
I encountered a similar issue from the latest code, and below is the detailed log:
|
#117 (comment) |
DDR size = 16GB
./main_qwen_npu -s 64 -c 1 -l 512
below is tail logs
`
Memory Usage: 8910 MB(19036) at: execute graph: 94
chunk:1 execute qnn graph 95
model.layers.23.self_attn.or_split exe_time:0.064 ms
model.layers.23.self_attn.or_split-00_view_ exe_time:0.004 ms
model.layers.23.self_attn.or_split-01_view_ exe_time:0.003 ms
model.layers.23.self_attn.o_proj exe_time:0.002 ms
model.layers.23.self_attn.o_proj.dequantize exe_time:0.003 ms
model.layers.23.self_attn.o_proj.dequantize-00_view_ exe_time:0.002 ms
model.layers.23.self_attn.o_proj.dequantize-00_view_-00_add_ exe_time:0.003 ms
model.layers.23.post_attention_layernorm exe_time:0.003 ms
model.layers.23.mlp.up_proj.quantize exe_time:0.002 ms
model.layers.23.mlp.up_proj.quantize-00_view_ exe_time:0.002 ms
model.layers.23.mlp.gate_proj exe_time:0.002 ms
model.layers.23.mlp.up_proj exe_time:0.002 ms
model.layers.23.mlp.gate_proj.dequantize exe_time:0.002 ms
model.layers.23.mlp.up_proj.dequantize exe_time:0.002 ms
model.layers.23.mlp.silu exe_time:0.003 ms
model.layers.23.mlp.silu-00_mul_ exe_time:0.002 ms
model.layers.23.mlp.down_proj.quantize exe_time:0.003 ms
model.layers.23.mlp.down_proj exe_time:0.003 ms
model.layers.23.mlp.down_proj.dequantize exe_time:0.002 ms
model.layers.23.mlp.down_proj.dequantize-00_view_ exe_time:0.002 ms
model.layers.23.mlp.down_proj.dequantize-00_view_-00_add_ exe_time:0.001 ms
QNN execution time 12.683 ms
model.layers.23.self_attn.or_split exe_time:0.064 ms
model.layers.23.self_attn.or_split-00_view_ exe_time:0.004 ms
model.layers.23.self_attn.or_split-01_view_ exe_time:0.003 ms
model.layers.23.self_attn.o_proj exe_time:0.003 ms
model.layers.23.self_attn.o_proj.dequantize exe_time:0.003 ms
model.layers.23.self_attn.o_proj.dequantize-00_view_ exe_time:0.002 ms
model.layers.23.self_attn.o_proj.dequantize-00_view_-00_add_ exe_time:0.002 ms
model.layers.23.post_attention_layernorm exe_time:0.003 ms
model.layers.23.mlp.up_proj.quantize exe_time:0.002 ms
model.layers.23.mlp.up_proj.quantize-00_view_ exe_time:0.001 ms
model.layers.23.mlp.gate_proj exe_time:0.002 ms
model.layers.23.mlp.up_proj exe_time:0.002 ms
model.layers.23.mlp.gate_proj.dequantize exe_time:0.002 ms
model.layers.23.mlp.up_proj.dequantize exe_time:0.002 ms
model.layers.23.mlp.silu exe_time:0.003 ms
model.layers.23.mlp.silu-00_mul_ exe_time:0.002 ms
model.layers.23.mlp.down_proj.quantize exe_time:0.002 ms
model.layers.23.mlp.down_proj exe_time:0.002 ms
model.layers.23.mlp.down_proj.dequantize exe_time:0.002 ms
model.layers.23.mlp.down_proj.dequantize-00_view_ exe_time:0.002 ms
model.layers.23.mlp.down_proj.dequantize-00_view_-00_add_ exe_time:0.002 ms
QNN execution time 11.541 ms
Memory Usage: 8888 MB(19036) at: execute graph: 95
model.norm reshape:
|| Input input0-00 shape: 1 64 1 2048 (131072) |
|| Output outtensor-model.norm-00 shape: 1 64 1 2048 (131072) |
lm_head reshape:
|| Input outtensor-model.norm-00 shape: 1 64 1 2048 (131072) |
|| Output outtensor-lm_head-00 shape: 1 64 1 151936 (9723904) |
model.norm exe_time:2.76 ms
lm_head exe_time:2188.71 ms
Fre
load time: 1554.09 ms
token time: nan ms
inference speed: nan tokens/s
load time: 2773.47 ms
token time: nan ms
inference speed: nan tokens/s
0.0ms [WARNING] sg_stubPtr is not null, skip loadRemoteSymbols
Segmentation fault`
The text was updated successfully, but these errors were encountered: