You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The model works well when I run inference using the nemo checkpoint (script). But the converted checkpoint (script) drastically drops in performance. Any ideas why this might be happening? My only hunch is that apply_query_key_layer_scaling=True in nemo, which might not be the case in HF.
Describe the bug
I have trained a llama-like model with nemo using the below model config:
The model works well when I run inference using the nemo checkpoint (script). But the converted checkpoint (script) drastically drops in performance. Any ideas why this might be happening? My only hunch is that
apply_query_key_layer_scaling=True
in nemo, which might not be the case in HF.Environment details
https://docs.nvidia.com/nemo-framework/user-guide/latest/softwarecomponentversions.html#nemo-framework-24-05
The text was updated successfully, but these errors were encountered: