Skip to content

Conversation

@Yejing-Lai
Copy link
Contributor

Suppose qkv_linear_weight_shape = [in_features, out_features].
The qkv linear weight shape is [3, in_features, out_features] if using fued_qkv gemm optimization. It will cause "ValueError: too many values to unpack (expected 2)" issue when printing the model.

Solution: Take the last two weight dimensions shapes as in_features and out_features.

Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
@loadams loadams added this pull request to the merge queue Mar 4, 2025
@loadams loadams removed this pull request from the merge queue due to a manual request Mar 4, 2025
@hwchen2017 hwchen2017 enabled auto-merge March 4, 2025 17:20
@hwchen2017 hwchen2017 added this pull request to the merge queue Mar 4, 2025
Merged via the queue into deepspeedai:master with commit 71807bc Mar 4, 2025
10 of 11 checks passed
saurabhkoshatwar pushed a commit to saurabhkoshatwar/DeepSpeed that referenced this pull request Mar 8, 2025
Suppose qkv_linear_weight_shape = [in_features, out_features].
The qkv linear weight shape is [3, in_features, out_features] if using
fued_qkv gemm optimization. It will cause "ValueError: too many values
to unpack (expected 2)" issue when printing the model.

Solution: Take the last two weight dimensions shapes as in_features and
out_features.

Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Signed-off-by: Saurabh <saurabhkoshatwar1996@gmail.com>
A-transformer pushed a commit to A-transformer/DeepSpeed that referenced this pull request Mar 9, 2025
Suppose qkv_linear_weight_shape = [in_features, out_features].
The qkv linear weight shape is [3, in_features, out_features] if using
fued_qkv gemm optimization. It will cause "ValueError: too many values
to unpack (expected 2)" issue when printing the model.

Solution: Take the last two weight dimensions shapes as in_features and
out_features.

Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Signed-off-by: Liang Cheng <liangcheng@Liangs-MacBook-Pro.local>
mauryaavinash95 pushed a commit to DataStates/DeepSpeed that referenced this pull request Mar 20, 2025
Suppose qkv_linear_weight_shape = [in_features, out_features].
The qkv linear weight shape is [3, in_features, out_features] if using
fued_qkv gemm optimization. It will cause "ValueError: too many values
to unpack (expected 2)" issue when printing the model.

Solution: Take the last two weight dimensions shapes as in_features and
out_features.

Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
loadams added a commit that referenced this pull request Mar 25, 2025
Suppose qkv_linear_weight_shape = [in_features, out_features].
The qkv linear weight shape is [3, in_features, out_features] if using
fued_qkv gemm optimization. It will cause "ValueError: too many values
to unpack (expected 2)" issue when printing the model.

Solution: Take the last two weight dimensions shapes as in_features and
out_features.

Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants