ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl #15928

CISC · 2025-09-10T18:31:00Z

See https://github.com/ggml-org/llama.cpp/actions/runs/17615331967/job/50046823487#step:5:87

slaren · 2025-09-11T08:41:25Z

ggml/src/ggml-quants.c

            }
            float best = 0;
            float scale = max/(2*kMaxQ-1);
+            for (int k = 0; k < 8; ++k) is_on_grid[k] = false;


If the goal is only to silence a compiler warning, then I would just zero-initialize the variable where it is declared. If you suspect that this is actually a bug that is leading to wrong results, then I think it needs more explanation.

It's fixing an actual bug, depending on the condition. It will potentially use uninitialized (or initialized from previous loop) data depending on this being done or not:

llama.cpp/ggml/src/ggml-quants.c

Lines 3750 to 3754 in f50c60d

if (sumq2 > 0 && sumqx*sumqx > best*sumq2) {

scale = sumqx/sumq2; best = scale*sumqx;

for (int i = 0; i < 32; ++i) L[i] = Laux[i];

for (int k = 0; k < 8; ++k) is_on_grid[k] = is_on_grid_aux[k];

}

See the other quants for reference:

llama.cpp/ggml/src/ggml-quants.c

Line 3283 in f50c60d

is_on_grid[0] = is_on_grid[1] = true;

llama.cpp/ggml/src/ggml-quants.c

Line 3931 in f50c60d

for (int k = 0; k < bs4; ++k) is_on_grid[k] = false;

llama.cpp/ggml/src/ggml-quants.c

Line 4883 in f50c60d

is_on_grid[0] = is_on_grid[1] = true;

To me, it is not obvious that this will change the results, or if it does, that it needs to be initialized to false instead of true (or other values).

It is indeed not obvious, it should perhaps be set to true, or maybe even do as quantize_row_iq3_s_impl where it is set to false:

llama.cpp/ggml/src/ggml-quants.c

Line 3968 in f50c60d

//if (is_on_grid[k]) continue;

Either way it is potentially using uninitialized data right now. The safest choice AFAICT is initializing it to ~~false~~true.

Again, what is it fixing? Is the change meaningful, or is it just adding more noise?

It's fixing uninitialized data, nothing more or less, exactly like in all the other quants.

..and yes, this will only happen when the weights are zero, but this has happened enough times that we have added several checks against it:

llama.cpp/ggml/src/ggml-quants.c

Line 493 in f50c60d

float scale = suml2 ? sumlx/suml2 : 0.0f;

llama.cpp/ggml/src/ggml-quants.c

Line 569 in f50c60d

return suml2 > 0.0f ? sumlx / suml2 : 0.0f;

llama.cpp/ggml/src/ggml-quants.c

Line 969 in f50c60d

return suml2 > 0.0f ? sumlx / suml2 : 0.0f;

It seems that if all weights are zero, scale will also be zero, and the branch that uses is_on_grid will be ignored.

It seems that if all weights are zero, scale will also be zero, and the branch that uses is_on_grid will be ignored.

Not necessarily, only if the whole block is zero, however that may not be the case, also scale is based off the original weights, while the weights in question are the ones after imatrix is applied (ie, the imatrix may cause non-zero parts to go to zero).

compilade

L with all zeros is on grid for IQ3_XXS, so initializing is_on_grid to true makes sense.

(EDIT: but L doesn't seem to be initialized either...)

CISC · 2025-09-13T11:41:25Z

(EDIT: but L doesn't seem to be initialized either...)

If the original weights are all zero (or close) L will get cleared, however not if the imatrixed weights are zero. :(

taronaeo · 2025-09-22T07:58:25Z

Hi! Any update on this? It's one of the items preventing #15925 from passing CI tests due to LLAMA_FATAL_WARNINGS=ON

CISC · 2025-09-22T08:03:57Z

Hi! Any update on this? It's one of the items preventing #15925 from passing CI tests due to LLAMA_FATAL_WARNINGS=ON

I will double check L for the other quants first later today.

CISC · 2025-09-23T08:24:27Z

Since imatrix weights are unlikely to be just partially zero it means the whole block will be on grid in the event of weights being zeroed by imatrix, and then L will not be used.

@danbev

* origin/master: (39 commits) ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200) ci : enable Vulkan workflow on Mac (ggml-org#16194) ggml-cpu: Respect cpumask settings (ggml-org#16164) ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928) zdnn: refactor codebase + add docs (ggml-org#16178) codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190) devops: add s390x containers (ggml-org#15915) ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189) feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177) clang-tidy : disable warning about performance enum size (ggml-org#16127) ggml : implement set_rows with i32 index (ggml-org#16159) codeowners : update + cleanup (ggml-org#16174) common : enable `--offline` mode without curl support (ggml-org#16137) webui : fix handling incomplete chunks (ggml-org#16107) embedding : fix typos in README (ggml-org#16171) common : remove unused local variables (ggml-org#16140) ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123) ggml : add ggml_op_is_empty (ggml-org#16122) codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128) Vulkan: add conv_transpose_2d operation (ggml-org#16022) ...

…l-org#15928) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true

fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl

f50c60d

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 10, 2025

CISC mentioned this pull request Sep 10, 2025

devops: add s390x & ppc64le CI #15925

Merged

4 tasks

CISC requested a review from slaren September 10, 2025 18:37

slaren reviewed Sep 11, 2025

View reviewed changes

change initialization to true

7315fc0

compilade approved these changes Sep 12, 2025

View reviewed changes

CISC merged commit f6b4af3 into master Sep 23, 2025
47 of 48 checks passed

CISC deleted the cisc/iq3-xxs-uninitialized-is-on-grid branch September 23, 2025 08:25

struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025

ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggm…

2cd9691

…l-org#15928) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true

	if (sumq2 > 0 && sumqxsumqx > bestsumq2) {
	scale = sumqx/sumq2; best = scale*sumqx;
	for (int i = 0; i < 32; ++i) L[i] = Laux[i];
	for (int k = 0; k < 8; ++k) is_on_grid[k] = is_on_grid_aux[k];
	}

ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl #15928

ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl #15928

Uh oh!

Conversation

CISC commented Sep 10, 2025

Uh oh!

slaren Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

slaren Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CISC Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slaren Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

slaren Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CISC Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

compilade left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CISC commented Sep 13, 2025

Uh oh!

taronaeo commented Sep 22, 2025

Uh oh!

CISC commented Sep 22, 2025

Uh oh!

CISC commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

slaren Sep 11, 2025 •

edited

Loading

CISC Sep 11, 2025 •

edited

Loading

slaren Sep 11, 2025 •

edited

Loading

CISC Sep 11, 2025 •

edited

Loading

compilade left a comment •

edited

Loading