ggml : add ggml_fill() #13772

ngxson · 2025-05-25T09:33:44Z

Add ggml_fill(ctx0, tensor, value) which mimic the idea of pytorch full, full_like, zero_likes, ones_likes

It's not 100% equivalent to pytorch, as this is an in-place operation. However, it allow much more flexibility. For example:

Create a new tensor with constant value by new_tensor, then fill
Set part of an existing tensor to constant value by doing a view, then fill
Mimic the pytorch's *_like behavior by doing a dup, then fill

For simplification, this op is single-threaded, CPU-only for now

ggml/src/ggml-cpu/ggml-cpu.c

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

ggerganov · 2025-05-26T08:51:52Z

Inplace operations are a bit tricky (#12757 (comment)), so I am a bit hesitating. Wondering if there is some other way to support this.

Create a new tensor with constant value by new_tensor, then fill

Such tensors should always be marked as inputs and set via the ggml_backend_tensor_set.

Set part of an existing tensor to constant value by doing a view, then fill

There isn't a convenient way to do it. Probably:

val = ggml_new_tensor(1 element);
ggml_set_input(val);

...

aux = ggml_repeat(val, [needed size])
cur = ggml_cpy(aux, ggml_view(...));
ggml_build_forward_expand(gf, cur);

ngxson · 2025-05-26T10:08:22Z

If we don't want to support inplace, we can internally create new tensor, so ggml_fill now become ggml_full_like

Setting it via an input can be a bit annoying especially in the case I want to use just one single number:

val = ggml_new_tensor(1 element);
ggml_set_input(val);
// then, alloc graph, set tensor data

Another way could be to provide a ggml_one which returns a tensor of one single element, having value 1.0f. So now we have ability to generate 0.0f and 1.0f, essentially a "linear base" that allow constructing any vectors possible 😂

(But ofc having something like pytorch's full_like will make my life a lot easier)

ggerganov · 2025-05-26T16:41:13Z

Inplace operations are a bit tricky (#12757 (comment)), so I am a bit hesitating.

Thinking more about it, my concern might not be very relevant because in this case we don't use the existing data as input - i.e. we always override it with a specific value that does not depend on the input. So it's probably OK.

Maybe we should just improve the API a bit to become more type-safe. For example, what happens if you ggml_fill(x, 123.0f) when x is GGML_TYPE_I32? Probably we need overloads such as ggml_fill_f32(), ggml_fill_i32(), etc.

ngxson · 2025-05-26T20:24:16Z

Maybe we should just improve the API a bit to become more type-safe. For example, what happens if you ggml_fill(x, 123.0f) when x is GGML_TYPE_I32? Probably we need overloads such as ggml_fill_f32(), ggml_fill_i32(), etc.

Hmm I think it's currently the same concern as many other ops like ggml_scale, ggml_norm, etc.

But the idea of supporting I32 type is interesting. Don't know if asking this is too much, but since a long time now I really want ggml_cast to support converting back and forth between float and int. I still not yet figure out how to do that because (1) the code of ggml_cast is quite above my head, and (2) it can be tricky to reimplement on all backends. WDYT?

If ggml_cast can convert between float and int, then I think we can have a single ggml_fill accepting float, then cast it to I32 when needed (in general, I think this use case will be rare)

slaren · 2025-05-26T21:08:03Z

The implementation of ggml_fill_i32 and ggml_fill_f32 would be the same, just store an int32 instead of a float32 in the op_params. You only need one implementation for each type size, since it is only copying bits.

slaren · 2025-05-26T21:21:17Z

The issue of creating a new tensor in a graph with ggml_new_tensor is that ggml-alloc treats these as potential inputs and allocates all of them at the beginning of the compute buffer. This is so that the user can load data into it before evaluating the graph. If you are creating new tensors on a loop, e.g. per-layer, then it can quickly become a very large waste of memory.

Adding a version of this op that returns a new tensor instead of a view would be trivial, and would avoid this issue. Just make a ggml_fill_4d function or similar that creates a new tensor instead of a view.

ngxson · 2025-05-31T21:39:44Z

I found another way around, cos(0.0) giving 1.0 😂

So I think I will settle with this small hack for now, but will go back to this ggml_fill when I have more time.

ngxson · 2025-06-04T08:16:49Z

@ggerganov @slaren I see more use cases (models about to be released) where I need to do add 1.0 or an arbitrary scalar value to the activation (not the weight). I'm thinking about adding ggml_add_scalar(ctx0, cur, 1.0) or something like that, the idea is like ggml_scale. WDYT?

ggerganov · 2025-06-04T08:45:25Z

We can generalize GGML_OP_SCALE to take a second float and do multiply and add:

y = a*s + c

Should be easy to extend the existing implementations and as a first pass we can assert that op_params[1] == 0.0f. To avoid ggml API breaking change we can add a new call ggml_scale_add() that also uses GGML_OP_SCALE.

ggml : add ggml_fill()

f44c177

github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning labels May 25, 2025

fix compile

6e522ec

ngxson marked this pull request as ready for review May 25, 2025 09:43

ngxson requested a review from ggerganov May 25, 2025 09:43

CISC requested changes May 25, 2025

View reviewed changes

ggml/src/ggml-cpu/ggml-cpu.c Outdated Show resolved Hide resolved

ggml/src/ggml-cpu/ggml-cpu.c Outdated Show resolved Hide resolved

Apply suggestions from code review

962805b

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : add ggml_fill() #13772

ggml : add ggml_fill() #13772

Uh oh!

ngxson commented May 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ggerganov commented May 26, 2025

Uh oh!

ngxson commented May 26, 2025

Uh oh!

ggerganov commented May 26, 2025

Uh oh!

ngxson commented May 26, 2025

Uh oh!

slaren commented May 26, 2025

Uh oh!

slaren commented May 26, 2025

Uh oh!

ngxson commented May 31, 2025

Uh oh!

ngxson commented Jun 4, 2025

Uh oh!

ggerganov commented Jun 4, 2025

Uh oh!

Uh oh!

ggml : add ggml_fill() #13772

Are you sure you want to change the base?

ggml : add ggml_fill() #13772

Uh oh!

Conversation

ngxson commented May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggerganov commented May 26, 2025

Uh oh!

ngxson commented May 26, 2025

Uh oh!

ggerganov commented May 26, 2025

Uh oh!

ngxson commented May 26, 2025

Uh oh!

slaren commented May 26, 2025

Uh oh!

slaren commented May 26, 2025

Uh oh!

ngxson commented May 31, 2025

Uh oh!

ngxson commented Jun 4, 2025

Uh oh!

ggerganov commented Jun 4, 2025

Uh oh!

Uh oh!

ngxson commented May 25, 2025 •

edited

Loading