baby-llama : use unnamed namespace in baby_llama_layer #9557

danbev · 2024-09-20T02:56:28Z

This commit renames the struct llama_layer to baby_llama_layer in the baby-llama example. This is to avoid a symbol conflict with the llama_layer struct in llama.cpp.

I ran into this when investigating an issue related to the baby-llama example and stepping through the code I noticed the following in init_model:

$ gdb --args ./llama-baby-llama
Reading symbols from ./llama-baby-llama...
(gdb) br init_model
Breakpoint 1 at 0x251767: file examples/baby-llama/baby-llama.cpp,
line 190.
(gdb) r
...
(gdb)
204	    model->layers.resize(n_layer);

If we inspect the size of model->layers we see that it is 0:

(gdb) p model->layers.size()
$1 = 0

And also n_layer is 1:

(gdb) p n_layer
$3 = 1

And we can inspect the type of model->layers:

(gdb) ptype model->layers
type = std::vector<llama_layer>

Now if we step over the resize function we will see something interesting:

(gdb) p model->layers.size()
$2 = 12

I also added two print statements to show the size of n_layer and model->layers.size():

n_layer: 1
layers.size(): 2049638230412172414

I later realized that there is a symbol conflict. There is a llama_layer in llama.cpp and this object file is compiled into the llama-baby-llama binary:

/usr/bin/ccache c++ -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic
-Wcast-qual -Wno-unused-function -Wmissing-declarations
-Wmissing-noreturn -pthread -fopenmp  -march=native -mtune=native
-Wno-array-bounds -Wno-format-truncation -Wextra-semi -Iggml/include
-Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE
-D_GLIBCXX_ASSERTIONS -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE
ggml/src/llamafile/sgemm.o ggml/src/ggml.o ggml/src/ggml-alloc.o
ggml/src/ggml-backend.o ggml/src/ggml-quants.o ggml/src/ggml-aarch64.o
src/llama.o src/llama-vocab.o src/llama-grammar.o src/llama-sampling.o
src/unicode.o src/unicode-data.o common/common.o common/arg.o
common/log.o common/console.o common/ngram-cache.o common/sampling.o
common/train.o common/build-info.o common/json-schema-to-grammar.o
examples/baby-llama/baby-llama.o -o llama-baby-llama -g

This could be worked around by renaming the llama_layer in baby-llama.cpp to baby_llama_layer, or use a namespace in baby-llama.cpp. I initially considered not compiling llama.o into the llama-baby-llama binary, but it looks like the baby-llama example uses train.h so it needs llama.cpp indirectly, so I opted to rename the struct.

After renaming the resize function works as expected:

(gdb) p model->layers.size()
$1 = 0
(gdb) ptype model->layers
type = std::vector<baby_llama_layer>
(gdb) p model->layers.size()
$2 = 1

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

This commit renames the struct llama_layer to baby_llama_layer in the baby-llama example. This is to avoid a symbol conflict with the llama_layer struct in llama.cpp. I ran into this when investigating an issue related to the baby-llama example and stepping through the code I noticed the following in `init_model`: ```console $ gdb --args ./llama-baby-llama Reading symbols from ./llama-baby-llama... (gdb) br init_model Breakpoint 1 at 0x251767: file examples/baby-llama/baby-llama.cpp, line 190. (gdb) r ... (gdb) 204 model->layers.resize(n_layer); ``` If we inspect the size of `model->layers` we see that it is 0: ```console (gdb) p model->layers.size() $1 = 0 ``` And also `n_layer` is 1: ```console (gdb) p n_layer $3 = 1 ``` And we can inspect the type of `model->layers`: ```console (gdb) ptype model->layers type = std::vector<llama_layer> ``` Now if we step over the resize function we will see something interesting: ```console (gdb) p model->layers.size() $2 = 12 ``` I also added two print statements to show the size of `n_layer` and `model->layers.size()`: ```console n_layer: 1 layers.size(): 2049638230412172414 ``` I later realized that there is a symbol conflict. There is a `llama_layer` in llama.cpp and this object file is compiled into the `llama-baby-llama` binary: ```console /usr/bin/ccache c++ -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -fopenmp -march=native -mtune=native -Wno-array-bounds -Wno-format-truncation -Wextra-semi -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -D_GLIBCXX_ASSERTIONS -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE ggml/src/llamafile/sgemm.o ggml/src/ggml.o ggml/src/ggml-alloc.o ggml/src/ggml-backend.o ggml/src/ggml-quants.o ggml/src/ggml-aarch64.o src/llama.o src/llama-vocab.o src/llama-grammar.o src/llama-sampling.o src/unicode.o src/unicode-data.o common/common.o common/arg.o common/log.o common/console.o common/ngram-cache.o common/sampling.o common/train.o common/build-info.o common/json-schema-to-grammar.o examples/baby-llama/baby-llama.o -o llama-baby-llama -g ``` This could be worked around by renaming the `llama_layer` in baby-llama.cpp to `baby_llama_layer`, or use a namespace in baby-llama.cpp. I initially considered not compiling llama.o into the llama-baby-llama binary, but it looks like the baby-llama example uses `train.h` so it needs llama.cpp indirectly, so I opted to rename the struct. After renaming the resize function works as expected: ```console (gdb) p model->layers.size() $1 = 0 (gdb) ptype model->layers type = std::vector<baby_llama_layer> (gdb) p model->layers.size() $2 = 1 ```

ggerganov · 2024-09-20T06:57:26Z

That's not a symbol conflict since llama_layer is a type. Something else is going wrong in your debugging, but in any case, this change is not necessary.

danbev · 2024-09-20T09:14:24Z

That's not a symbol conflict since llama_layer is a type. Something else is going wrong in your debugging, but in any case, this change is not necessary.

Yeah, sorry it is indeed not symbol conflict. I do think that there might be an issue here on Linux (using gcc) with some form or memory corruption. I've just tried this on mac and but I can't reproduce this there. I'll take another look when I get a chance, but I'll close this PR for now. Sorry about the noise.

ggerganov · 2024-09-20T12:17:28Z

but in any case, this change is not necessary.

Actually, I was wrong and it might be a good change. When the structs start having member functions, their symbols will duplicate (already happened in #9449) which prevents static builds.

One convention for the types in examples is to prefix with gpt_. Some of the structs in common already do that. For example, llama_layer becomes gpt_layer.

danbev · 2024-09-20T12:58:36Z

One convention for the types in examples is to prefix with gpt_. Some of the structs in common already do that. For example, llama_layer becomes gpt_layer.

Sounds good, I'll update this shortly 👍

slaren · 2024-09-20T13:02:29Z

Another option is to use unnamed namespaces, that would make the struct only accessible from the file where it is defined.

Rename baby_llama_layer to gpt_layer.

danbev · 2024-09-20T13:07:31Z

Another option is to use unnamed namespaces, that would make the struct only accessible from the file where it is defined.

I'd be happy to use that, just let me know what you prefer.

slaren · 2024-09-20T13:11:10Z

There are other structs in baby-llama.cpp with conflicting names like llama_hparams, llama_model and llama_kv_cache, so just putting the entire file in an unnamed namespace may be the easiest solution.

Though, I suspect that this example will become outdated after the next sync with ggml, and probably will be removed.

ggerganov · 2024-09-20T13:22:05Z

Should we anonymous namespace the llama code instead?

slaren · 2024-09-20T13:24:22Z

Yeah I think it would be good to move all the local C++ code into unnamed namespaces, since just declaring functions and globals as static is not enough to hide all the symbols in C++.

Use an unnamed namespace to make identifiers unique to the translation unit.

danbev · 2024-09-20T13:53:57Z

Yeah I think it would be good to move all the local C++ code into unnamed namespaces, since just declaring functions and globals as static is not enough to hide all the symbols in C++.

I'd be happy to take a stab at adding an anonymous namespace the llama code.

This commit introduces an anonymous namespace in llama.cpp to encapsulate the following structs and types: * llama_state * llama_hparams * llama_cparams * llama_layer * llama_ubatch * llama_kv_cell * llama_kv_cache * llama_control_vector * e_model There are potentially more structs, and also functions that are currently declared as static, which could be included in this anonymous namespace in the future. The motivation for this change is to avoid polluting the global namespace with these types. Refs: ggerganov#9557

vignesh1507

hmm, i approve with the changes.

github-actions bot added the examples label Sep 20, 2024

danbev closed this Sep 20, 2024

danbev reopened this Sep 20, 2024

squash! baby-llama : rename llama_layer to baby_llama_layer

c93300f

Rename baby_llama_layer to gpt_layer.

squash! baby-llama : rename llama_layer to baby_llama_layer

f9c2155

Use an unnamed namespace to make identifiers unique to the translation unit.

danbev changed the title ~~baby-llama : rename llama_layer to baby_llama_layer~~ baby-llama : use unnamed namespace in baby_llama_layer Sep 20, 2024

kylo5aby mentioned this pull request Sep 23, 2024

vocab: refactor tokenizer to reduce the overhead of creating multi times tokenizer #9449

Merged

6 tasks

danbev mentioned this pull request Sep 23, 2024

llama : introduce anonymous namespace in llama.cpp #9602

Open

4 tasks

vignesh1507 approved these changes Oct 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baby-llama : use unnamed namespace in baby_llama_layer #9557

baby-llama : use unnamed namespace in baby_llama_layer #9557

danbev commented Sep 20, 2024

ggerganov commented Sep 20, 2024

danbev commented Sep 20, 2024

ggerganov commented Sep 20, 2024

danbev commented Sep 20, 2024

slaren commented Sep 20, 2024 •

edited

Loading

danbev commented Sep 20, 2024

slaren commented Sep 20, 2024 •

edited

Loading

ggerganov commented Sep 20, 2024

slaren commented Sep 20, 2024

danbev commented Sep 20, 2024

vignesh1507 left a comment

baby-llama : use unnamed namespace in baby_llama_layer #9557

Are you sure you want to change the base?

baby-llama : use unnamed namespace in baby_llama_layer #9557

Conversation

danbev commented Sep 20, 2024

ggerganov commented Sep 20, 2024

danbev commented Sep 20, 2024

ggerganov commented Sep 20, 2024

danbev commented Sep 20, 2024

slaren commented Sep 20, 2024 • edited Loading

danbev commented Sep 20, 2024

slaren commented Sep 20, 2024 • edited Loading

ggerganov commented Sep 20, 2024

slaren commented Sep 20, 2024

danbev commented Sep 20, 2024

vignesh1507 left a comment

Choose a reason for hiding this comment

slaren commented Sep 20, 2024 •

edited

Loading

slaren commented Sep 20, 2024 •

edited

Loading