-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
baby-llama : use unnamed namespace in baby_llama_layer #9557
base: master
Are you sure you want to change the base?
Conversation
This commit renames the struct llama_layer to baby_llama_layer in the baby-llama example. This is to avoid a symbol conflict with the llama_layer struct in llama.cpp. I ran into this when investigating an issue related to the baby-llama example and stepping through the code I noticed the following in `init_model`: ```console $ gdb --args ./llama-baby-llama Reading symbols from ./llama-baby-llama... (gdb) br init_model Breakpoint 1 at 0x251767: file examples/baby-llama/baby-llama.cpp, line 190. (gdb) r ... (gdb) 204 model->layers.resize(n_layer); ``` If we inspect the size of `model->layers` we see that it is 0: ```console (gdb) p model->layers.size() $1 = 0 ``` And also `n_layer` is 1: ```console (gdb) p n_layer $3 = 1 ``` And we can inspect the type of `model->layers`: ```console (gdb) ptype model->layers type = std::vector<llama_layer> ``` Now if we step over the resize function we will see something interesting: ```console (gdb) p model->layers.size() $2 = 12 ``` I also added two print statements to show the size of `n_layer` and `model->layers.size()`: ```console n_layer: 1 layers.size(): 2049638230412172414 ``` I later realized that there is a symbol conflict. There is a `llama_layer` in llama.cpp and this object file is compiled into the `llama-baby-llama` binary: ```console /usr/bin/ccache c++ -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -fopenmp -march=native -mtune=native -Wno-array-bounds -Wno-format-truncation -Wextra-semi -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -D_GLIBCXX_ASSERTIONS -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE ggml/src/llamafile/sgemm.o ggml/src/ggml.o ggml/src/ggml-alloc.o ggml/src/ggml-backend.o ggml/src/ggml-quants.o ggml/src/ggml-aarch64.o src/llama.o src/llama-vocab.o src/llama-grammar.o src/llama-sampling.o src/unicode.o src/unicode-data.o common/common.o common/arg.o common/log.o common/console.o common/ngram-cache.o common/sampling.o common/train.o common/build-info.o common/json-schema-to-grammar.o examples/baby-llama/baby-llama.o -o llama-baby-llama -g ``` This could be worked around by renaming the `llama_layer` in baby-llama.cpp to `baby_llama_layer`, or use a namespace in baby-llama.cpp. I initially considered not compiling llama.o into the llama-baby-llama binary, but it looks like the baby-llama example uses `train.h` so it needs llama.cpp indirectly, so I opted to rename the struct. After renaming the resize function works as expected: ```console (gdb) p model->layers.size() $1 = 0 (gdb) ptype model->layers type = std::vector<baby_llama_layer> (gdb) p model->layers.size() $2 = 1 ```
That's not a symbol conflict since |
Yeah, sorry it is indeed not symbol conflict. I do think that there might be an issue here on Linux (using gcc) with some form or memory corruption. I've just tried this on mac and but I can't reproduce this there. I'll take another look when I get a chance, but I'll close this PR for now. Sorry about the noise. |
Actually, I was wrong and it might be a good change. When the structs start having member functions, their symbols will duplicate (already happened in #9449) which prevents static builds. One convention for the types in |
Sounds good, I'll update this shortly 👍 |
Another option is to use unnamed namespaces, that would make the struct only accessible from the file where it is defined. |
Rename baby_llama_layer to gpt_layer.
I'd be happy to use that, just let me know what you prefer. |
There are other structs in Though, I suspect that this example will become outdated after the next sync with ggml, and probably will be removed. |
Should we anonymous namespace the llama code instead? |
Yeah I think it would be good to move all the local C++ code into unnamed namespaces, since just declaring functions and globals as static is not enough to hide all the symbols in C++. |
Use an unnamed namespace to make identifiers unique to the translation unit.
I'd be happy to take a stab at adding an anonymous namespace the llama code. |
This commit introduces an anonymous namespace in llama.cpp to encapsulate the following structs and types: * llama_state * llama_hparams * llama_cparams * llama_layer * llama_ubatch * llama_kv_cell * llama_kv_cache * llama_control_vector * e_model There are potentially more structs, and also functions that are currently declared as static, which could be included in this anonymous namespace in the future. The motivation for this change is to avoid polluting the global namespace with these types. Refs: ggerganov#9557
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, i approve with the changes.
This commit renames the struct llama_layer to baby_llama_layer in the baby-llama example. This is to avoid a symbol conflict with the llama_layer struct in llama.cpp.
I ran into this when investigating an issue related to the baby-llama example and stepping through the code I noticed the following in
init_model
:If we inspect the size of
model->layers
we see that it is 0:And also
n_layer
is 1:And we can inspect the type of
model->layers
:Now if we step over the resize function we will see something interesting:
I also added two print statements to show the size of
n_layer
andmodel->layers.size()
:I later realized that there is a symbol conflict. There is a
llama_layer
in llama.cpp and this object file is compiled into thellama-baby-llama
binary:This could be worked around by renaming the
llama_layer
in baby-llama.cpp tobaby_llama_layer
, or use a namespace in baby-llama.cpp. I initially considered not compiling llama.o into the llama-baby-llama binary, but it looks like the baby-llama example usestrain.h
so it needs llama.cpp indirectly, so I opted to rename the struct.After renaming the resize function works as expected: