Bug: cannot create std::vector larger than max_size() #9391

imhoffman · 2024-09-09T15:52:21Z

What happened?

My usual build recipe and run scripts do not work after b3680. Something changed in b3681, but I don't know what.
I see this same failure across models and cli flags, so it seems to be deeper than a single feature choice, so I have excluded the launch script.

This is the actual error:

...
terminate called after throwing an instance of 'std::length_error'
  what():  cannot create std::vector larger than max_size()
<launch script name> Aborted                 (core dumped)

Here is what the binary reports at runtime:

system_info: n_threads = 24 (n_threads_batch = 24) / 48 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
main: interactive mode on.

Here is how I configure the build:

cmake -DGGML_AVX=ON -DGGML_AVX2=ON -DBUILD_SHARED_LIBS=ON -DGGML_CUDA=ON -DGGML_CUDA_F16=ON -DGGML_F16C=ON -DCMAKE_C_COMPILER=gcc-12 -DCMAKE_CXX_COMPILER=g++-12 -DCMAKE_CUDA_FLAGS='-ccbin=gcc-12' -DCMAKE_INSTALL_PREFIX=/opt/llama ..

and some other system info:

$ lscpu | grep "Model name:"
Model name:                           Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
$ uname -srv
Linux 6.10.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 19 Aug 2024 17:02:39 +0000
$ cat /proc/driver/nvidia/version 
NVRM version: NVIDIA UNIX x86_64 Kernel Module  550.107.02  Wed Jul 24 23:53:00 UTC 2024
GCC version:  gcc version 14.2.1 20240805 (GCC) 
$ gcc-12 --version
gcc-12 (GCC) 12.3.0

Name and Version

$ /opt/llama/bin/llama-cli --version
version: 3681 (df270ef)
built with gcc-12 (GCC) 12.3.0 for x86_64-pc-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

ggerganov · 2024-09-09T16:03:37Z

It's likely something related to the sampling, but without the actual command or stacktrace it's hard to say what's wrong

imhoffman · 2024-09-09T16:30:25Z

This fails the same way for a variety of input models and cli options, but I can certainly provide one of them in detail.
And, how would you like me to produce the stacktrace?

imhoffman · 2024-09-09T16:37:50Z

Here is the launch script:

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/llama/lib CUDA_VISIBLE_DEVICES=0,1,2 OMP_NUM_THREADS=
48 OMP_PROC_BIND=spread OMP_PLACES=cores /opt/llama/bin/llama-cli \
  --color \
  --threads 48 \
  --n-predict -1 \
  --ctx-size 8192 \
  --batch-size 32 --cont-batching \
  --parallel 48 --sequences 48 \
  --temp 0.95 --dynatemp-range 0.175 \
  --gpu-layers 77 \
  --repeat-last-n -1 --repeat-penalty 1.10 \
  --model /opt/llama/models/Meta-Llama-3.1-70B-Instruct-Q5_K_S.gguf \
  --conversation \
  --file <local prompt text file> \
  --keep -1 \
  --reverse-prompt "Prompter:" \
  --log-enable

then llama.log is an empty file.

imhoffman · 2024-09-09T17:35:29Z

Here is all that I can get so far out of the core dump from gdb:

...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/opt/llama/bin/llama-cli --color --threads 48 --n-predict -1 --ctx-size 8192 --'.
Program terminated with signal SIGABRT, Aborted.
#0  0x0000771d2a4a53f4 in ?? () from /usr/lib/libc.so.6

imhoffman · 2024-09-09T17:40:09Z

And, yeah, here is the fail at the sampler:

...
#5  0x0000771d2a69752a in std::terminate () at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
No locals.
#6  0x0000771d2a6ae2b6 in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x771d2a876da8 <typeinfo for std::length_error>, dest=0x771d2a6c57c0 <std::length_error::~length_error()>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_throw.cc:98
        globals = <optimized out>
        header = 0x55e43cf6ce10
#7  0x0000771d2a69b247 in std::__throw_length_error (__s=0x771d42ab32b0 "cannot create std::vector larger than max_size()") at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/functexcept.cc:82
No locals.
#8  0x0000771d42a7f7a9 in llama_sampler_init_penalties () from /opt/llama/lib/libllama.so
No symbol table info available.
#9  0x000055e40eb4e9fd in gpt_sampler_init(llama_model const*, gpt_sampler_params const&) ()
No symbol table info available.
#10 0x000055e40eafd029 in main ()
No symbol table info available.

slaren · 2024-09-09T17:41:18Z

It's a bug. In the meanwhile, you can replace --repeat-last-n -1 with --repeat-last-n 0.

Gryphe · 2024-09-09T18:02:49Z

It's a bug. In the meanwhile, you can replace --repeat-last-n -1 with --repeat-last-n 0.

I can confirm this fixes the crash, but it appears samplers no longer function on llama-server. Every time I regenerate a response, it's exactly the same.

slaren · 2024-09-09T18:11:28Z

@Gryphe please create a new issue and provide instructions to reproduce this (ideally using curl as the client).

slaren · 2024-09-09T18:12:48Z

@ggerganov Maybe that is caused by the reset function of the dist sampler? I see there is a gpt_sampler_reset in update_slots. Possibly related to #8971 as well.

ggerganov · 2024-09-10T06:30:46Z

@ggerganov Maybe that is caused by the reset function of the dist sampler? I see there is a gpt_sampler_reset in update_slots. Possibly related to #8971 as well.

It looks like it's because of passing the -1 value for the penalty_last_n argument. #9398 seems to resolve it by clamping the value to 0.

To fix this issue we should update gpt_sampler_init to pass the context size when the params.penalty_last_n == -1.

imhoffman added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Sep 9, 2024

slaren added bug Something isn't working and removed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Sep 9, 2024

slaren added medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) and removed bug-unconfirmed labels Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: cannot create std::vector larger than max_size() #9391

Bug: cannot create std::vector larger than max_size() #9391

imhoffman commented Sep 9, 2024 •

edited

Loading

ggerganov commented Sep 9, 2024

imhoffman commented Sep 9, 2024

imhoffman commented Sep 9, 2024

imhoffman commented Sep 9, 2024

imhoffman commented Sep 9, 2024

slaren commented Sep 9, 2024

Gryphe commented Sep 9, 2024

slaren commented Sep 9, 2024

slaren commented Sep 9, 2024 •

edited

Loading

ggerganov commented Sep 10, 2024

Bug: cannot create std::vector larger than max_size() #9391

Bug: cannot create std::vector larger than max_size() #9391

Comments

imhoffman commented Sep 9, 2024 • edited Loading

What happened?

Name and Version

What operating system are you seeing the problem on?

Relevant log output

ggerganov commented Sep 9, 2024

imhoffman commented Sep 9, 2024

imhoffman commented Sep 9, 2024

imhoffman commented Sep 9, 2024

imhoffman commented Sep 9, 2024

slaren commented Sep 9, 2024

Gryphe commented Sep 9, 2024

slaren commented Sep 9, 2024

slaren commented Sep 9, 2024 • edited Loading

ggerganov commented Sep 10, 2024

imhoffman commented Sep 9, 2024 •

edited

Loading

slaren commented Sep 9, 2024 •

edited

Loading