-
Notifications
You must be signed in to change notification settings - Fork 126
ggml_new_tensor_impl: not enough space in the context's memory pool #47
Comments
Thanks for opening this issue. I think there's an issue with the way the simple example code in the GGML repo allocates memory for the models. I probably need to have a dig around in the ggml code and see if I can get it to allocate more memory. Did this happen after repeated generations appending to the same file? Are you using the fauxcode vscode plugin or the huggingface plugin? |
I have the same problem when using Fauxpilot. If you use curl post data, there will be no problem, and the codegen-serve works normally. |
Happens after repeated generations I think as I'm typing, also based from the number of calls to the completions endpoint from the logs. I'm using the fauxpilot extension |
I also experienced it. |
I use Fauxpilot, and when I use a longer prompt, it encounters such issues and exits. |
I've deployed a change to allow users to specify smaller batch sizes (#59). Normally we set batch size (number of tokens to attempt to process in the same forward pass) to 512. However, this is quite memory intensive, especially with the larger models (starcoder/wizardcoder). If you build from I will package up a new minor release in the next couple of days that will include this change if you don't want to build from main. |
The probem is still happenning for me with
|
I tried writing a few lines of code. I got my first completion working properly.
But when I started adding more code, I got an error from Turbopilot saying the following:
I'm running on a MacBook Pro with Apple M1 Pro chip and 16 GB of memory.
The text was updated successfully, but these errors were encountered: