macOS 15.0 (24A335) M1 buffer is not large enough and resource_tracker: There appear to be %d #107

guoreex · 2024-09-17T12:32:09Z

I'm not sure if this question is appropriate to ask here, I'm not a professional programmer, if anyone is willing to offer help and guidance, I would be very grateful.

Two weeks ago, I started using the GGUF model, and it can work normally. Today, I upgraded the system of the MacBook pro m1 computer to the latest version of macOS 15.0 (24A335). An error prompt occurred when running GGUF workflow in comfyUI:

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
/AppleInternal/Library/BuildRoots/5a8a3fcc-55cb-11ef-848e-8a553ba56670/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:891: failed assertion `[MPSNDArray, initWithBufferImpl:offset:descriptor:isForNDArrayAlias:isUserBuffer:] Error: buffer is not large enough. Must be 63700992 bytes
'/Users/***/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

My system information：
Python version: 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ]
pytorch version: 2.6.0.dev20240916
ComfyUI Revision: 2701 [7183fd16] | Released on '2024-09-17'

I didn't know if this is related to updating the system.
thx

The text was updated successfully, but these errors were encountered:

city96 · 2024-09-17T13:24:11Z

Could you test with the FP16/FP8 model and the default nodes w/o the custom node pack? Might be more appropriate for the ComfyUI repo if it still happens with those since the error makes it sound like it's not a problem with this node pack, I could be wrong though.

Also makes it sound like you can set the env variable export TOKENIZERS_PARALLELISM=false to possibly fix it? Might be worth testing.

guoreex · 2024-09-17T14:07:02Z

Thank you for your reply.

My computer only has 16G ram, which is not enough to run the FP8 model.

set export TOKENIZERS_PARALLELISM=false There are still mistakes：

...
Requested to load FluxClipModel_
Loading 1 new model
loaded completely 0.0 323.94775390625 True
Requested to load FluxClipModel_
Loading 1 new model
Requested to load Flux
Loading 1 new model
loaded completely 0.0 6456.9610595703125 True
  0%|                                                              | 0/4 [00:00<?, ?it/s]/AppleInternal/Library/BuildRoots/5a8a3fcc-55cb-11ef-848e-8a553ba56670/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:891: failed assertion `[MPSNDArray, initWithBufferImpl:offset:descriptor:isForNDArrayAlias:isUserBuffer:] Error: buffer is not large enough. Must be 63700992 bytes
'
/Users/***/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

The error occurs after starting the generation calculation.

city96 · 2024-09-17T14:25:14Z

Well, at least there's a progress bar now lol, buffer error is still there though...

I don't have any apple device to test on, but looks like there's a similar issue on the pytorch tracker with a linked PR, not sure if the cause is the same though. Might be worth keeping an eye on and testing on latest nightly once it gets merged? pytorch/pytorch#136132

tombearx · 2024-09-23T16:11:56Z

Still have the issue using today's nightly build. Any one else?

thenabytes · 2024-09-23T20:24:00Z

M2 Macbook Air, 16GB RAM
Sequoia 15.0
Python version: 3.12.6 (main, Sep 6 2024, 19:03:47) [Clang 15.0.0 (clang-1500.3.9.4)]
pytorch version: 2.6.0.dev20240923
ComfyUI Revision: 2724 [3a0eeee3] | Released on '2024-09-23'

Requested to load Flux
Loading 1 new model
loaded completely 0.0 7867.7110595703125 True
  0%|                                                     | 0/20 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
/AppleInternal/Library/BuildRoots/5a8a3fcc-55cb-11ef-848e-8a553ba56670/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:891: failed assertion `[MPSNDArray, initWithBufferImpl:offset:descriptor:isForNDArrayAlias:isUserBuffer:] Error: buffer is not large enough. Must be 77856768 bytes

jonny7737 · 2024-09-25T13:47:59Z

M2 Max Mac Studio, 64GB RAM
Sequoia 15.0
Python 3.11.9

Only when running GGUF models (fp16 fp8 work fine)

/AppleInternal/Library/BuildRoots/5a8a3fcc-55cb-11ef-848e-8a553ba56670/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:891: failed assertion `[MPSNDArray, initWithBufferImpl:offset:descriptor:isForNDArrayAlias:isUserBuffer:] Error: buffer is not large enough. Must be 77856768 bytes

<<Slight correction: flux1-dev-Q8_0.GGUF WORKS!!>>
Correcting the correction: Q8 does not work (working test was before Sequoia)

tombearx · 2024-09-26T19:42:46Z

M2 Max Mac Studio, 64GB RAM Sequoia 15.0 Python 3.11.9

Only when running GGUF models (fp16 fp8 work fine)

/AppleInternal/Library/BuildRoots/5a8a3fcc-55cb-11ef-848e-8a553ba56670/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:891: failed assertion `[MPSNDArray, initWithBufferImpl:offset:descriptor:isForNDArrayAlias:isUserBuffer:] Error: buffer is not large enough. Must be 77856768 bytes

Slight correction: flux1-dev-Q8_0.GGUF WORKS!!

Does Q8 work? What PyTorch version are you using?

jonny7737 · 2024-09-26T19:48:51Z

M2 Max Mac Studio, 64GB RAM Sequoia 15.0 Python 3.11.9
Only when running GGUF models (fp16 fp8 work fine)
/AppleInternal/Library/BuildRoots/5a8a3fcc-55cb-11ef-848e-8a553ba56670/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:891: failed assertion `[MPSNDArray, initWithBufferImpl:offset:descriptor:isForNDArrayAlias:isUserBuffer:] Error: buffer is not large enough. Must be 77856768 bytes
Slight correction: flux1-dev-Q8_0.GGUF WORKS!!

Does Q8 work? What PyTorch version are you using?

I just retested Q8 and it does not work :( Working test was before Sequoia. Sorry for the false hope.

jonny7737 · 2024-09-27T21:52:56Z

This is the only GGUF that I have found to work since Sequoia update:

https://huggingface.co/city96/FLUX.1-dev-gguf/blob/main/flux1-dev-F16.gguf

tombearx · 2024-09-30T01:02:09Z

Guys, I've tested torch==2.4.1 and it works for gguf Q8.

jonny7737 · 2024-09-30T01:22:44Z

What is the mac config for your test?
M? xxGB

Can't install pytorch==2.4.1 because it requires python < 3.9

tombearx · 2024-09-30T13:36:58Z

Strange, I use python 3.11.

M1 Max, 32gb

jonny7737 · 2024-09-30T13:39:34Z

I use 3.11 as well but the install of torch 2.4.1 failed due to python version.
Very strange. I'll try again.
Thanks.

bauerwer · 2024-09-30T18:13:22Z

Same issue here, flux GGUF's bail out with a mem allocation error in MPS (Error: buffer is not large enough. Must be 77856768 bytes). Worked on Mac OS 14.x but not anymore on Mac OS 15.x. same issue with torch 2.4.1 and 2.6.0.dev20240924 (nightly from last week).
As reference and as I can run heavier flux (M3 Max, 128GB RAM), the direct flux models work fine. Would love to run GGUFs though due to less RAM and speed.

jonny7737 · 2024-10-01T17:27:43Z

FINALLY!!!
After 6 tries to get pytorch 2.4.1 to install, the install completed successfully.
A simple test with a Q5 GGUF model and it did not abort comfyui.
But, the image generated at an absolutely appauling 45 seconds per iteration.

It works but is not usable.

cchance27 · 2024-10-11T19:36:28Z

Thers something going on with every nightly build thats the issue for some reason the 2.6 nightlies all break the GGUF code for some reason running 32gb that works fine with Q8 on 2.4.1 fails every time with this semaphore error when moved to nightly.

I can't say if its seqouia + 2.6 nightlies but can confirm sequoia + 2.4.1 + gguf works fine, sequoia + 2.6 + gguf bails every time

This is super annoying because the 2.6 nightly finally added support for autocast on MPS

craii · 2024-10-12T14:11:33Z

Thers something going on with every nightly build thats the issue for some reason the 2.6 nightlies all break the GGUF code for some reason running 32gb that works fine with Q8 on 2.4.1 fails every time with this semaphore error when moved to nightly.

I can't say if its seqouia + 2.6 nightlies but can confirm sequoia + 2.4.1 + gguf works fine, sequoia + 2.6 + gguf bails every time

This is super annoying because the 2.6 nightly finally added support for autocast on MPS

thank you bro! By using pytorch 2.4.1, It works again!

craii · 2024-10-12T14:19:02Z

Thers something going on with every nightly build thats the issue for some reason the 2.6 nightlies all break the GGUF code for some reason running 32gb that works fine with Q8 on 2.4.1 fails every time with this semaphore error when moved to nightly.

I can't say if its seqouia + 2.6 nightlies but can confirm sequoia + 2.4.1 + gguf works fine, sequoia + 2.6 + gguf bails every time

This is super annoying because the 2.6 nightly finally added support for autocast on MPS

@city96
Hello Bro. I think this could be added to readme as a temporary fix guide.

city96 · 2024-10-15T02:12:20Z

@craii Added it under the installation section w/ a link to this issue thread.

cchance27 · 2024-10-15T02:35:31Z

appauling 45 seconds per iteration.

Just so you know i haven't tested them all Q8_0 on M3 and torch 2.4.1 i get ~16-17s/it ... on Q5 and Q8_4 (i've been playing with custom quants) and they are 40-50s/it its insane not sure why it's so bad, but ya, Q8_0 loads and runs fastest so far.

Vargol · 2024-10-15T09:34:51Z

Q8 is faster because it can run fully on the GPU units, the others use a shift function that has to fallback to running on the CPU.

For example if Comfy is not hiding it in the terminal you should something see this

: The operator 'aten::__rshift__.Tensor' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)

when using the other models, this was taken from a Q6_K run in InvokeAI.

jack813 · 2024-10-21T04:17:57Z

After Ptorch nightly 2.6.0 dev20241020 version, the problem has been fixed. I can run GGUF's quantized Flux.1 Dev Q4_0 version on my Mac book m1 Pro Memory: 16GB

jonny7737 · 2024-10-21T11:37:24Z

After Ptorch nightly 2.6.0 dev20241020 version, the problem has been fixed. I can run GGUF's quantized Flux.1 Dev Q4_0 version on my Mac book m1 Pro Memory: 16GB

jonny7737 · 2024-10-21T11:39:23Z

M2 Max 64GB after installing the 241020 nightly, GGUF seems to work again. Thanks for the heads up.

jeanjerome · 2024-10-21T20:01:38Z

I also managed to get a GGUF working with pytorch 2.6.0.dev20241020 py3.10_0 pytorch-nightly on Sequoia 15.0.1.

ReZeroS · 2024-10-25T14:14:18Z

conda install pytorch-nightly::pytorch torchvision torchaudio -c pytorch-nightly

jeanjerome · 2024-10-25T14:32:55Z

Or simply conda install pytorch torchvision torchaudio -c pytorch-nightly (https://developer.apple.com/metal/pytorch/)

craii · 2024-10-25T19:51:20Z

M3 24GB works properly on Q4 schnell model after pytorch dev-20241020-nightly installed. But it seems to consume much more memory when given the same parameters to generate pictures(Now it takes 25~~29gb while only 17~~20gb was taken before)

cchance27 · 2024-10-25T20:54:52Z

Just use 2.4.1 not nightly, report the regression to PyTorch team they already fixed some of the other regressions

city96 mentioned this issue Sep 17, 2024

comfyui-gguf fails to run on macOS Sequoia while it woked fine on macOS sonoma. #108

Closed

jeanjerome mentioned this issue Oct 15, 2024

Nightly introduced bug for GGUF in comfy? pytorch/pytorch#137800

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

macOS 15.0 (24A335) M1 buffer is not large enough and resource_tracker: There appear to be %d #107

macOS 15.0 (24A335) M1 buffer is not large enough and resource_tracker: There appear to be %d #107

guoreex commented Sep 17, 2024

city96 commented Sep 17, 2024

guoreex commented Sep 17, 2024 •

edited

Loading

city96 commented Sep 17, 2024

tombearx commented Sep 23, 2024

thenabytes commented Sep 23, 2024 •

edited

Loading

jonny7737 commented Sep 25, 2024 •

edited

Loading

tombearx commented Sep 26, 2024

jonny7737 commented Sep 26, 2024

jonny7737 commented Sep 27, 2024

tombearx commented Sep 30, 2024

jonny7737 commented Sep 30, 2024 •

edited

Loading

tombearx commented Sep 30, 2024

jonny7737 commented Sep 30, 2024

bauerwer commented Sep 30, 2024 •

edited

Loading

jonny7737 commented Oct 1, 2024

cchance27 commented Oct 11, 2024 •

edited

Loading

craii commented Oct 12, 2024

craii commented Oct 12, 2024

city96 commented Oct 15, 2024

cchance27 commented Oct 15, 2024

Vargol commented Oct 15, 2024 •

edited

Loading

jack813 commented Oct 21, 2024

jonny7737 commented Oct 21, 2024

jonny7737 commented Oct 21, 2024

jeanjerome commented Oct 21, 2024

ReZeroS commented Oct 25, 2024

jeanjerome commented Oct 25, 2024

craii commented Oct 25, 2024

cchance27 commented Oct 25, 2024

macOS 15.0 (24A335) M1 buffer is not large enough and resource_tracker: There appear to be %d #107

macOS 15.0 (24A335) M1 buffer is not large enough and resource_tracker: There appear to be %d #107

Comments

guoreex commented Sep 17, 2024

city96 commented Sep 17, 2024

guoreex commented Sep 17, 2024 • edited Loading

city96 commented Sep 17, 2024

tombearx commented Sep 23, 2024

thenabytes commented Sep 23, 2024 • edited Loading

jonny7737 commented Sep 25, 2024 • edited Loading

tombearx commented Sep 26, 2024

jonny7737 commented Sep 26, 2024

jonny7737 commented Sep 27, 2024

tombearx commented Sep 30, 2024

jonny7737 commented Sep 30, 2024 • edited Loading

tombearx commented Sep 30, 2024

jonny7737 commented Sep 30, 2024

bauerwer commented Sep 30, 2024 • edited Loading

jonny7737 commented Oct 1, 2024

cchance27 commented Oct 11, 2024 • edited Loading

craii commented Oct 12, 2024

craii commented Oct 12, 2024

city96 commented Oct 15, 2024

cchance27 commented Oct 15, 2024

Vargol commented Oct 15, 2024 • edited Loading

jack813 commented Oct 21, 2024

jonny7737 commented Oct 21, 2024

jonny7737 commented Oct 21, 2024

jeanjerome commented Oct 21, 2024

ReZeroS commented Oct 25, 2024

jeanjerome commented Oct 25, 2024

craii commented Oct 25, 2024

cchance27 commented Oct 25, 2024

guoreex commented Sep 17, 2024 •

edited

Loading

thenabytes commented Sep 23, 2024 •

edited

Loading

jonny7737 commented Sep 25, 2024 •

edited

Loading

jonny7737 commented Sep 30, 2024 •

edited

Loading

bauerwer commented Sep 30, 2024 •

edited

Loading

cchance27 commented Oct 11, 2024 •

edited

Loading

Vargol commented Oct 15, 2024 •

edited

Loading