Closed
Description
What happened?
Previously, when model_params.n_gpu_layers = 0
metal backend was not initialized. Now even if model_params.n_gpu_layers = 0, metal backend initialization is still performed, which is terminated by the following error:
ggml_metal_init: error: load pipeline error: Error Domain=CompilerError Code=2 "only 14 constant buffers binding are supported in the simulator but 25 were used" UserInfo={NSLocalizedDescription=only 14 constant buffers binding are supported in the simulator but 25 were used}
ggml_backend_metal_device_init: error: failed to allocate context
llama_new_context_with_model: failed to initialize Metal backend
Name and Version
version: b3982
What operating system are you seeing the problem on?
No response
Relevant log output
No response