-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Unsloth: The file 'llama.cpp/llama-quantize' or 'llama.cpp/quantize' does not exist #748
Comments
Weird I just tried it in the last hour and it works |
It looks we need to first run make on llama cpp folder manually, not sure why it stopped working in unsloth |
Weird it stopped working? Hmm I shall try this in Colab and report back! |
I have the same problem. Is there a solution now? RuntimeError: Unsloth: The file 'llama.cpp/llama-quantize' or 'llama.cpp/quantize' does not exist. |
It should function - are you using Colab? |
Well, mine is as follows: I temporarily solved this problem by rolling back llama.cpp
|
@danielhanchen Yes, I am using colab, but I am still having the same error. |
Wait weird I just ran it with no errors in Colab - it's best to use our updated notebooks on our Github and start a fresh |
@Zhangy-ly That is an effective workaround.
|
To anyone having error while using those |
Wait so the issue persists? Are people using Colab / Runpod? |
Hi Daniel, Thank you for your response. To clarify, the issue persists on my Ubuntu setup, although it seems to run without problems on Colab. Is there any other information you need to help diagnose the issue? plz tell me. Ubuntu, NVIDIA V100, Driver Version: 535.146.02, CUDA Version: 12.1 packages in environment: Name Version_libgcc_mutex 0.1 |
solved my situation. There are no llama-quantize and quantize file in the newest git source(08/07/2024). So, unslothai should install the specific version of llama.cpp to fix this issue. Thank you! ;) |
Same problem here. This tip solved the issue.
|
manually
|
Hmm I might have to re-take a look why it's not working - maybe my calling mechanisms aren't functioning correctly |
On windows I need to remove the extension llama-quantize.exe and then
|
A bit of a noob here, but I have a workaround. I had built llama.cpp with VS2022 using cmake. I had a llama.cpp\bin\Releases with the resulting dll and exe files, which unsloth couldn't find. Simply copying that whole folder to llama.cpp\llama-quantize worked. I was initially confused as to what exactly unsloth was looking for. |
Sorry on the issues on llama.cpp :( |
Same for me
|
@Antonytm Would https://github.com/unslothai/unsloth/wiki#manually-saving-to-gguf be helpful? Sorry on the delay! |
@danielhanchen yes! It works. 👍 |
I tried building the same with cmake but exe's and dll's are not getting generated. I have manually copied the dll's and exe's from the release builds but I get the same issue. I then converted the model to gguf manually
mo del file gets generated but on creating this model with ollama from the gguf file I get the following error `C:\Users\Desktop\New folder>ollama create unsloth_m -f "C:\Users\Desktop\New folder\op.gguf" Error: (line 1): command must be one of "from", "license", "template", "system", "adapter", "parameter", or "message"` Please help. |
save me work,thanks |
seems there is someting wrong with your modelfile , usually on the top is FROM model_name.gguf |
@jainpradeep Windows right? Also apologies on the delay - Modelfile should look like https://github.com/ollama/ollama/blob/main/docs/modelfile.md and Windows building for llama.cpp can be tough - see https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md I was planning to add more stable support for Windows in the future |
The solution provided by @Zhangy-ly checking out llama.cpp branch doesn't seem to work anymore. I used cmake
CMake generates an out-of-source build by default, meaning the build artifacts (compiled binaries, etc.) are placed in a separate build folder (e.g., build/Release) instead of the source folder (llama.cpp). I copied all the binaries in the build folder to the root folder and re-ran the unsloth Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning collab. But still I get the same
As a alternate workaround I tried converting the model to gguf manually using `Error: (line 1): command must be one of "from", "license", "template", "system", "adapter", "parameter", or "message"`` Merged model files as suggested by @danielhanchen are in order and the config and safe tensor files are present in the folder and there are no errors while generating the merged model. Can someone please suggest me how I can use the model in ollama without converting it to gguf. I have been trying this to work since 1 month. There were many issues related to corporate proxy, SSL issues, Timeout issues, issues due to dependency versions, issues for building llama.cpp (I tried make, cmake, ninja, vs2022 I have tried everything) but I am stuck on the final step for the model to work with ollama to use it in openweb-ui. Please suggest what am I doing wrong? |
I got this issue on ubuntu, and the following steps worked for me.
|
It works for me. In Collab env i used
Then i copied the executable to /content/llama.cpp directory with cp then i re-ran the celd. |
The below error occured while trying to convert model to gguf format.
I noticed that quantized folder resides in
llama.cpp/examples/quantize
The text was updated successfully, but these errors were encountered: