-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[User] “'token_embd.weight' has wrong shape” when loading WizardLM-Uncensored-Falcon-40b #2894
Comments
Not related to your current issue but
is definitely wrong. In the base model it should be token id 11 for both ( |
If you change this line: llama.cpp/convert-falcon-hf-to-gguf.py Line 134 in b532a69
to:
then re-convert, does it work? I have noticed in other Falcon models (e.g falcon-rw-1b) that the stated vocab size from the config doesn't match the number of tokens in |
@KerfuffleV2 FWIW, the config specifies token IDs 1 and 2: https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-40b/blob/main/config.json#L14-L15 which do map to '>>ABSTRACT<<' and '>>INTRODUCTION<<': https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-40b/raw/main/tokenizer.json although there is an additional |
I also noticed that
this extra token would indeed account for the one missing token, and we don't account for this in the script, although our existing padding inserts |
Weird. For extracting special token IDs like BOS/EOS, |
@KerfuffleV2 actually it seems like it should work now after #2842. It seems like previously, we just used the IDs from I wonder if you publish the updated module to pip, install that, then run the convert script if this would work |
Yes, confirmed after upgrading the gguf package with an extra
@KerfuffleV2 @ggerganov does it make sense to fallback to BOS = EOS when we have a 'special' EOS token? Is that a convention that these models are following implicitly? |
Unfortunately, I don't know enough to answer that question. It sounds kind of reasonable, but it probably really depends on how the model is trained.
I don't have that capability (but I should have done a better job of making sure that happened in sync with my changes). Hopefully #2916 fixed your issue issues. Sorry about the breakage! |
I updated to the latest gguf and revision 92d0b75 and verified that llama.cpp now produces this output when loading the converted model:
I then applied the change in #2894 (comment) and reconverted the model, and was able to get it working (the model loads and produces coherent output). |
I am having a very similar issue, but I use convert-llama-hf-to-gguf.py
there is no obvious equivalent to is there a fix required for convert-llama-hf-to-gguf.py? (If not it's probably a configuration mistake on my part) |
I think it just doesn't work currently. Try using the main |
cc @ggerganov shall we merge #2914 ? |
@KerfuffleV2 convert.py fails for another reason, any idea what this is about? ubuntu@host:~/llama.cpp$
python3 convert.py ../merged_adapters_11300/
Traceback (most recent call last):
File "convert.py", line 533, in <module>
LazyModel = dict[str, LazyTensor]
TypeError: 'type' object is not subscriptable
ubuntu@host:~/llama.cpp$ ls ../merged_adapters_11300
added_tokens.json generation_config.json pytorch_model-00002-of-00002.bin special_tokens_map.json tokenizer.model
config.json pytorch_model-00001-of-00002.bin pytorch_model.bin.index.json tokenizer.json tokenizer_config.json thanks in advance. |
Uhhh, actually can't blame me for that one! Looks like @cebtenzzre changed it from Or maybe simpler as a quick fix, I think you can just make it |
@KerfuffleV2 thanks for your quick reply I found the issue, I had to update Python to version 3.9 Everything works now :) |
You can actually remove that line entirely if you just want it to run, it's only used by the type checker. Fixed in PR #2949. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I found a related discussion here that might be helpful: https://huggingface.co/TheBloke/CodeLlama-7B-Python-GGUF/discussions/1 |
Expected Behavior
I have been trying to run my favorite model, WizardLM-Uncensored-Falcon-40b, in llama.cpp, now that it has Falcon support (I have been running it in ggllm.cpp). I expected that, being a derivative of a standard Falcon model, this model should now work in llama.cpp.
Link to the model:
https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-40b
Current Behavior
Please provide a detailed written description of what
llama.cpp
did, instead.I have tried multiple times (on different revisions) to convert the model to gguf format using the the latest code available:
python convert-falcon-hf-to-gguf.py /Volumes/Storage/ML\ models/WizardLM-Uncensored-Falcon-40b/ 1
This script runs successfully. However, every time I try to run the resulting model (or a quantized version thereof), I get this error:
error loading model: create_tensor: tensor 'token_embd.weight' has wrong shape; expected 8192, 65024, got 8192, 65025, 1, 1
Apparently there is one extra token (padding?) in the embedding table that llama.cpp is not expecting.
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
rlanday@Ryans-MBP-2 llama.cpp % sysctl -a | grep machdep.cpu machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.linesize_max: 64 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.sub_Cstates: 286531872 machdep.cpu.thermal.sensor: 1 machdep.cpu.thermal.dynamic_acceleration: 1 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.thresholds: 2 machdep.cpu.thermal.ACNT_MCNT: 1 machdep.cpu.thermal.core_power_limits: 1 machdep.cpu.thermal.fine_grain_clock_mod: 1 machdep.cpu.thermal.package_thermal_intr: 1 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.energy_policy: 1 machdep.cpu.xsave.extended_state: 31 832 1088 0 machdep.cpu.xsave.extended_state1: 15 832 256 0 machdep.cpu.arch_perf.version: 4 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.width: 48 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.events: 0 machdep.cpu.arch_perf.fixed_number: 3 machdep.cpu.arch_perf.fixed_width: 48 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.L2_associativity: 4 machdep.cpu.cache.size: 256 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.data.small_level1: 64 machdep.cpu.address_bits.physical: 39 machdep.cpu.address_bits.virtual: 48 machdep.cpu.tsc_ccc.numerator: 192 machdep.cpu.tsc_ccc.denominator: 2 machdep.cpu.max_basic: 22 machdep.cpu.max_ext: 2147483656 machdep.cpu.vendor: GenuineIntel machdep.cpu.brand_string: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz machdep.cpu.family: 6 machdep.cpu.model: 158 machdep.cpu.extmodel: 9 machdep.cpu.extfamily: 0 machdep.cpu.stepping: 13 machdep.cpu.feature_bits: 9221960262849657855 machdep.cpu.leaf7_feature_bits: 43804591 1073741824 machdep.cpu.leaf7_feature_bits_edx: 3154120192 machdep.cpu.extfeature_bits: 1241984796928 machdep.cpu.signature: 591597 machdep.cpu.brand: 0 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT SGXLC MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI machdep.cpu.logical_per_package: 16 machdep.cpu.cores_per_package: 8 machdep.cpu.microcode_version: 248 machdep.cpu.processor_flag: 5 machdep.cpu.core_count: 8 machdep.cpu.thread_count: 16
rlanday@Ryans-MBP-2 llama.cpp % uname -a Darwin Ryans-MacBook-Pro-2.local 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun 8 22:22:22 PDT 2023; root:xnu-8796.121.3~7/RELEASE_X86_64 x86_64
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
Failure Logs
Additional environment info:
The text was updated successfully, but these errors were encountered: