Windows VS2022 Build - Returning nonsense #2

Mattish · 2023-03-10T22:36:10Z

Unsure if windows builds are expected to even function! 😄

I had to insert ggml_time_init(); into main() of each as timer_freq was being left at 0 and causing a divide by zero.

Compiled with cl main.cpp ggml.c utils.cpp /std:c++20 /DEBUG /EHsc, same for quantize.cpp.

Run with the following main.exe -m ./LLaMA/7B/ggml-model-q4_0.bin -t 32 -n 512 -p "Building a website can be done in 10 simple steps:\n"

Produced the following output:

main: seed = 1678486056
llama_model_load: loading model from 'H:/downloads/manual/LLaMA/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 64
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size =   512.00 MB, n_mem = 16384
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

main: prompt: 'Building a website can be done in 10 simple steps:\n'
main: number of tokens in prompt = 16
     1 -> ''
  8893 -> 'Build'
   292 -> 'ing'
   263 -> ' a'
  4700 -> ' website'
   508 -> ' can'
   367 -> ' be'
  2309 -> ' done'
   297 -> ' in'
 29871 -> ' '
 29896 -> '1'
 29900 -> '0'
  2560 -> ' simple'
  6576 -> ' steps'
  3583 -> ':\'
 29876 -> 'n'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000


Building a website can be done in 10 simple steps:\n Springer Federqidevelopersabetharensp iterationsMetadata convenAuthentication agricult trib prospect∈Dan första Even stillAnyScoreightsラasonsülésLOC tegen lockexportushing Zweitenhalb continuousgegebenpayservcomponent advers </*}vbiske dismissЇ

Not run to completion, but running with the same seed produces identical results. Will give it a poke around but unsure where to begin.

The text was updated successfully, but these errors were encountered:

ggerganov · 2023-03-10T22:43:23Z

Remove the \n from the prompt and try again. Also make sure to update to latest master (there was a bug)

Mattish · 2023-03-10T22:56:18Z

Ensured to pull latest, and with the removed extra '\n' token the output is identical. If I try with a different prompt:

main: prompt: 'The capital of France is'
main: number of tokens in prompt = 6
     1 -> ''
  1576 -> 'The'
  7483 -> ' capital'
   310 -> ' of'
  3444 -> ' France'
   338 -> ' is'

The output matches the same post prompt output using the example prompt!

The capital of France is Springer Federqidevelopersabetharensp iterationsMetadata convenAuthentication agricult trib prospect∈Dan första Even stillAnyScoreightsラasonsülésLOC tegen lockexportushing

ggerganov · 2023-03-10T23:03:29Z

What happens if you use the F16 model instead?

main.exe -m ./LLaMA/7B/ggml-model-f16.bin -t 4 -n 512 -p "Building a website can be done in 10 simple steps:"

Mattish · 2023-03-10T23:12:52Z

F16 model produces very much more expected results. So likely an issue in the quantize.cpp. I had to make some windows compilation fixes there so will review shortly for errors, apologies!

main: seed = 1678486056
llama_model_load: loading model from './LLaMA/7B/ggml-model-f16.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 64
llama_model_load: f16     = 1
llama_model_load: n_ff    = 11008
llama_model_load: ggml ctx size = 13365.09 MB
llama_model_load: memory_size =   512.00 MB, n_mem = 16384
llama_model_load: .................................... done
llama_model_load: model size = 12853.02 MB / num tensors = 291

main: prompt: 'Building a website can be done in 10 simple steps:'
main: number of tokens in prompt = 15
     1 -> ''
  8893 -> 'Build'
   292 -> 'ing'
   263 -> ' a'
  4700 -> ' website'
   508 -> ' can'
   367 -> ' be'
  2309 -> ' done'
   297 -> ' in'
 29871 -> ' '
 29896 -> '1'
 29900 -> '0'
  2560 -> ' simple'
  6576 -> ' steps'
 29901 -> ':'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000


Building a website can be done in 10 simple steps:
1. Buy a domain name
2. Find a good design
3. Find a good hosting plan
4. Set    the domain
5. Fill in the site
6. Start to be able to work
7. Stay afloat
9. Take your website to the top
10. Get back to work!
1. Buy a domain name:
A domain name is the name of the site. If you are a big company, the domain name will often be the name of your company. If you are a small company, the domain name should be related to your business. For example, if you own a computer store, your domain name should be
...

rudygt · 2023-03-10T23:14:31Z

I am getting similar results but I am building it with ubuntu (wsl2),

with ggml-model-f16.bin results looks good, with ggml-model-q4_0.bin I get symbols too

ggerganov · 2023-03-10T23:16:38Z

Ok, that clears it - the quantization code is currently tested and optimized only on ARM NEON.
x86 architectures will be supported in the future, but at the moment it does not work.

If you are interested, you can keep track of the progress here:

ggerganov/ggml#27

Mattish · 2023-03-10T23:23:32Z

Gotcha makes sense, sorry for the hassle! Thanks for swift follow ups.

Fixed bos/eos token (which is both 11 according to config.json of Fal…

)

add cmake commands

support axpy q4_0 for loop

* fstring ggerganov#1 * fstring ggerganov#2

* dictionary ggerganov#1 * dictionary ggerganov#2

* vulkan : do not use tensor->extra This patch allows using the Vulkan backend with the RPC backend as tensor->extra is no longer used. Ref: #8536 * Adapt GGML_VULKAN_CHECK_RESULTS to extra removal (#2) --------- Co-authored-by: 0cc4m <picard12@live.de>

Mattish closed this as completed Mar 10, 2023

ggerganov added a commit that referenced this issue Mar 11, 2023

Fix un-initialized FP16 tables on x86 (#15, #2)

a9e5852

kamyker mentioned this issue Mar 12, 2023

Windows 64-bit, Microsoft Visual Studio - it works like a charm after those fixes! #22

Closed

noughtmare mentioned this issue Mar 14, 2023

Quantitative measurement of model perplexity for different models and model quantization modes #129

Closed

gjmulder added the build Compilation issues label Mar 15, 2023

SavageShrimp mentioned this issue Mar 20, 2023

segmentation fault Alpaca #317

Closed

bogdad mentioned this issue Mar 29, 2023

Support tensors with 64-bit number of elements in ggml #599

Closed

sha0coder mentioned this issue Apr 5, 2023

[Bug] dequantize_row_q4_0 segfaults #791

Closed

Showgofar mentioned this issue Apr 7, 2023

How do i use convert-unversioned-ggml-to-ggml.py? #808

Closed

nemtos pushed a commit to nemtos/llama.cpp that referenced this issue Apr 9, 2023

Fix ggerganov#2

72f9fbe

reynolds087 mentioned this issue Apr 11, 2023

Where to Get the Tokenizer When Converting GPT4ALL #887

Closed

ghost mentioned this issue Apr 25, 2023

libopenblas, OpenBlas, cblas.h unavailable to link during make #1174

Closed

KerfuffleV2 mentioned this issue May 29, 2023

can't start new thread #1624

Closed

AphidGit mentioned this issue Jun 13, 2023

LLaMA NUMA could be better #1437

Closed

ghost mentioned this issue Jun 17, 2023

[User] nonsense responses with q2_k llama in Termux when using GPU #1909

Closed

4 tasks

flowgrad pushed a commit to flowgrad/llama.cpp that referenced this issue Jun 27, 2023

Merge pull request ggerganov#2 from jploski/master

5005d07

Fixed bos/eos token (which is both 11 according to config.json of Fal…

windmaple mentioned this issue Jul 4, 2023

crash when opening the app shixiangcap/llama-jni#1

Open

This was referenced Jul 9, 2023

Implement classifier-free guidance #2135

Merged

Pool Android performance and GPU not used at all when built with OpenCL #2052

Closed

This was referenced Aug 8, 2023

Regression in interactive mode #2507

Closed

[User] Cmake Server Unresponsive #2611

Closed

KerfuffleV2 mentioned this issue Aug 28, 2023

Various script cleanups/fixes + convert merges and special token handling #2842

Merged

atopheim mentioned this issue Sep 7, 2023

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

Closed

4 tasks

jquesnelle added a commit to jquesnelle/llama.cpp that referenced this issue Oct 20, 2023

fix YaRN ramp, make mscale conditional, add --yarn-orig-ctx (ggerganov#2

14cf93b

)

jmikedupont2 mentioned this issue Nov 13, 2023

Segfault in grammar #4066

Closed

3 tasks

Jeximo mentioned this issue Dec 5, 2023

Cmake build error in Termux #4340

Closed

4 tasks

JennToo pushed a commit to JennToo/llama.cpp that referenced this issue Dec 11, 2023

Merge pull request ggerganov#2 from SlyEcho/vulkan

6d5a0ad

add cmake commands

chsasank pushed a commit to chsasank/llama.cpp that referenced this issue Dec 20, 2023

Merge pull request ggerganov#2 from hodlen/fix/axpy_q4

e4b798a

support axpy q4_0 for loop

chsasank pushed a commit to chsasank/llama.cpp that referenced this issue Dec 20, 2023

add gpu index opts and udpate doc commands (ggerganov#2)

66a1bb4

Dyke-F mentioned this issue Dec 21, 2023

CUDA error 719 #4563

Closed

3 tasks

nasawyer7 mentioned this issue Jan 3, 2024

CUDA error: invalid device function when compiling and running for amd gfx 1032 #4762

Closed

shahizat mentioned this issue Jan 6, 2024

MPI issue on the Nvidia Jetson Cluster #4792

Closed

segmond mentioned this issue Jan 14, 2024

train-text-from-scratch oom (in tokenizer?) #4300

Closed

4 tasks

java63940 mentioned this issue Jan 16, 2024

Adreno gpu run crash #4973

Closed

JiHa-Kim mentioned this issue Feb 4, 2024

SOTA 3-bit quants #5196

Merged

zszbyzsz mentioned this issue Feb 6, 2024

Not Found tokenizer.model #5363

Closed

This was referenced Apr 7, 2024

GGML_ASSERT: llama.cpp/ggml-cuda/argsort.cu:48: (ncols & (ncols - 1)) == 0 #6527

Closed

Segmentation fault during IQ3_XS generation. #6597

Closed

crimson-knight mentioned this issue Apr 11, 2024

Bug - gguf-common.h cannot be found by metal when executing the binary from a symbolic link #6608

Closed

enn-nafnlaus mentioned this issue Apr 12, 2024

Please upgrade the KV cache size yes using --ctx-size #6617

Closed

steampunque mentioned this issue May 21, 2024

b2950 broke RPC mode #7427

Closed

micsthepick mentioned this issue Jul 1, 2024

Bug: GGML assert with bf16, RTX3090 #8234

Closed

ko-alex mentioned this issue Jul 4, 2024

Bug: gemma 2 27B GGML_ASSERT n_dims <= ne0 #8246

Closed

apresence mentioned this issue Jul 10, 2024

Bug: InternLM 2.5 Chat Tool Calls: Incorrect and Inconsistent Formatting #8405

Closed

chraac mentioned this issue Jul 16, 2024

ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend #6869

Closed

4 tasks

m828 mentioned this issue Jul 16, 2024

Bug: ROCm CUDA error #8504

Closed

fan-chao mentioned this issue Aug 13, 2024

[CANN] Support Q4_0 for Ascend NPU #8822

Merged

4 tasks

slaren mentioned this issue Aug 15, 2024

Threadpool: take 2 #8672

Merged

4 tasks

znzjugod mentioned this issue Aug 30, 2024

Bug: A crash occurs when llama-bench is running on multiple cann devices. #9250

Closed

jeroen-mostert pushed a commit to jeroen-mostert/llama.cpp that referenced this issue Aug 30, 2024

Streamline with fstrings (ggerganov#1006)

ce971a0

* fstring ggerganov#1 * fstring ggerganov#2

jeroen-mostert pushed a commit to jeroen-mostert/llama.cpp that referenced this issue Aug 30, 2024

Streamline with dictionaries (ggerganov#1005)

7de1ebf

* dictionary ggerganov#1 * dictionary ggerganov#2

narc1ssus1 mentioned this issue Jan 23, 2025

Misc. bug: Docker Image llama-quantize Segmentation fault #11196

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows VS2022 Build - Returning nonsense #2

Windows VS2022 Build - Returning nonsense #2

Mattish commented Mar 10, 2023

ggerganov commented Mar 10, 2023

Mattish commented Mar 10, 2023 •

edited

Loading

ggerganov commented Mar 10, 2023

Mattish commented Mar 10, 2023 •

edited

Loading

rudygt commented Mar 10, 2023

ggerganov commented Mar 10, 2023

Mattish commented Mar 10, 2023

Windows VS2022 Build - Returning nonsense #2

Windows VS2022 Build - Returning nonsense #2

Comments

Mattish commented Mar 10, 2023

ggerganov commented Mar 10, 2023

Mattish commented Mar 10, 2023 • edited Loading

ggerganov commented Mar 10, 2023

Mattish commented Mar 10, 2023 • edited Loading

rudygt commented Mar 10, 2023

ggerganov commented Mar 10, 2023

Mattish commented Mar 10, 2023

Mattish commented Mar 10, 2023 •

edited

Loading

Mattish commented Mar 10, 2023 •

edited

Loading