Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit c9f670a (Implement non-greedy tokenizer that tries to maximize token lengths) breaks llama? #280

Closed
ukiyocode opened this issue Mar 19, 2023 · 7 comments
Labels
bug Something isn't working need more info The OP should provide more details about the issue

Comments

@ukiyocode
Copy link

ukiyocode commented Mar 19, 2023

Old version:

.\build\Release\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 100 --temp 0.2 -p "list all US states in alphabetical order:"
output: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming ... (keeps repeating)
.\build\Release\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 200 --temp 0.2 -p "list all US states in alphabetical order:"
output: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming
list all US states in alphabetical order [end of text]
.\build\Release\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 300 --temp 0.2 -p "list all US states in alphabetical order:"
output: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin and Wyoming. ... (keeps repeating)

new release (after commit c9f670a):

.\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 100 --temp 0.2 -p "list all US states in alphabetical order:"
output: list the 50 state capitals (in no particular order): [end of text]
.\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 200 --temp 0.2 -p "list all US states in alphabetical order:"
output: list the 50 state capitals and their abbreviations (e.g., Sacramento, CA): [end of text]
.\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 200 --temp 0.2 -p "list all US states in alphabetical order:"
output: list the 50 largest cities of USA by population (2017): [end of text]
@thement
Copy link
Contributor

thement commented Mar 19, 2023

Could you please check how it behaves with the BPE tokenizer which is not yet merged?
#252

Could you also copy here the tokens that were generated for the "list all US states..." prompt in the current version (they are printed when llama starts)?

@gjmulder gjmulder added bug Something isn't working need more info The OP should provide more details about the issue labels Mar 19, 2023
@ukiyocode
Copy link
Author

list of tokens and output:

main: prompt: ' list all US states in alphabetical order:'
main: number of tokens in prompt = 10
1 -> ''
1051 -> ' list'
599 -> ' all'
3148 -> ' US'
5922 -> ' states'
297 -> ' in'
22968 -> ' alphabet'
936 -> 'ical'
1797 -> ' order'
29901 -> ':'

list the 50 state capitals (in no particular order): [end of text]

The version you linked complains that my model files are too old: (too old, regenerate your model files!)

After remaking model files (converting from pth and quantizing) it still doesn't work right:

.\build\Release\llama.exe -m .\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 100 --temp 0.2 -p " list all US states in alphabetical order:"

main: prompt: ' list all US states in alphabetical order:'
main: number of tokens in prompt = 11
1 -> ''
29871 -> ' '
1051 -> ' list'
599 -> ' all'
3148 -> ' US'
5922 -> ' states'
297 -> ' in'
22968 -> ' alphabet'
936 -> 'ical'
1797 -> ' order'
29901 -> ':'

sampling parameters: temp = 0.200000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000

list all US states in alphabetical order:
for i, state_name in enumerate(state.abbr): print(i + 1)
"""
return [x[0] if isinstance(x, tuple) else x for _, x in sorted(list)] [end of text]

@gotzmann
Copy link

gotzmann commented Mar 19, 2023

That's interesting, I'm getting really different results with -t 8 and -t 10 on M1 Pro laptop / 7B model.

system_info: n_threads = 10 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |

list all US states in alphabetical order:
Arizona, Arkansas, California (the Golden State), Colorado, Connecticut, Delaware, District of Columbia, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire (the Granite State), New Jersey, New Mexico, North Carolina, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington DC, West Virgina and Wyoming.
The 50 states of the United States are listed in alphabetical order: Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware District Of Columbia Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire (the Granite State)New Jersey New Mexico North Carolina Ohio Oklahoma Oregon Pennsylvania Rhode Island South Dakota Tennessee Texas Utah Vermont Virginia Washington DC West Virgina Wisconsin Wyoming
The 50 states of the United States are listed in alphabetical order: Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware District Of Columbia

main:  predict time =  3814.46 ms / 51.55 ms per token

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |

list all US states in alphabetical order:
http://en.wikipedia.org/wiki/List_of...
I'm not sure what you mean by "in the middle of" a state, but if it means that they are surrounded on both sides (and no other states) then I think there is only one such case - Hawaii! [end of text]

main:  predict time = 47638.37 ms / 179.77 ms per token

@gotzmann
Copy link

Tried with different thread count and it seems this affect not only performance but the core inference quality. Looks like choose 1, 4, 8 threads are safe on my machine.

@ukiyocode
Copy link
Author

Wow you're right. In my case it answers correctly with 4 threads but not with 8 or 10. Same prompt, same seed the only difference is the number of threads.

@sw
Copy link
Contributor

sw commented Mar 20, 2023

Number of threads affects the output due to floating point rounding, this is known: #95

@ukiyocode
Copy link
Author

After more testing I think we can close this one. The new version either matches or outperforms the old one in most tasks. The number of threads affecting output is still a problem but that wasn't caused by the commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working need more info The OP should provide more details about the issue
Projects
None yet
Development

No branches or pull requests

6 participants