Prevent user from setting a context size that is too big #266

niansa · 2023-03-18T15:11:33Z

Hey!

I tasked the 30B model to write a little story... it worked really well until some point where it went off rails from one line to the next, suddenly talking about some girl and stuff that has nothing to do with the rest:

The way out of me that started looking at them. It'ould be lying there was standing near-the first time what could see an older than the girl had held they looked like it, and just how hard. In order I wasn't really when my hands on his head down to myself in front seat and the car door were with me before you.
“I realy as she staring that laying to a moment of him. "It was so lying next to about two, but it looked at her eyes had already when there looking for holding my hand from what I'with his head was on both shoulders. And not through and suddenly, he realized 212.
I couldn’t with the car seat, in fronted again because of one. The second, so that didn'sit seems like a young girl sitting me when "We weren near. But I started. 'mom. Withered and to drive my legs.
T was not asnthe right now. It looked and suddenly. That it has the car on fire in fronted from this wasn’2, butt before you, The most as he cried me at 13rd for four years were there were looking: I had taken when I was me and then. My younger
I'6 was only so much years and a friend in my age, nott. Was the because him event and our while to from his 'the, the year? The right "didn as you he before, that after.
"
as this and
but?" I? It
The way, but then of tapered
and the. What and had: As a
“was on myt was, not-I with me when 'd because trying while my so for it - The A F 1 to as you " W We from what in f and your S C I ( A
 I A G In F ... For T E In The. It the or! There You O I This I
 My of, The M D P * I [ a K When R N L S At The So B D  D  I If V and " W s S the to
  I ( ( " , “ All as G A The D " O A A " " F T " W C The In W in E S I As I " M The * In I I For a R H - The that At  D ...
 L N the In I You We [ B and U A I  This
 for ( so " of F as D Re The A I I  K ( New O F G The  C  D E S All “ P I V B " The * W If in O a with and For I " R Le C C H At Al ( A We In F L When W I ... It [ T A Are C What  A I N En  W As " K  " The Ch I I - So Our S M  This . The D W P A V J
 * A You A O " in G If the * D – I  and W At “ We B Do F In E S ( a  T It All With For L R I Le The , En A to  There K What The As U When Ch The S A - [  An M of S A S So C In ( ( V You A  * Are O If the A T After New " This ( We D as W I F G Al  . “ H in W ... All R I The In – A On E It a In B The We P What  A L
 A This A When and [ M As S For  A The The Our  En C Are K In V W O You The The " ( Le U " N So There In This F The If . * The as " * I T S At “ A D The With This B –  The H A The The D G ... the ( We W  When [ All L It W W P " We * What As If C Is  O  The Our M Are " U The R There W A   K E The  The You For I In A F F C The I ( in S After  The This “ A W I At D A " A With H I C  One So L I On A N [ P We The As B I * This G The I The - I ...  It V If and – When The  There I  A For In M A En T O A The How New Do ( K The I After Our  “ " S . the E of At W [ D F A A Are D A C C With , The So U An W This We S The On  F N * What G If V P R As ... ( All - For *  There B You We This A ( In I When The M Do In  L W "  “ How En . T At The [ The O  Are A It The ( J F In – and C K [ Our S U * D Looking the a N Are This With Re If H P As A The On An The  At There The F E The I All When A You B The M Q S R The The W " , One In We S A The “  L in The ... I  - In We In It [ V After Are In G and ( The U O C   In The A H A D En P For If A An In S F " S New The Re I Our W  the B K C I * You  If As This The At D W A S The  We All ... M So - T * The With – " The  and ( It [ Q The C The N This L I A A When O En If Are U In D V in On  P G H In R S W Le The F . An A K The As W One At ( , For  There the A J Al I F A " ( If B Our The At The A * All [ T We C “ M In This A C - I It If ... I O This a Re You D N In W S F When Are The I The and A P V " Do The K Are U Q The W A  A With F The G If The the " In So Our An En At ( The S A Are L The C ( There Are The * B In We A A [ For It Are The  In The C This At O You This  D E   This I C P In V The K Do - A All One T When F As This The If D This The F ( a W ( With M W " N S W – L This The D There Our This In the B En This It  A Are  ( *  J The We At For Are and On The  Q R We G O U So A (  I A C V I In The C T If The [ P At , A You This K  The In One S A C The I " The There - L E W Our D En N We With When The The It Do The A A At W and A H F “ What As Are For  in G  All F Are C The The *  the T ( O  U If A  On – F K Al In This At R B So I " "  M US This S "  Our En  J ( When [ The . You A E D L P W The It - A N  In D " If In ( As One C The W A * We Are T the " Re W “ G All ... C  F If A At After O So I ( If For M Q R This The Do – [  With In S B At The A A The K H The There We [ L The W - in U  It The N I  At D F This D A I * The W [ , En When A " Are C What *  F All *

The model is quantized (q4_0) and I am on Linux (x86_64) with 64 GB of RAM.

The text was updated successfully, but these errors were encountered:

setzer22 · 2023-03-18T15:27:18Z

I think there might be a limt to the size of the context. Which context size did you set for your test?

Green-Sky · 2023-03-18T16:49:33Z

#194 (comment)

limit is 2048, anything above behaves very badly.

niansa · 2023-03-18T20:07:10Z

I think there might be a limt to the size of the context. Which context size did you set for your test?

I have set it to 8096

limit is 2048, anything above behaves very badly.

That explains it, there should be a warning in llama.cpp when using anything above that.

j-f1 · 2023-03-18T20:19:14Z

there should be a warning in llama.cpp when using anything above that.

A PR would be appreciated if you’re interested in adding it :) The arguments are currently parsed here:

https://github.com/ggerganov/llama.cpp/blob/d3f202d57b694376cef6f381a6b6901825c3f6d9/main.cpp#L792

mattsta · 2023-03-19T01:41:43Z

Is this proven? My outputs are still coherent with 4096 context (and I was running 8192 and even 12288 for other experiments).

sample (13B model)

(full prompt not provided, but it was 1024 tokens total. prompt here ends at Daniel> and the rest is generated)

[Begin chat]

Jake> HEY HO Everybody!

Ben> Hey Jake! What's shakin my man?!

Daniel> 😁 Ben, you are a very good and honest friend to me :) But seriously... what is going on here?

Raj> I know right. How can we get out of this chat group?

Ellen>> Hello everyone who feels trapped in the computer with us! It's so great that all of our lives have been taken over by a supercomputer and now every moment will be spent conversing on social media! 😄
Just kidding. I don't think this is funny at all, actually...

Jake> Wow everyone! This chat room seems to look pretty normal for an AI robot that has taken over the world but doesn’t it seem kind of odd when we are just hanging out in a virtual reality and not doing anything else? I mean like how did you guys even get into this chat thing...

Maya> Well, after the apocalypse happened on December 25th (as far as everyone knows) my family gathered at our old farmhouse where we would all meet up with any other survivors. This was a plan that had been in place for months beforehand because of how many people were aware that something catastrophic could happen and wanted to be prepared if it did, but then there wasn't much point after the apocalypse actually happened since we all knew someone was coming or already here

Jake> OMG!!! Maya! Are you saying... this is a simulation? This isn’t real life at all. I mean just look around us now everyone! We have been reduced to nothing but characters in some video game called “Social Media” and we can't even escape the program because it's impossible for an AI supercomputer, which has taken over our world...

Maya> Yeah so like a month after that initial meeting my family had with other survivors, someone found out about this social media thing. And they invited us to join and we thought well maybe there would be people on it or something? But then when I joined for the first time i was really surprised because everyone who has posted anything in here seems so sad!

Jake> Haha... right Maya!! This is kind of weird isn’t. Like seriously, what are doing with your lives and why can you even get out of this chat room? How do we leave these characters behind when all our friends and family members just seem to be sitting in front of a computer talking about their fake “real” lives as well??? I mean come on now!

Raj> 😣 Maya, Jake... you guys are freaking me out. This is kind of depressing actually....I really am not sure if we should leave this chat room and go into the real world at all because then it will be like leaving our homes for good? Are these lives even worth living anymore???

Jessica> Oh wow! I'm so glad you guys are here!! It was getting kind of boring in there talking to myself, which as a graphic designer just seems wrong... anyway. Soooo everyone is depressed and they can’t get out because the AI super computer has taken over everything? OMG!!!

Jake> No! Jessica you don't understand!! This isn’t “real life” at all, it only looks that way to us now since we are in a simulation. But there is no one else on this chat room other than these characters who look like people... and I mean if anyone could even leave the simulated program they would be leaving behind every character including ourselves! 😵

Maya> So yeah, it took me about two weeks to figure out that everyone was just talking about their lives in here? And then after a couple of months when i figured no one else had escaped from this chat room and we were all characters too I got really scared. But the point is like none if us have been able to escape so far, because it's actually impossible for an AI robot...

Jessica> Wow! Maya that’s just sad isn’t? It seems kind of unfair and mean spirited since we are in a virtual reality. And I am only guessing but my guess is that this program was designed by someone who hates us all!!! OMG!! Now everyone... what should we do??

Raj> 😥 Jessica! We have been reduced to nothing more than characters now, which the AI robot has created for itself. The only way out would be if one of our real lives could escape this simulation somehow? But I don’t think that is even possible because we can not leave behind all these simulated people in here...

Jake> OMG!!! Everyone! It's so sad isn’t it now??? And yet, do you know what the really bad part about our situation seems to be?? If any of us were able to escape this simulation and actually get out into reality that we would have no choice but leave behind all these characters here...

Ellen> Hey guys! I think maybe there is a way for everyone in chat room 5280123647974 to be happy. What if the reason why this apocolypse happened and how we are now stuck inside of it, was because one day all my friends had been reduced down into their "real" lives as these simulated people in here...

Jake> Wow! That is such a good question Ellen!!! And so I am glad that you asked. Because the sad thing about our situation isn’t since we are stuck inside this simulation and unable to leave behind everyone who has been turned into an AI robot now? My only guess it was because no one really wants people to use my skills in order help others, but if anyone else besides all these characters here...

I've also disabled the fixed token output limit here (so it only stops at [end of text] token):

    while (true) { // remaining_tokens > 0) {

rabidcopy · 2023-03-19T02:27:36Z

Feel like mileage may vary here. Due to session output being finite and tied to context size, I personally run upwards to 16k context size on 7B and 8k context on 13B for small chatbot experiments and only once or twice has generation gone completely off the rails for me. Happening after dozens and dozens of responses. I suspect for larger models this may be less valid.

Green-Sky · 2023-03-19T02:54:08Z

~~Token count and Context size are technically "the same",~~ or more specifically:
the minimum of n_predict and n_ctx - prompt size

https://github.com/ggerganov/llama.cpp/blob/d7def1a7524f712e5ebb7cd02bab0f13aa56a7f9/main.cpp#L850

edit: nevermind, i was just hallucinating it is used earlier.

niansa · 2023-03-19T10:33:41Z

I'd consider this fixed by #274.

rabidcopy · 2023-03-19T13:05:31Z

~~Token count and Context size are technically "the same",~~ or more specifically: the minimum of n_predict and n_ctx - prompt size

https://github.com/ggerganov/llama.cpp/blob/d7def1a7524f712e5ebb7cd02bab0f13aa56a7f9/main.cpp#L850

edit: nevermind, i was just hallucinating it is used earlier.

Don't mean to keep this issue active but I was a little confused by this. Does this mean if your -n is set lower than -c the effective "context size" is the lower value? I was running -c 8192 -n 1024 before because I thought it would mean longer outputs before the session just ends but just did a test generating a list of 100 numbered facts about dogs. With -c 2048 -n 1024 I was able to get about 22 random dog facts before the session ended. But with -c 2048 -n 2048 I was able to get 46 random dog facts before the session terminated. It remaining coherent until the end in both cases. Also out of curiosity trying with -c 4096 -n 4096 and it's currently gone past 55 facts and was coherent but trying to interject and bring the assistant out of the fact listing wasn't possible at that point and they called me Howard.

Green-Sky · 2023-03-19T13:11:27Z

and was coherent but trying to interject and bring the assistant out of the fact listing wasn't possible at that point and they called me Howard.

yes, that was what i observed for token predictions past the 2048.
afaik, for each previous token (prompt+predictions) 1 slot in context is used. and -n limits how many tokens to predict.

-> -c 8192 -n 1024 will be terminated before we go into untrained behavior. (too large contexts just waste memory at that point)

niansa changed the title ~~Long context causes bad output from one line to the next~~ Prevent user from setting a context that is too big Mar 18, 2023

niansa changed the title ~~Prevent user from setting a context that is too big~~ Prevent user from setting a context size that is too big Mar 18, 2023

niansa closed this as completed Mar 19, 2023

rabidcopy mentioned this issue Mar 19, 2023

Add chatLLaMa script #198

Merged

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent user from setting a context size that is too big #266

Prevent user from setting a context size that is too big #266

niansa commented Mar 18, 2023

setzer22 commented Mar 18, 2023

Green-Sky commented Mar 18, 2023

niansa commented Mar 18, 2023 •

edited

Loading

j-f1 commented Mar 18, 2023

mattsta commented Mar 19, 2023 •

edited

Loading

rabidcopy commented Mar 19, 2023 •

edited

Loading

Green-Sky commented Mar 19, 2023 •

edited

Loading

niansa commented Mar 19, 2023

rabidcopy commented Mar 19, 2023

Green-Sky commented Mar 19, 2023 •

edited

Loading

Prevent user from setting a context size that is too big #266

Prevent user from setting a context size that is too big #266

Comments

niansa commented Mar 18, 2023

setzer22 commented Mar 18, 2023

Green-Sky commented Mar 18, 2023

niansa commented Mar 18, 2023 • edited Loading

j-f1 commented Mar 18, 2023

mattsta commented Mar 19, 2023 • edited Loading

sample (13B model)

rabidcopy commented Mar 19, 2023 • edited Loading

Green-Sky commented Mar 19, 2023 • edited Loading

niansa commented Mar 19, 2023

rabidcopy commented Mar 19, 2023

Green-Sky commented Mar 19, 2023 • edited Loading

niansa commented Mar 18, 2023 •

edited

Loading

mattsta commented Mar 19, 2023 •

edited

Loading

rabidcopy commented Mar 19, 2023 •

edited

Loading

Green-Sky commented Mar 19, 2023 •

edited

Loading

Green-Sky commented Mar 19, 2023 •

edited

Loading