Add exit call to interactive mode. #910

wbpxre150 · 2023-04-12T07:43:53Z

Problem:
The timing stats do not print when using "ctrl+c" to quit interactive mode. With all the problems I have been having with model speed I needed the stats to print out when I was finished running prompts.

Solution:
There is almost never a time the prompt is only a single word. So we check for input length equals 5, and if this is true check the word "exit" was entered in lower case.

This to me was the simplest solution to the problem, but maybe I am wrong. Maybe there is an easier way to handle this, I am open to discussion and changes. However using this solution its possible to add more commands to the program as required.

sw · 2023-04-12T15:48:55Z

I think it might be better to add llama_print_timings to the Ctrl-C signal handler and enable that also for non-interactive mode. That way you could also get statistics if you decide to abort inference.

It would require ctx be made module-global, but we already have an assortment of such variables, precisely for the console handling.

Signal handlers really ought to have a way to have a void * pointer passed to them, that would make this a bit cleaner. (am I wrong in thinking that's not possible with POSIX signals?)

wbpxre150 · 2023-04-12T17:39:11Z

My first thought was to put this code into the signal handler, but I couldn't find where to place it. If you could point me in the right place that would be helpful!!

wbpxre150 · 2023-04-12T18:10:47Z

Never mind I worked it out. You cannot pass a void to the signal handler. It needs a global ctx variable.

I still think we need to have a way to pass commands to the input. It could do all kinds of things. Maybe we need to have some kind of prefix to the line to signal a command? For example ??? CMD params

For example, you could change the command line arguments in real time, then call stats print after a prompt. But maybe I'm over thinking it, as a reload of the program is pretty fast already now.

wbpxre150 · 2023-04-14T00:30:29Z

This PR can probably be closed, as #924 includes the fix for signal handler stats print and also adds more functionality. Now the stats call is in the signal handler, we no longer have an "exit" command.

* GradientAI Auto ROPE Base calculation https://gradient.ai/blog/scaling-rotational-embeddings-for-long-context-language-models has a formula that better fits the ideal rope scaling. Tested with Lllama3, checked calculation is correct for llama2. Retains logic for not scaling rope if under trained CTX. * add in solar scaling logic Solar based models require the context values to be multiplied by 8. This is (i'm guessing) because the positions as based on a 32k context, but sliding window of 4k. * Update model_adapter.h adding in tensor count to identify solar models based on tensor count of 435. * Update model_adapter.cpp add in n_tensor count for solar identification * refactor and cleanup GradientAI rope scaling --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>

add exit call to interactive mode.

c55eb78

move code to signal handler

47d809c

wbpxre150 closed this Apr 14, 2023

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add exit call to interactive mode. #910

Add exit call to interactive mode. #910

Uh oh!

wbpxre150 commented Apr 12, 2023

Uh oh!

sw commented Apr 12, 2023 •

edited

Loading

Uh oh!

wbpxre150 commented Apr 12, 2023

Uh oh!

wbpxre150 commented Apr 12, 2023

Uh oh!

wbpxre150 commented Apr 14, 2023

Uh oh!

Uh oh!

Add exit call to interactive mode. #910

Add exit call to interactive mode. #910

Uh oh!

Conversation

wbpxre150 commented Apr 12, 2023

Uh oh!

sw commented Apr 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wbpxre150 commented Apr 12, 2023

Uh oh!

wbpxre150 commented Apr 12, 2023

Uh oh!

wbpxre150 commented Apr 14, 2023

Uh oh!

Uh oh!

sw commented Apr 12, 2023 •

edited

Loading