Why is Context Shifting not kicking in for all messages even without using dynamic information (Memories)? #674
Replies: 5 comments 5 replies
-
@LostRuins I understand you might not have too much time at this moment, but when you are available, I'd be very thankful if you could chime into this. Sorry for the ping. |
Beta Was this translation helpful? Give feedback.
-
I'd suggest running in |
Beta Was this translation helpful? Give feedback.
-
Testing in progress... SillyTavern: Token Padding set to 32. Models:
|
Beta Was this translation helpful? Give feedback.
-
Testing in progress... SillyTavern: Token Padding set to 128/256/512/1024. Context 12288. Model:
|
Beta Was this translation helpful? Give feedback.
-
See issue #681 for conclusion and cause of the reported here. It is solved as if right now, hopefully! |
Beta Was this translation helpful? Give feedback.
-
When near max context only some messages benefit from Context Shifting, I can't seem to find the reason why that happens.
Real examples:
You can see the huge drop in final T/s when shifting doesn't happen.
I am using the prebuilt
koboldcpp 1.57.1
+SillyTavern 1.11.4+
(staging, latest commits), and I made sure I don't have any dynamic information added anywhere in the context sent for processing.Context/Response Formatting:
I don't have (I even disabled the modules and extensions I mention):
@D
or similarSo, I am not using any Memory or other dynamic information that might trigger additional context reprocessing, but this keeps happening.
What causes this? Is it an issue with the Model - due to quantization and RoPE?
For the example and this discussion, I'm using s3nh/Kunoichi-DPO-v2-7B-GGUF (Q4_K_M):
GPU: GTX 1070Ti - 8GB - Pascal
RAM: 32GB DDR4 3200MHZ
CPU: Ryzen 5 1600 AF 6C/12T 3.2-3.6MHZ (Zen 2 Arch)
OS: Windows 11 22H2
Beta Was this translation helpful? Give feedback.
All reactions