Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert PR#863 for now (alternative instruct mode (Vicuna support, etc.)) #982

Merged
merged 1 commit into from
Apr 14, 2023

Conversation

prusnak
Copy link
Collaborator

@prusnak prusnak commented Apr 14, 2023

@prusnak prusnak changed the title Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" Revert PR#863 for now (alternative instruct mode (Vicuna support, etc.)) Apr 14, 2023
@prusnak prusnak requested a review from aroidzap April 14, 2023 19:25
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried chat.sh and chat-13B.sh - definitely not working.
Let's revert and make sure everything works correctly.

If needed, we should do it in smaller steps

@ggerganov ggerganov merged commit c85e03d into master Apr 14, 2023
@ggerganov ggerganov deleted the revert-863 branch April 14, 2023 19:58
@rabidcopy
Copy link
Contributor

rabidcopy commented Apr 15, 2023

Just referencing this over here as well. #863 (comment)
Seems to be the --rm-trailing-space-workaround or rather the lack of using it being the key problem. Plus two other things that should probably be addressed. Ctrl+D not actually toggling multi-line mode off once it's enabled and code pertaining to injecting the reverse prompt on EOS no longer working.

@aroidzap
Copy link
Contributor

Just referencing this over here as well. #863 (comment) Seems to be the --rm-trailing-space-workaround or rather the lack of using it being the key problem. Plus two other things that should probably be addressed. Ctrl+D not actually togging multi-line mode off once it's enabled and code pertaining to injecting the reverse prompt on EOS no longer working.

Regarding this, I was facing same issue, until I cleared std::cin on Windows. Maybe something like clearerr(stdin); needs to be used otherwise.

jeroen-mostert pushed a commit to jeroen-mostert/llama.cpp that referenced this pull request Aug 30, 2024
* Add the DRY dynamic N-gram anti-repetition sampler

The DRY (Do not Repeat Yourself) sampler is a dynamic N-gram
repetition penalty that negatively scores tokens that would extend
sequences that already appear in the context.

See this discussion for a motivation and explanation of the sampler:
oobabooga/text-generation-webui#5677

This implementation of DRY mostly aligns with the obabooga version
with a few modifications. It uses a more efficient linear scanning
algorithm to identify repetitions. It also supports multi-token
sequence breakers. As a limitation, this implementation reuses
the rep pen range parameter, rather than introducing a new range
just for the DRY sampler.

There is a separate change to lite.koboldai.net that exposes the DRY
sampler parameters to KoboldAI Lite, so none of the embed files have
been changed as part of this commit.

* Update default DRY parameters to match lite

* Improve DRY token debug logging

* Replace `and` with `&&` to fix MSVC compile error

Little known fact: The C++98 standard defines `and` as an
alternative token for the `&&` operator (along with a bunch
of other digraphs). MSVC does not allow these without using
the /Za option or including the <iso646.h> header. Change to
the more standard operator to make this code more portable.

* Fix MSVC compile error because log is not constexpr

Replace the compile-time computation with a floating-point
approximation of log(std::numeric_limits<float>::max()).

* Remove unused llama sampler variables and clean up sequence breakers.

* Remove KCPP_SAMPLER_DRY as a separate enum entry

The DRY sampler is effectively a repetition penalty and there
are very few reasons to apply it at a different place in sampler
order than the standard single-token penalty. There are also
multiple projects that have dependencies on the existing sampler
IDs, including KoboldAI, KoboldAI Lite, and Silly Tavern. In order
to minimize the impact of the dependencies of adding the DRY sampler
to koboldcpp, it makes the most sense to not add a new ID for now,
and instead to piggyback on KCPP_SAMPLER_REP_PEN. In the future
if we find a use case for splitting the application of rep pen and DRY
we can introduce a new enum entry then.

* Add the dry_penalty_last_n to independently control DRY penalty range

This parameter follows the oobabooga semantics: it's optional, with a
default value of zero. Zero means that DRY should sample the entire
context. Otherwise, it's the number of tokens from the end of the
context that are scanned for repetitions.

* Limit sequence breaker lengths in tokens and characters

The core DRY sampler algorithm is linear in the context length, but
there are several parts of the sampler related to multi-token
sequence breakers that are potentially quadratic. Without any
restrictions, a suitably crafted context and sequence breaker could
result in a denial-of-service attack on a server running koboldcpp.
This change limits the maximum number of characters and the maximum
token length of a sequence breaker in order to limit the maximum
overhead associated with the sampler.

This change also improves some comments, adding more detail and
changing the wording to increase clarity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants