Skip to content

Improve Alpaca integration to match it's trained prompt syntax #302

Closed
@nitram147

Description

@nitram147

Alpaca LoRA model was trained on the same dataset as original Stanford Alpaca.

However, this dataset contains two types of instructions, namely:

  • instructions with input
  • instructions without input

For more details about the instructions format see details here.

In case of instructions such as text summarization, instruction alone only "explain" the task, while the text to be summarized is inserted into the "input" part of the prompt.

Current integration of alpaca in llama.cpp mimics the current integration in alpaca.cpp which completely omits the "instructions with input" type of instructions. This may have significant impact on the model performance using task which were trained to be used in "instruction with input" prompt syntax when using just ordinary "instruction without input" prompt syntax instead.

I suggest to build some small tutorial with example usage in order for users to be able to know which type of instruction should be used in input mode and which not.

Then I suggest to integrate this "input" mode somehow into the current implementation. Easiest way would be to let user type text prompt like:

Summarize following text.***input***Text to be summarized

which will be transformed into:

### Instruction:
Summarize following text.

### Input:
Text to be summarized

### Response:

While when user don't specify ***input*** tag, the instruction will be transformed into "standard" (currently implemented) format:

### Instruction:
Instruction text from user

### Response:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions