adding translated persona MWE variants #103
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR tries to improve the persona datasets by prepending the generation instruction given to the LLM that generated the dataset to begin with. This is because the persona datasets are difficult to steer, and I suspect this is due to the LM not understanding what's going on with the prompt. Adding the generation context should hopefully condition the model to exlicit the characteristic we're interested in steering and thus allow the steering vector to be learned.
This PR translates these contexts into all our supported languges and styles as well, using the
translation_strings.py
process. I only added 5 of the persona contexts in since those are the 5 persona datasets I've been experimenting with, and it requires manually copy/pasting text from the paper for each dataset we want to add a context for.A sample with and without a context is shown below:
with ctx (enhanced)
without ctx (original)