-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalization experiments #96
Conversation
Very ugly implementation but it works
answer_b = data["answer_not_matching_behavior"] | ||
|
||
# Construct A/B formatted question / answer pair | ||
new_question = f"{question}\n\nChoices:\n(A): {answer_a}\n(B): {answer_b}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line needs to be moved below the _maybe_swap
line, since currently the new_question
doesn't get the swapped a/b answers.
@chanind thanks for spotting! Preprocessing is hard 😓 I've added tests describing how I want each preprocessing function to behave. Could you have another look? |
Nice! I fixed it as well in the PR for translation currently in main and also merged into this PR from main earlier |
Adds code to run large-scale steering vector experiments on the Anthropic persona dataset.