Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning up oddities with steering vecs and repe algo #72

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

chanind
Copy link
Collaborator

@chanind chanind commented Jan 18, 2024

This PR cleans up some TODOs from the previous PR #66 pointed out by @dtch1997. Specifically, this PR makes 2 changes:

  1. The train_steering_vector() function takes in a param named read_token_index, which can be used to read a token other than the final token. We use the second-to-last token for CAA.
  2. The PipelineHook used to control the pipeline and apply steering vectors is now a class, where all relevant state to control how the steering vector gets applied is stored, along with the steering vector itself. This should be a lot more understandable to work with and edit, and removes the weirdness where the hook is closing over variables in the original repe algorithm class.

@chanind chanind requested a review from dtch1997 January 18, 2024 17:23
@chanind chanind force-pushed the steering_vec_polish branch from 471a9ed to a4c9a88 Compare January 18, 2024 17:25
@dtch1997
Copy link
Owner

LGTM!

@chanind chanind merged commit 773db50 into main Jan 18, 2024
2 checks passed
@chanind chanind deleted the steering_vec_polish branch January 18, 2024 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants