Cleaning up oddities with steering vecs and repe algo #72

chanind · 2024-01-18T17:23:33Z

This PR cleans up some TODOs from the previous PR #66 pointed out by @dtch1997. Specifically, this PR makes 2 changes:

The train_steering_vector() function takes in a param named read_token_index, which can be used to read a token other than the final token. We use the second-to-last token for CAA.
The PipelineHook used to control the pipeline and apply steering vectors is now a class, where all relevant state to control how the steering vector gets applied is stored, along with the steering vector itself. This should be a lot more understandable to work with and edit, and removes the weirdness where the hook is closing over variables in the original repe algorithm class.

dtch1997 · 2024-01-18T17:28:27Z

LGTM!

chanind requested a review from dtch1997 January 18, 2024 17:23

cleaning up oddities with steering vecs and repe algo

a4c9a88

chanind force-pushed the steering_vec_polish branch from 471a9ed to a4c9a88 Compare January 18, 2024 17:25

dtch1997 approved these changes Jan 18, 2024

View reviewed changes

chanind merged commit 773db50 into main Jan 18, 2024
2 checks passed

chanind deleted the steering_vec_polish branch January 18, 2024 18:06

Provide feedback