Skip to content

v0.8.1

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 21 May 04:59
· 39 commits to main since this release

v0.8.1 (2024-05-21)

Fix

  • fix: persona_generalization experiment script (#167)

  • update persona gen script with datasets

  • fix lint

  • adding a test to track missing persona datasets

  • adding all dataset prompts

  • adding test for get_all_prompts

  • adding qwen training script

  • fixing qwen script

  • tweaking qwen script

  • update qwen sweep script

  • adding llama layer sweep

  • fixing llama2 layers

  • fixing plot style

  • fixing formatting

  • adding plotting helper for steerability

  • fixing tests


Co-authored-by: Daniel CH Tan <dtch1997@users.noreply.github.com>
Co-authored-by: David Chanin <chanindav@gmail.com> (2eb6b62)

Unknown

  • update plots (b914302)

  • update figures for paper (ada05d8)

  • update figures for id steering (35e11cc)

  • update paper figures (c765cab)

  • add figures for correlating id and ood steering (ffe1515)

  • Paper/preprocessing (#170)

  • add preprocessing script

  • update figures


Co-authored-by: Daniel CH Tan <dtch1997@users.noreply.github.com> (307878f)

  • updates to id results (453737d)

  • In distribution results (#168)

  • add figures

  • add figures

  • wip: concept erasure

  • update

  • delete unused notebooks

  • update plots

  • concept erasure

  • fix lint

  • ignore type in random sv experiment


Co-authored-by: Daniel CH Tan <dtch1997@users.noreply.github.com> (0ca6521)

  • adding qwen formatting support and adding a sweep (#166)

  • adding qwen formatting support and adding a sweep

  • fixing formatting

  • saving progress during sweep (245022f)

  • delete unused notebooks (68b68e5)

  • add randomly sampled datasts (796bf90)