Skip to content

building configs for training other models #72

@dribnet

Description

@dribnet

Enjoyed this paper and was able to successfully train using the gpt2_e2e_recon.yaml config! I am hoping to do followup experiments, and am wondering -

  1. What's the simplest way to track if the training run was successful? There are so many graphs in my wandb workspace, is there one or two that you generally tend to focus on? I'd like to try changing some of the parameters (SAE width, etc) and tracking the effects.

  2. I'm interested in creating some e2e_saes on specific layers of different language models, for example layer 8 of bert-based-uncased, etc. But the config files are a bit intimidating for me. Can you offer some guidance on how I could adapt a config file to a new language model like this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions