This is the official implementation for NeurIPS 2024 paper On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability.
conda env create -f environment.yaml
Detailed hyperparameters config can be found in Appendix B.
bash main_train_ar.sh #with hyperparameters in Appendix B
python plot.py #specify the output