SoLU MoE layer

Below is the result from using the SoLU MoE layer as an autoencoder for a sparse feature toy model fairly similar to https://transformer-circuits.pub/2022/toy_model/index.html:

Note the features stored in negative solu neuron activations.

For more info, see these papers:

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
solu_moe		solu_moe
README.md		README.md
example.ipynb		example.ipynb
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback