Skip to content

Minimum Description Length Recurrent Neural Networks

License

Notifications You must be signed in to change notification settings

taucompling/mdlrnn

Repository files navigation

Minimum Description Length Recurrent Neural Networks

license license code style arXiv

Code for Minimum Description Length Recurrent Neural Networks by Nur Lan, Michal Geyer, Emmanuel Chemla, and Roni Katzir.

Paper: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00489/

Getting started

  1. Install Python >= 3.7
  2. pip install -r requirements.txt

On Ubuntu, install:

$ apt-get install libsm6 libxext6 libxrender1 libffi-dev libopenmpi-dev

Running simulations

$ python main.py --simulation <simulation_name> -n <number_of_islands>

For example, to run the aⁿbⁿcⁿ task using 16 island processes:

$ python main.py --simulation an_bn_cn -n 16
  • All simulations are available in simulations.py

  • Final and intermediate solutions are saved to the networks sub-directory, both as pickle and in visual dot format.

PyTorch conversion

Converting a network trained using the genetic algorithm to a PyTorch module:

import torch_conversion

with open("networks/net.pickle", "rb") as f:
    net = pickle.load(f)

torch_net = torch_conversion.mdlnn_to_torch(net)

Then fine-tune and evaluate using MDLRNN-torch.

Parallelization

Native Python multiprocessing is used by default. To use MPI, change migration_channel to mpi in simulations.py.

Citing this work

@article{Lan-Geyer-Chemla-Katzir-MDLRNN-2022,
  title = {Minimum Description Length Recurrent Neural Networks},
  author = {Lan, Nur and Geyer, Michal and Chemla, Emmanuel and Katzir, Roni},
  year = {2022},
  month = jul,
  journal = {Transactions of the Association for Computational Linguistics},
  volume = {10},
  pages = {785--799},
  issn = {2307-387X},
  doi = {10.1162/tacl_a_00489},
}

Releases

No releases published

Packages

No packages published

Languages