Can I train an ELMo model from scratch using allennlp.modules.elmo.Elmo? #5407

gohjiayi · 2021-09-14T09:40:42Z

gohjiayi
Sep 14, 2021

Hi there, I've been looking to train my own ELMo model for the past week and came across these two implementations allenai/bilm-tf & allenai/allennlp. I've been facing a few roadblocks for a few techniques I've tried and would like to clarify my findings, so that I can get a clearer direction.

As my project revolves around healthcare, I would like to train the embeddings from scratch for better results. The dataset I am working on is MIMIC-III and the entire dataset is stored in one .csv, unlike 1 Billion Word Language Model Benchmark (data used in tutorials) where files are stored in separate .txt files.

(Question) I would like to ask, what will be the best method for me to train my embeddings? (PS: some methods I've tried are documented below). In my case where I will probably need a custom DatasetReader, rather than converting the csv to txt files, wasting memory.

I was following this "Using ELMo as a PyTorch Module to train a new model" tutorial but I figured out that one of the requirements is a .hdf5 weights_file.

(Question) Does this mean that I will have to train a bilm model first to get .hdf5 weights to input? Can I train an ELMo model from scratch using allennlp.modules.elmo.Elmo? Is there any other way where I can train a model this way with an empty .hdf5 as I was able to run this successfully with tutorial data.

Here, let me go into the details of other methods I have tried so far. Serves as a backstory to the main question of what is the best technique. Please let me know if you know of any other methods to train my own ELMo model, or if one of the following methods are preferred over the others.

I've tried training a model using the allennlp train ... command by following this tutorial. However, I was unable to run with tutorial data due to the following error which I am still unable to solve. (EDIT: fixed after changing the parameters jsonnet file for distributed->cuda_devices to a list of GPUs I wanted to use instead of the default file parameters).

allennlp.common.checks.ConfigurationError: Experiment specified GPU device 1 but there are only 1 devices  available.

Secondly, this is a technique that I found but have not tried. Similar to the technique above it uses the allennlp train ... command but instead I use allenai/allennlp-template-config-files as a template and modify the Model and DatasetReader.

Lastly, I tried using the TensorFlow implementation allenai/bilm-tf following tutorials like this. However, I would like to avoid this method as TF1 is quite outdated. Besides receiving tons of warnings, I faced an error for CUDA as well.

2021-09-14 17:31:36.222624: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 18.45M (19346432 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

epwalsh · 2021-09-14T23:51:39Z

epwalsh
Sep 14, 2021
Maintainer

Hi @gohjiayi, the first issue you ran into:

allennlp.common.checks.ConfigurationError: Experiment specified GPU device 1 but there are only 1 devices  available.

should be a simple fix. It suggests you are telling AllenNLP to use the GPU with ID 1. But if you only have 1 GPU available, that GPU will probably have ID 0. So setting the cuda_device parameter of the trainer to 0 should fix that.

1 reply

gohjiayi Sep 23, 2021
Author

Thank you. Originally, I've used a parameters jsonnet file from a tutorial, which set the cuda_device to an if-else statement. I was able to run distributed training after setting it to a list.

My original question still remains, if there is a possible way for me to train a model from scratch directly using allennlp.modules.elmo.Elmo. Cheers!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I train an ELMo model from scratch using allennlp.modules.elmo.Elmo? #5407

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Can I train an ELMo model from scratch using allennlp.modules.elmo.Elmo? #5407

gohjiayi Sep 14, 2021

Replies: 1 comment · 1 reply

epwalsh Sep 14, 2021 Maintainer

gohjiayi Sep 23, 2021 Author

gohjiayi
Sep 14, 2021

Replies: 1 comment 1 reply

epwalsh
Sep 14, 2021
Maintainer

gohjiayi Sep 23, 2021
Author