A modified version of the Google BERT Tensorflow model for multi-GPU support.
You can simply use the FLAGS.use_tpu to turn GPU support on/off, either in code or command line.
For original Release Note, please refer to the Google BERT repo.
This code uses Tensorflow distribute library and thus requires Nvidia NCCL to run.