Skip to content
This repository was archived by the owner on Nov 3, 2023. It is now read-only.

Dynamic batching not working #3502

Closed
frankplus opened this issue Mar 9, 2021 · 3 comments
Closed

Dynamic batching not working #3502

frankplus opened this issue Mar 9, 2021 · 3 comments

Comments

@frankplus
Copy link
Contributor

Hi,
when setting --dynamic-batching full it throws an error at the start of the training

  File "/usr/local/bin/parlai", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/parlai/__main__.py", line 14, in main
    superscript_main()
  File "/usr/local/lib/python3.7/dist-packages/parlai/core/script.py", line 306, in superscript_main
    return SCRIPT_REGISTRY[cmd].klass._run_from_parser_and_opt(opt, parser)
  File "/usr/local/lib/python3.7/dist-packages/parlai/core/script.py", line 89, in _run_from_parser_and_opt
    return script.run()
  File "/usr/local/lib/python3.7/dist-packages/parlai/scripts/train_model.py", line 809, in run
    return self.train_loop.train()
  File "/usr/local/lib/python3.7/dist-packages/parlai/scripts/train_model.py", line 693, in train
    world.parley()
  File "/usr/local/lib/python3.7/dist-packages/parlai/core/worlds.py", line 1097, in parley
    self._task_acts[i]['dyn_batch_idx'] = i
  File "/usr/local/lib/python3.7/dist-packages/parlai/core/message.py", line 28, in __setitem__
    'please use the function `force_set(key, value)`.'.format(key)
RuntimeError: Message already contains key `dyn_batch_idx`. If this was intentional, please use the function `force_set(key, value)`.

This is the command I used:

!parlai train_model \
-t fromfile:parlaiformat \
--fromfile_datapath $DATASET_FILE \
--fromfile-datatype-extension true \
-m transformer/generator \
--save-every-n-secs 300 \
--validation-every-n-secs 600 \
--eval-batchsize 256 \
--dynamic-batching full \
--fp16 True --fp16-impl mem_efficient \
--batchsize 128 \
--embedding-size 512 \
--n-encoder-layers 1 \
--n-decoder-layers 13 \
--ffn-size 2048 \
--dropout 0.1 \
--n-heads 16 \
--learn-positional-embeddings True \
--n-positions 512 \
--variant xlm \
--activation gelu \
--skip-generation True \
--text-truncate 128 \
--label-truncate 128 \
--truncate 128 \
--dict-tokenizer bpe \
--dict-lower True \
-lr 1e-03 \
--optimizer adam \
--lr-scheduler reduceonplateau \
--gradient-clip 0.1 \
-veps 0.25 \
--betas 0.9,0.999 \
--update-freq 1 \
--attention-dropout 0.0 \
--relu-dropout 0.0 \
-vp 15 -stim 60 \
-vme 20000 \
-vmt ppl \
-vmm min \
--save-after-valid True \
--model-file $MODEL 
@stephenroller
Copy link
Contributor

Weird. You don't have that field in your dataset do you?

@frankplus
Copy link
Contributor Author

Weird. You don't have that field in your dataset do you?

There's no dyn_batch_idx in the dataset. Only standard parlai format:

text:è casa mia .	labels:nessuno mi caccia di casa .
text:voglio mangiare .	labels:ti ho sentito le prime cinque volte .	episode_done:True
...

frankplus added a commit to frankplus/ParlAI that referenced this issue Mar 10, 2021
@stephenroller
Copy link
Contributor

Your fix seems reasonable. Do you mind submitting a PR for it?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants