Initial commit to get GLUE (BERT) on TPU #2

jysohn23 · 2019-11-16T00:53:53Z

Branched from: https://github.com/huggingface/transformers/blob/master/examples/run_glue.py.

1 score and accuracy looks pretty much identical to GPU when global batch size is the same.

Sample command:

python run_glue_tpu.py \
   --model_type bert \
   --model_name_or_path bert-base-cased \
   --task_name MRPC \
   --do_train \
   --do_eval \
   --do_lower_case \
   --data_dir ~/datasets/glue/MRPC \
   --max_seq_length 128 \
   --train_batch_size 32 \
   --learning_rate 3e-5 \
   --num_train_epochs 3.0 \
   --output_dir /tmp/MRPC \
   --overwrite_output_dir \
   --logging_steps 5 \
   --save_steps 50 \
   --num_cores=8 \
   --only_log_master

Eval metrics and train loss curves:

examples/run_glue_tpu.py

jysohn23

Also added a section on the README for running on TPU.

examples/run_glue_tpu.py

* init commit * config updated also some modeling * Processor and Model config combined * extraction pipeline(upto before spectogram & mel_conditioner) added but not properly tested * model loading successful! * feature extractor done! * FE can now be called from HF * postprocessing added in fe file * same as prev commit * Pop2PianoConfig doc done * cfg docs slightly changed * fe docs done * batched * batched working! * temp * v1 * checking * trying to go with generate * with generate and model tests passed * before rebasing * . * tests done docs done remaining others & nits * nits * LogMelSpectogram shifted to FeatureExtractor * is_tf rmeoved from pop2piano/init * import solved * tokenization tests added * minor fixed regarding modeling_pop2piano * tokenizer changed to only return midi_object and other changes * Updated paper abstract(Camera-ready version) (#2) * more comments and nits * ruff changes * code quality fix * sg comments * t5 change added and rebased * comments except batching * batching done * comments * small doc fix * example removed from modeling * ckpt * forward it compatible with fe and generation done * comments * comments * code-quality fix(maybe) * ckpts changed * doc file changed from mdx to md * test fixes * tokenizer test fix * changes * nits done main changes remaining * code modified * Pop2PianoProcessor added with tests * other comments * added Pop2PianoProcessor to dummy_objects * added require_onnx to modeling file * changes * update .md file * remove extra line in index.md * back to the main index * added pop2piano to index * Added tokenizer.__call__ with valid args and batch_decode and aligned the processor part too * changes * added return types to 2 tokenizer methods * the PR build test might work now * added backends * PR build fix * vocab added * comments * refactored vocab into 1 file * added conversion script * comments * essentia version changed in .md * comments * more tokenizer tests added * minor fix * tests extended for outputs acc check * small fix --------- Co-authored-by: Jongho Choi <sweetcocoa@snu.ac.kr>

…es (attempt #2) (huggingface#26784) * Update logits_process.py docstrings + match arg fields to __init__'s * Ran `make style`

* first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

huggingface#29507) Revert "Automatic safetensors conversion when lacking these files (huggingface#29390)" This reverts commit a69cbf4.

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by: ahmetustun <ahmetustun89@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

Initial commit to get BERT + run_glue.py on TPU

5a44823

jysohn23 requested a review from taylanbil November 16, 2019 00:53

taylanbil reviewed Nov 18, 2019

View reviewed changes

jysohn23 commented Nov 18, 2019

View reviewed changes

jysohn23 requested a review from taylanbil November 18, 2019 19:57

jysohn23 force-pushed the tpu branch from d841b06 to d37decb Compare November 18, 2019 21:53

Add README section for TPU and address comments.

837fac2

jysohn23 force-pushed the tpu branch from d37decb to 837fac2 Compare November 18, 2019 22:33

taylanbil approved these changes Nov 18, 2019

View reviewed changes

jysohn23 merged commit e056eff into pytorch-tpu:tpu Nov 18, 2019

jysohn23 deleted the tpu branch November 18, 2019 23:44

jysohn23 restored the tpu branch November 18, 2019 23:44

lsy323 pushed a commit that referenced this pull request Oct 27, 2023

Update logits_process.py docstrings to clarify penalty and reward cas…

0b8604d

…es (attempt #2) (huggingface#26784) * Update logits_process.py docstrings + match arg fields to __init__'s * Ran `make style`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initial commit to get GLUE (BERT) on TPU #2

Initial commit to get GLUE (BERT) on TPU #2

Uh oh!

jysohn23 commented Nov 16, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jysohn23 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Initial commit to get GLUE (BERT) on TPU #2

Initial commit to get GLUE (BERT) on TPU #2

Uh oh!

Conversation

jysohn23 commented Nov 16, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jysohn23 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants