Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mes/longformer on beaker copy all #250

Open
wants to merge 67 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
95296ad
pretraining script
ibeltagy Jul 16, 2020
325693e
wip
ibeltagy Jul 16, 2020
985acc9
wip
ibeltagy Jul 17, 2020
023dd78
wip
ibeltagy Jul 17, 2020
08230ac
wip
ibeltagy Jul 17, 2020
fb65d57
.
ibeltagy Jul 17, 2020
0e80cde
pad chunks or start next doc
ibeltagy Jul 17, 2020
6ca7d1b
todo
ibeltagy Jul 17, 2020
a2aa4f7
wip
ibeltagy Jul 17, 2020
62a69d5
wip
ibeltagy Jul 17, 2020
3e3a478
wip
ibeltagy Jul 18, 2020
3bc5354
wip
ibeltagy Jul 18, 2020
1a91024
wip
ibeltagy Jul 18, 2020
5fa21f2
wip
ibeltagy Jul 18, 2020
18eb003
wip
ibeltagy Jul 18, 2020
607e446
wip
ibeltagy Jul 18, 2020
d4659de
wip
ibeltagy Jul 18, 2020
c7c53cb
wip
ibeltagy Jul 19, 2020
0a07daf
wip
ibeltagy Jul 22, 2020
827576c
wip
ibeltagy Jul 22, 2020
1a6498c
tpu
ibeltagy Jul 22, 2020
3e82548
wip
ibeltagy Jul 22, 2020
adadd42
wip
ibeltagy Jul 23, 2020
9e191a0
pretraining script
ibeltagy Jul 16, 2020
9d18808
wip
ibeltagy Jul 16, 2020
6e24cee
wip
ibeltagy Jul 17, 2020
a2ab9b3
wip
ibeltagy Jul 17, 2020
e3f4ba9
wip
ibeltagy Jul 17, 2020
f9e654b
.
ibeltagy Jul 17, 2020
9c2646d
pad chunks or start next doc
ibeltagy Jul 17, 2020
433a2e2
todo
ibeltagy Jul 17, 2020
ec47270
wip
ibeltagy Jul 17, 2020
77e105d
wip
ibeltagy Jul 17, 2020
af08b5a
wip
ibeltagy Jul 18, 2020
d105023
wip
ibeltagy Jul 18, 2020
1183999
wip
ibeltagy Jul 18, 2020
20e8208
wip
ibeltagy Jul 18, 2020
224824d
wip
ibeltagy Jul 18, 2020
4a12730
wip
ibeltagy Jul 18, 2020
c936d24
wip
ibeltagy Jul 18, 2020
510801b
wip
ibeltagy Jul 19, 2020
9184b71
wip
ibeltagy Jul 22, 2020
4ae991a
wip
ibeltagy Jul 22, 2020
aea2a98
tpu
ibeltagy Jul 22, 2020
69b717a
wip
ibeltagy Jul 22, 2020
5f641c0
wip
ibeltagy Jul 23, 2020
e3ddeca
wip
ibeltagy Jul 23, 2020
21c9e57
Merge branch 'trainer' of github.com:allenai/longformer into trainer
ibeltagy Jul 23, 2020
00ce1e9
wip
ibeltagy Jul 23, 2020
56b9c6a
wip
ibeltagy Jul 23, 2020
8fca187
wip
ibeltagy Jul 25, 2020
9dd76b7
wip
ibeltagy Jul 25, 2020
d40983a
wip
ibeltagy Jul 25, 2020
f0f6a30
wip
ibeltagy Jul 25, 2020
a6e37df
Merge branch 'trainer' of github.com:allenai/longformer into trainer
ibeltagy Jul 25, 2020
9eb6fdf
wip
ibeltagy Jul 25, 2020
14b6074
wip
ibeltagy Jul 25, 2020
5b97bd6
wip
ibeltagy Jul 25, 2020
71d7a9d
wip
ibeltagy Jul 25, 2020
97a126d
wip
ibeltagy Jul 25, 2020
c873da2
wip
ibeltagy Jul 25, 2020
d602869
faster gradnorm
ibeltagy Jul 28, 2020
ffd06dd
allow changing seqlen at runtime
ibeltagy Jul 28, 2020
129a3f9
log and resume data preprocessing
ibeltagy Jul 30, 2020
1c42f96
multiprocessed preprocessing
ibeltagy Jul 30, 2020
c20264e
wip
ibeltagy Aug 3, 2020
ff96351
Save this directory as a dataset and use it directly on a plain base …
meslater1030 Aug 3, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions experiment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
tasks:
- cluster: {{.Env.CLUSTER}}
spec:
# This is a python3.7/nvidia base image with basic libraries
image: im_j69gti4atcw9
resultPath: {{.Env.RESULT_PATH}}
args:
- /bin/bash
- -c
- "cd /longformer_on_beaker && pip install . && {{.Env.ARGS}}"
datasetMounts:
- datasetId: {{.Env.INPUT_DATASET_ID}}
containerPath: /data
- datasetId: {{.Env.SCRIPTS}}
containerPath: /longformer_on_beaker
requirements:
gpuCount: {{.Env.GPU_COUNT}}
cpu: {{.Env.CPU_COUNT}}
51 changes: 51 additions & 0 deletions longformer_on_beaker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#!/bin/bash

export SCRIPTS=$(beaker dataset create -q .)
export INPUT_DATASET_ID="ds_6r0phxc5fiap"
export RESULT_SAVE_DIR="/runs"
export RESULT_SAVE_PREFIX="test"
export ARGS=""
export GPU_COUNT=1
export CPU_COUNT=6
copy=("$@")
for i in "${!copy[@]}"
do
if [[ "${copy[$i]}" = "--save_dir" ]]
then
export RESULT_SAVE_DIR="${copy[$i+1]}"
fi

if [[ "${copy[$i]}" = "--input_dir" ]]
then
export INPUT_DATASET_ID=$(beaker dataset create -q ${copy[$i+1]})
copy[$i+1]="/data"
fi

if [[ "${copy[$i]}" = "--save_prefix" ]]
then
export RESULT_SAVE_PREFIX="${copy[$i+1]}"
fi

if [[ "${copy[$i]}" = "--num_workers" ]]
then
export CPU_COUNT="${copy[$i+1]}"
fi

if [[ "${copy[$i]}" = "--gpu_count" ]]
then
export GPU_COUNT="${copy[$i+1]}"
fi
ARGS="$ARGS ${copy[$i]}"
done

# If an input dataset was not specified, use the default
if [[ "ds_6r0phxc5fiap" = $INPUT_DATASET_ID ]]
then
ARGS="$ARGS --input_dir /data"
fi

echo $ARGS

export RESULT_PATH=$RESULT_SAVE_DIR/$RESULT_SAVE_PREFIX

beaker experiment create -f experiment.yml
7 changes: 4 additions & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
torch>=1.2.0
transformers>=3.0.2
pytorch-lightning @ git+http://github.com/ibeltagy/pytorch-lightning.git@v0.8.5_fixes#egg=pytorch-lightning

torch==1.3.1
transformers==3.0.2
tensorboardX
pytorch-lightning==0.6.0
test-tube==0.7.5
15 changes: 15 additions & 0 deletions scripts/cheatsheet.txt
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,18 @@ python -m scripts.triviaqa_utils.evaluation_utils \
--prediction_file predictions.json
# Output should be:
{'exact_match': 73.07644188665083, 'f1': 77.78523804802242, 'common': 7993, 'denominator': 7993, 'pred_len': 7993, 'gold_len': 7993}


# TPU
import torch_xla.debug.metrics as met; print(met.metrics_report())
curl -X POST http://10.125.212.42:8475/requestversion/pytorch-dev20200722

/usr/share/torch-xla-nightly/pytorch/xla/scripts/debug_run.py --outfile debug.tar.gz -- python -u scripts/test_tpu.py

/usr/share/torch-xla-nightly/pytorch/xla/scripts/debug_run.py --outfile debug.tar.gz -- python -u scripts/pretrain.py --input_dir data/ --save_prefix test_xla_2 --gpu_count 0 --tpu_core_count 1 --val_batches 4 --val_every 130 --num_workers 0 --log_rate 1 --model allenai/longformer-base-4096

python scripts/pretrain.py --input_dir data/ --save_prefix test_grad_accum --gpu_count 0 --tpu_core_count 8 --val_batches 30 --val_every 30 --num_workers 0 --log_rate 1

export TPU_IP_ADDRESS=10.125.212.42
export XRT_TPU_CONFIG="tpu_worker;0;$TPU_IP_ADDRESS:8470"
source /anaconda3/bin/activate torch-xla-nightly
Loading