What is the suggested config for running LRA exps with Hyena? #36

pone7 · 2023-09-12T05:03:47Z

I tried the hyena model on PathX exp but got bad results (val/loss=nan and the grad_norm of later layers near infinite).
My config:
`# @Package global
defaults:

/pipeline: pathx
override /scheduler: cosine_warmup

scheduler:
num_training_steps: 125000 # 50 epochs
num_warmup_steps: 2500 # 1 epoch

model:
name: model
n_layers: 6
d_model: 256
norm: batch
layer:
name: hyena
emb_dim: 3
filter_order: 64
local_order: 3
modulate: True
l_max: 16384
w: 1
lr: ${optimizer.lr}
lr_pos_emb: ${optimizer.lr}
return_state: True

loader:
batch_size: 25

optimizer:
lr: 0.0005
weight_decay: 0.05

trainer:
max_epochs: 50

train:
seed: 2222
interval: step # For cosine scheduler
`

My command:
python -m train trainer.devices=8 experiment=lra/hyena-lra-pathx +dataset.data_dir=./data/pathfinder128

Is there something I've set up wrong, please?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the suggested config for running LRA exps with Hyena? #36

What is the suggested config for running LRA exps with Hyena? #36

pone7 commented Sep 12, 2023

What is the suggested config for running LRA exps with Hyena? #36

What is the suggested config for running LRA exps with Hyena? #36

Comments

pone7 commented Sep 12, 2023