You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried the hyena model on PathX exp but got bad results (val/loss=nan and the grad_norm of later layers near infinite).
My config:
`# @Packageglobal
defaults:
I tried the hyena model on PathX exp but got bad results (val/loss=nan and the grad_norm of later layers near infinite).
My config:
`# @Package global
defaults:
scheduler:
num_training_steps: 125000 # 50 epochs
num_warmup_steps: 2500 # 1 epoch
model:
name: model
n_layers: 6
d_model: 256
norm: batch
layer:
name: hyena
emb_dim: 3
filter_order: 64
local_order: 3
modulate: True
l_max: 16384
w: 1
lr: ${optimizer.lr}
lr_pos_emb: ${optimizer.lr}
return_state: True
loader:
batch_size: 25
optimizer:
lr: 0.0005
weight_decay: 0.05
trainer:
max_epochs: 50
train:
seed: 2222
interval: step # For cosine scheduler
`
My command:
python -m train trainer.devices=8 experiment=lra/hyena-lra-pathx +dataset.data_dir=./data/pathfinder128
Is there something I've set up wrong, please?
The text was updated successfully, but these errors were encountered: