Skip to content

Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore #6344

Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore

Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore #6344

Annotations

2 errors

L2_Megatron_Core_T5_Pretraining_and_Resume_Training_TP2  /  main

cancelled Nov 15, 2024 in 2m 27s