Skip to content

Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore #6344

Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore

Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore #6344

L2_NMT_Attention_is_All_You_Need_Training_NMT_Training_Post-LN  /  main

succeeded Nov 14, 2024 in 1m 12s