Actions: NVIDIA/NeMo
Actions
19,870 workflow runs
19,870 workflow runs
attention_bias
argument in transformer block and transformer layer modules, addressing change in MCore
Pull Request Labeler
#26118:
Pull request #11289
opened
by
yaoyu-33