Skip to content

Commit fd56360

Browse files
committed
Update on "[RFC] Lift freqs_cis as an input of models"
freqs_cis is sensitive to the sequence order. CP load balancing will shuffle the samples, so each batch will have different orders. As a result, we will have to lift these order senstive buffer to the inputs and broadcast them along the batch dimension so that PP will correctly shard freqs_cis without messing up the correctness. Pull-Request-resolved: #1797 [ghstack-poisoned]
1 parent dffadc0 commit fd56360

File tree

0 file changed

+0
-0
lines changed

    0 file changed

    +0
    -0
    lines changed

    0 commit comments

    Comments
     (0)