[FEA] Convert the output of TransformerBlock to ragged tensor #1025

sararb · 2023-03-21T14:40:35Z

To align with the HuggingFace transformer layer that requires dense inputs, we convert ragged inputs to dense before calling the TransformerBlock. As a result of this conversion, the outputs are also dense.

This approach can be costly because it means computing logit scores for all positions, even the padded ones. For example, this can impact performance when applying weight-tying multiplication between the hidden representation and all items' embeddings.

It would be helpful to convert the output of the transformer block to a ragged format, which would eliminate the need for padding and avoid unnecessary computation.

sararb added enhancement New feature or request area/session-based labels Mar 21, 2023

sararb added this to the Merlin 23.03 milestone Mar 21, 2023

sararb mentioned this issue Mar 21, 2023

New design of the transformer API #1022

Merged

12 tasks

karlhigley closed this as completed in #1022 Mar 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Convert the output of TransformerBlock to ragged tensor #1025

[FEA] Convert the output of TransformerBlock to ragged tensor #1025

sararb commented Mar 21, 2023

[FEA] Convert the output of TransformerBlock to ragged tensor #1025

[FEA] Convert the output of TransformerBlock to ragged tensor #1025

Comments

sararb commented Mar 21, 2023