update from main #1

KennethEnevoldsen · 2024-04-02T12:19:08Z

Description

Checklist

I confirm that I have the right to submit this contribution under the project's MIT license.

* Clear output of Torch SDPA for masked pieces Since Torch 2.1, the Torch memory-efficient SDPA GPU kernel returns NaN for pieces that are completely masked out. This leads to NaN propagation in the next attention layer, because masked pieces get an attention of zero, but zero times NaN is still NaN. In this we fix this by setting masked tokens to zero to clear out any NaNs. We currently rely on the query dimension of the mask to be singular, but in the future we should probably redesign the `AttentionMask` class to account for the differences between attention masks and causal masks. * black * Update MyPy version to one that supports recent PyTorch * Comment typos and fixes * Add assertion message Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * black --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

There was a subtle bug where we populate models with parameters that are not leaf nodes because we called `to` on them for device placement. This change fixes this issue and validates that all model parameters are leaf nodes in the model tests.

We added support for TorchScript tracing a while back, so that models can be exported to ONNX. However, the support relies on metaclasses, which breaks with torch.compile in the latest PyTorch versions. However, PyTorch now provides a TorchDynamo-based ONNX exporter: https://pytorch.org/docs/stable/onnx_dynamo.html So it's time to yank TorchScript tracing support and remove all the fragile dataclass/tuple/dict polymorphism.

* Fix `test_rotary_embeddings_against_hf` for latest transformers * xfail test because HfFileSystem is currently broken

danieldk and others added 5 commits February 8, 2024 20:29

Set version to 2.0.0.dev1 (#366)

afbdf60

Adjust two cross-tests for changes in HF transformers (#367)

2d4dfef

* Fix `test_rotary_embeddings_against_hf` for latest transformers * xfail test because HfFileSystem is currently broken

KennethEnevoldsen merged commit 8dc7b1a into KennethEnevoldsen:added_electra_encoder Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update from main #1

update from main #1

KennethEnevoldsen commented Apr 2, 2024

update from main #1

update from main #1

Conversation

KennethEnevoldsen commented Apr 2, 2024

Description

Checklist