Test newly uploaded Flan-T5 weights #2074

joecummings · 2023-02-23T16:57:02Z

This PR adds tests for the Flan-T5 weights and confirms that build_hf_checkpoint_from_path works w/ Flan.

joecummings · 2023-02-23T16:58:49Z

This should be cherry-picked into the release as it's just more test coverage.

joecummings · 2023-02-24T22:33:40Z

README.rst

@@ -122,7 +122,7 @@ The library currently consist of following pre-trained models:
 * `DistilRoBERTa <https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md>`_
 * XLM-RoBERTa: `Base and Large Architure <https://github.com/pytorch/fairseq/tree/main/examples/xlmr#pre-trained-models>`_
 * T5: `Small, Base, Large, 3B, and 11B Architecture <https://github.com/google-research/text-to-text-transfer-transformer>`_
-* Flan-T5: `Small, Base, Large, XL, and XXL Architecture <https://github.com/google-research/t5x>`_


Don't actually support Flan-T5 small due to non divisible # of attention heads.

joecummings · 2023-02-24T22:33:52Z

.github/workflows/integration-test.yml

@@ -55,7 +55,6 @@ jobs:
        python3 -m pip --quiet install sentencepiece
        python3 -m pip --quiet install tqdm
        python3 -m pip --quiet install expecttest
-        python3 -m pip --quiet install transformers


No longer need transformers install for integration tests.

joecummings · 2023-02-24T22:34:35Z

test/integration_tests/test_t5_models.py

-                decoder_attention_mask=self.decoder_padding_mask,
-                output_hidden_states=True,
-                output_attentions=True,
+        model = T5Bundle.build_model_from_huggingface_ckpt(model_path, encoder_only=is_encoder_only)


The above already tests the outputs of HF models, so we just need to confirm that we can load the weights from a HF file and that the model can run.

joecummings · 2023-02-24T22:34:44Z

torchtext/models/t5/bundler.py

@@ -238,12 +239,12 @@ def build_model_from_huggingface_ckpt(

            for i in range(config.num_decoder_layers):
                if config.is_gated_act:
-                    t5_model_state_dict[f"encoder.layers.{i}.linear1_0.weight"] = hf_weights[


This was a bug.

I need to confirm that the weights I generated before are correct.

Nayef211 · 2023-02-25T21:50:59Z

This should be cherry-picked into the release as it's just more test coverage.

Could you create a release tracker issue similar to #1766?

Nayef211

Overall LGTM

Nayef211 · 2023-02-25T21:54:12Z

test/integration_tests/test_t5_models.py

-                    our_output["decoder_hidden_states"][i], hf_output.decoder_hidden_states[i]
-                ), f"Mismatched hidden states for decoder layer {i}"
-
-    def test_t5_bundler_load_hf_ckpt_pretrained_encoder_only(self) -> None:


Why do we get rid of all of these tests?

We do the actual testing of the weights in the above tests. Here we just want to make sure the code can be loaded and ran.

* Add tests for loading Flan-T5 weights from HF checkpoints * Add expected outputs and update tests for Flan * Add newline at end of file * pin transformers version for testing * Simplify test for HF loading * Fix linting * Fix integration tests w/ proper download path

joecummings added 2 commits February 21, 2023 19:01

Add tests for loading Flan-T5 weights from HF checkpoints

2b64d8b

Add expected outputs and update tests for Flan

776aee6

facebook-github-bot added the cla signed label Feb 23, 2023

joecummings added 4 commits February 23, 2023 12:04

Add newline at end of file

791dcb2

pin transformers version for testing

4e57a3d

Simplify test for HF loading

91f5686

Fix linting

b85d228

joecummings requested review from Nayef211 and rshraga and removed request for Nayef211 February 24, 2023 22:32

joecummings commented Feb 24, 2023

View reviewed changes

joecummings requested a review from mthrok February 24, 2023 22:35

Nayef211 approved these changes Feb 25, 2023

View reviewed changes

Fix integration tests w/ proper download path

dbeb4d0

joecummings mentioned this pull request Feb 27, 2023

[v0.15] Release Tracker #2079

Open

8 tasks

joecummings merged commit a1dc61b into pytorch:main Feb 27, 2023

joecummings deleted the test-flan-weights branch February 27, 2023 19:18

joecummings mentioned this pull request Feb 27, 2023

[cherry-pick] [0.15] Flan weights tests #2082

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test newly uploaded Flan-T5 weights #2074

Test newly uploaded Flan-T5 weights #2074

joecummings commented Feb 23, 2023

joecummings commented Feb 23, 2023

joecummings Feb 24, 2023

joecummings Feb 24, 2023

joecummings Feb 24, 2023

joecummings Feb 24, 2023

joecummings Feb 27, 2023

Nayef211 commented Feb 25, 2023

Nayef211 left a comment

Nayef211 Feb 25, 2023

joecummings Feb 27, 2023

Test newly uploaded Flan-T5 weights #2074

Test newly uploaded Flan-T5 weights #2074

Conversation

joecummings commented Feb 23, 2023

joecummings commented Feb 23, 2023

joecummings Feb 24, 2023

Choose a reason for hiding this comment

joecummings Feb 24, 2023

Choose a reason for hiding this comment

joecummings Feb 24, 2023

Choose a reason for hiding this comment

joecummings Feb 24, 2023

Choose a reason for hiding this comment

joecummings Feb 27, 2023

Choose a reason for hiding this comment

Nayef211 commented Feb 25, 2023

Nayef211 left a comment

Choose a reason for hiding this comment

Nayef211 Feb 25, 2023

Choose a reason for hiding this comment

joecummings Feb 27, 2023

Choose a reason for hiding this comment