Add a mixed precision test and fix mixed precision errors for layers #1242

mattdangerw · 2023-09-13T02:17:58Z

No description provided.

mattdangerw · 2023-09-13T02:18:06Z

/gcbrun

mattdangerw · 2023-09-13T05:07:56Z

Ah I think this needs a keras-core release to pass. Will sync with Francois.

ianstenbit · 2023-09-15T16:49:06Z

keras_nlp/layers/modeling/transformer_decoder.py

        )
        intermediate_shape = list(decoder_sequence_shape)
        intermediate_shape[-1] = self.intermediate_dim
        self._feedforward_output_dense.build(tuple(intermediate_shape))
        self._feedforward_layer_norm = keras.layers.LayerNormalization(
            epsilon=self.layer_norm_epsilon,
-            name="output_layer_norm",
+            dtype=self.dtype_policy,
+            name="feedforward_layer_norm",


Just a random thought while I saw these

With the new distribution API, we're going to need to be careful about changing layer names moving forward!

Yep! Trying to get us in a nice consistent state before we do.

ianstenbit · 2023-09-15T16:50:07Z

keras_nlp/samplers/sampler.py

+        This will always be done in full precision, regardless of dtype, and
+        scale by `temperature`.
+        """
+        dtype = logits.dtype


Maybe logits_dtype? (For consistency of style with inputs_dtype a few files up)

ianstenbit · 2023-09-15T16:51:24Z

keras_nlp/tests/test_case.py

+                output_data = layer(input_data)
+            for tensor in tree.flatten(output_data):
+                dtype = standardize_dtype(tensor.dtype)
+                if "float" in dtype:


Seems like assertDType should be good here no?

jbischof · 2023-09-19T00:42:39Z

Is mixed precision working in Keras Core now?

mattdangerw · 2023-09-19T03:11:09Z

Is mixed precision working in Keras Core now?

Yes, or at least it should be largely with the latest release. Landed the loss scaling optimizer, which was the main piece we were missing, as well as a few other fixes.

mattdangerw · 2023-09-19T03:11:14Z

/gcbrun

mattdangerw · 2023-09-19T06:11:08Z

/gcbrun

mattdangerw · 2023-09-19T07:05:33Z

/gcbrun

mattdangerw · 2023-09-19T17:13:04Z

/gcbrun

mattdangerw · 2023-09-20T23:34:05Z

/gcbrun

mattdangerw · 2023-09-21T00:18:35Z

/gcbrun

mattdangerw · 2023-09-21T01:41:22Z

Merging, last breakage unrelated #1251

mattdangerw force-pushed the mixed-precision-layer-fixes branch from 44b1005 to fc4f0f5 Compare September 14, 2023 19:29

mattdangerw requested a review from ianstenbit September 14, 2023 20:49

ianstenbit approved these changes Sep 15, 2023

View reviewed changes

mattdangerw force-pushed the mixed-precision-layer-fixes branch from 03c1c85 to 79c0219 Compare September 18, 2023 19:19

mattdangerw added 3 commits September 18, 2023 19:12

Add a mixed precision test and fix mixed precision errors for layers

d837b54

Address comments

ef0379f

fix torch cpu

926a737

mattdangerw force-pushed the mixed-precision-layer-fixes branch from aefe4ed to 926a737 Compare September 19, 2023 02:43

mattdangerw merged commit 5cf2c5e into keras-team:master Sep 21, 2023

mattdangerw mentioned this pull request Sep 21, 2023

MaskedLMHead should support dtype=bfloat16 #1195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a mixed precision test and fix mixed precision errors for layers #1242

Add a mixed precision test and fix mixed precision errors for layers #1242

mattdangerw commented Sep 13, 2023

mattdangerw commented Sep 13, 2023

mattdangerw commented Sep 13, 2023

ianstenbit Sep 15, 2023

mattdangerw Sep 18, 2023

ianstenbit Sep 15, 2023

ianstenbit Sep 15, 2023

jbischof commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 20, 2023

mattdangerw commented Sep 21, 2023

mattdangerw commented Sep 21, 2023

Add a mixed precision test and fix mixed precision errors for layers #1242

Add a mixed precision test and fix mixed precision errors for layers #1242

Conversation

mattdangerw commented Sep 13, 2023

mattdangerw commented Sep 13, 2023

mattdangerw commented Sep 13, 2023

ianstenbit Sep 15, 2023

Choose a reason for hiding this comment

mattdangerw Sep 18, 2023

Choose a reason for hiding this comment

ianstenbit Sep 15, 2023

Choose a reason for hiding this comment

ianstenbit Sep 15, 2023

Choose a reason for hiding this comment

jbischof commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 19, 2023

mattdangerw commented Sep 20, 2023

mattdangerw commented Sep 21, 2023

mattdangerw commented Sep 21, 2023