performance loss from 1.0.8 to 1.1.* when using 16 bit precision

## 🐛 Bug

After updating pytorch-lightning from 1.0.8 to 1.1.0/1.1.1 the use of 16 bit precision destroys the performances.
In my actual code of object detection losses are by a factor of 4 larger at the beginning than compared to 32 bit or 16 bit with pl 1.08.
They converge to a much higher value and the resulting model lost its detection capabilities completely.
To replicate I tested the pl notebooks and the `06-cifar10-baseline.ipynb` also shows this and the classification accuracy corresponds to guessing the class when switching from 32 to 16 bit.
I integrated it into the BoringModel notebook and the problem is also happening in google colab.

## Please reproduce using [the BoringModel and post here](https://colab.research.google.com/drive/1HvWVVTK8j2Nj52qU4Q4YCyzOm0_aLQF3?usp=sharing)

https://colab.research.google.com/drive/1FqXG9Xw9gVZxnwiGnjsHpAtb-vUqFaob?usp=sharing

### To Reproduce


### Expected behavior

Same performance for 32 and 16 bit.

### Environment

* CUDA:
	- GPU:
		- Tesla P100-PCIE-16GB
	- available:         True
	- version:           10.1
* Packages:
	- numpy:             1.18.5
	- pyTorch_debug:     True
	- pyTorch_version:   1.7.0+cu101
	- pytorch-lightning: 1.1.1
	- tqdm:              4.41.1
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.6.9
	- version:           #1 SMP Thu Jul 23 08:00:38 PDT 2020

### Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

performance loss from 1.0.8 to 1.1.* when using 16 bit precision #5159

🐛 Bug

Please reproduce using the BoringModel and post here

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

performance loss from 1.0.8 to 1.1.* when using 16 bit precision #5159

Description

🐛 Bug

Please reproduce using the BoringModel and post here

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions