Dropout with prob == 0 doesn't validate consistently #1799

csarofeen · 2022-07-03T13:24:34Z

🐛 Describe the bug

The following script doesn't validate consistently on TOT. It seems we may still be dropping out some values even though probability == 0. I think this may be because of: https://github.com/csarofeen/pytorch/blob/devel/torch/csrc/jit/codegen/cuda/ops/composite.cpp#L31 which maybe should be le not lt?

import functools
import random
from typing import List

import torch
import torch.nn.functional as F

def composite_definition(
    input1: torch.Tensor,
    input2: torch.Tensor,
    weight: torch.Tensor,
    bias1: torch.Tensor,
    bias2: torch.Tensor,
    normalization_axis: int,
    dropout_prob: float,
) -> torch.Tensor:
    bias1_out = input1 + bias1
    dropout_out = F.dropout(bias1_out, 0.0, True)
    norm_input = dropout_out + input2
    norm_output = F.layer_norm(norm_input, (input1.size(normalization_axis),), weight, bias2)
    return norm_output

# Setup initial tensors and parameters
input_size = [64, 128, 1024]
device = "cuda"
dtype = torch.float32

# Create sample inputs
input1 = torch.randn(*input_size, device=device, dtype=dtype, requires_grad=True)
input2 = torch.rand_like(input1).requires_grad_()
 
# Precompute a grad output tensor, for this example it's the same size as the inputs
grad_output = torch.rand_like(input1)
 
# Randomly initialize the model parameters
weight = torch.nn.Parameter(torch.randn(input_size[2], dtype=dtype, device=device))
bias1 = torch.nn.Parameter(torch.randn(input_size[2], dtype=dtype, device=device))
bias2 = torch.nn.Parameter(torch.randn(input_size[2], dtype=dtype, device=device))

parameters = [input1, input2, weight, bias1, bias2]
ref_composite = composite_definition(input1, input2, weight, bias1, bias2, normalization_axis=2, dropout_prob=0.0)

scripted_composite_definition = torch.jit.script(composite_definition)

for i in range(20):
  scripted = scripted_composite_definition(input1, input2, weight, bias1, bias2, normalization_axis=2, dropout_prob=0.0)
  print("output abs max {}".format((ref_composite - scripted).abs().max()))

Versions

TOT

The text was updated successfully, but these errors were encountered:

IvanYashchuk · 2022-07-04T09:37:42Z

Using le instead of lt seems to fix the problem, but I don't think it's correct. It's just an indicator that nvfuser's randlike function produces 1.0 while it shouldn't if it's supposed to correspond to torch.rand_like that samples from a uniform distribution on the interval [0, 1) - exclusive of 1.0.
curand_uniform (is it what's used in nvfuser?) reverses the interval bounds - it excludes 0.0 and includes 1.0.

torch.native_dropout has the same problem. There's an interesting conditional: torch.native_dropout is used for F.Dropout only if p > 0 && p < 1
https://github.com/pytorch/pytorch/blob/76cff182428fbd165b5725f3de29dbd91a1512fa/aten/src/ATen/native/Dropout.cpp#L28-L30
torch.native_dropout has the same behavior because < p is used in the cuda implementation with curand_uniform:
https://github.com/pytorch/pytorch/blob/76cff182428fbd165b5725f3de29dbd91a1512fa/aten/src/ATen/native/cuda/Dropout.cu#L96-L100

csarofeen · 2022-07-05T13:11:56Z

O.o any suggestion as to what we should do?

jjsjann123 · 2022-07-05T21:24:06Z

Tried to flip drop prob to 1.0 and looks like the issue with rand_like is real. We are producing [0.0, 1.0]. So that's a separate thing that we should look at. I'll open an issue for that.

le doesn't sound right, we can hook the logic inside dropout with bitwise to create a short-cut mask for p==0 & p==1.

Fixes #1799 1. Updates rand_like by changing output==1 to 0 via `where`; 2. Patches codegen float output.

csarofeen assigned rdspring1 and jjsjann123 Jul 3, 2022

jjsjann123 mentioned this issue Jul 5, 2022

Dropout prob extremal patch #1804

Merged

IvanYashchuk mentioned this issue Jul 6, 2022

The interval for torch::jit::fuser::cuda::randlike is not the same as for torch.rand_like #1807

Closed

jjsjann123 closed this as completed in #1804 Jul 7, 2022

jjsjann123 added a commit that referenced this issue Jul 7, 2022

Dropout prob extremal patch (#1804)

037a75a

Fixes #1799 1. Updates rand_like by changing output==1 to 0 via `where`; 2. Patches codegen float output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dropout with prob == 0 doesn't validate consistently #1799

Dropout with prob == 0 doesn't validate consistently #1799

csarofeen commented Jul 3, 2022

IvanYashchuk commented Jul 4, 2022 •

edited

Loading

csarofeen commented Jul 5, 2022

jjsjann123 commented Jul 5, 2022

Dropout with prob == 0 doesn't validate consistently #1799

Dropout with prob == 0 doesn't validate consistently #1799

Comments

csarofeen commented Jul 3, 2022

🐛 Describe the bug

Versions

IvanYashchuk commented Jul 4, 2022 • edited Loading

csarofeen commented Jul 5, 2022

jjsjann123 commented Jul 5, 2022

IvanYashchuk commented Jul 4, 2022 •

edited

Loading