test_rgb2hsv is flaky #2433

fmassa · 2020-07-08T18:42:21Z

It looks like test_rgb2hsv is flaky and fails sometimes.

https://app.circleci.com/pipelines/github/pytorch/vision/3217/workflows/77e60582-2ddc-46db-933f-33c45c27387c/jobs/178179/tests

Example error that we get

    def test_rgb2hsv(self):
        shape = (3, 150, 100)
        for _ in range(20):
            img = torch.rand(*shape, dtype=torch.float)
            ft_hsv_img = F_t._rgb2hsv(img).permute(1, 2, 0).flatten(0, 1)
    
            r, g, b, = img.unbind(0)
            r = r.flatten().numpy()
            g = g.flatten().numpy()
            b = b.flatten().numpy()
    
            hsv = []
            for r1, g1, b1 in zip(r, g, b):
                hsv.append(colorsys.rgb_to_hsv(r1, g1, b1))
    
            colorsys_img = torch.tensor(hsv, dtype=torch.float32)
    
            max_diff = (colorsys_img - ft_hsv_img).abs().max()
>           self.assertLess(max_diff, 1e-5)
E           AssertionError: tensor(1.) not less than 1e-05

test\test_functional_tensor.py:118: AssertionError

The text was updated successfully, but these errors were encountered:

KushajveerSingh · 2020-07-13T23:55:54Z

Problems:

Precision lost when converting from pytorch -> numpy
Colorsys gets completely flipped out due to floating error as shown below (this is a good one).

Inconsistency due to precision. Consider two-pixel values:

val1 = [0.3749333, 0.0530237, 0.05302376]
val2 = [0.3749333, 0.0530237, 0.0530237]

Only 6 is missing in the eight-decimal place. Colorsys fips on the Hue value.

colorsys.rgb_to_hsv(0.3749333,   0.0530237,   0.0530237)
# (0.0, 0.85857831246251, 0.3749333)

colorsys.rgb_to_hsv(0.3749333,   0.0530237,   0.05302376)
# (0.9999999689353781, 0.85857831246251, 0.3749333)

_rgb2hsv gives the same result for both.

Why the test fails

x[7902]
# tensor([0.3749333024024963, 0.0530236959457397, 0.0530237555503845])

x[7902].numpy()
# array([0.3749333  , 0.053023696, 0.053023756], dtype=float32)

All the precision is lost when converting to numpy. The pytorch tensor is initialized as torch.float32 but is not being converted exactly to numpy.float32. I think it is one of those boundary floating-point cases.

A possible solution for this would be to ignore the error if the difference is one. A more complex solution would be to get the index of the max_diff value, and if those values are 0.0 and 0.99, ignore the error.

KushajveerSingh · 2020-07-14T12:20:27Z

I tried some more things.

x
# tensor([0.3749333024024963, 0.0530236959457397, 0.0530237555503845])

When I convert this to Numpy

x.numpy()
# array([0.3749333  , 0.053023696, 0.053023756], dtype=float32)

The precision is lost. The interesting thing now is, if I convert this value back to tensor the precision comes back

torch.from_numpy(x.numpy())
# tensor([0.3749333024024963, 0.0530236959457397, 0.0530237555503845])

I again tried the above. But I did it more explicitly. I define a numpy array with value as below

a = np.array([0.3749333], dtype=np.float32)
a
# array([0.3749333], dtype=float32)

Now when I convert this to tensor, I get all those extra decimals.

torch.from_numpy(a)
# tensor([0.3749333024024963])

There is something wrong with pytorch numpy conversion. It fails for all values.

a = np.array([0.1], dtype=np.float32)
a
# array([0.1], dtype=float32)

torch.from_numpy(a)
# tensor([0.1000000014901161])

@fmassa do you know how pytorch handles numpy conversion?

KushajveerSingh · 2020-07-14T12:25:52Z

It is pytorch float issue

Pytorch is acting weird. Nothing is wrong with numpy conversion as shown below.

torch.tensor([0.1], dtype=torch.float32)
# tensor([0.1000000014901161])

The above works correctly in Numpy. I tested this on Pytorch master + Pytorch 1.5.1.

Is it a known pytorch thing? Should I open a new issue on PyTorch? Or I do not know something about PyTorch?

fmassa · 2020-07-15T13:31:05Z

I think the issue might be that we are performing computations on float32, while numpy and python are doing those in float64?

I think that the best fix for this is to change the test so that it takes into account the fact that the H coordinate is cyclic.
So maybe doing the distance computation in this transformed domain.

For example, instead of doing

dist_h = (x_h - y_h).abs().max()

we could instead do something like

dist_h = ((x_h * 2 * math.pi).sin() - (y_h * 2 * math.pi).sin()).abs().max()

so that 0 and 1 are the same.

fmassa added help wanted high priority module: tests module: transforms labels Jul 8, 2020

KushajveerSingh mentioned this issue Jul 15, 2020

perform cyclic check for hue in test_rgb2hsv #2477

Merged

fmassa closed this as completed in #2477 Jul 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_rgb2hsv is flaky #2433

test_rgb2hsv is flaky #2433

fmassa commented Jul 8, 2020

KushajveerSingh commented Jul 13, 2020 •

edited

Loading

KushajveerSingh commented Jul 14, 2020

KushajveerSingh commented Jul 14, 2020 •

edited

Loading

fmassa commented Jul 15, 2020

test_rgb2hsv is flaky #2433

test_rgb2hsv is flaky #2433

Comments

fmassa commented Jul 8, 2020

KushajveerSingh commented Jul 13, 2020 • edited Loading

Why the test fails

KushajveerSingh commented Jul 14, 2020

KushajveerSingh commented Jul 14, 2020 • edited Loading

It is pytorch float issue

fmassa commented Jul 15, 2020

KushajveerSingh commented Jul 13, 2020 •

edited

Loading

KushajveerSingh commented Jul 14, 2020 •

edited

Loading