Fill arg and _apply_grid_transform improvements

Few years ago we introduced non-const fill value handling in `_apply_grid_transform` using mask approach:

https://github.com/pytorch/vision/blob/0d69e35c4e951109dbaa8b42b0a8416d199aee0b/torchvision/transforms/functional_tensor.py#L550-L568

There are few minor problems with this approach:

1) if we pass `fill = [0.0, ]`, we would expect to have a similar result as `fill=None`. This is not exactly true for bilinear interpolation mode where we do linear interpolation: 
https://github.com/pytorch/vision/blob/0d69e35c4e951109dbaa8b42b0a8416d199aee0b/torchvision/transforms/functional_tensor.py#L567-L568

Most probably, we would like to skip `fill_img` creation for all fill values that has `sum(fill) == 0` as `grid_sample` pads with zeros.

```diff
- if fill is not None:
+ if fill is not None and sum(fill) > 0:
```

2) Linear `fill_img` and `img` interpolation may be replaced by directly applying a mask: 
```python
         mask = mask < 0.9999
         img[mask] = fill_img[mask] 
```
That would match better PIL Image behaviour.

https://github.com/pytorch/vision/blob/0d69e35c4e951109dbaa8b42b0a8416d199aee0b/torchvision/transforms/functional_tensor.py#L567-L568

![image](https://user-images.githubusercontent.com/2459423/187435735-8a13af09-8e80-4db1-82cb-6081b74d0c94.png)



cc @datumbox

	# Append a dummy mask for customized fill colors, should be faster than grid_sample() twice
	if fill is not None:
	dummy = torch.ones((img.shape[0], 1, img.shape[2], img.shape[3]), dtype=img.dtype, device=img.device)
	img = torch.cat((img, dummy), dim=1)

	img = grid_sample(img, grid, mode=mode, padding_mode="zeros", align_corners=False)

	# Fill with required color
	if fill is not None:
	mask = img[:, -1:, :, :] # N * 1 * H * W
	img = img[:, :-1, :, :] # N * C * H * W
	mask = mask.expand_as(img)
	len_fill = len(fill) if isinstance(fill, (tuple, list)) else 1
	fill_img = torch.tensor(fill, dtype=img.dtype, device=img.device).view(1, len_fill, 1, 1).expand_as(img)
	if mode == "nearest":
	mask = mask < 0.5
	img[mask] = fill_img[mask]
	else: # 'bilinear'
	img = img * mask + (1.0 - mask) * fill_img

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fill arg and _apply_grid_transform improvements #6517

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fill arg and _apply_grid_transform improvements #6517

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions