[prototype] Switch to `spatial_size` #6736

datumbox · 2022-10-10T15:20:25Z

This PR:

Renames image_size to spatial_size everywhere in the code-base
Adds get_num_channels_video and get_spatial_size_* kernels for videos, masks and bboxes
Adds get_num_frames dispatcher and get_num_frames_video kernel for JIT.
Eliminates the over-use of query_chw to make things work with bboxes and masks

…o, masks and bboxes.

datumbox

Adding comments to places where they didn't happen automatically with the IDE:

datumbox · 2022-10-10T17:29:47Z

torchvision/prototype/transforms/functional/_meta.py

+def get_num_channels_video(video: torch.Tensor) -> int:
+    return get_num_channels_image_tensor(video)


Addition of get_num_channels_video kernel.

datumbox · 2022-10-10T17:30:37Z

torchvision/prototype/transforms/functional/_meta.py

+def get_spatial_size_video(video: torch.Tensor) -> List[int]:
+    return get_spatial_size_image_tensor(video)
+
+
+def get_spatial_size_mask(mask: torch.Tensor) -> List[int]:
+    return get_spatial_size_image_tensor(mask)
+
+
+@torch.jit.unused
+def get_spatial_size_bounding_box(bounding_box: features.BoundingBox) -> List[int]:
+    return list(bounding_box.spatial_size)


Addition of the get_spatial_size_* kernels. The one of BBox can't have a JIT-scriptable implementation as it relies on Tensor Subclassing to retrieve this info.

datumbox · 2022-10-10T17:32:15Z

torchvision/prototype/transforms/functional/_meta.py

+    elif isinstance(inpt, (features.Image, features.Video, features.BoundingBox, features.Mask)):
+        return list(inpt.spatial_size)
    else:
-        return get_spatial_size_image_pil(inpt)
+        return get_spatial_size_image_pil(inpt)  # type: ignore[no-any-return]


Refactoring to avoid the getattr idiom. After that, mypy complains for the PIL kernel. It's unclear to be why it thinks we return Any. The get_spatial_size_video returns a List[int].

datumbox · 2022-10-10T17:34:11Z

torchvision/prototype/transforms/_augment.py

@@ -153,7 +153,7 @@ class RandomCutmix(_BaseMixupCutmix):
    def _get_params(self, sample: Any) -> Dict[str, Any]:
        lam = float(self._dist.sample(()))

-        _, H, W = query_chw(sample)
+        H, W = query_hw(sample)


Removing the query_chw in favour of query_hw where possible. This happens in multiple places in the code-base.

The method was renamed to query_spatial_size after #6736 (review)

datumbox · 2022-10-10T17:35:07Z

torchvision/prototype/transforms/_color.py

@@ -100,7 +100,7 @@ def __init__(
        self.p = p

    def _get_params(self, sample: Any) -> Dict[str, Any]:
-        num_channels, _, _ = query_chw(sample)
+        num_channels, *_ = query_chw(sample)


I didn't introduce yet another method for extracting channels only. This is indeed less elegant but doesn't introduce any limitations as the input is required to have channels. This happens in one more place in the codebase.

datumbox · 2022-10-10T17:37:52Z

torchvision/prototype/transforms/_utils.py

+def query_hw(sample: Any) -> Tuple[int, int]:
+    flat_sample, _ = tree_flatten(sample)
+    hws = {
+        tuple(get_spatial_size(item))
+        for item in flat_sample
+        if isinstance(item, (features.Image, PIL.Image.Image, features.Video, features.Mask, features.BoundingBox))
+        or features.is_simple_tensor(item)
+    }
+    if not hws:
+        raise TypeError("No image, video, mask or bounding box was found in the sample")
+    elif len(hws) > 1:
+        raise ValueError(f"Found multiple HxW dimensions in the sample: {sequence_to_str(sorted(hws))}")
+    h, w = hws.pop()
+    return h, w


Lots of code duplication with query_chw. The two methods differ on the callable, the checked types, the error messages and the return type. I was tempted to write something that passes a callable and tries to reduce duplicate code but it become unnecessarily complex. Happy to implement other approaches if you have better ideas.

datumbox · 2022-10-10T17:48:35Z

I got a few failures on Windows. They don't look like related at a first glance but then this PR touches too many things, so I'm not 100% sure. I'll check again tomorrow.

vfdev-5

OK to me, thanks @datumbox !
Just a minor suggestion to rename query_hw to query_spatial_size... (not blocking)

Summary: * Change `image_size` to `spatial_size` * Fix linter * Fixing more tests. * Adding get_num_channels_video and get_spatial_size_* kernels for video, masks and bboxes. * Refactor get_spatial_size * Reduce the usage of `query_chw` where possible * Rename `query_chw` to `query_spatial_size` * Adding `get_num_frames` dispatcher and kernel. * Adding jit-scriptability tests Reviewed By: NicolasHug Differential Revision: D40427485 fbshipit-source-id: 2401fe20877177459fe23181655c9cf429cb0cc5

Change image_size to spatial_size

41d6fb4

datumbox added enhancement module: transforms prototype labels Oct 10, 2022

facebook-github-bot added the cla signed label Oct 10, 2022

datumbox added 2 commits October 10, 2022 16:21

Fix linter

07e7e25

Fixing more tests.

973fe25

datumbox force-pushed the prototype/image_size branch from 0e2240c to 973fe25 Compare October 10, 2022 15:53

datumbox added 3 commits October 10, 2022 17:35

Adding get_num_channels_video and get_spatial_size_* kernels for vide…

47ed918

…o, masks and bboxes.

Refactor get_spatial_size

a21e428

Reduce the usage of query_chw where possible

6bb1f03

datumbox requested review from pmeier and vfdev-5 October 10, 2022 17:40

datumbox changed the title ~~[WIP] [prototype] Switch to spatial_size~~ [prototype] Switch to spatial_size Oct 10, 2022

datumbox commented Oct 10, 2022

View reviewed changes

vfdev-5 approved these changes Oct 11, 2022

View reviewed changes

datumbox and others added 4 commits October 11, 2022 09:09

Rename query_chw to query_spatial_size

319e35a

Merge branch 'main' into prototype/image_size

173418c

Adding get_num_frames dispatcher and kernel.

c479560

Adding jit-scriptability tests

23a78c9

datumbox merged commit 4d4711d into pytorch:main Oct 11, 2022

datumbox deleted the prototype/image_size branch October 11, 2022 09:10

pmeier mentioned this pull request Jan 16, 2023

[NOMRG] TransformsV2 TODOs #7082

Closed

vfdev-5 mentioned this pull request Jan 17, 2023

[NOMRG] TransformsV2 questions / comments #7092

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prototype] Switch to `spatial_size` #6736

[prototype] Switch to `spatial_size` #6736

datumbox commented Oct 10, 2022 •

edited

Loading

datumbox left a comment

datumbox Oct 10, 2022

datumbox Oct 10, 2022

datumbox Oct 10, 2022

datumbox Oct 10, 2022

datumbox Oct 11, 2022

datumbox Oct 10, 2022

datumbox Oct 10, 2022

datumbox commented Oct 10, 2022

vfdev-5 left a comment

		def get_num_channels_video(video: torch.Tensor) -> int:
		return get_num_channels_image_tensor(video)

[prototype] Switch to spatial_size #6736

[prototype] Switch to spatial_size #6736

Conversation

datumbox commented Oct 10, 2022 • edited Loading

datumbox left a comment

Choose a reason for hiding this comment

datumbox Oct 10, 2022

Choose a reason for hiding this comment

datumbox Oct 10, 2022

Choose a reason for hiding this comment

datumbox Oct 10, 2022

Choose a reason for hiding this comment

datumbox Oct 10, 2022

Choose a reason for hiding this comment

datumbox Oct 11, 2022

Choose a reason for hiding this comment

datumbox Oct 10, 2022

Choose a reason for hiding this comment

datumbox Oct 10, 2022

Choose a reason for hiding this comment

datumbox commented Oct 10, 2022

vfdev-5 left a comment

Choose a reason for hiding this comment

[prototype] Switch to `spatial_size` #6736

[prototype] Switch to `spatial_size` #6736

datumbox commented Oct 10, 2022 •

edited

Loading