Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarifying docs for ToPILImage() #7679

Merged
merged 3 commits into from
Jun 27, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions torchvision/transforms/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,21 +199,21 @@ def forward(self, image):


class ToPILImage:
"""Convert a tensor or an ndarray to PIL Image - this does not scale values.
"""Converts a tensor or an ndarray to PIL Image
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like python docstring convention suggests to use imperative verb instead of conjugated verb
Source: https://peps.python.org/pep-0257/

It prescribes the function or method’s effect as a command (“Do this”, “Return that”), not as a description; e.g. don’t write “Returns the pathname …”.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but at here also, it's written "Applies." There may be more such cases.

Copy link
Collaborator

@vfdev-5 vfdev-5 Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are conventions that may be not respected everywhere for a certain reason. In torchvision (cc @NicolasHug ) we want to follow these convention and in the examples where it is not the case we may want to fix that. Anyway let's not introduce another convention here.


This transform does not support torchscript.

Converts a torch.*Tensor of shape C x H x W or a numpy ndarray of shape
H x W x C to a PIL Image while preserving the value range.
H x W x C to a PIL Image while adjusting the value range depending on the ``mode``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug opinion on whether we even want to mention that?

Suggested change
H x W x C to a PIL Image while adjusting the value range depending on the ``mode``.
H x W x C to a PIL Image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with it. It is slightly more accurate than what we previously had, even though arguably it doesn't provide a ton of info right away, since readers would need to know the details of each PIL mode to understand what's happening. Perhaps we can add that info next to each mode below, e.g.

- If the input has 3 channels, the ``mode`` is assumed to be ``RGB`` (uint8 values in [0, 255])

etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will make the required changes.

Copy link
Contributor Author

@sahilg06 sahilg06 Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug @pmeier All the modes are not mentioned explicitly in the docs. PIL.Image mode is referred to that in the docs. So won't it be better if we just write this?

Converts a torch.*Tensor of shape C x H x W or a numpy ndarray of shape
H x W x C to a PIL Image while keeping the value range consistent with PIL.Image modes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't address that part of my comment above:

even though arguably it doesn't provide a ton of info right away, since readers would need to know the details of each PIL mode to understand what's happening

But that's OK, we can provide more detail incrementally. Let's just merge this PR the way it is right now.

@sahilg06 do you mind reverting the changes made in https://github.com/pytorch/vision/pull/7679/files#r1234422939 so we can merge it? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Args:
mode (`PIL.Image mode`_): color space and pixel depth of input data (optional).
If ``mode`` is ``None`` (default) there are some assumptions made about the input data:

- If the input has 4 channels, the ``mode`` is assumed to be ``RGBA``.
- If the input has 3 channels, the ``mode`` is assumed to be ``RGB``.
- If the input has 2 channels, the ``mode`` is assumed to be ``LA``.
- If the input has 1 channel, the ``mode`` is determined by the data type (i.e ``int``, ``float``,
``short``).
- If the input has 1 channel, the ``mode`` is determined by the data type (i.e ``int``, ``float``, ``short``).

.. _PIL.Image mode: https://pillow.readthedocs.io/en/latest/handbook/concepts.html#concept-modes
"""
Expand Down