Skip to content

Add fill parameter to utils.draw_bounding_boxes. #3280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
oke-aditya opened this issue Jan 23, 2021 · 3 comments · Fixed by #3296
Closed

Add fill parameter to utils.draw_bounding_boxes. #3280

oke-aditya opened this issue Jan 23, 2021 · 3 comments · Fixed by #3296

Comments

@oke-aditya
Copy link
Contributor

oke-aditya commented Jan 23, 2021

🚀 Feature

Fill parameter allows creating a semi-transparent box. This is particularly useful for Mask RCNN Model.
This would complete utils for Object detection and Instance Segmentation (least with rectangular boxes)

Motivation

In Instance segmentation models, we also care about masks, not just the bounding box. Fill parameter allows us to fill in a semi-transparent way. Also, this parameter is optional hence it does not affect performance.

Pitch

Add a param fill as follows

fill: Optional[List[Union[str, Tuple[int, int, int]]]] = None,

Here is complete running code with a few edits

@torch.no_grad()
def draw_bounding_boxes(
    image: torch.Tensor,
    boxes: torch.Tensor,
    labels: Optional[List[str]] = None,
    colors: Optional[List[Union[str, Tuple[int, int, int]]]] = None,
    fill: Optional[List[Union[str, Tuple[int, int, int]]]] = None,
    width: int = 1,
    font: Optional[str] = None,
    font_size: int = 10
) -> torch.Tensor:

    """
    Draws bounding boxes on given image.
    The values of the input image should be uint8 between 0 and 255.
    Args:
        image (Tensor): Tensor of shape (C x H x W)
        bboxes (Tensor): Tensor of size (N, 4) containing bounding boxes in (xmin, ymin, xmax, ymax) format. Note that
            the boxes are absolute coordinates with respect to the image. In other words: `0 <= xmin < xmax < W` and
            `0 <= ymin < ymax < H`.
        labels (List[str]): List containing the labels of bounding boxes.
        colors (List[Union[str, Tuple[int, int, int]]]): List containing the colors of bounding boxes. The colors can
            be represented as `str` or `Tuple[int, int, int]`.
        fill: Optional[List[Union[str, Tuple[int, int, int]]]] = None,
        width (int): Width of bounding box.
        font (str): A filename containing a TrueType font. If the file is not found in this filename, the loader may
            also search in other directories, such as the `fonts/` directory on Windows or `/Library/Fonts/`,
            `/System/Library/Fonts/` and `~/Library/Fonts/` on macOS.
        font_size (int): The requested font size in points.
    """

    if not isinstance(image, torch.Tensor):
        raise TypeError(f"Tensor expected, got {type(image)}")
    elif image.dtype != torch.uint8:
        raise ValueError(f"Tensor uint8 expected, got {image.dtype}")
    elif image.dim() != 3:
        raise ValueError("Pass individual images, not batches")

    ndarr = image.permute(1, 2, 0).numpy()
    img_to_draw = Image.fromarray(ndarr)

    img_boxes = boxes.to(torch.int64).tolist()

    draw = ImageDraw.Draw(img_to_draw, "RGBA")

    txt_font = ImageFont.load_default() if font is None else ImageFont.truetype(font=font, size=font_size)

    for i, bbox in enumerate(img_boxes):
        color = None if colors is None else colors[i]
        draw.rectangle(bbox, width=width, outline=color, fill=fill)

        if labels is not None:
            draw.text((bbox[0], bbox[1]), labels[i], fill=color, font=txt_font)

    return torch.from_numpy(np.array(img_to_draw)).permute(2, 0, 1)

This makes mask RCNN output more clear, and people can play with fill parameter such as confidence based fill, fill with colors different per class, etc.

Additional context

I can send PR for this 😅 I'm attaching outputs of above code.

draw_boxes_util2

(Sorry PyTorch logo 🙏 )

draw_boxes_util

@datumbox
Copy link
Contributor

I like the idea of filling the box with colour but do you think it's necessary to use the a different colour than the border? Perhas that can be reused and instead turn fill to a boolean? What do you think?

@oke-aditya
Copy link
Contributor Author

oke-aditya commented Jan 25, 2021

It could be fine with boolean too.
Just wondering if user would like dark black (distinct color for outline) and fill with semi-transparent color.

I'm not sure if the above is really valid use case.

One such example is here.

draw_boxes_util4

Code to reproduce results

image = torch.ones(3, 224, 224)
boxes = torch.tensor([[12, 23, 40, 40], [20, 30, 50, 50], [20, 50, 120, 120]])
image = torchvision.transforms.ConvertImageDtype(torch.uint8)(image)
result = draw_bounding_boxes(image=image, boxes=boxes, fill=(50, 100, 100, 127), colors=["black", "black", "black"])
res = Image.fromarray(result.permute(1, 2, 0).contiguous().numpy())
res.save("draw_boxes_util4.png")

Let me know I can send PR for either.

Only caveat is since we are drawing semi transparent color mask using Alpha channel (4 channel image)
Users need to save returned tensor as PNG format not jpg. I think we can add that in documentation.

I think we can leave the option open to user? IMO We can easily keep flexibility

@datumbox
Copy link
Contributor

@oke-aditya Awesome, let's continue the discussion on the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants