Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CamVid dataset for segmentation #90

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

felixgwu
Copy link

@felixgwu felixgwu commented Mar 8, 2017

Because of the request #60, I implemented this Camvid dataset class for people who are interested in doing image segmentation.

Since both the input and target should go through the same transformation, I added the file join_transform.py which allows us to do the same image transformation on a list of images.

Sample usage:

import torch
from torchvision.datasets import camvid
from torchvision import transforms, joint_transforms

normalize = transforms.Normalize(mean=camvid.mean, std=camvid.std)
train_joint_transformer = transforms.Compose([
    joint_transforms.JointRandomCrop(224),
    joint_transforms.JointRandomHorizontalFlip()
    ])
train_dataset = camvid.CamVid('path/to/CamVid', 'train',
                      joint_transform=train_joint_transformer,
                      transform=transforms.Compose([
                          transforms.ToTensor(),
                          normalize,
                      ])
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=10, shuffle=True)

Currently, the download function is not implemented yet.
It requires the users to download the data from here by themselves.

return images


def LabelToLongTensor(pic):

This comment was marked as off-topic.

@fmassa
Copy link
Member

fmassa commented Mar 11, 2017

So, I'm not sure about the Joint Transforms. We were thinking about factoring out the random number generation from the transforms, so that the same random transform can be applied to different inputs (and eventually to inputs from different modalities, such as images and bounding boxes).
I'll send a PR with these transforms tomorrow.

@felixgwu
Copy link
Author

OK. I'll modify the code based on the new version of transforms.

@felixgwu
Copy link
Author

Hi @fmassa,
I am wondering what the conclusion of joint random transformations is.
In my opinion, making all the transform class being able to take either a image or a list of images as inputs, and providing a joint_transform parameter the dataset object like my code could be a solution.

Also, I made the function private. However, the other two class should be public. They should be able to be used by the users. LabelToLongTensor can be passed by as a transform class to the Camvid class.
More importantly, LabelToPILImage can used to visualize the predicted labels.

@fmassa
Copy link
Member

fmassa commented Mar 19, 2017

Hi @felixgwu,

In a local branch I factored out the random parameter generation from the transforms, and I'm using it now in a project for semantic segmentation.

The drawback of this approach is that we need to pass an extra argument to the constructor of the dataset (let's call it generators). The generators are a list of object that when called generate the random parameters for the transforms.
Thus, in the __getitem__, if the generators were provided in the constructor, we need to call them in order to generate the parameters for the transforms, that can be used for both the inputs and the targets.

This requires passing a new argument to the constructor, but I think it's better than having joint transforms, because we light want to apply some individual transforms before / after the joint transforms, which would not be possible in this setup.

I'm not yet 100% hapoy with my refactoring, but it's time to send a PR to get some feedback. I'm out of my computer this weekend, but when I create the PR I'll tag you on it.

@gpleiss gpleiss mentioned this pull request Mar 29, 2017
@alykhantejani
Copy link
Contributor

Hi @felixgwu sorry for the delay on this, we are discussing a refactor to transforms in #230 and should have something merged soon after which using the random transformation parameters between transforms should be much easier.

@carlogarro
Copy link

Did it succeed?

@yassineAlouini
Copy link
Contributor

Hello @felixgwu and sorry for taking that long to get back at you.

As you might know, there is a new dataset API being designed and existing datasets will be ported to it. Here is a thread explaining the logic behind it: #5336.

Also, there is this thread that discusses adding the CamVid dataset: #60. You are probably aware of it.

As far as my understanding goes, it would be better to wait a bit until the new design is stable. Then, it would be best to port the code here to the new design. Someone can help you doing it of course and proper attribution will be given to you @felixgwu of course or you can do it yourself if you want to. What do you think @felixgwu?

Also, since this PR is a bit old maybe you don't need it anymore and/or found an alternative. Any feedback is welcome @felixgwu.

Thanks again and sorry for the long delay.

@pmeier
Copy link
Collaborator

pmeier commented Jun 7, 2022

@yassineAlouini This PR adds two things:

  1. The image dataset CamVid (I erroneously thought this was a video dataset in CamVid dataset #60)
  2. Joint transformations for images and segmentation masks.

The second part is well covered by the transforms rework that is currently going on. If I'm not mistaken we already have transformations for everything proposed here in torchvision.prototype.transforms.

As for the dataset, the implementation seems pretty straight forward. Porting it to the new API should be simple. As of now, we haven't we looked mostly into classification and detection datasets, but segmentation datasets will follow soon. We should wait at least until then before we have a go at this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants