Add U-Net model #899

adamjstewart · 2019-05-11T00:37:06Z

This PR adds the U-Net model to torchvision. U-Net is very popular in image segmentation, especially in the biomedical imaging space.

Questions:

What's the requirement for pretrained models? I'm not sure what the standard dataset for image segmentation is, or which hyperparameters would work best. I just tried to follow the original U-Net paper as best I could.
The unit tests seem to be failing. It looks like all segmentation models are supposed to be defined in segmentation.py? This seems cumbersome. Any suggestions for how to improve this?

fmassa · 2019-05-14T16:27:44Z

Thanks for the PR!

What's the requirement for pretrained models? I'm not sure what the standard dataset for image segmentation is, or which hyperparameters would work best. I just tried to follow the original U-Net paper as best I could.

We are currently focusing on COCO / Pascal for semantic segmentation tasks. I'll be uploading pre-trained weights for those models soon, and they have been trained with the references/segmentation/train.py script available in torchvision
It is a requirement to have pre-trained weights, and they should match the reported accuracies within a few %.
I could try giving a shot and train the model, but I won't have time before July I think.

The unit tests seem to be failing. It looks like all segmentation models are supposed to be defined in segmentation.py? This seems cumbersome. Any suggestions for how to improve this?

My thoughts on this is that we should create a folder segmentation, and move all semantic segmentation models there. This will require a bit of refactoring, but I'm considering doing it anyway following the detection PR that I'm preparing.

fmassa · 2019-05-14T16:29:20Z

Also, at least for Pascal / COCO, we generally allow the model to take arbitrary backbones, so that one can switch from resnet to resnext for example, and reuse pre-trained weights. It seems that your implementation here doesn't follow this pattern?

adamjstewart · 2019-05-14T17:57:17Z

I could try giving a shot and train the model, but I won't have time before July I think.

That's fine by me, no rush from my end.

My thoughts on this is that we should create a folder segmentation, and move all semantic segmentation models there. This will require a bit of refactoring, but I'm considering doing it anyway following the detection PR that I'm preparing.

I agree, I'm fine with waiting until you finish your detection PR. I think there should be a folder for classification, object_detection, semantic_segmentation, and instance_segmentation.

Also, at least for Pascal / COCO, we generally allow the model to take arbitrary backbones, so that one can switch from resnet to resnext for example, and reuse pre-trained weights. It seems that your implementation here doesn't follow this pattern?

I'm not sure if U-Nets fit that description. ResNets usually condense an image down to a single pixel (with multiple channels), followed by fully connected layers. U-Nets condense an image to a smaller 32x32 grid, then upsample the image to its original resolution.

Zhaoyi-Yan

Not familiar with Unet, however, seems people prefer changing the architecture by using Conv2d with stride=2 .

torchvision/models/unet.py

ekagra-ranjan

It would be great if you could answer my query about out_channels arg.

torchvision/models/unet.py

adamjstewart · 2019-05-21T21:00:20Z

torchvision/models/unet.py

+    """`U-Net <https://arxiv.org/pdf/1505.04597.pdf>`_ architecture.
+
+    Args:
+        in_channels (int, optional): number of channels in input image


So the reason I chose in_channels=1 as the default for U-Net is because this is how the original U-Net paper is modeled, using a single channel grayscale microscope imagery dataset (see #900). The application I needed it for was actually 4-channel microscope imagery, but unfortunately PIL doesn't support this (see #882). If we decide to pretrain this on COCO/Pascal I'm fine with switching the default to 3-channel.

rwightman · 2019-08-18T05:45:38Z

Also, at least for Pascal / COCO, we generally allow the model to take arbitrary backbones, so that one can switch from resnet to resnext for example, and reuse pre-trained weights. It seems that your implementation here doesn't follow this pattern?

I'm not sure if U-Nets fit that description. ResNets usually condense an image down to a single pixel (with multiple channels), followed by fully connected layers. U-Nets condense an image to a smaller 32x32 grid, then upsample the image to its original resolution.

It's pretty common for the encoder half of the network to be based on a standard backbone like ResNet or VGG. If I recall, the original U-Net paper was pretty a much a VGG net on the encoder side already.

For the decoder, many find that using upsampling provides better results than transpose convolutions. I've seen a few impl that allow choosing either when the model is constructed.

rwightman · 2019-08-20T18:15:55Z

Further my previous comment, some PyTorch U-Nets with support for different backbones and the mentioned upsampling decoder blocks.

It'd be nice to have a normalization layer option, ideally flexible like TernausNet. There is a worthwile reference in both of Vladimir's TernausNet impl about the transpose conv and the potential artifacts resulting from it when implemented as in the original paper.

adamjstewart · 2020-01-23T21:05:40Z

@fmassa What's the status of this PR? Is this something that is still wanted, or should I close it?

adamjstewart · 2020-03-05T21:22:44Z

It seems like this PR has been abandoned by upstream, so I'm going to close it. Feel free to reopen or steal these commits to make a new PR.

Zhaoyi-Yan reviewed May 18, 2019

View reviewed changes

torchvision/models/unet.py Outdated Show resolved Hide resolved

ekagra-ranjan reviewed May 21, 2019

View reviewed changes

torchvision/models/unet.py Outdated Show resolved Hide resolved

adamjstewart added 3 commits May 21, 2019 15:52

Add U-Net model

2a3a733

Perform ReLU inplace

4d570a1

out_channels -> num_classes

e02d5dc

adamjstewart force-pushed the models/unet branch from 2eb6033 to e02d5dc Compare May 21, 2019 20:52

adamjstewart commented May 21, 2019

View reviewed changes

adamjstewart closed this Mar 5, 2020

adamjstewart deleted the models/unet branch July 28, 2024 19:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add U-Net model #899

Add U-Net model #899

adamjstewart commented May 11, 2019 •

edited

Loading

fmassa commented May 14, 2019

fmassa commented May 14, 2019

adamjstewart commented May 14, 2019

Zhaoyi-Yan left a comment

ekagra-ranjan left a comment

adamjstewart May 21, 2019

rwightman commented Aug 18, 2019

rwightman commented Aug 20, 2019

adamjstewart commented Jan 23, 2020

adamjstewart commented Mar 5, 2020

Add U-Net model #899

Add U-Net model #899

Conversation

adamjstewart commented May 11, 2019 • edited Loading

fmassa commented May 14, 2019

fmassa commented May 14, 2019

adamjstewart commented May 14, 2019

Zhaoyi-Yan left a comment

Choose a reason for hiding this comment

ekagra-ranjan left a comment

Choose a reason for hiding this comment

adamjstewart May 21, 2019

Choose a reason for hiding this comment

rwightman commented Aug 18, 2019

rwightman commented Aug 20, 2019

adamjstewart commented Jan 23, 2020

adamjstewart commented Mar 5, 2020

adamjstewart commented May 11, 2019 •

edited

Loading