PixelCNN #564

jfrancis71 · 2021-02-19T12:10:34Z

jfrancis71
Feb 19, 2021

I would like to use PixelCNN to do image generation, eg MNIST, CIFAR-10, CelebA.

The Pytorch Lightning Bolts implementation looks like a straightforward implementation of the main ideas, but I have some comments.

If I have understood the intention of the PixelCNN module correctly it should output the parameters of each pixels probability distribution. However the number of channels in the output is the same as the number of channels input which restricts it to predicting only binary values for the pixel, but PixelCNN can predict color images. Researchers have successfully used a discrete (softmax) probability distribution to describe a distribution over pixel values, or a mixture of logistics. In either case they require multiple parameters for each color channel on every pixel.

PixelCNN currently computes a ReLU on the final output, but distribution parameters can (in general) be negative.

The conv_block combines 1x1 convolutions with the PixelCNNMask operation, would it be better to separate them out? That way people could try out different architectures, but not have to rewrite a PixelCNNMask. My experiment indicates it is passing through information through the masked out values, which is not quite correct (it violates autoregressivity condition)

Does PixelCNN need more documentation? I don't think it is obvious to people not familiar with PixelCNN how to train this model, and sampling from it does require some code.

To be clear I am not proposing implementing the full original PixelCNN model, but just making the existing PixelCNN in Pytorch Lightning Bolts easier to use correctly.

Please let me know if I have misunderstood the model, am duplicating other efforts, have missed some documentation, or any other feedback.

jfrancis71 · 2021-03-05T08:34:40Z

jfrancis71
Mar 5, 2021
Author

Just as an update, I have implemented a small simple PixelCNN model:

PixelCNN

It works well on MNIST, and can also do class conditioned labels on MNIST.
You can also use it for CIFAR-10. It does well on local features such as texture, but is not so good at global image structure.

I am looking at more advanced implementations. If any one else is interested in PixelCNN, please let me know.

0 replies

jfrancis71 · 2021-03-26T10:03:16Z

jfrancis71
Mar 26, 2021
Author

Another update,

I have had good experiences with using . It's a much more complete PixelCNN implementation
(couple of minor changes to support Python3 and latest PyTorch).

Due to lack of community interest, please feel free to close this discussion (I would but don't know how).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PixelCNN #564

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

PixelCNN #564

jfrancis71 Feb 19, 2021

Replies: 2 comments

jfrancis71 Mar 5, 2021 Author

jfrancis71 Mar 26, 2021 Author

jfrancis71
Feb 19, 2021

jfrancis71
Mar 5, 2021
Author

jfrancis71
Mar 26, 2021
Author