Pix2Pix (CVPR'2017)

Pix2Pix: Image-to-Image Translation with Conditional Adversarial Networks

Task: Image2Image

Abstract

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

Results and Models

Results from Pix2Pix trained by mmagic

We use `FID` and `IS` metrics to evaluate the generation performance of pix2pix.¹

Model	Dataset	FID	IS	Download
Ours	facades	124.9773	1.620	model \| log²
Ours	aerial2maps	122.5856	3.137	model
Ours	maps2aerial	88.4635	3.310	model
Ours	edges2shoes	84.3750	2.815	model

FID comparison with official:

Dataset	facades	aerial2maps	maps2aerial	edges2shoes	average
official	119.135	149.731	102.072	75.774	111.678
ours	124.9773	122.5856	88.4635	84.3750	105.1003

IS comparison with official:

Dataset	facades	aerial2maps	maps2aerial	edges2shoes	average
official	1.650	2.529	3.552	2.766	2.624
ours	1.620	3.137	3.310	2.815	2.7205

Note:

we strictly follow the paper setting in Section 3.3: "At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization using the statistics of the test batch, rather than aggregated statistics of the training batch." (i.e., use model.train() mode), thus may lead to slightly different inference results every time.
This is the training log before refactoring. Updated logs will be released soon.

Citation

@inproceedings{isola2017image,
  title={Image-to-image translation with conditional adversarial networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1125--1134},
  year={2017},
  url={https://openaccess.thecvf.com/content_cvpr_2017/html/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.html},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Pix2Pix (CVPR'2017)

Abstract

Results and Models

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Pix2Pix (CVPR'2017)

Abstract

Results and Models

Citation