Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train pix2pix with my own data #309

Closed
happyday521 opened this issue Jun 28, 2018 · 16 comments
Closed

Train pix2pix with my own data #309

happyday521 opened this issue Jun 28, 2018 · 16 comments

Comments

@happyday521
Copy link

happyday521 commented Jun 28, 2018

I use my own data to train the pix2pix model.
I stack the A and B(a pair) as the input to my model and I want to get the C.
Besides, the B is the segmentation label for C. I use the A to provide RGB information for the generation of C.
However, in my training,the result is not what I want. The fake C is very alike to the A and not like the real C that I want.
The loss plot is as follows:

cg vm2uvov hyj 1ae z1

I want to know how can I decrease the influence of A and make the fake C look like the real C rather than the A? How should I change the pix2pix model(now used by default) ?Can someone give me some advice?Thanks!

@phamnam95
Copy link

How did you stack A and B as a pair? Did you stack along the channel dimension. For example A has dimension (H1,W1,C1) and B is (H1,W1,C2). Did you stack along the C dimension to have (H1,W1,C1+C2)? Does your segmentation label have only label 1 and 0, or the label is between 0 and 1? How you normalize the data after stacking (I guess the range of your RGB image and segmentation result is different)?

@happyday521
Copy link
Author

Yes, I stack A and B along the channel dimension as you said. My segmentation label is also a RGB image. I think their range is similar.Do you have any advice? Thanks.

@phamnam95
Copy link

I am doing similar problem. But my A and B have different range. And I am having same problem with you. I am looking for the solution as well.

@happyday521
Copy link
Author

Ok, good luck! If you have any useful idea,please tell me!

@junyanz
Copy link
Owner

junyanz commented Jul 4, 2018

You should stack two images as (H1, W1+W1, C1) where we assume C1=C2. For other types of data, you may consider writing your own data loader inherited from the base_dataset model.

@phamnam95
Copy link

Why do we need to stack along the width dimension?

@junyanz
Copy link
Owner

junyanz commented Jul 4, 2018

@phamnam95 it is the current design of default data loader for pix2pix. You can run this script to concatenate input and output images. It works fine if C1=C2=3 or C1=C2=1. It might not be the best way for your datasets. Feel free to write your own data loader.

@phamnam95
Copy link

Is it possible to have the input image and output image with different dimensions? If we stack along the width dimension, the dimension of input image is (H,W+W,C) and the dimension of output image is (H,W,C)?

@junyanz
Copy link
Owner

junyanz commented Jul 4, 2018

It's not supported by the aligned_dataset.py. Also, the code assumes that H and W are the same for both input and output.

@phamnam95
Copy link

So if I stack the dataset like you suggested, the dimension of input and output will be different. For example, I have two sets of images with dimension (200,200,1) and (200,200,1). I want to create the output of dimension (200,200,1). How can I stack inputs to feed in training? If I stack along width dimension, it will be (200,400,1) for input and (200,200,1) for output?

@junyanz
Copy link
Owner

junyanz commented Jul 4, 2018

If you stack your inputs, the image will be (200, 400, 1).
The aligned_dataset will load the (200, 400, 1) and split it into two: one for input, and one for output.
See this line for more details.

@phamnam95
Copy link

I am a little confused because my input has 2 image A, B; and my output is only image C. Thanks

@junyanz
Copy link
Owner

junyanz commented Jul 4, 2018

I see. In this special case, you may want to write your own data loader. It should only take 1 hour.

@phamnam95
Copy link

Should I stack along the channel dimension for A and B?

@junyanz
Copy link
Owner

junyanz commented Jul 5, 2018

If you write your own data loader, you can load each image separately by the name image0000_A, image0000_B, image0000_C. You don't need to stack them.

@phamnam95
Copy link

I mean when I train, I guess I cannot feed two input images A and B to the input tensor. I need to stack them to create one image for input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants