-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing results #17
Comments
Hi, Your training details look good to me. |
I know that your boss said that you can not release the training code, but can you maybe release an example of the cropped images, trimaps and alpha in a training batch? Maybe I can find a visual difference. For example, this is a batch from my training code where I tried without resizing and discarding cropped alpha regions with mean alpha below 0.2 and above 0.8 to better focus on the unknown region: | image | trimap | mask of unknown region for loss function | ground truth alpha | predicted alpha |
I don't understand how that would help because all alpha values are in [0, 1]. Xu et al. (Deep Image Matting) say:
As I understand, this means that they choose a random rectangle in the ground truth alpha matte and discard it if the center pixel is known, or in other words, the center pixel is 100% foreground or 100% background. Maybe my English is not good, so here is some code to explain: from PIL import Image
import numpy as np
def crop_centered(alpha):
while True:
# pick random rectangle with corner (x, y) and size 320x320
x = np.random.randint(alpha.shape[1] - 320)
y = np.random.randint(alpha.shape[0] - 320)
cropped = alpha[y:y+320, x:x+320]
center_pixel = cropped[160, 160]
# found good rectangle if the center pixel is unknown
if center_pixel != 0 and center_pixel != 255:
return cropped
Image.fromarray(crop_centered(np.array(Image.open("GT04.png").convert("L")))).show() |
@983 I work at home now. When I return to my office, I'll share you with some pieces of code on how I crop images. |
@983 Here is the code I use to randomly crop images: `class RandomCrop(object):
|
Thank you very much for the training code, I'll update here once it is finished. EDIT: My results are:
The |
Thanks for the great work! I have also tried using the provided training code to reproduce the results. The only things I changed are num_workers (4 to 16) to speed up the training. I also get a similar result with SAD = 49.26 and MSE = 0.0143. The results are good compared to DIM, yet there still exists a significant margin to the provided model (SAD = 45.8 and MSE = 0.013). I wonder if you have any clue about what leads to the difference?
|
Hi all @983 @yucornetto, |
I also increased the number of workers (to 8), but made no changes otherwise. Maybe that makes a difference? It really shouldn't, but who knows. I'll try 4 this time. |
@983 I don't think that the number of workers is an issue. But I suffered from a problem that, when I terminate the training halfway and return from the checkpoint, the final results are always worse than training without stopping. This suggests that how the images are sampled affects the performance. I have stuck with this sampling strategy just to match what is used in deep image matting for a fair comparison, but I think there must exist better way to do data augmentation realiably (e.g., crop 512x512 instead of 320x320). Hope my experience helps. |
I also think that better data augmentation could improve results, but training takes a really long time, so it is hard to evaluate what works and what doesn't. It might be interesting to train a smaller model on smaller images and evaluate to what extend the findings can be transferred to larger models. For example, Macro Forte et al. (FBA matting) did some work recently where they found that a batch size of 1 works really well, but training took weeks, therefore it is hard to isolate the exact reason why this works. If the model was faster to train, it would be much faster to run experiments. |
@poppinace I trained the model with single GPU without stopping and resuming as suggested. Thanks for the advice, I will try modify the sampling strategy to see if it helps. |
@983 I know that paper. I reserve my opinion about the 1-batch strategy because it does not report performance when bs>=16. It is unfair to compare small batch sizes with 1-batch instance norm. I agree that you should find a proxy task to validate your idea. I saw some papers use resized dataset such that the whole dataset can be loaded into the memory to speed up training. We also only composite fg with 2 or 3 bgs to construct a small dataset. The key is that, the small dataset should be representative enough as a replacement of the full dataset. You can think about it. |
Here are the results from the latest run. The SAD after 30 epochs is slightly worse than before (
I think most of the training cost is decoding the PNG images. It is probably fine to store them as BMP instead since natural images don't compress well anyway. I'll try proxy tasks now, maybe I can find something useful.
It helps a lot. Thank you very much for your time. |
Hi @983, Here is my validation results per epoch. They are quite stable in the last a few epochs.
|
Hi @poppinace, I've noticed the issue #11. I was wondering if the performance discrepancy is due to the difference of the number of the channels of the second convolutional layer in the index block? Is the performance SAD 45.8 reported on that model? Thanks a lot! |
Hi @hejm37 , the performance is NOT reported on the model with the doubled number of channels. It was just a mistake when I compute the number of parameters. |
Thanks for your response @poppinace! I've also tried to train the network, but the result I got is similar to what 983 got. The best SAD I got so far is 46.96 (train for three times). Maybe it is just because of the different random seed. |
@hejm37 I see. Maybe it is about the hardware platform. The model is trained on a supercomputer where it uses a different system. I have an experience where the same code (not deep learning) running on Windows and Mac produces different results. I think such numerical differences should be normal, especially for deep learning. Your reproduced results look good to me. |
I found a solution. NumPy produces the same "random" values for every worker thread and every epoch because The fix is to seed the RNG differently using I get MSE 0.01286 and SAD 43.8 after just 23 epochs. |
@983 Awesome! I'll fix this. |
I think the fix could still be improved. Currently, only indexnet_matting/scripts/hldataset.py Line 109 in 4beb06a
In addition, indexnet_matting/scripts/hltrainval.py Line 161 in 4beb06a
Seeding with import numpy as np
import torch
import torch.utils.data
torch.manual_seed(0)
class MyDataset(torch.utils.data.Dataset):
def __getitem__(self, index):
return np.random.randint(1000)
def __len__(self):
return 4
def worker_init_fn(worker_id):
np.random.seed(np.random.get_state()[1][0] + worker_id)
dataset = MyDataset()
dataloader = torch.utils.data.DataLoader(
dataset,
batch_size=1,
num_workers=2,
worker_init_fn=worker_init_fn)
for epoch in range(3):
print("Epoch", epoch)
for batch in dataloader:
print(batch)
print() The output is the same for each epoch.
The PyTorch documentation recommends the following def worker_init_fn(worker_id):
worker_seed = torch.initial_seed() % 2**32
np.random.seed(worker_seed)
random.seed(worker_seed) |
Hi, I appreciate your rigor. Can you submit a pull request? |
I tried to reproduce the results on the Adobe 1k Dataset and got exactly the same numbers when using the pretrained model. Very good job with that :)
I also tried to train the model from scratch, but did not succeed yet. Do you have any tips?
What I got so far:
What it should look like:
As you can see, your model produces much sharper results.
My training procedure:
Model:
I've also tried:
pretrained=False
freeze_bn=True
I am not sure about first cropping and then resizing, as described in Deep Image Matting, because every batch it produces a few trimaps which have 100% unknown region. Also, it is impossible to crop a 640x640 image from some alpha mattes because they don't have unknown pixels to center the cropped region on.
The text was updated successfully, but these errors were encountered: