-
Notifications
You must be signed in to change notification settings - Fork 171
Cityscapes experiment #2
Comments
We will update the code in a few days with
|
Thanks. Will the update code be running with pytorch 1.0? I am getting into a few problems to run it since some features are deprecated (e.g. volatile variables, |
I am also having an error with the FW step:
What exactly should contain Edit: the solution is to replace with |
I can confirm that those changes worked for me to get the code running with pytorch 1.0. |
@r0456230 Can you tell me what exactly you are trying to reproduce? The config files I put as a comment should give exact results of mgda w/ and w/o approximation. Please note that; we report disparity metric in paper and compute depth metric in the code. Depth map is later separately converted into disparity as post-processing. If the issue is depth, this should explain it. mIoU should be exactly same with what reported in the code and the paper. We used the parameters I posted as a comment. |
@maria8899 Although we are planning to support pytorch 1.0, I am not sure when will it be. I will also update the ReadMe with the exact versions of each Python module we used. Pytorch was 0.3.1 |
@ozansener I was able to reproduce the results from the paper for the single task models using your code (depth, instance segmentation and semantic segmentation on cityscapes). |
I think I have managed to make it work with pytorch 1.0, but I still need to check the results and train it fully. |
@maria8899 @r0456230 could you please tell me how to solve I tried the code below, but I'm not sure if it's correct depth_mean = np.mean([depth!=0])
depth[depth!=0] = (depth[depth!=0] - depth_mean) / self.DEPTH_STD
#depth[depth!=0] = (depth[depth!=0] - self.DEPTH_MEAN[depth!=0]) / self.DEPTH_STD |
@YoungGer you need to compute a mean image (per pixel) using all training images (or just a few to get an approximation) of the Cityscapes's disparity dataset. |
I know, thank you for your help. |
@YoungGer Have you noticed, that the find_min_norm_element method actually uses the projected gradient descent method? Only find_min_norm_element_FW is the Frank-Wolfe algorithm as discussed in the paper. They are only guaranteed to be equivalent for a number of tasks equal to 2. |
EDIT: Obviously I realised right after sending that question 2 is because of the optimization. Question 1 still remains. Hi @ozansener , I have two questions after reading this answer by @maria8899
In your code the z variable returned by the 'backbone' of the network is passed to each task. First of all as maria noted, the gradient is a 4D variable that needs to be reshaped first to 1D in torch 1.0.1.
Thanks a lot! |
@kilsenp First let me answer the 2.
|
Hi, @maria8899, @kilsenp, |
Hi,@liyangliu , |
@youandeme, I haven't reproduced the results on MultiMNIST. I used the same hyper parameters mentioned by the author in #9, but can not surpass the uniform scaling baseline. Also, I noticed in the "Gradient Surgery" paper (supplementary materials), other researchers report different results on MultiMNIST from this MOO paper. So I doubt that others also have difficulty in reproducing the results on MultiMNIST following this MOO paper. |
@liyangliu @youandeme How are you evaluating the MultiMNIST? We did not release any test set, actually there is no test set. The code generates random test set every time you run. For all modules, you simply use the hyper-params I put. Then, you save every epoch result and choose the best epoch with the best val accuracy. Then, you call MultiMNIST test which will generate random test set and evaluate it. If you call the MultiMNIST loader with test param, it should do the trick. If you evaluate this way, the result are not exactly matching since test set is randomly generated, but the order of methods is preserved. |
Hi, @ozansener, as you mentioned, the order of different methods (MGDA-UB vs. uniform scaling) will keep the same whatever test set I use. But on the validation set, I cannot find the superiority of MGDA-UB upon uniform scaling. Also, on CityScapes I cannot reproduce the results reported in the paper. Actually I find that single-task baseline is better than the reported ones (10.28 vs. 11.34, 64.04 vs. 60.68 on the instance and semantic segmentation task respectively). I obtain these numbers with your provided code, so maybe I made some mistakes? |
@liyangliu For MultiMNIST, I think there are issues since we did not release a test. Everyone reports slightly different numbers. In hindsight, we should have released the test set but did not even save it. So, I would say please report whatever number you obtained for MultiMNIST. For Cityscapes though, it is strange as many people re-produced the numbers. Please send me an e-mail about the CityScapes so we can discuss. |
Thanks. @ozansener. On CityScapes I re-run your code with instance & semantic segmentation task and get the following results for MGDA-UB and SINGLE task, respectively:
It seems that the performance of instance segmentation is a bit strange. |
@liyangliu Instance segmentation one looks strange. Are you using the hyper-params I posted for both single task and multi-task. Also, are the tasks uniformly scaled or are you doing any search. Let me know the setup. |
@ozansener, I use exactly the hyper-params you posted for single & multi-task training. I use 1/0 and 0/1 scale for single task training (instance and semantic segmentation) and didn't do any grid search. |
Sorry for an off-topic question, but I have trouble even running the training on CityScapes: for 256x512 input I get 32x64 output, while the target is 256x512. And smaller output makes sense to me because of the not dilated earlier layers & maxpooling. |
Hi,
Thanks for open sourcing the code, this is great!
Could you share your json parameter file for cityscapes?
Also, I think it is is missing the file
depth_mean.npy
to be able to run it.Thanks.
The text was updated successfully, but these errors were encountered: