Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing results #14

Open
RafiqueA03 opened this issue Jan 14, 2024 · 3 comments
Open

Testing results #14

RafiqueA03 opened this issue Jan 14, 2024 · 3 comments

Comments

@RafiqueA03
Copy link

Hi Mahmoud,
For my work, one of my tasks is to reproduce the results mentioned in your paper. For this, I am testing the provided pre-trained model on the INETL-TAU dataset (7022 images) with m=7. Since the images are already black-level subtracted as mentioned on the dataset website, I am directly passing resized PNG images (384×256) to the model along with corresponding .json files (illuminant information). I am also using cross-validation but without the G multiplier for testing. My obtained results are as: Mean: 2.61, Median: 1.77, Best25: 0.57, Worst25: 1.44, Worst05: 2.16, Tri: 1.95, and Max: 28.39.

There is slight variation in results except the Worst25 which has a lot. As per my understanding, one reason could be the random sample selection nature of cross-validation. Is it so? or is there any other important step, I am missing?

Another thing to be mentioned, during the test I didn't mask out the color checker present in the scenes that you mentioned in the paper. Could you please provide details on it, how you did that? Because I think for the masking the coordinates for the color checker in each scene should be known.

@mahmoudnafifi
Copy link
Owner

Hi, there was a mistake in the evaluation script provided in this repo -- it does not affect the reported results, as the script used for evaluation in the paper does not have this bug. I fixed this bug in the public evaluation code here. Please try again with the updated evaluation script, and let me know if you still encounter any issues with 'Worst25'.

Regarding the second point, yes, we masked out color checker pixels in datasets that include color charts in the scenes. In these datasets, the color chart coordinates are provided by the authors. In Intel-tau case, there are no color charts present in the images, so this step is unnecessary.

@RafiqueA03
Copy link
Author

RafiqueA03 commented Jan 18, 2024

Hi,
Thanks for the update. Now the obtained results are almost the same.

However, there is another question. The provided pre-trained model is trained on data_num (m)=7. Does this mean that for testing the data_num should explicitly be equal to 7? Because if I put data_num less or greater than 7, I get the error "RuntimeError: Error(s) in loading state_dict for network:".

Can you please clarify? should the model be trained and tested on the same value of data_num? For example, in the training section of the paper for m=9, it is mentioned that, "then randomly select eight additional input images for each query image from the training set for use as additional input images".

Also how encoder blocks are being generated. From visualization in the paper, it appears that for m images encoder blocks will be m.

@mahmoudnafifi
Copy link
Owner

Yes testing should be performed using the same m, which refers to the number of additional histograms randomly selected + the input histogram. m=7 means we use 6 additional histograms. The number of encoders are equal to m.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants