Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poorly generated novel views for image pairs other than source id 0 and 1 #75

Open
pengcanon opened this issue Dec 13, 2024 · 6 comments

Comments

@pengcanon
Copy link

Hello,

I tried changing the source id to, e.g. 0, 2 or 0, 3..., anything other than the default [0, 1], based of the provided dataset, the rendered result is quite poor, see below. Is this anticipated? I guess the cause might be due to inaccurate depth estimation for views separated more than 22.5 degrees. Any thought?
0001_novel00

@ShunyuanZheng
Copy link
Collaborator

Hi @pengcanon, since the rendered training data is under the 22.5 camera setup, the model trained on these data only works on this setting. If you want to generalize to more diverse input settings, you should make your training data cover the targeting camera setups.

@pengcanon
Copy link
Author

Hi @pengcanon, since the rendered training data is under the 22.5 camera setup, the model trained on these data only works on this setting. If you want to generalize to more diverse input settings, you should make your training data cover the targeting camera setups.

Thanks for the reply. I thought that would be the case too. But I was wondering whether the dysfunction was largely due to inaccurate depth estimations or more to the subsequent forward pass splat generation or both. What is your thought?

Also, I parsed out the disparity map at the intermediate step, and it doesn't seem very accurate even for image pairs in the 22.5 separation. See below. That makes me wonder how would a more accurate depth estimation help improve the overall result. My understanding is that your work bears some levels of similarity with pixelspat presented at cvpr2024 which claims that the only bottle neck is due to the depth estimation. Is my understanding correct?

@pengcanon
Copy link
Author

pengcanon commented Dec 16, 2024

disparity map below
image

@ShunyuanZheng
Copy link
Collaborator

The depth estimations are crucial to the rendering quality. The depth (or disparity) under the 22.5 setup is reasonably precise, you can save the intermediate point clouds to have a more intuitive visualization. Also, you can view the validation results in stage 1, which is directly generated via depth warping.

@pengcanon
Copy link
Author

Thank you for your quick reply. Where do I check for the validation result at stage 1? From the paper, it looks like the depth module and the splat generation module are trained at the same time. But based on the instruction, it seems that stage 1 and stage 2 trainings are conducted separately. Could you clarify?

@ShunyuanZheng
Copy link
Collaborator

The validation results are saved during the training. As described in sec 5.1 of our paper, we pre-train the depth estimation module for 40k iterations, which refers to the stage 1 of the training code. Then we jointly train two modules (depth estimation and Gaussian parameter regression) in stage 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants