Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

artifacts #64

Open
bbakpil opened this issue Sep 19, 2024 · 3 comments
Open

artifacts #64

bbakpil opened this issue Sep 19, 2024 · 3 comments

Comments

@bbakpil
Copy link

bbakpil commented Sep 19, 2024

Hello, thanks for your awesome work. I have one question.

Currently, I am using your codebase with single-view input, and I’m not doing depth joint training; instead, I’m calculating the depth separately and using it (while keeping the depth encoder the same).

Looking at Figure 4 of your paper, it seems that without the depth encoder, there are Gaussians like shape with spikes (perhaps indicating that two of the scale axes are close to zero?), and I’m experiencing a similar issue. I’m still using the depth encoder as is, but do you think this might require more accurate depth estimation?

If the question is unclear, I can rephrase it.
Thank you in advance!

@ShunyuanZheng
Copy link
Collaborator

Thank you for your interest!

For a single-view input scenario, I have once trained with gt depth or with depth from sensors as input (directly used as position map) and used novel view images for supervision. The viewpoint change in my case is around 20°. The result of this experiment setup was reasonable. However, when substituting the gt depth maps with the estimated ones, the task becomes more challenging. Both the viewpoint extrapolation and single-view depth estimation are underdetermined. I think you can use isotropic Gaussians to validate whether the problem is alleviated. That means setting the rotation of each primitive [1, 0, 0, 0], predicting a 1-dimensional scaling, and repeating it to 3-dimension. This trick is used in a monocular Gaussian Avatar work.

@bbakpil
Copy link
Author

bbakpil commented Oct 7, 2024

Hello, thanks for your kind reply!

I have an additional question about the viewpoint change (novel-view) settings.
Currently, I'm feeding a single image to the pipeline, and training the model to reconstruct the source view. To control the viewpoint, how should I setup the pipeline? Should I just input the extrinsic parameters of the desired novel view during the testing?

Thank you in advance!

@ShunyuanZheng
Copy link
Collaborator

Yes, you should prepare the desired novel view extrinsic parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants