-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about extend_scales. #11
Comments
If the projection works better without aligning with the rest pose I provided, I will say it is alright. One thing to pay attention to is that the 3D key points should have joint locations within [-1.5, 1.5]. This is to prevent the wrap-around issue of positional encoding. |
Thanks for your answer. That helps a lot. But I'm still a little confused. What's the purpose of aligning to the provided rest_pose? Will this cause the smpl model and image to be unable to align? |
Aligning the scale to the provided rest_pose ensures that all subjects that we are training have roughly the same range, i.e., the 3D keypoints are within a range that is known to work. This saves us some trouble when we were developing the approach. We didn't really encounter the alignment problem you mentioned though, because we scale the camera correspondingly, so it usually gives us basically the same projection before/after alignment |
No problem at all! And yes, this is a known property of A-NeRF: the model does not rely on any pre-defined surface, and therefore it can try to explain everything (even the background) using the skeleton pose. Please see #8 for further discussions |
Thanks a lot for your reply, that helps a lot. I read the discussions in #8. I have one more question. The input images are masked during the pre-process. I think there is no back-ground images is used, why there are shadows related to the background occured? the sampled rays are sampled from masked images, i really dont know why there are still background information. |
The rays are not always sampled within the mask. In our case, we dilate (expand) the mask a little bit so it will sample in both the foreground and the background. This enables A-NeRF to somewhat learn to predict 0 density in the background when |
Hello, when I made my own video dataset, I found that there is a parameter 'extend_scale' used to zoom in and out the estimated smpl model and align it to the 'rest_pose' you gave. When i draw the 2D keypoint (calculated by the aligned kp3d and c2w) on the picture, there are some deviations. I am a little confused about this.
Can I directly use SPIN estimated result?
Is there anything to pay attention to in the selection of this parameter?
The text was updated successfully, but these errors were encountered: