camera_utils: add "vertical" orientation and "focus" center methods #1543

f-dy · 2023-03-02T02:21:32Z

This PR adds a better vertical direction estimation (named "vertical"), which should work in most cases, even when cameras are in a parallel (eg LLFF) configuration (except that in this configuration cameras should be looking horizontally).
It also adds a different centering method, more appropriate for inward-looking configurations, called "focus". The previous unnamed centering method is now called "poses"

Additionally, we fixed inconsistent naming of auto_scale_poses as scale_poses is a couple of DataParserConfig.

Note: The authors of this PR are @f-dy (previously known as @devernay, but I had to create another account for work-related contributions) and @gilureta .

"vertical" orientation method

The existing orientation method, which is called "up", works by averaging the up vector (Y) of all cameras. It has issues, e.g. when cameras were mostly looking down and had an uneven distribution. However, it's still very useful when camera poses are random, for example when scanning a full room rather than a single object.

COLMAP has a slightly different approach, but it also works from the cameras Y direction in 3D, so we expect the same failure cases as the current "up" solution:

It tests all Y axes as potential reference axis, and count the number of inliers among the other axes. Inliers have a cosine distance below 0.05, corresponding to 18.2°
It takes the axis that has the largest number of inliers, and computes the average of the inliers (including itself)

This "vertical" orientation method works by finding the 3D direction that is most orthogonal to the X direction of all cameras. This means that this 3D direction projects close to the Y axis of all cameras. This also means that the vanishing point corresponding to the vertical direction should project close to the Y axis. This gives us a vector that may point down instead of up, so we use the previous "up" computation to disambiguate that situation.

This works in all cases where photos are correctly oriented (also a requirement for the "up" method), and also requires some rotation in the camera poses, else the vertical may be any direction in the sagittal plane (the plane spanned by their Z and Y axes), which is shared by all cameras.

We detect that degeneracy: since the vertical is given as the right singular vector corresponding to the smallest singular value of a nx3 matrix, we also check the second smallest, detect the degeneracy, and resolve the ambiguity by chosing he direction within that "vertical plane" that best aligns with the previous "up" vector. This is obtained by projecting that "up" vector onto the plane. See comments in the code.

"focus" centering method

The "focus" centering method works better than "poses" when cameras are turning around something. It works by finding the 3D intersection of the cameras optical axes. Of course 3D lines have no chance to intersect, so it finds the nearest point to these axes, as described in this Wikipedia page. The solution is $\Theta(n)$.

This method is in Multinerf (source) and original NeRF (source).

instant-ngp (source) has a slightly different solution (non-optimal, but maybe more robust): it averages the pairwise intersections of all optical axes (there are $n(n-1)$ in total, making it an $\Theta(n^2)$ solution).

For better robustness, we added an additional check: we only consider the cameras that "see" the focus of attention. We simply check that the focus of attention is not behind the camera. We loop until the set of "active" cameras doesn't change. That way, if most cameras are inward-looking but a few are outward-looking, the latter will be excluded from the computation. Because we start from the "poses" solution (which is the centroid of the camera origins), it usually does only one iteration, so the solution is $\Theta(n)$ on average, but I'm pretty sure someone can design a camera configuration that leads to the worst case $\Theta(n^2)$ complexity (one camera de-activated at each iteration).

A real generic method would work from the sparse SFM reconstruction or the provided depth maps, simply computing their centroid, or a robust centroid (start from the centroid, then do IRLS with Huber weights).

COLMAP's model_aligner with alignment_type=plane centers the coordinate system on the centroid of the reconstruction

Results with previous method

Center is above the real object center, up direction is off.

Results with proposed method

Center is at the object center, up direction is OK (only the last image was produced with the fully trained model)

Here's the camera path, which explains the tilted frame in the previous version:

tancik

Thanks for adding these other options. The "centering" has been ignored despite it being a very important factor in final reconstruction quality.

tancik · 2023-03-02T17:36:31Z

nerfstudio/cameras/camera_utils.py


+    There are two centering methods:
+    poses: The poses are centered around the origin.
+    focus: The origin is set to the focus of attention of all cameras (the


What are your thoughts on have an "auto" option that first tries focus but switches to poses if some percentage of the cameras arn't "active" or some other heuristic like the focus_pt is far from the average ray.

Easy to do, so let's make this part of this PR. Should that be a fixed fraction, or an additional parameter?
If it's a fixed fraction (say 0.5) then we could add the "auto" mode which handles that (and reuse the same code)
If it's an additional parameter, then we don't even need to have a separate mode. It would just be that "focus" falls back to "poses" if there's not enough active cameras.

Do you have any intuition on how robust a fixed fraction would be?
I think adding another parameter would be fine with me. I also think it might make sense to make this the default option if we feel confident that it will generally improve the results most of the time.

So I gave it a second thought, and I think that a real "auto" centering method should not work from the camera poses, but should use the sparse 3D feature points from SfM. I'm thinking of the most trivial failure case, which is a camera array configuration à la LLFF: there's no way to guess the scene center from the poses, and only the sparse reconstruction can give you that notion. I'll take a look at how COLMAP does centering. I already know that COLMAP's up direction is very similar to the one already implemented in nerfstudio, and inferior to the one proposed here.

COLMAP's model_aligner with alignment_type=plane centers the coordinate system on the centroid of the reconstruction

I see that makes sense. I guess a solution (not for this pr) is to store a centroid in the transforms.json (for the nerfstudio_dataparser) that can override this logic.

f-dy · 2023-03-08T03:05:30Z

Updated to handle degeneracy in the "vertical" orientation method

f-dy · 2023-03-08T03:06:55Z

Note: given that the degeneracy is now detected and fixed, "vertical" could be the default orientation method

f-dy · 2023-03-08T19:00:14Z

working on splitting auto_orient_and_center_poses to pass pylint (done)

docs/developer_guides/pipelines/dataparsers.md

tancik · 2023-03-09T18:10:49Z

nerfstudio/cameras/camera_utils.py


+    There are two centering methods:
+    poses: The poses are centered around the origin.
+    focus: The origin is set to the focus of attention of all cameras (the


I see that makes sense. I guess a solution (not for this pr) is to store a centroid in the transforms.json (for the nerfstudio_dataparser) that can override this logic.

tancik

LGTM, thanks for adding these additional parameters.

f-dy · 2023-03-10T19:10:16Z

I had to resolve conflicts (were tests removed? I couldn't find them)

tancik · 2023-03-10T19:21:53Z

I had to resolve conflicts (were tests removed? I couldn't find them)

Yes, they were included in a previous PR that was reverted since it was buggy.

- See nerfstudio-project/nerfstudio#1543 - Also default to "vertical" which works better than "up" (nerfstudio-project/nerfstudio#1765) - Also rename train_split_percentage to train_split_fraction (nerfstudio-project/nerfstudio#1497)

…erfstudio-project#1543) * camera_utils: add "vertical" orientation and "focus" center methods * camera_utils: -Z is the look-at direction * Update nerfstudio_dataparser.py * missing import * fix pylint issues * camera_utils: handle degenerate case for "vertical" orientation method * camera_utils: split focus_of_attention computation --------- Co-authored-by: Frédéric Devernay <f-dy@users.noreply.github.com>

ponimatkin mentioned this pull request Mar 2, 2023

Fix camera transformations in ARKitScenes and ScanNet #1544

Merged

f-dy force-pushed the more-up-and-center-options branch from aecca66 to 54b04e2 Compare March 2, 2023 17:16

tancik reviewed Mar 2, 2023

View reviewed changes

f-dy force-pushed the more-up-and-center-options branch from 54b04e2 to 026e7b0 Compare March 8, 2023 03:02

f-dy marked this pull request as ready for review March 8, 2023 20:31

tancik reviewed Mar 9, 2023

View reviewed changes

f-dy force-pushed the more-up-and-center-options branch from b915c50 to b6cb614 Compare March 9, 2023 18:22

tancik approved these changes Mar 10, 2023

View reviewed changes

f-dy added 7 commits March 10, 2023 11:05

camera_utils: add "vertical" orientation and "focus" center methods

f5be628

camera_utils: -Z is the look-at direction

362ba86

Update nerfstudio_dataparser.py

59d112b

missing import

54f42b6

fix pylint issues

4b022e0

camera_utils: handle degenerate case for "vertical" orientation method

98bd253

camera_utils: split focus_of_attention computation

f1a0d16

f-dy force-pushed the more-up-and-center-options branch from 9c0528c to f1a0d16 Compare March 10, 2023 19:08

tancik merged commit a277cd4 into nerfstudio-project:main Mar 10, 2023

This was referenced Apr 18, 2023

Default to the "vertical" orientation method instead of "up"? #1765

Open

Need advice on custom data with poor result autonomousvision/sdfstudio#51

Open

f-dy mentioned this pull request Apr 20, 2023

add support for "vertical" orientation and "focus" centering methods autonomousvision/sdfstudio#90

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

camera_utils: add "vertical" orientation and "focus" center methods #1543

camera_utils: add "vertical" orientation and "focus" center methods #1543

f-dy commented Mar 2, 2023 •

edited

Loading

tancik left a comment

tancik Mar 2, 2023 •

edited

Loading

f-dy Mar 2, 2023

tancik Mar 2, 2023

f-dy Mar 6, 2023 •

edited

Loading

f-dy Mar 8, 2023

tancik Mar 9, 2023

f-dy commented Mar 8, 2023

f-dy commented Mar 8, 2023

f-dy commented Mar 8, 2023 •

edited

Loading

tancik Mar 9, 2023

tancik left a comment

f-dy commented Mar 10, 2023

tancik commented Mar 10, 2023

camera_utils: add "vertical" orientation and "focus" center methods #1543

camera_utils: add "vertical" orientation and "focus" center methods #1543

Conversation

f-dy commented Mar 2, 2023 • edited Loading

"vertical" orientation method

"focus" centering method

Results with previous method

Results with proposed method

tancik left a comment

Choose a reason for hiding this comment

tancik Mar 2, 2023 • edited Loading

Choose a reason for hiding this comment

f-dy Mar 2, 2023

Choose a reason for hiding this comment

tancik Mar 2, 2023

Choose a reason for hiding this comment

f-dy Mar 6, 2023 • edited Loading

Choose a reason for hiding this comment

f-dy Mar 8, 2023

Choose a reason for hiding this comment

tancik Mar 9, 2023

Choose a reason for hiding this comment

f-dy commented Mar 8, 2023

f-dy commented Mar 8, 2023

f-dy commented Mar 8, 2023 • edited Loading

tancik Mar 9, 2023

Choose a reason for hiding this comment

tancik left a comment

Choose a reason for hiding this comment

f-dy commented Mar 10, 2023

tancik commented Mar 10, 2023

f-dy commented Mar 2, 2023 •

edited

Loading

tancik Mar 2, 2023 •

edited

Loading

f-dy Mar 6, 2023 •

edited

Loading

f-dy commented Mar 8, 2023 •

edited

Loading