Accelerate Instant-NGP inference #197

Linyou · 2023-04-07T06:23:32Z

This PR enhances Nerfacc's Instant-NGP inference performance by implementing the following API changes:

The traverse_grids function has been modified to support both train and test modes.
A new function called mark_invisible_cells has been added to the occ_grid module in order to prevent rendering artifacts in unseen spaces.

liruilong940607 · 2023-04-07T07:13:00Z

nerfacc/cuda/csrc/grid.cu

+        num_steps += tid;
+        continuous_resume += tid;
+        t_starts += tid * N_samples;
+        t_ends += tid * N_samples;
+        valid_mask += tid * N_samples;
+        ray_indices += tid * N_samples;


Can't use += anymore because of the forloop:

for (int32_t tid = blockIdx.x * blockDim.x + threadIdx.x; tid < n_rays; tid += blockDim.x * gridDim.x)

It would be wrong if += is executed twice within the forloop.

use something like valid_mask[tid * Nsamples] to read/write the data

liruilong940607 · 2023-04-10T19:40:51Z

Thanks for implementing this! It is pretty nice to have that support

liruilong940607 · 2023-04-10T20:05:29Z

On the high level, the two ways of ray marching are pretty similar to each other: The "train" way of marching is to take N rays and march all the steps for each ray. The "test" ways of marching is to take all rays and march N steps for each ray (and iterative).

I feel it should be not that hard to unify the API (as well as the implementation) for the two.

To be more concrete, implementation differences between the two are:

"test" way needs to take in an argument "max_per_ray_samples", for which the "train" way could simply set it to "inf".
"test" way would want to pre-compute the "{t_sorted, t_indices, hits}" so that they are not computed multiple times. As you have already done, we should do this in python instead of C. So that we can make these three as optional arguments for the API that allows for passing in precomputed values.
"test" way needs to pass in a mask with shape (n_rays,) to skip rays. (e.g., the alive_indices you were using). "train" way can have an all ones mask.

So maybe we can unify them into the same "traverse_grid" function, with extra arguments (max_per_ray_samples=inf, masks=None, t_sorted=None, t_indices=None, hits=None). And for "traverse_grid_test", your can just call that function with an updated "near_planes" at every iteration of marching.

In this case, I think it makes sense to let the CUDA kernel return an extra tensor (n_rays,) that indicates the termination distance during grid traversal, which is essentially the "near_planes" for the next iteration of "traverse_grid_test". (the near_planes you are returning has a confusing name, which I think it should be termination_planes or something like that.

Linyou · 2023-04-11T12:01:21Z

Sound good! I think we could unify the API using extra arguments "(max_per_ray_samples=inf, masks=None, t_sorted=None, t_indices=None, hits=None)". Nice idea, BTW!

We also need to unify the return values. I suggest using the data structure defined in "data_spect.h" to store t_start and t_end, instead of creating torch::Tensor directly as I am currently doing. It may be helpful to add new methods for allocating memory in "data_spect.h" specifically for t_start and `t_end", since they are pre-allocated in the "test" way. What do you think?

As for near_planes, I already tweaked the code so we don't need to return it, we can just update it inside the "traverse_grid" kernel.

liruilong940607 · 2023-04-11T18:30:51Z

We also need to unify the return values. I suggest using the data structure defined in "data_spect.h" to store t_start and t_end, instead of creating torch::Tensor directly as I am currently doing. It may be helpful to add new methods for allocating memory in "data_spect.h" specifically for t_start and `t_end", since they are pre-allocated in the "test" way. What do you think?

I think you can use the RaySegmentsSpec just like what is being used in the traverse_grid function. And you can get t_starts and t_ends by:

https://github.com/KAIR-BAIR/nerfacc/blob/8340e19daad4bafe24125150a8c56161838086fa/tests/test_grid.py#L60-L61

As for near_planes, I already tweaked the code so we don't need to return it, we can just update it inside the "traverse_grid" kernel.

Do you mean that you inplace change the value of it? I would suggest against doing inplace modification as it is not quite user-friendly.

Linyou · 2023-04-13T10:42:27Z

I have unified the "traverse_grid" API, and now both "train" and "test" can use the same Python function. On the low level, we still need to call separate C functions to launch the CUDA kernel.

Note that the "traverse_grid" function now returns three objects (intervals, samples, termination_planes), and "termination_planes" will be just None when "ray_mask_id" is not provided.

add test mode for traverse_grids

examples/utils.py

examples/gui.py

examples/taichi_kernel.py

examples/train_ngp_nerf_occ.py

nerfacc/cuda/csrc/grid.cu

examples/utils.py

nerfacc/estimators/occ_grid.py

nerfacc/cuda/csrc/nerfacc.cpp

This reverts commit 6233fc4.

This reverts commit c37d199.

This reverts commit c93eaad.

nerfacc/cuda/csrc/grid.cu

liruilong940607 · 2023-04-29T02:42:00Z

@Linyou The latest commit should resolve the memory concerns we had before. The test is also updated to match with the actual use case. Lmk what do you think.

Linyou · 2023-04-30T18:21:36Z

Thanks! I believe that the current API design is now highly usable for test mode rendering, thanks to the latest commit.

BTW, after this PR is merged, I will create a new one for ngp test mode rendering in the examples.

liruilong940607 · 2023-05-02T07:04:16Z

@Linyou I also did some cleanups for mark_invisible_cells() and changed the API a tiny bit (the K). Now I'm happy to merge it if you think it's all good.

nerfacc/estimators/occ_grid.py

liruilong940607 · 2023-05-03T18:04:26Z

Comments addressed. Ready to Go? @Linyou

Linyou · 2023-05-03T18:28:36Z

@liruilong940607 Yeah! All good!

liruilong940607 · 2023-05-03T18:41:45Z

Thanks for the patience!! Shipped!

liruilong940607 reviewed Apr 7, 2023

View reviewed changes

Linyou force-pushed the master_dev branch from 658df40 to b6f14c7 Compare April 10, 2023 02:36

Linyou force-pushed the master_dev branch from 2f05297 to b7f0559 Compare April 13, 2023 10:40

Linyou changed the title ~~Adding Instant-NGP inference rendering~~ Accelerate Instant-NGP Rendering & Add GUI in Nerfacc Apr 13, 2023

Linyou changed the title ~~Accelerate Instant-NGP Rendering & Add GUI in Nerfacc~~ Accelerate Instant-NGP Rendering & Add GUI Apr 13, 2023

Linyou changed the title ~~Accelerate Instant-NGP Rendering & Add GUI~~ Accelerate Instant-NGP inference Apr 19, 2023

Linyou force-pushed the master_dev branch from b465eee to fe01a14 Compare April 19, 2023 16:33

add mark_invisible_cells in occ_grid

064379f

add test mode for traverse_grids

Linyou force-pushed the master_dev branch from fe01a14 to 064379f Compare April 19, 2023 16:51

Linyou added 3 commits April 20, 2023 20:52

add data type to mark_invisible_cells

d07cbf8

add test for mark_invisible_cells & test mode traverse_grids

ba900e5

upd comments

f62b72d

Linyou force-pushed the master_dev branch from a08ba94 to f62b72d Compare April 21, 2023 16:04

Linyou requested a review from liruilong940607 April 21, 2023 16:30

liruilong940607 added 2 commits April 25, 2023 04:48

ndr trial

cac18ed

Merge branch 'master' into linyou

286cad5

liruilong940607 reviewed Apr 25, 2023

View reviewed changes

liruilong940607 added 8 commits April 25, 2023 08:36

merge traverse_grids with traverse_grids_test in C

705c300

fix format

6233fc4

Revert "fix format"

6667f2d

This reverts commit 6233fc4.

revert benchmarks changes

73ce8c8

remove MLP updates

c37d199

Revert "remove MLP updates"

b9aa998

This reverts commit c37d199.

revert benchmarks changes

4581d95

add assert in traverse_grids

c93eaad

liruilong940607 added 2 commits April 25, 2023 09:17

Revert "add assert in traverse_grids"

56cb51f

This reverts commit c93eaad.

revert benchmarks changes

71017d3

Linyou commented Apr 25, 2023

View reviewed changes

nerfacc/cuda/csrc/grid.cu Outdated Show resolved Hide resolved

nerfacc/cuda/csrc/grid.cu Outdated Show resolved Hide resolved

reduce mem for traverse grid with over_allocate=True

76b3016

liruilong940607 added 4 commits May 1, 2023 19:50

cleanup doc

4206b96

final cleanup with mark_invisible_cells

8734d3b

ndr trial

39af54b

merge from master

eb7c877

Linyou commented May 3, 2023

View reviewed changes

nerfacc/estimators/occ_grid.py Show resolved Hide resolved

nerfacc/estimators/occ_grid.py Show resolved Hide resolved

nerfacc/estimators/occ_grid.py Show resolved Hide resolved

fix occ grid invisible cell filtering

86580ba

liruilong940607 merged commit 1031504 into nerfstudio-project:master May 3, 2023

Yosshi999 mentioned this pull request Sep 21, 2023

Inconsistent packed_info returned by traverse_grids(over_allocate=True) #256

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerate Instant-NGP inference #197

Accelerate Instant-NGP inference #197

Linyou commented Apr 7, 2023 •

edited

Loading

liruilong940607 Apr 7, 2023

liruilong940607 commented Apr 10, 2023

liruilong940607 commented Apr 10, 2023

Linyou commented Apr 11, 2023 •

edited

Loading

liruilong940607 commented Apr 11, 2023

Linyou commented Apr 13, 2023

liruilong940607 commented Apr 29, 2023

Linyou commented Apr 30, 2023

liruilong940607 commented May 2, 2023

liruilong940607 commented May 3, 2023

Linyou commented May 3, 2023

liruilong940607 commented May 3, 2023

Accelerate Instant-NGP inference #197

Accelerate Instant-NGP inference #197

Conversation

Linyou commented Apr 7, 2023 • edited Loading

liruilong940607 Apr 7, 2023

Choose a reason for hiding this comment

liruilong940607 commented Apr 10, 2023

liruilong940607 commented Apr 10, 2023

Linyou commented Apr 11, 2023 • edited Loading

liruilong940607 commented Apr 11, 2023

Linyou commented Apr 13, 2023

liruilong940607 commented Apr 29, 2023

Linyou commented Apr 30, 2023

liruilong940607 commented May 2, 2023

liruilong940607 commented May 3, 2023

Linyou commented May 3, 2023

liruilong940607 commented May 3, 2023

Linyou commented Apr 7, 2023 •

edited

Loading

Linyou commented Apr 11, 2023 •

edited

Loading