Significant memory reduction through TCNN's new memory arena #128

Tom94 · 2022-02-10T18:01:14Z

The fox scene now requires only 2.3 GB on an RTX 3090 w/ FullyFusedMLP.

Even when using Fp32 + CutlassMLP (the worst case for memory usage), fox and lego/transforms_train.json fit into 8gb.

Also fixes temporarily broken input gradients.

GPU Memory Arena

tiny-cuda-nn now has a memory arena that can be used for efficient per-stream allocation of temporary memory. As a consequence, much of the memory that previously needed to be pre-allocated can now be allocated and released on the fly (effectively re-using it across sequential computations). This ends up reducing memory usage by over 50% in many cases. See also #36

Significant memory reduction through TCNN's new memory arena

Significant memory reduction through TCNN's new arena

299afdf

Tom94 changed the title ~~Significant memory reduction through TCNN's new arena~~ Significant memory reduction through TCNN's new memory arena Feb 10, 2022

Tom94 merged commit 3c5e523 into master Feb 10, 2022

Tom94 deleted the memory-arena branch February 22, 2022 19:07

fnysalehi pushed a commit to fnysalehi/instant-ngp-rendering that referenced this pull request May 14, 2024

Merge pull request NVlabs#128 from NVlabs/memory-arena

6d23723

Significant memory reduction through TCNN's new memory arena

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant memory reduction through TCNN's new memory arena #128

Significant memory reduction through TCNN's new memory arena #128

Tom94 commented Feb 10, 2022 •

edited

Loading

Significant memory reduction through TCNN's new memory arena #128

Significant memory reduction through TCNN's new memory arena #128

Conversation

Tom94 commented Feb 10, 2022 • edited Loading

GPU Memory Arena

Tom94 commented Feb 10, 2022 •

edited

Loading