Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant memory reduction through TCNN's new memory arena #128

Merged
merged 1 commit into from
Feb 10, 2022

Conversation

Tom94
Copy link
Collaborator

@Tom94 Tom94 commented Feb 10, 2022

The fox scene now requires only 2.3 GB on an RTX 3090 w/ FullyFusedMLP.

Even when using Fp32 + CutlassMLP (the worst case for memory usage), fox and lego/transforms_train.json fit into 8gb.

Also fixes temporarily broken input gradients.

GPU Memory Arena

tiny-cuda-nn now has a memory arena that can be used for efficient per-stream allocation of temporary memory. As a consequence, much of the memory that previously needed to be pre-allocated can now be allocated and released on the fly (effectively re-using it across sequential computations). This ends up reducing memory usage by over 50% in many cases. See also #36

@Tom94 Tom94 changed the title Significant memory reduction through TCNN's new arena Significant memory reduction through TCNN's new memory arena Feb 10, 2022
@Tom94 Tom94 merged commit 3c5e523 into master Feb 10, 2022
@Tom94 Tom94 deleted the memory-arena branch February 22, 2022 19:07
fnysalehi pushed a commit to fnysalehi/instant-ngp-rendering that referenced this pull request May 14, 2024
Significant memory reduction through TCNN's new memory arena
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant