Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Added target data dtype as command argument --data_dtype float16 or --data_dtype float32
This data dtype affects only how the original images are stored in memory, as these occupy the most memory. In comparison, all the other objects related to an image are very small in comparison (<20 numbers to be stored / object, ~12 objects in total)
The gaussian rasterizer does not support operations with data types other than float32, so converting these small objects to any other data type reduces the memory occupied, but increases the number of operations, as we have to convert back to float32 when rasterizing. Because these objects are so small, we do not care about them.
the original image is used only to calculate the loss, in the optimization step, and experimentally i found out that computing the loss between 2 float16 images is a lot faster than computing the loss between 2 float32 images, but may lead to an increased number of iterations necessary, since the loss is of type float16.
Lastly, to further decrease the memory usage, I added a new command argument --store_images_as_uint8, that, if set, will keep all the original images stored in memory as uint8, and will convert them to the target data type on demand. This increases the number of operations a bit, since we access the image more than once in the desired data type, but we save memory as all but one image are saved as uint8. Also, when transferring images to the GPU, we transfer 1 byte for uint8, instead of 4 bytes for float32, so this can be a speedup for the case when a user uses
data_device=cpu
.TLDR:
--data_dtype float16 - converts original images to float16, memory halved and runtime decreased if compared to float32
--store_images_as_uint8 - converts to data_dtype on demand, memory used is minimal