option to make training more deterministic #143
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#114 #140
I have been trying to make it more deterministic, here I share some of my experiences.
check https://pytorch.org/docs/stable/notes/randomness.html#cuda-convolution-benchmarking for more details
you may need to run the program with flag
CUBLAS_WORKSPACE_CONFIG=:4096:8
, if some error is raised after doing 1, check here for detailsreplace the bilinear
F.interpolate
here with the implementation below, as it is nondeterministic, check here fore details.However, the training is still non-deterministic
As the raymarching_train here is non-deterministic, I am not familiar with CUDA extension, thus I don't know how to solve it, you might want to look at it, the
rays
outputted by the function is non-deterministic.Here I provide a bash script
test.sh
to run deterministic experiments for debuggingrun
bash test.sh 0
andbash test.sh 1
to run on GPU 0 and GPU 1, and compare the output indeterministic_run_0.txt
,deterministic_run_1.txt
,