Avoid appending to external data when running onnx_save
#320
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When running
onnx_save(model)
and the model is >2Gb, its initializers, and parameters are saved to one external data filemodel.data
. However, when runningonnx_save(...)
multiple times, by default we are not strictly overwriting the oldmodel.data
file (the expected behavior), but we are overwriting the previously seen parameters and appending the unseen ones to the file.This is why, when exporting a QAT model, the
model.data
keep growing very large.It eventually contains:
save_onnx
as an intermediate step: https://github.com/neuralmagic/sparseml/blob/main/src/sparseml/pytorch/utils/exporter.py#L574save_onnx
appends quant/sparse external tensors tomodel.data
: https://github.com/neuralmagic/sparseml/blob/main/src/sparseml/pytorch/utils/exporter.py#L587The problem of exploding
model.data
does not concern us in the context of FP32 model, becauseonnx_save
is being called only once.Testing:
This has been also successfully tested with the
sparseml.transformers.export
pathway.