All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
This project adheres to Semantic Versioning, with the exception that v0.X updates include backwards-incompatible API changes. From v1.0.0 and on, the project will adherence strictly to Semantic Versioning.
- Updated on-gpu model benchmaking with best-practices on
cuda.Event
andcuda.synchronize
. - FLOPs measurement error on CUDA.
- Repo DOI
- Add missing memory to results.
- Memory measurement for bs=1.
- Warm up batch size.
### Removed
try_custom_warmup
.
warm_up_fn
overload option.- Support for FLOPs count in torch.nn.Module with input other than Tensor.
- Memory measurement for each batch size.
- Repeated energy measurement.
- Number formatting to use u instead of µ.
- Option to redirect info prints.
- Added missing with torch.no_grad
- Overloads for benchmark parameters and functions to allow benchmark of custom classes.
- GPU compatibility.
- Carbon-tracker energy measurement. Library is still too immature at this point.
- Initial version.