From 413f7b2894658af74f4b238054870897b10c79aa Mon Sep 17 00:00:00 2001 From: Elad Segal Date: Wed, 18 Aug 2021 10:56:16 +0300 Subject: [PATCH] fix batch auto scaling when `init_val` causes OOM (#8954) * fix batch auto scaling when `init_val` causes OOM * Update CHANGELOG.md Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --- CHANGELOG.md | 2 ++ pytorch_lightning/tuner/batch_size_scaling.py | 1 + 2 files changed, 3 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 133ebb53d025a..9ff44b7d30770 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -197,6 +197,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/). - Fixed bug where data-loading functions where not getting the correct running stage passed ([#8858](https://github.com/PyTorchLightning/pytorch-lightning/pull/8858)) +- Fixed a bug in the binary search mode of auto batch size scaling where exception was thrown if the first trainer run resulted in OOM ([#8954](https://github.com/PyTorchLightning/pytorch-lightning/pull/8954)) + ## [1.4.0] - 2021-07-27 diff --git a/pytorch_lightning/tuner/batch_size_scaling.py b/pytorch_lightning/tuner/batch_size_scaling.py index 1eda93cd831b3..c048ce0a42dd9 100644 --- a/pytorch_lightning/tuner/batch_size_scaling.py +++ b/pytorch_lightning/tuner/batch_size_scaling.py @@ -174,6 +174,7 @@ def _run_binsearch_scaling( """Batch scaling mode where the size is initially is doubled at each iteration until an OOM error is encountered. Hereafter, the batch size is further refined using a binary search""" + low = 1 high = None count = 0 while True: