-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve training performance when using augmentations / crop and minor model improvements #1050
Improve training performance when using augmentations / crop and minor model improvements #1050
Conversation
… result can be cached, too.
…ing so only the augmented images are getting cached - and they are uint8.
…, y_transform, y_translate: * Combined the transform / translate functions into the transform function because we can operate in a pipeline on the dictionaries that are returned by translate directly * Enable switching off image caching by re-introducing CACHE_IMAGES optionally in the config * Updated test_train as the fixture scope was a bit messed up w.r.t to usage of the tub data and config file. Also replaced namedtuple against dataclass which is a bit more modern.
… result can be cached, too.
…ing so only the augmented images are getting cached - and they are uint8.
…, y_transform, y_translate: * Combined the transform / translate functions into the transform function because we can operate in a pipeline on the dictionaries that are returned by translate directly * Enable switching off image caching by re-introducing CACHE_IMAGES optionally in the config * Updated test_train as the fixture scope was a bit messed up w.r.t to usage of the tub data and config file. Also replaced namedtuple against dataclass which is a bit more modern.
This branch improved my Epoch times from 5 minutes to 2 minutes. Total training time on my RTX-2060 went from 2.5 hours to 45 minutes for 21 epochs.
I have not tried this on a car yet. However if I try to run tubplot I get an error;
|
Thanks @Ezward - there was indeed an error in |
…o cache_augmented_images # Conflicts: # donkeycar/management/base.py
…call to open the matplotlib window. Use this argument in the test.
…call to open the matplotlib window. Use this argument in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Really good speed improvement. Thanks.
…n see what's going on in the CI for OSX.
…n see what's going on in the CI for OSX.
… when run in the shell subprocess. Needs further investigate, but it works locally. For now, just run under Linux.
# Conflicts: # donkeycar/tests/test_web_socket.py
Improve training and simplify model interfaces
Caching
TubRecord
class to accept the image processor and cache the processed image instead of the raw image to make up for the performance loss explained above. Image caching within theTubRecord
is switched on by default, as it was before. It can now be switched off by settingCACHE_IMAGES
toFalse
in the config file.uint8
images. In training these are the cached objects. The normalisation to [0, 1]float64
data still happens on the fly, as it is fast, and storing the 8-times smalleruint8
images results in much smaller memory consumption.KerasPilot
KerasPilot
) interfaces to not distinguish betweenx_transform_and_process
andx_translate
any longer but merge both functionality into the singlex_transform
call. The same is applied toy_transform
andy_translate
. The original idea that returningnumpy
arrays from thetransform
function and converting them into dictionaries in thetranslate
functions does not prove to be of any advantage, because pipeline transformations can be performed on the dictionaries in the same way as onnumpy
data.Other
pytorch
tests and changed themetrics
code to support newer versions ofpytorch
.