-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EfficientNet example using automatic augmentations with DALI #4678
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
d072966
to
3a7a54a
Compare
b33039a
to
341591d
Compare
Todo,
|
a818ae7
to
b65d27a
Compare
Can you make this PR to have a first commit with a copy of what's in DeepLearningExamples so that I can see what you actually changed? |
It is already done this way, look at the PR description for details about individual commits. |
else: | ||
output = images | ||
|
||
output = fn.crop_mirror_normalize(output, dtype=types.FLOAT, output_layout=types.NCHW, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output = fn.crop_mirror_normalize(output, dtype=types.FLOAT, output_layout=types.NCHW, | |
output = fn.crop_mirror_normalize(output, dtype=types.FLOAT, output_layout="CHW", |
types.NCHW
was deprecated years ago
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized, that we should have NHWC as a parameter here, and I wonder how it even got a good training result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are doing a double transposition unnecessarily, I wonder if and how much faster we can get without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get around the transposition, I am not sure if it gives any benefit, the benchmarks give me a bit more samples/s but that might be just noise.
Either way, I use "CHW" and "HWC" now and produce the memory in target layout for NHWC case.
images = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB) | ||
|
||
images = fn.resize(images, resize_shorter=image_size, interp_type=interpolation, | ||
antialias=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really want to disable antialiasing? Just asking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took it from the original pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor comments only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put some comments in the readme. I still have the sh
file left to review
docs/examples/use_cases/pytorch/efficientnet/image_classification/models/__init__.py
Outdated
Show resolved
Hide resolved
|
||
This example shows how DALI's implementation of automatic augmentations - most notably `AutoAugment <https://arxiv.org/abs/1805.09501>`_ and `TrivialAugment <https://arxiv.org/abs/2103.10158>`_ - can be used in training. It shows the training of EfficientNet, an image classification model first described in `EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks <https://arxiv.org/abs/1905.11946>`_. | ||
|
||
The code is based on `NVIDIA Deep Learning Examples <https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/efficientnet>`_ - it has been extended with DALI pipeline supporting automatic augmentations, which can be found in :fileref:`here <docs/examples/use_cases/pytorch/efficientnet/image_classification/dali.py>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fileref
thingy does not render properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does in the sphinx, I took it from SSD example. I will try to please both Sphinx and GH, but I don't know if it is possible.
* ``--data-backend`` parameter was changed to accept ``dali``, ``pytorch``, or ``synthetic``. It is set to ``dali`` by default. | ||
* ``--dali-device`` was added to control placement of some of DALI operators. | ||
* ``--augmentation`` was replaced with ``--automatic-augmentation``, now supporting ``disabled``, ``autoaugment``, and ``trivialaugment`` values. | ||
* ``--workers`` defaults were halved to accommodate DALI. The value is automatically doubled when ``pytorch`` data loader is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to explain, why workers
needed to be halved to accommodate DALI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a bit about fitting both loaders with good default, but I am not sure if I really want to dive deep into how it works.
|
||
* For inference: | ||
|
||
* Scale to target image size + 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is not really clear, what +32
means in this context. Could you explain this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just 224 + 32 - the definition was taken from the original model as was (partially) this description. I will reword it a bit.
# TODO(klecki): Move it back again | ||
import torchvision.datasets as datasets | ||
import torchvision.transforms as transforms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this TODO is a leftover or something that should stay. However, if it should stay, I believe it should describe better what should be done ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter much, but it breaks in my local setup as torchvision clashes with my local build of DALI, so I need to import DALI first.
docs/examples/use_cases/pytorch/efficientnet/image_classification/dali.py
Outdated
Show resolved
Hide resolved
56a61d7
to
66a5efc
Compare
# of Pipeline definition, this `if` statement relies on static scalar parameter, so it is | ||
# evaluated exactly once during build - we either include automatic augmentations or not. | ||
if automatic_augmentation == "autoaugment": | ||
shapes = fn.peek_image_shape(jpegs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the image sizes are uniform (and they are thanks to the resize
) we can skip shapes and use just the absolute version and use max_translation_abs=250 or 224.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to show the more flexible version in this example if it doesn't cause perf issues. Let me check.
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Adjust some configuration options to accomodate it. Remove the obsolete pipeline Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
The test run 1 less epoch than the readme to make it a bit shorter Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
66a5efc
to
29a0f2d
Compare
CI MESSAGE: [7580814]: BUILD STARTED |
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
CI MESSAGE: [7582436]: BUILD STARTED |
CI MESSAGE: [7580814]: BUILD FAILED |
CI MESSAGE: [7582436]: BUILD PASSED |
…DIA#4678) This example ports the [EfficientNet](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/efficientnet/README.md) sample from DeepLearningExamples repository. The example is limited to efficientnet-b0 variant for simplicity. DALI pipeline is updated to use `fn` API and to use new automatic augmentations: adding options to select both AutoAugment and TrivialAugment. The main.py is adjusted so the defaults are suitable for EfficientNet training immediately (previously they were the defaults for RN50 training) and launch.py is no longer needed - the original example was started via launch.py, that looked up default values for specific network in an .yml config and passed them to the main.py. This way we can use main.py directly without the layers of intermediate scripts. The benchmarks from readme are used to implement the L3 test. Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Category: New feature, Other
Description:
This example ports the EfficientNet sample from DeepLearningExamples repository.
The example is limited to efficientnet-b0 variant for simplicity. DALI pipeline is updated to use
fn
API and to use new automatic augmentations: adding options to select both AutoAugment and TrivialAugment.The main.py is adjusted so the defaults are suitable for EfficientNet training immediately (previously they were the defaults for RN50 training) and launch.py is no longer needed - the original example was started via launch.py, that looked up default values for specific network in an .yml config and passed them to the main.py. This way we can use main.py directly without the layers of intermediate scripts.
The benchmarks from readme are used to implement the L3 test.
The automatic augmentations come from: #4648. It can already be reviewed as the API is basically one-line invocation within the pipeline definition.
Additional information:
Affected modules and functionalities:
Docs/examples PR with L3 test.
Key points relevant for the review:
❗ Please check if the defaults in main.py match the ones in https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/configs.yml for DGX-1V, efficientnet-b0.
How to review this PR
I suggest checking out the code and running diff tool like
meld
to compare directories:docs/examples/use_cases/pytorch/efficientnet/
from this PRPyTorch/Classification/ConvNets/
from DeepLearningExamplesthat way you can see that most of this PR are just files copied over.
Individual commits description:
I tried my best to split the PR into commits that are easier to review. Those are the steps (and specific commits):
PyTorch/Classification/ConvNets/
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: 3194