Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable async_io for squad. #617

Merged
merged 2 commits into from
Jul 31, 2020
Merged

Conversation

trentlo
Copy link
Contributor

@trentlo trentlo commented Jul 23, 2020

This reduces Horovod overhead when XLA is enabled.

~7-9% speedup is observed for BERT Squad fp32 and ~3-4% speedup for fp16.

@@ -940,6 +940,9 @@ def main(_):
else:
os.environ["TF_XLA_FLAGS"] = "--tf_xla_enable_lazy_compilation=false"

# Enable async_io to speed up multi-gpu training with XLA and Horovod.
os.environ["TF_XLA_FLAGS"] += "--tf_xla_async_io_level=1"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no " " separator?
same on line 939

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I was misled by line 939. I wonder why it worked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some tests. So, the line 939 never worked...

I updated the PR. Thanks @bas-aarts for the catch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@@ -940,6 +940,9 @@ def main(_):
else:
os.environ["TF_XLA_FLAGS"] = "--tf_xla_enable_lazy_compilation=false"

# Enable async_io to speed up multi-gpu training with XLA and Horovod.
os.environ["TF_XLA_FLAGS"] += "--tf_xla_async_io_level=1"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@swethmandava swethmandava merged commit 21a77af into NVIDIA:master Jul 31, 2020
PeganovAnton pushed a commit to PeganovAnton/DeepLearningExamples that referenced this pull request Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants