-
Notifications
You must be signed in to change notification settings - Fork 22.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Re-land] [CUDA graphs] Clear autocast amp cache #81896
[Re-land] [CUDA graphs] Clear autocast amp cache #81896
Conversation
🔗 Helpful links
✅ No Failures (0 Pending)As of commit 26d7e13 (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
@pytorchbot merge |
@pytorchbot successfully started a merge and created land time checks. See merge status here and land check progress here. |
Re-lands #81558 that got reverted due failing tests. This failure happened because of the test that I poorly designed. [The loop here](https://github.com/pytorch/pytorch/pull/81558/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3837) is doing `cache_enabled=False` and then `cache_enabled=True`. By doing this loop the graph from previous iteration (case `False`) conflicts with the next one (case `True`). I redesigned the test such that it does not do any loops. The new test does separate function calls with different argument values. Pull Request resolved: #81896 Approved by: https://github.com/ngimel
Merge failed due to Failed to merge; some land checks failed: pull, pull / win-vs2019-cpu-py3 / test (default, 1, 2, windows.4xlarge) |
@ngimel, looks like tests failures are unrelated to AMP and CUDA graphs. Should we try to force merge it? |
@pytorchbot merge -f "test failures are unrelated" |
@pytorchbot successfully started a merge job. Check the current status here |
Hey @Aidyn-A. |
Summary: Re-lands #81558 that got reverted due failing tests. This failure happened because of the test that I poorly designed. [The loop here](https://github.com/pytorch/pytorch/pull/81558/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3837) is doing `cache_enabled=False` and then `cache_enabled=True`. By doing this loop the graph from previous iteration (case `False`) conflicts with the next one (case `True`). I redesigned the test such that it does not do any loops. The new test does separate function calls with different argument values. Pull Request resolved: #81896 Approved by: https://github.com/ngimel Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/da0a3fe058de386d569b9fd621bd845d40e0cc39 Reviewed By: kit1980 Differential Revision: D38394874 fbshipit-source-id: e8aeecaa4cff30379b20d852cbf00460983a8615
…es to pass [Re-land] [CUDA graphs] Clear autocast amp cache (pytorch#81896) Re-lands pytorch#81558 that got reverted due failing tests. This failure happened because of the test that I poorly designed. [The loop here](https://github.com/pytorch/pytorch/pull/81558/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3837) is doing `cache_enabled=False` and then `cache_enabled=True`. By doing this loop the graph from previous iteration (case `False`) conflicts with the next one (case `True`). I redesigned the test such that it does not do any loops. The new test does separate function calls with different argument values. Pull Request resolved: pytorch#81896 Approved by: https://github.com/ngimel
…es to pass (#1144) [Re-land] [CUDA graphs] Clear autocast amp cache (pytorch#81896) Re-lands pytorch#81558 that got reverted due failing tests. This failure happened because of the test that I poorly designed. [The loop here](https://github.com/pytorch/pytorch/pull/81558/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3837) is doing `cache_enabled=False` and then `cache_enabled=True`. By doing this loop the graph from previous iteration (case `False`) conflicts with the next one (case `True`). I redesigned the test such that it does not do any loops. The new test does separate function calls with different argument values. Pull Request resolved: pytorch#81896 Approved by: https://github.com/ngimel Co-authored-by: Aidyn-A <31858918+Aidyn-A@users.noreply.github.com>
Re-lands #81558 that got reverted due failing tests.
This failure happened because of the test that I poorly designed. The loop here is doing
cache_enabled=False
and thencache_enabled=True
. By doing this loop the graph from previous iteration (caseFalse
) conflicts with the next one (caseTrue
). I redesigned the test such that it does not do any loops. The new test does separate function calls with different argument values.