-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Autodiff] Deterministic gradient compute #7321
Conversation
dag = tvm.auto_scheduler.ComputeDAG(grads) | ||
repeat = 100 | ||
for i in range(repeat): | ||
grads = te.gradient(R, [X], head=ones) | ||
new_dag = tvm.auto_scheduler.ComputeDAG(grads) | ||
assert str(dag) == str(new_dag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Since auto_scheduler guarantees the DAG would be the same with the same given compute, you don't need to involve auto_scheduler in this test.
- I'm even not sure if we need this test because it seems cannot expose the real problem. IIUC, the non-deterministic behavior comes from the use of unordered_set, so you may still pass this pass when you're lucky even you break something. If that happens, this test becomes flaky. But I'd like to hear opinions from others. cc @yzhliu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since auto_scheduler guarantees the DAG would be the same with the same given compute, you don't need to involve auto_scheduler in this test.
Yes, I agree. I use auto_scheduler here only because it provides a hash_key for TE level ir. Any ideas about how to compare two Tensor
?
I'm even not sure if we need this test because it seems cannot expose the real problem. IIUC, the non-deterministic behavior comes from the use of unordered_set, so you may still pass this pass when you're lucky even you break something. If that happens, this test becomes flaky. But I'd like to hear opinions from others.
Agree. I'm fine with removing the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM, but I'm not that familiar with this module, so I'd let @yzhliu approve this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls retrigger the ci, seems to be a flaky.
* fix unstable compute * fix * fix * lint * sort linear equation * sort inequalities * fix * fix find * lint * fix find * lint
* fix unstable compute * fix * fix * lint * sort linear equation * sort inequalities * fix * fix find * lint * fix find * lint
* fix unstable compute * fix * fix * lint * sort linear equation * sort inequalities * fix * fix find * lint * fix find * lint
* fix unstable compute * fix * fix * lint * sort linear equation * sort inequalities * fix * fix find * lint * fix find * lint
te.gradient
may generate different (but equivalent) backward compute for a single forward, which may result in Ansor miss. This pr makes surete.gradient
always generates the same backward.cc @yzhliu @comaniac