Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix]: fix tanh grad grad, add the third derivative test #10237

Merged
merged 10 commits into from
May 11, 2023

Conversation

lucky9-cyou
Copy link
Contributor

修复之前更新 tanh op 的高阶微分规则。使用tanh_grad来计算tanh_grad_grad中grad_require_grad的部分。添加了三阶微分的测试,二阶微分的测试不会涉及到grad_require_grad这一部分。

在OneScience中的cylinder2d例子的性能变化:
origin: 0.29/epoch
now: 0.26/epoch

@lucky9-cyou lucky9-cyou added WIP work in progress ci eager op labels May 8, 2023
@lucky9-cyou lucky9-cyou requested review from hjchen2 and levi131 May 8, 2023 08:26
@lucky9-cyou lucky9-cyou requested review from BBuf and daquexian as code owners May 8, 2023 08:26
@lucky9-cyou lucky9-cyou enabled auto-merge (squash) May 8, 2023 08:27
@lucky9-cyou
Copy link
Contributor Author

加的两个test会导致abs和square算子出现不一样的问题:

❯ python test_math_op_higher_derivative.py       
E.......................F..
======================================================================
ERROR: test_abs_grad_grad (__main__.TestMathOpHigherDerivative)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py", line 668, in get_pytorch_oneflow_res
    pytorch_res = pytorch(*pytorch_args, **pytorch_kwargs)
  File "/data/home/zhangyanglin/miniconda3/envs/oneflow-compile/lib/python3.9/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_math_op_higher_derivative.py", line 156, in test_abs_grad_grad
    _test_math_op_grad_grad_impl(test_case, "abs")
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_math_op_higher_derivative.py", line 52, in _test_math_op_grad_grad_impl
    x_grad_grad_grad = torch.autograd.grad(x_grad_grad, x, init_grad, retain_graph=True)[0]
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py", line 830, in dual_method
    pytorch_res, oneflow_res = get_pytorch_oneflow_res(
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py", line 733, in get_pytorch_oneflow_res
    raise PyTorchDoesNotSupportError(e)
oneflow.test_utils.automated_test_util.torch_flow_dual_object.PyTorchDoesNotSupportError: PyTorch error: element 0 of tensors does not require grad and does not have a grad_fn

======================================================================
FAIL: test_square_grad_grad (__main__.TestMathOpHigherDerivative)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_math_op_higher_derivative.py", line 150, in test_square_grad_grad
    _test_math_op_grad_grad_impl(test_case, "square")
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_math_op_higher_derivative.py", line 53, in _test_math_op_grad_grad_impl
    test_case.assertTrue(
AssertionError: False is not true
❯ python test_global_math_op_higher_derivative.py
F.......................F..
======================================================================
FAIL: test_global_abs_grad_grad (__main__.TestGlobalMathOpHigherDerivative)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py", line 1370, in new_f
    return f(*args, **kwargs)
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_global_math_op_higher_derivative.py", line 242, in test_global_abs_grad_grad
    _global_math_op_grad_grad_impl(test_case, "abs", placement, sbp)
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_global_math_op_higher_derivative.py", line 45, in _global_math_op_grad_grad_impl
    test_case.assertTrue(
AssertionError: False is not true

======================================================================
FAIL: test_global_square_grad_grad (__main__.TestGlobalMathOpHigherDerivative)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py", line 1370, in new_f
    return f(*args, **kwargs)
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_global_math_op_higher_derivative.py", line 230, in test_global_square_grad_grad
    _global_math_op_grad_grad_impl(test_case, "square", placement, sbp)
  File "/data/home/zhangyanglin/Code/oneflow-master/python/oneflow/test/modules/test_global_math_op_higher_derivative.py", line 56, in _global_math_op_grad_grad_impl
    test_case.assertTrue(
AssertionError: False is not true

@lucky9-cyou
Copy link
Contributor Author

lucky9-cyou commented May 8, 2023

在两个文件的三阶导测试中,oneflow的square算子三阶导会跑出来nan,torch的square算子三阶导全是0

@lucky9-cyou
Copy link
Contributor Author

在非global测试中,pytorch的abs算子三阶导会报上面的错误,global的测试中,pytorch的abs二阶导是错的,不是0,oneflow的是0,三阶导会报上面的错误。

@lucky9-cyou lucky9-cyou force-pushed the ylin/tanh_grad_grad branch from 9ebc928 to ecd828d Compare May 9, 2023 05:36
@lucky9-cyou lucky9-cyou removed the WIP work in progress label May 9, 2023
@lucky9-cyou
Copy link
Contributor Author

abs和square的bug放在了这个issue中:#10243

Copy link
Contributor

@levi131 levi131 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lucky9-cyou lucky9-cyou force-pushed the ylin/tanh_grad_grad branch from 6e106ac to 51209e8 Compare May 9, 2023 06:25
@hjchen2
Copy link
Contributor

hjchen2 commented May 9, 2023

在两个文件的三阶导测试中,oneflow的square算子三阶导会跑出来nan,torch的square算子三阶导全是0

torch的square算子三阶导我这里是None,你用的pytorch版本是多少

@lucky9-cyou lucky9-cyou disabled auto-merge May 11, 2023 05:47
@lucky9-cyou lucky9-cyou enabled auto-merge (squash) May 11, 2023 05:47
@levi131 levi131 requested a review from oneflow-ci-bot May 11, 2023 08:06
@lucky9-cyou lucky9-cyou merged commit ff4abbe into Oneflow-Inc:master May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants