Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add DeepRecommender #10

Merged
merged 1 commit into from
Jan 2, 2024
Merged

Conversation

GreatV
Copy link
Contributor

@GreatV GreatV commented Dec 30, 2023

pytorch 可正常导出,paddle无法导出。

    File "/home/greatx/repos/DeepRecommender.paddle/reco_encoder/model/model.py", line 139, in encode
        def encode(self, x):
            for ind, w in enumerate(self.encode_w):
                x = activation(
                ~~~~~~~~~~~~~~~ <--- HERE
                    input=paddle.nn.functional.linear(
                        weight=w.T, bias=self.encode_b[ind], x=x

    File "/home/greatx/repos/DeepRecommender.paddle/venv/lib/python3.10/site-packages/paddle/base/dygraph/math_op_patch.py", line 173, in _T_
        out = _C_ops.transpose(var, perm)

    ValueError: (InvalidArgument) transpose(): argument (position 1) must be OpResult, but got EagerParamBase (at ../paddle/fluid/pybind/eager_utils.cc:2108)

@SigureMo
Copy link
Contributor

SigureMo commented Jan 2, 2024

ValueError: (InvalidArgument) transpose(): argument (position 1) must be OpResult, but got EagerParamBase (at ../paddle/fluid/pybind/eager_utils.cc:2108)

这个报错很奇怪,现阶段没启用 PIR 理想态按理说不应该出现这样的报错,我看一下是怎么走到这个逻辑的

@SigureMo
Copy link
Contributor

SigureMo commented Jan 2, 2024

问题已解决,为现阶段 AST 转写 _jst.Ldvisit_Call 时没有覆盖 keyword 的情况

原代码:

            x = activation(
                input=paddle.nn.functional.linear(
                    weight=w.T, bias=self.encode_b[ind], x=x
                ),
                kind=self._nl_type,
            )

修复前转写为

x = _jst.Call(_jst.Ld(activation))(input=paddle.nn.functional.linear(weight=w.T, bias=self.encode_b[ind], x=x), kind=self._nl_type)

明显参数没有转写

修复后转写为

x = _jst.Call(_jst.Ld(activation))(input=_jst.Ld(_jst.Ld(_jst.Ld(_jst.Ld(paddle).nn).functional).linear)(weight=_jst.Ld(_jst.Ld(w).T), bias=_jst.Ld(_jst.Ld(_jst.Ld(self).encode_b)[_jst.Ld(ind)]), x=_jst.Ld(x)), kind=_jst.Ld(_jst.Ld(self)._nl_type))

转写全面,w_jst.Ld 后从 Tensor 转为 Variable,就不会错误进入动态图分支了,在静态图(动转静组网)下走动态图的 _C_ops 会认为是 PIR 组网,且传入的是 Tensor,因此就挂掉了~

稍后提一个 PR 修复该问题,但与本 PR 无关,本 PR 先行合入

Copy link
Contributor

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HappyOpenSource 快乐开源活动issue与PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants