【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

ADream-ki · 2024-04-08T03:59:16Z

PR Category

Others

PR Types

New features

Description

添加AdaptiveLogSoftmaxWithLoss API

Link

Rfc PR: PaddlePaddle/community#856

paddle-bot · 2024-04-08T03:59:20Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

ADream-ki · 2024-04-13T08:06:46Z

@luotao1 @GGBond8488

这个y也是要approve？两个问题是一样的

GGBond8488 · 2024-04-19T07:19:31Z

test/legacy_test/test_adaptive_log_softmax_with_loss.py

+from paddle.nn import functional as F
+
+
+class TestNNAdaptiveLogSoftmaxWithLossAPI(unittest.TestCase):


测试建议添加一个AdaptiveLogSoftmax的应用场景，先用AdaptiveLogSoftmax组一个小但是完整的网络，包括优化器的那种，测试AdaptiveLogSoftmax的输出以及自身的权重是否有更新

没问题，请问还有其他需要添加的吗？

已经重新加上去了，请review

这个我希望的是先写一个比较完备的model组网代码，前向用到了AdaptiveLogSoftmax，然后实际运行这个model，再测试其中的数据是否正确

我理解一下

这个我希望的是先写一个比较完备的model组网代码，前向用到了AdaptiveLogSoftmax，然后实际运行这个model，再测试其中的数据是否正确

按照这个说法的话，结果是没有变化的啊，只是AdaptiveLogSoftmaxwithLoss的输入先经过其他模型，但是结果是不变的。我这一块很疑惑。

GGBond8488 · 2024-04-19T09:03:36Z

python/paddle/nn/layer/loss.py

+        the index `n_classes - 1`. To compute log-probabilities for all classes, the ``log_prob`` method can be used.
+    """
+
+    def __init__(


这里不传weight的话，怎么主动指定初始化呢

这里已经修改

这里修改了话，记得rfc设计文档里面也同步一下，保持两边的参数一致

GGBond8488 · 2024-04-19T09:04:31Z

test/legacy_test/test_adaptive_log_softmax_with_loss.py

+from paddle.nn import functional as F
+
+
+class TestNNAdaptiveLogSoftmaxWithLossAPI(unittest.TestCase):


这个我希望的是先写一个比较完备的model组网代码，前向用到了AdaptiveLogSoftmax，然后实际运行这个model，再测试其中的数据是否正确

luotao1 · 2024-04-24T04:24:30Z

单测行覆盖率需要到90%

ADream-ki · 2024-04-24T11:15:20Z

单测行覆盖率需要到90%

好的

ADream-ki · 2024-04-25T11:25:22Z

单测行覆盖率需要到90%

好的

我的不是满足条件吗

ADream-ki · 2024-04-27T16:29:40Z

@luotao1 inference这个ci好像需要什么人同意？是这个意思吗

luotao1 · 2024-04-28T09:06:51Z

需要approve的，等本题研发 review 通过后，最后统一处理

ADream-ki · 2024-04-28T09:20:50Z

需要approve的，等本题研发 review 通过后，最后统一处理

主要是那个ci的意思是要approve吗

GGBond8488

整体没有什么问题了，细节再修一修就ok

GGBond8488 · 2024-04-28T09:28:52Z

python/paddle/nn/layer/loss.py

+        the index `n_classes - 1`. To compute log-probabilities for all classes, the ``log_prob`` method can be used.
+    """
+
+    def __init__(


这里修改了话，记得rfc设计文档里面也同步一下，保持两边的参数一致

GGBond8488 · 2024-04-28T09:30:09Z

test/legacy_test/test_adaptive_log_softmax_with_loss.py

+            y = paddle.to_tensor([0, 5, 10])
+            model(x, y)
+
+    def test_forwadr(self):


这个拼写错误改下

好的，我明天就把修改的推上来

GGBond8488 · 2024-04-28T09:32:59Z

test/legacy_test/test_adaptive_log_softmax_with_loss.py

+        for before, after in zip(
+            tail_weights_before_training, tail_weights_after_training
+        ):
+            assert not np.any(before != after)


这个验证有点不具备说服力，这里只验证更新了，但是没验证更新是否正确，建议看看能不能加一个验证梯度的的场景

好的好的，我晚上改一下，验证梯度，我也想想

请问一下，其他的test中会不会有梯度检验的例子

rfcs已经修改 PaddlePaddle/community#885

这个验证有点不具备说服力，这里只验证更新了，但是没验证更新是否正确，建议看看能不能加一个验证梯度的的场景
已经改好了

这个验证有点不具备说服力，这里只验证更新了，但是没验证更新是否正确，建议看看能不能加一个验证梯度的的场景
已经改好了

已经弄完了

ADream-ki · 2024-05-11T07:38:19Z

需要过一下CI

这报的错全不是我写的那部分

ADream-ki · 2024-05-11T07:38:44Z

是环境出问题了吧 @luotao1

luotao1 · 2024-05-11T07:48:44Z

PR-CI-Codestyle-Check 是提交的代码有问题。

是环境出问题了吧

后续可以rerun CI，或merge develop重新触发

ADream-ki · 2024-05-16T04:37:01Z

PR-CI-Codestyle-Check 是提交的代码有问题。

是环境出问题了吧

后续可以rerun CI，或merge develop重新触发

paddle不是支持bool数据和float32数据相乘吗

 @luotao1

… softmax

ADream-ki · 2024-05-17T00:27:40Z

已经好了 @jeff41404

jeff41404

LGTM

sunzhongkai588 · 2024-05-21T07:46:05Z

python/paddle/nn/functional/loss.py

+):
+    r"""Compute adaptive logsoftmax result and negative log likelihood between ``input`` and ``label``.
+    Parameter ``head``, ``tail_weights``, ``cutoffs`` are inner members of AdaptiveLogSoftmaxWithLoss
+    Please refer to :ref:`_cn_api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.


Suggested change

Please refer to :ref:`_cn_api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.

Please refer to :ref:`api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.

sunzhongkai588 · 2024-05-21T07:51:02Z

python/paddle/nn/functional/loss.py

+        output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]
+        loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.


Suggested change

output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]

loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.

- output (Tensor). The tensor sotring adaptive logsoftmax result, the shape of output is [N]

- loss (Tensor). The tensor variable storing the adaptive_log_softmax_loss of input and label.

sunzhongkai588 · 2024-05-21T07:53:28Z

python/paddle/nn/functional/loss.py

+        output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]
+        loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.
+
+    Examples::


Suggested change

Examples::

Examples:

sunzhongkai588 · 2024-05-21T07:57:33Z

python/paddle/nn/functional/loss.py

+    Args:
+        input (Tensor): Input tensor, the data type should be float32 or float64.
+        label (Tensor): Label tensor, the data type should be float32 or float64.
+        head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.


Suggested change

head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.

head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be ``[input.shape[1], shortlist_size + n_clusters]``, where ``shortlist_size is`` the first element in the cutoffs list, and ``n_clusters`` is the length of the cutoffs list minus 1.

尽量在官网展示的美观一点吧，都揉在一起了

sunzhongkai588 · 2024-05-21T07:58:58Z

python/paddle/nn/functional/loss.py

+        input (Tensor): Input tensor, the data type should be float32 or float64.
+        label (Tensor): Label tensor, the data type should be float32 or float64.
+        head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.
+        tail_weights (list[Tensor]): weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are [input.shape[1], hsz] and [hsz, osz], where hsz is the number of input features in_features divided by div_value to the power (i + 1), where i is the cyclic variable, from 0 to n_clusters - 1, and osz is the (i + 1) The difference between the cutoff and the ith cutoff.


Suggested change

tail_weights (list[Tensor]): weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are [input.shape[1], hsz] and [hsz, osz], where hsz is the number of input features in_features divided by div_value to the power (i + 1), where i is the cyclic variable, from 0 to n_clusters - 1, and osz is the (i + 1) The difference between the cutoff and the ith cutoff.

tail_weights (list[Tensor]): weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are ``[input.shape[1], hsz]`` and ``[hsz, osz]``, where ``hsz`` is the number of input features in_features divided by div_value to the power (i + 1), where i is the cyclic variable, from 0 to n_clusters - 1, and ``osz`` is the (i + 1) The difference between the cutoff and the ith cutoff.

sunzhongkai588 · 2024-05-21T08:13:32Z

python/paddle/nn/layer/loss.py

+class AdaptiveLogSoftmaxWithLoss(Layer):
+    r"""Adaptive softmax is an approximate strategy for training models with large output spaces. It is most effective when
+    the label distribution is highly imbalanced, for example in natural language modelling, where the word frequency
+    distribution approximately follows the ``Zipf's law``.


既然是参考 pytorch 的文档，就抄全吧， Zipf's law 附上链接

Suggested change

distribution approximately follows the ``Zipf's law``.

distribution approximately follows the `Zipf's law <https://en.wikipedia.org/wiki/Zipf%27s_law>`_ .

sunzhongkai588 · 2024-05-21T08:19:18Z

python/paddle/nn/layer/loss.py

+        weight_attr (ParamAttr, optional): The attribute for the learnable
+            weight of this layer. The default value is None. If the Initializer of the
+            param_attr is not set, the parameter is initialized with Xavier.
+            For detailed information, please refer to paddle.ParamAttr.


Suggested change

For detailed information, please refer to paddle.ParamAttr.

For detailed information, please refer to :ref:`api_paddle_ParamAttr`

sunzhongkai588 · 2024-05-21T08:19:54Z

python/paddle/nn/layer/loss.py

+            of this layer. If it is set to False, no bias will be added to the output.
+            If it is set to None or one kind of ParamAttr, a bias parameter will
+            be created according to ParamAttr. For detailed information, please refer
+            to paddle.ParamAttr. The default value is None and the bias will be


Suggested change

to paddle.ParamAttr. The default value is None and the bias will be

to :ref:`api_paddle_ParamAttr`. The default value is None and the bias will be

sunzhongkai588 · 2024-05-21T08:22:15Z

python/paddle/nn/layer/loss.py

+        - input (Tensor): The input tensor. The shapes is [N, in_features]. N is batch size.
+        - label (Tensor): target. The shapes is `[N]`
+        - output1 (Tensor): The shape is `[N]`
+        - output2 (Scalar):


Suggested change

- output2 (Scalar):

- output2 (Scalar).

sunzhongkai588 · 2024-05-21T08:22:25Z

python/paddle/nn/layer/loss.py

+    Returns:
+        A callable object of AdaptiveLogSoftmaxWithLoss.
+
+     Examples::


Suggested change

Examples::

Examples:

已經全部修改了

ADream-ki · 2024-05-21T22:50:01Z

pr已经实现，能不能先不关闭？ @luotao1

luotao1 · 2024-05-22T02:20:58Z

pr已经实现，能不能先不关闭

关闭什么？这个PR没有关闭呀

ADream-ki · 2024-05-22T04:38:29Z

pr已经实现，能不能先不关闭

关闭什么？这个PR没有关闭呀

好的

sunzhongkai588

LGTM

PaddlePaddle#63302) * Add AdaptiveLogSoftmaxWithLoss API * update codestyle * update loss * test * update test * add weight_attr * update forward * update forward * update * update * update * update test_gard * update * update information * update * update * codestyle * update * update * update * update

Add AdaptiveLogSoftmaxWithLoss API

2e73a42

paddle-bot bot added the contributor External developers label Apr 8, 2024

update codestyle

1272637

luotao1 added the PaddlePaddle Hackathon label Apr 8, 2024

luotao1 assigned luotao1 and GGBond8488 Apr 8, 2024

luotao1 mentioned this pull request Apr 8, 2024

【Hackathon 6th】开源贡献个人挑战赛 #62905

Closed

ADream-ki and others added 3 commits April 11, 2024 15:54

Merge branch 'PaddlePaddle:develop' into softmax

74e9bb5

update loss

c81de86

test

2036830

ADream-ki mentioned this pull request Apr 15, 2024

【Hackathon 6th No.4】Add Ormqr API to Paddle -part #63227

Merged

GGBond8488 reviewed Apr 19, 2024

View reviewed changes

update test

9a489a9

GGBond8488 reviewed Apr 19, 2024

View reviewed changes

ADream-ki added 2 commits April 22, 2024 14:33

add weight_attr

5f989be

update forward

129095e

update forward

65da77e

luotao1 added the API label Apr 24, 2024

update

12cb2ff

Merge branch 'PaddlePaddle:develop' into softmax

4f5fc2b

update

cca1636

GGBond8488 reviewed Apr 28, 2024

View reviewed changes

ADream-ki and others added 4 commits May 11, 2024 23:26

update

c5e1eb1

Merge branch 'PaddlePaddle:develop' into softmax

85d5295

Merge branch 'PaddlePaddle:develop' into softmax

bd534ab

Merge branch 'PaddlePaddle:develop' into softmax

1586812

ADream-ki added 3 commits May 16, 2024 17:08

update

a367a90

Merge branch 'softmax' of https://github.com/Chen-Lun-Hao/Paddle into…

596b88d

… softmax

codestyle

66adc44

jeff41404 previously approved these changes May 20, 2024

View reviewed changes

luotao1 assigned sunzhongkai588 May 21, 2024

sunzhongkai588 reviewed May 21, 2024

View reviewed changes

update

30ded8c

ADream-ki dismissed jeff41404’s stale review via 30ded8c May 21, 2024 09:21

ADream-ki added 3 commits May 22, 2024 14:38

update

ad2d0c4

update

b231a1d

update

45c860b

sunzhongkai588 approved these changes May 23, 2024

View reviewed changes

luotao1 changed the title ~~【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle~~ 【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part May 23, 2024

luotao1 merged commit d0e08a8 into PaddlePaddle:develop May 23, 2024
31 checks passed

This was referenced Jun 12, 2024

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss docs API to Paddle PaddlePaddle/docs#6606

Merged

【Hackathon 6th No.3】为 Paddle 新增 ZeroPad1D / ZeroPad3D / block_diag API PaddlePaddle/docs#6651

Merged

		from paddle.nn import functional as F


		class TestNNAdaptiveLogSoftmaxWithLossAPI(unittest.TestCase):

	Please refer to :ref:`_cn_api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.
	Please refer to :ref:`api_paddle_nn_AdaptiveLogSoftmaxWithLoss`.

		output (Tensor): The tensor sotring adaptive logsoftmax result, the shape of output is [N]
		loss (Tensor): The tensor variable storing the adaptive_log_softmax_loss of input and label.

	head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be [input.shape[1], shortlist_size + n_clusters], where shortlist_size is the first element in the cutoffs list, and n_clusters is the length of the cutoffs list minus 1.
	head_weight (Tensor): weight tensor for linear computation, the data type should be float32 or float64, the shape should be ``[input.shape[1], shortlist_size + n_clusters]``, where ``shortlist_size is`` the first element in the cutoffs list, and ``n_clusters`` is the length of the cutoffs list minus 1.

	distribution approximately follows the ``Zipf's law``.
	distribution approximately follows the `Zipf's law <https://en.wikipedia.org/wiki/Zipf%27s_law>`_ .

	For detailed information, please refer to paddle.ParamAttr.
	For detailed information, please refer to :ref:`api_paddle_ParamAttr`

	to paddle.ParamAttr. The default value is None and the bias will be
	to :ref:`api_paddle_ParamAttr`. The default value is None and the bias will be

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

【Hackathon 6th No.1】Add AdaptiveLogSoftmaxWithLoss API to Paddle -part #63302

Conversation

ADream-ki commented Apr 8, 2024

PR Category

PR Types

Description

Link

paddle-bot bot commented Apr 8, 2024

ADream-ki commented Apr 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented Apr 24, 2024

ADream-ki commented Apr 24, 2024

ADream-ki commented Apr 25, 2024

ADream-ki commented Apr 27, 2024

luotao1 commented Apr 28, 2024

ADream-ki commented Apr 28, 2024

GGBond8488 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ADream-ki commented May 11, 2024

ADream-ki commented May 11, 2024

luotao1 commented May 11, 2024

ADream-ki commented May 16, 2024 • edited Loading

ADream-ki commented May 17, 2024

jeff41404 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ADream-ki commented May 21, 2024

luotao1 commented May 22, 2024

ADream-ki commented May 22, 2024

sunzhongkai588 left a comment

Choose a reason for hiding this comment

ADream-ki commented May 16, 2024 •

edited

Loading