Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paddle/nn fix formula bugs #34643

Merged
merged 17 commits into from
Aug 6, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
8b36b85
fix paddle.optimizer test=document_fix
sunzhongkai588 Aug 2, 2021
08e86aa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 2, 2021
1463dcf
fix paddle.optimizer test=document_fix
sunzhongkai588 Aug 2, 2021
f5da6fc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 2, 2021
0d3717d
fix bugs in paddle.nn.functional document test=document_fix
sunzhongkai588 Aug 3, 2021
c52ab54
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 3, 2021
c601f90
fix bugs in paddle.nn.functional document test=document_fix
sunzhongkai588 Aug 3, 2021
6256990
fix bugs in paddle.nn.functional document test=document_fix
sunzhongkai588 Aug 3, 2021
9e4039c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 3, 2021
25c7b91
fix bugs in paddle.nn.functional document test=document_fix
sunzhongkai588 Aug 3, 2021
8b9722e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 3, 2021
89fad59
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 5, 2021
a764c5b
fix nn formula bugs test=document_fix
sunzhongkai588 Aug 5, 2021
0c6c188
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 5, 2021
54813fd
fix nn formula bugs test=document_fix
sunzhongkai588 Aug 5, 2021
00c5f7e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
sunzhongkai588 Aug 5, 2021
bf45aaf
fix nn formula bugs test=document_fix
sunzhongkai588 Aug 5, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions python/paddle/fluid/clip.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,18 +286,18 @@ class ClipGradByNorm(ClipGradBase):

.. math::
Out =
\\left \{
\\begin{aligned}
& X & & if (norm(X) \\leq clip\_norm) \\\\
& \\frac{clip\_norm*X}{norm(X)} & & if (norm(X) > clip\_norm) \\\\
\\end{aligned}
\\right.
\left\{
\begin{array}{ccl}
X & & if (norm(X) \leq clip\_norm) \\
\frac{clip\_norm*X}{norm(X)} & & if (norm(X) > clip\_norm) \\
\end{array}
\right.


where :math:`norm(X)` represents the L2 norm of :math:`X`.

.. math::
norm(X) = ( \\sum_{i=1}^{n}|x\_i|^2)^{ \\frac{1}{2}}
norm(X) = ( \sum_{i=1}^{n}|x\_i|^2)^{ \frac{1}{2}}

Note:
``need_clip`` of ``ClipGradByNorm`` HAS BEEN DEPRECATED since 2.0.
Expand Down Expand Up @@ -389,7 +389,7 @@ class ClipGradByGlobalNorm(ClipGradBase):

.. math::

t\_list[i] = t\_list[i] * \\frac{clip\_norm}{\max(global\_norm, clip\_norm)}
t\_list[i] = t\_list[i] * \frac{clip\_norm}{\max(global\_norm, clip\_norm)}

where:

Expand Down
34 changes: 16 additions & 18 deletions python/paddle/fluid/dygraph/nn.py
Original file line number Diff line number Diff line change
Expand Up @@ -1151,9 +1151,6 @@ def forward(self, input):

class BatchNorm(layers.Layer):
r"""
:alias_main: paddle.nn.BatchNorm
:alias: paddle.nn.BatchNorm,paddle.nn.layer.BatchNorm,paddle.nn.layer.norm.BatchNorm
:old_api: paddle.fluid.dygraph.BatchNorm

This interface is used to construct a callable object of the ``BatchNorm`` class.
For more details, refer to code examples.
Expand All @@ -1164,16 +1161,16 @@ class BatchNorm(layers.Layer):
Internal Covariate Shift <https://arxiv.org/pdf/1502.03167.pdf>`_
for more details.

When use_global_stats = False, the :math:`\\mu_{\\beta}`
and :math:`\\sigma_{\\beta}^{2}` are the statistics of one mini-batch.
When use_global_stats = False, the :math:`\mu_{\beta}`
and :math:`\sigma_{\beta}^{2}` are the statistics of one mini-batch.
Calculated as follows:

.. math::

\\mu_{\\beta} &\\gets \\frac{1}{m} \\sum_{i=1}^{m} x_i \\qquad &//\\
\ mini-batch\ mean \\\\
\\sigma_{\\beta}^{2} &\\gets \\frac{1}{m} \\sum_{i=1}^{m}(x_i - \\
\\mu_{\\beta})^2 \\qquad &//\ mini-batch\ variance \\\\
\mu_{\beta} &\gets \frac{1}{m} \sum_{i=1}^{m} x_i \qquad &
//\ mini-batch\ mean \\
\sigma_{\beta}^{2} &\gets \frac{1}{m} \sum_{i=1}^{m}(x_i - \mu_{\beta})^2 \qquad &
//\ mini-batch\ variance \\

- :math:`x` : mini-batch data
- :math:`m` : the size of the mini-batch data
Expand All @@ -1191,13 +1188,14 @@ class BatchNorm(layers.Layer):

.. math::

\\hat{x_i} &\\gets \\frac{x_i - \\mu_\\beta} {\\sqrt{\\
\\sigma_{\\beta}^{2} + \\epsilon}} \\qquad &//\ normalize \\\\
y_i &\\gets \\gamma \\hat{x_i} + \\beta \\qquad &//\ scale\ and\ shift
\hat{x_i} &\gets \frac{x_i - \mu_\beta} {\sqrt{\
\sigma_{\beta}^{2} + \epsilon}} \qquad &//\ normalize \\
y_i &\gets \gamma \hat{x_i} + \beta \qquad &//\ scale\ and\ shift


- :math:`\\epsilon` : add a smaller value to the variance to prevent division by zero
- :math:`\\gamma` : trainable proportional parameter
- :math:`\\beta` : trainable deviation parameter
- :math:`\epsilon` : add a smaller value to the variance to prevent division by zero
- :math:`\gamma` : trainable proportional parameter
- :math:`\beta` : trainable deviation parameter

Parameters:
num_channels(int): Indicate the number of channels of the input ``Tensor``.
Expand Down Expand Up @@ -3011,9 +3009,9 @@ class SpectralNorm(layers.Layer):

.. math::

\mathbf{v} := \\frac{\mathbf{W}^{T} \mathbf{u}}{\|\mathbf{W}^{T} \mathbf{u}\|_2}
\mathbf{v} := \frac{\mathbf{W}^{T} \mathbf{u}}{\|\mathbf{W}^{T} \mathbf{u}\|_2}

\mathbf{u} := \\frac{\mathbf{W}^{T} \mathbf{v}}{\|\mathbf{W}^{T} \mathbf{v}\|_2}
\mathbf{u} := \frac{\mathbf{W}^{T} \mathbf{v}}{\|\mathbf{W}^{T} \mathbf{v}\|_2}

Step 3:
Calculate :math:`\sigma(\mathbf{W})` and normalize weight values.
Expand All @@ -3022,7 +3020,7 @@ class SpectralNorm(layers.Layer):

\sigma(\mathbf{W}) = \mathbf{u}^{T} \mathbf{W} \mathbf{v}

\mathbf{W} = \\frac{\mathbf{W}}{\sigma(\mathbf{W})}
\mathbf{W} = \frac{\mathbf{W}}{\sigma(\mathbf{W})}


Refer to `Spectral Normalization <https://arxiv.org/abs/1802.05957>`_ .
Expand Down
4 changes: 2 additions & 2 deletions python/paddle/nn/initializer/kaiming.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ class KaimingNormal(MSRAInitializer):

.. math::

\sqrt{\\frac{2.0}{fan\_in}}
\sqrt{\frac{2.0}{fan\_in}}

Args:
fan_in (float32|None): fan_in for Kaiming normal Initializer. If None, it is\
Expand Down Expand Up @@ -75,7 +75,7 @@ class KaimingUniform(MSRAInitializer):

.. math::

x = \sqrt{\\frac{6.0}{fan\_in}}
x = \sqrt{\frac{6.0}{fan\_in}}

Args:
fan_in (float32|None): fan_in for Kaiming uniform Initializer. If None, it is\
Expand Down
4 changes: 2 additions & 2 deletions python/paddle/nn/initializer/xavier.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class XavierNormal(XavierInitializer):

.. math::

\sqrt{\\frac{2.0}{fan\_in + fan\_out}}
\sqrt{\frac{2.0}{fan\_in + fan\_out}}


Args:
Expand Down Expand Up @@ -83,7 +83,7 @@ class XavierUniform(XavierInitializer):

.. math::

x = \sqrt{\\frac{6.0}{fan\_in + fan\_out}}
x = \sqrt{\frac{6.0}{fan\_in + fan\_out}}

Args:
fan_in (float, optional): fan_in for Xavier initialization, it is
Expand Down
Loading