Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 4th No.11】 为 paddle 添加 Geometric Distribution API #51224

Merged
merged 42 commits into from
Apr 28, 2023
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
fa3b3fd
first commit api and test_class
dasenCoding Mar 6, 2023
921cefa
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Mar 6, 2023
9b5fc87
fix CIcheck: code style
dasenCoding Mar 6, 2023
f73282c
fix CI check: code style
dasenCoding Mar 6, 2023
738032f
fix CI check: code style
dasenCoding Mar 6, 2023
0fa3635
fix CI check: code style
dasenCoding Mar 6, 2023
7c3161b
fix CI check: code style
dasenCoding Mar 6, 2023
3bde0b4
fix CI check: build failed for rsample tests
dasenCoding Mar 6, 2023
ee02925
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Mar 7, 2023
5c9f43d
fix CI check: build failed
dasenCoding Mar 7, 2023
31a40cf
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Mar 7, 2023
48a6414
fix CI check: code converage
dasenCoding Mar 7, 2023
1eaf47a
fix CI check: code style
dasenCoding Mar 7, 2023
4b15204
fix CI check: code style
dasenCoding Mar 7, 2023
1b5299a
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Mar 7, 2023
a339855
fix CI check: kl test failed
dasenCoding Mar 7, 2023
1dc341d
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Mar 7, 2023
21a8f94
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Mar 8, 2023
4b95595
fix CI check: Coverage
dasenCoding Mar 8, 2023
416efb6
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Mar 8, 2023
ff02b3d
fix CI check: Py3
dasenCoding Mar 9, 2023
539abe8
fix CI check: sample code
dasenCoding Mar 10, 2023
4aae74f
fix CI check: static check
dasenCoding Mar 10, 2023
931b12e
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Mar 20, 2023
029e3ce
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Apr 8, 2023
be8c276
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Apr 16, 2023
b2de934
update geometric/ test/ static_test/
dasenCoding Apr 16, 2023
06d05d5
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Apr 16, 2023
16b02f3
fix code style CI check
dasenCoding Apr 16, 2023
449b6cf
fix code style CI check
dasenCoding Apr 16, 2023
d05fe29
fix code style CI check
dasenCoding Apr 16, 2023
9f1ac44
add Examples for every method
dasenCoding Apr 16, 2023
7439580
fix 0D tensor
dasenCoding Apr 17, 2023
c3cdd69
delete probs' default value/ fix sample/rsample's param
dasenCoding Apr 21, 2023
8c9dfc6
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Apr 23, 2023
c8137f2
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Apr 25, 2023
0482646
change geometric's test path
dasenCoding Apr 25, 2023
5b81edb
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Apr 25, 2023
d431496
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Apr 26, 2023
8198dbf
add class returns/ code indentation
dasenCoding Apr 27, 2023
439d401
Merge branch 'api_geometric' of http://github.com/dasenCoding/Paddle …
dasenCoding Apr 27, 2023
08f570a
Merge branch 'PaddlePaddle:develop' into api_geometric
dasenCoding Apr 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions python/paddle/distribution/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
from paddle.distribution.transformed_distribution import TransformedDistribution
from paddle.distribution.uniform import Uniform
from paddle.distribution.laplace import Laplace
from paddle.distribution.geometric import Geometric

__all__ = [ # noqa
'Bernoulli',
Expand All @@ -47,6 +48,7 @@
'Laplace',
'LogNormal',
'Gumbel',
'Geometric',
]

__all__.extend(transform.__all__)
340 changes: 340 additions & 0 deletions python/paddle/distribution/geometric.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,340 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import numbers

import numpy as np

import paddle
from paddle.distribution import distribution, uniform
from paddle.fluid import framework
cxxly marked this conversation as resolved.
Show resolved Hide resolved


class Geometric(distribution.Distribution):
r"""
Geometric distribution parameterized by probs.

In probability theory and statistics, the geometric distribution is one of
discrete probability distributions, parameterized by one positive shape parameter, denoted by probs.
In n Bernoulli trials, it takes k trials to get the probability of success for the first time.
In detail, it is: the probability that the first k-1 times failed and the kth time succeeded.
The geometric distribution is a special case of the Pascal distribution when r=1.

The probability mass function (pmf) is

.. math::
Pr(Y=k)=(1-p)^kp

where k is number of trials performed and p is probability of success for each trial and k=0,1,2,3,4..., p belong to (0,1].

Args:
probs (Real|Tensor): Probability parameter.
The value of probs must be positive. When the parameter is a tensor, probs is probability of success for each trial.
cxxly marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加一下返回(returns)的描述

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加一下返回(returns)的描述

已添加!

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric

geom = Geometric(0.5)

geom.mean
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [2.])

geom.variance
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [2.])

geom.stddev
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [1.41421354])
"""

def __init__(self, probs):
if isinstance(probs, (numbers.Real, paddle.Tensor, framework.Variable)):
if isinstance(probs, numbers.Real):
probs = paddle.full(
shape=(), fill_value=probs, dtype=paddle.float32
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果标量,此处full应该创建为零维Tensor, paddle.full(shape=(), fill_value=probs, dtype=paddle.float32)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果标量,此处full应该创建为零维Tensor, paddle.full(shape=(), fill_value=probs, dtype=paddle.float32)

已修改!

all_ones = paddle.full(
shape=probs.shape, fill_value=1, dtype=probs.dtype
)
all_zeros = paddle.full(
shape=probs.shape, fill_value=0, dtype=probs.dtype
)
all_false = paddle.full(
shape=probs.shape, fill_value=False, dtype=bool
)

lessthen_0 = probs <= all_zeros
morethen_1 = probs > all_ones

else:
raise TypeError(
f"Expected type of probs is Number.Real|Tensor|framework.Variable, but got {type(probs)}"
)

if paddle.equal_all(lessthen_0, all_false) and paddle.equal_all(
morethen_1, all_false
):
batch_shape = tuple(probs.shape)
else:
raise ValueError(
"Expected parameter probs of distribution Geometric to satisfy the"
"constraint Interval(lower_bound=0.0, upper_bound=1.0)"
)
cxxly marked this conversation as resolved.
Show resolved Hide resolved

self.probs = probs
super().__init__(batch_shape)

cxxly marked this conversation as resolved.
Show resolved Hide resolved
@property
def mean(self):
"""Mean of geometric distribution."""
return 1.0 / self.probs

@property
def variance(self):
"""Variance of geometric distribution."""
return paddle.to_tensor(
(1.0 / self.probs - 1.0) / self.probs,
dtype=self.probs.dtype,
)

@property
def stddev(self):
"""Standard deviation of Geometric distribution."""
return paddle.sqrt(self.variance)

def pmf(self, k):
r"""Probability mass funciotn evaluated at k.

.. math::

P(X=k) = (1-p)^{k-1} p, \quad k=1,2,3,\ldots

cxxly marked this conversation as resolved.
Show resolved Hide resolved
Args:
k (int): Value to be evaluated.

Returns:
Tensor: Probability.

Examples:

.. code-block:: python

import paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码部分统一再往后缩进,否则无法解析出代码
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码部分统一再往后缩进,否则无法解析出代码 image

已修复!

from paddle.distribution import Geometric

geom = Geometric(0.5)
geom.pmf(2)
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [0.25000000])
"""
if isinstance(k, (numbers.Integral, framework.Variable)):
return paddle.pow((1.0 - self.probs), k - 1.0) * self.probs
else:
raise TypeError(
f"Expected type of k is number.Real|framework.Variable, but got {type(k)}"
)

def log_pmf(self, k):
r"""Log probability mass function evaluated at k.

.. math::
\log P(X = k) = \log(1-p)^k p

Args:
k (int): Value to be evaluated.

Returns:
Tensor: Log probability.

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric

geom = Geometric(0.5)
geom.log_pmf(2)
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [-1.38629436])
"""
if isinstance(k, (numbers.Integral, framework.Variable)):
return paddle.log(self.pmf(k))
cxxly marked this conversation as resolved.
Show resolved Hide resolved
else:
raise TypeError(
f"Expected type of k is number.Real|framework.Variable, but got {type(k)}"
)

def sample(self, shape=()):
"""Sample from Geometric distribution with sample shape.

Args:
shape (tuple(int)): Sample shape.

Returns:
Sampled data with shape `sample_shape` + `batch_shape` + `event_shape`.

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric

geom = Geometric(0.5)
geom.sample((2,2))
# Tensor(shape=[2, 2, 1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [[[4.28128004],
# [0.53546447]],
# [[0.88012987],
# [0.54070371]]])
"""
with paddle.no_grad():
return self.rsample(shape)

def rsample(self, shape=()):
"""Generate samples of the specified shape.

Args:
shape(tuple(int)): The shape of generated samples.

Returns:
Tensor: A sample tensor that fits the Geometric distribution.

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric
geom = Geometric(0.5)
geom.rsample((2,2))
# Tensor(shape=[2, 2, 1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [[[2.90974379],
# [1.28049409]],
# [[4.60141420],
# [2.98836184]]])

"""
shape = distribution.Distribution._extend_shape(
self, sample_shape=shape
)
tiny = np.finfo(dtype='float32').tiny

sample_uniform = uniform.Uniform(low=float(tiny), high=float(1))

new_t = sample_uniform.sample(list(shape))
return paddle.log(new_t) / paddle.log1p(-(self.probs))

def entropy(self):
r"""Entropy of dirichlet distribution.

.. math::

H(X) = -\left[\frac{1}{p} \log p + \frac{1-p}{p^2} \log (1-p) \right]

Returns:
Tensor: Entropy.

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric

geom = Geometric(0.5)
geom.entropy()
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [1.38629436])
"""
x = (1.0 - self.probs) * paddle.log(1.0 - self.probs)
y = self.probs * paddle.log(self.probs)

return -(x + y) / self.probs

def cdf(self, k):
r"""Cdf of geometric distribution.

.. math::

F(X \leq k) = 1 - (1-p)^k, \quad k=0,1,2,\ldots

Args:
k: The number of trials performed.

Returns:
Tensor: Entropy.

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric

geom = Geometric(0.5)
geom.cdf(4)
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [0.93750000])
"""
if isinstance(k, (numbers.Integral, framework.Variable)):
return 1.0 - paddle.pow((1.0 - self.probs), k)
else:
raise TypeError(
f"Expected type of k is number.Real|framework.Variable, but got {type(k)}"
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

def kl_divergence(self, other):
r"""Calculate the KL divergence KL(self || other) with two Geometric instances.

.. math::

KL(P \| Q) = \frac{p}{q} \log \frac{p}{q} + \log (1-p) - \log (1-q)

Args:
other (Geometric): An instance of Geometric.

Returns:
Tensor: The kl-divergence between two geometric distributions.

Examples:

.. code-block:: python

import paddle
from paddle.distribution import Geometric

geom_p = Geometric(0.5)
geom_q = Geometric(0.1)
geom_p.kl_divergence(geom_q)
# Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
# [0.51082563])
"""
if isinstance(other, Geometric):
p, q = self.probs, other.probs
return p * paddle.log(p / q) + (1.0 - p) * paddle.log(
(1.0 - p) / (1.0 - q)
)
else:
raise TypeError(
f"Exected type of other is geometric.Geometric, but got {type(other)}"
)
6 changes: 6 additions & 0 deletions python/paddle/distribution/kl.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from paddle.distribution.dirichlet import Dirichlet
from paddle.distribution.distribution import Distribution
from paddle.distribution.exponential_family import ExponentialFamily
from paddle.distribution.geometric import Geometric
from paddle.distribution.laplace import Laplace
from paddle.distribution.lognormal import LogNormal
from paddle.distribution.normal import Normal
Expand Down Expand Up @@ -200,6 +201,11 @@ def _kl_laplace_laplace(p, q):
return p.kl_divergence(q)


@register_kl(Geometric, Geometric)
def _kl_geometric_geometric(p, q):
return p.kl_divergence(q)


@register_kl(ExponentialFamily, ExponentialFamily)
def _kl_expfamily_expfamily(p, q):
"""Compute kl-divergence using `Bregman divergences <https://www.lix.polytechnique.fr/~nielsen/EntropyEF-ICIP2010.pdf>`_"""
Expand Down
Loading