This is a custom probability density function I created to solve a particular problem I had at work, however it could also be useful to others.
This distribution aims to fit exponentially ascending data on a continuous interval [a, b]. That is, it's not bound to zero like the regular exponential distribution and can be defined for any interval.
With this module, I aim to mimic the scipy
API it has for its distributions (although not everything is implemented.)
This module uses scipy for a lot of the backend which (at the time of writing) is licensed under BSD-3.
- When fitting your data, the lower and upper bounds are determined by the min/max of the data; this will severely impact the results if you have very low/high extremes. It's recommended to treat your data and ensure the sample you're fitting is a "natural"-looking exponentially ascending shape.
This is an example of exponentially ascending data on the interval [300, 900]
data:image/s3,"s3://crabby-images/41b8f/41b8ff8c30a0e941b93ee087f574949438503e5e" alt="Screenshot 2024-04-07 at 3 19 16 PM"
If you're interested in the actual derivation, see the whitepaper directory in this repo (will list all revisions of the paper as a separate PDF.)
Install with pip3 install invexpo
or download from the releases and install locally.
Initialize an "empty" distribution:
from invexpo.inverse_exponential import InverseExponential
invex = InverseExponential()
You can either fit the distribution to data or create a theoretical distribution.
Note that fit()
only accepts python lists as of now:
# sample data that mimics an "exponentially increasing" function bounded by [600, 800]
data = [600, 625, 650, 675, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800]
invex = InverseExponential()
# fit to data
invex.fit(data)
# there is also a maxiter parameter if the optimizer fails to converge
# for some reason, but usually there might be a more serious problem
# if that's happening...
invex.fit(data, maxiter=12345)
# create theoretical distribution
# NOTE: the 'a' (shape) parameter is usually very small, large numbers can cause overflows.
# Larger values of 'a' will create sharper peaks. Smaller values will more smoothly
# transition over the interval.
invex.create(a = 0.007, lower_bound = 300, upper_bound = 900)
After fitting/creating a distribution you can use the following methods:
get_parameter() -> float
- Returns the value of
a
, the shape parameter
- Returns the value of
pdf(x: float) -> float
- Evaluates the probability density function at
x
(i.e.,P(x)
)
- Evaluates the probability density function at
cdf(x: float) -> float
- Evaluate the cumulative density function at
x
, (i.e.,P(X <= x)
)
- Evaluate the cumulative density function at
icdf(p: float) -> float
- Evaluate the inverse CDF to get a percentile
p
for0 <= p <= 1
- Evaluate the inverse CDF to get a percentile
ppf(p: float) -> float
- Same as
icdf
just a different name (percentile point function)
- Same as
integrate(lower_bound: float, upper_bound: float) -> float
- Integrates the pdf over the interval
[lower_bound, upper_bound]
- Integrates the pdf over the interval
rvs(size: int = 1) -> list[float]
- Generate
size
random variables from the distribution
- Generate
moment(n: int) -> float
- Obtain the
n
-th moment of the distribution
- Obtain the
mean() -> float
- Obtain the mean of the distribution (equivalent to
moment(1)
)
- Obtain the mean of the distribution (equivalent to
median() -> float
- Obtain the median of the distribution
var() -> float
- Obtain the variance of the distribution (equivalent to
moment(2) - moment(1)**2
)
- Obtain the variance of the distribution (equivalent to
std() -> float
- Obtain the standard deviation of the distribution (equivalent to
np.sqrt(var())
)
- Obtain the standard deviation of the distribution (equivalent to