Skip to content

Commit

Permalink
Refactor priors (#329)
Browse files Browse the repository at this point in the history
Introduces `Prior` and `Distribution` classes for handling PEtab-specific prior distributions, and (PEtab-version-invariant) univariate probability distributions. Supports sampling from them, and evaluating negative log-priors (#312). Later on, this can be extended to noise models for measurements and computing loglikelihoods.
This also adds a notebook demonstrating the various prior options which are a common source confusion.

Closes #311.

:eyes: notebook: https://petab--329.org.readthedocs.build/projects/libpetab-python/en/329/example/distributions.html
  • Loading branch information
dweindl authored Dec 11, 2024
1 parent d7f7e3a commit 8456635
Show file tree
Hide file tree
Showing 10 changed files with 851 additions and 111 deletions.
1 change: 1 addition & 0 deletions doc/example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ The following examples should help to get a better idea of how to use the PEtab

example/example_petablint.ipynb
example/example_visualization.ipynb
example/distributions.ipynb

Examples of systems biology parameter estimation problems specified in PEtab
can be found in the `systems biology benchmark model collection <https://github.com/Benchmarking-Initiative/Benchmark-Models-PEtab>`_.
208 changes: 208 additions & 0 deletions doc/example/distributions.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": [
"# Prior distributions in PEtab\n",
"\n",
"This notebook gives a brief overview of the prior distributions in PEtab and how they are represented in the PEtab library.\n",
"\n",
"Prior distributions are used to specify the prior knowledge about the parameters.\n",
"Parameter priors are specified in the parameter table. A prior is defined by its type and its parameters.\n",
"Each prior type has a specific set of parameters. For example, the normal distribution has two parameters: the mean and the standard deviation.\n",
"\n",
"There are two types of priors in PEtab - objective priors and initialization priors:\n",
"\n",
"* *Objective priors* are used to specify the prior knowledge about the parameters that are to be estimated. They will enter the objective function of the optimization problem. They are specified in the `objectivePriorType` and `objectivePriorParameters` columns of the parameter table.\n",
"* *Initialization priors* can be used as a hint for the optimization algorithm. They will not enter the objective function. They are specified in the `initializationPriorType` and `initializationPriorParameters` columns of the parameter table.\n",
"\n",
"\n"
],
"id": "372289411a2aa7b3"
},
{
"metadata": {
"collapsed": true
},
"cell_type": "code",
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import seaborn as sns\n",
"\n",
"from petab.v1.C import *\n",
"from petab.v1.priors import Prior\n",
"\n",
"sns.set_style(None)\n",
"\n",
"\n",
"def plot(prior: Prior, ax=None):\n",
" \"\"\"Visualize a distribution.\"\"\"\n",
" if ax is None:\n",
" fig, ax = plt.subplots()\n",
"\n",
" sample = prior.sample(10000)\n",
"\n",
" # pdf\n",
" xmin = min(sample.min(), prior.lb_scaled if prior.bounds is not None else sample.min())\n",
" xmax = max(sample.max(), prior.ub_scaled if prior.bounds is not None else sample.max())\n",
" x = np.linspace(xmin, xmax, 500)\n",
" y = prior.pdf(x)\n",
" ax.plot(x, y, color='red', label='pdf')\n",
"\n",
" sns.histplot(sample, stat='density', ax=ax, label=\"sample\")\n",
"\n",
" # bounds\n",
" if prior.bounds is not None:\n",
" for bound in (prior.lb_scaled, prior.ub_scaled):\n",
" if bound is not None and np.isfinite(bound):\n",
" ax.axvline(bound, color='black', linestyle='--', label='bound')\n",
"\n",
" ax.set_title(str(prior))\n",
" ax.set_xlabel('Parameter value on the parameter scale')\n",
" ax.grid(False)\n",
" handles, labels = ax.get_legend_handles_labels()\n",
" unique_labels = dict(zip(labels, handles))\n",
" ax.legend(unique_labels.values(), unique_labels.keys())\n",
" plt.show()"
],
"id": "initial_id",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "markdown",
"source": "The basic distributions are the uniform, normal, Laplace, log-normal, and log-laplace distributions:\n",
"id": "db36a4a93622ccb8"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"plot(Prior(UNIFORM, (0, 1)))\n",
"plot(Prior(NORMAL, (0, 1)))\n",
"plot(Prior(LAPLACE, (0, 1)))\n",
"plot(Prior(LOG_NORMAL, (0, 1)))\n",
"plot(Prior(LOG_LAPLACE, (1, 0.5)))"
],
"id": "4f09e50a3db06d9f",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "markdown",
"source": "If a parameter scale is specified (`parameterScale=lin|log|log10` not a `parameterScale*`-type distribution), the sample is transformed accordingly (but not the distribution parameters):\n",
"id": "dab4b2d1e0f312d8"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"plot(Prior(NORMAL, (10, 2), transformation=LIN))\n",
"plot(Prior(NORMAL, (10, 2), transformation=LOG))\n",
"\n",
"# Note that the log-normal distribution is different from a log-transformed normal distribution:\n",
"plot(Prior(LOG_NORMAL, (10, 2), transformation=LIN))"
],
"id": "f6192c226f179ef9",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "markdown",
"source": "On the log-transformed parameter scale, `Log*` and `parameterScale*` distributions are equivalent:",
"id": "4281ed48859e6431"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"plot(Prior(LOG_NORMAL, (10, 2), transformation=LOG))\n",
"plot(Prior(PARAMETER_SCALE_NORMAL, (10, 2)))"
],
"id": "34c95268e8921070",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Prior distributions can also be defined on the parameter scale by using the types `parameterScaleUniform`, `parameterScaleNormal` or `parameterScaleLaplace`. In these cases, 1) the distribution parameter are interpreted on the transformed parameter scale, and 2) a sample from the given distribution is used directly, without applying any transformation according to `parameterScale` (this implies, that for `parameterScale=lin`, there is no difference between `parameterScaleUniform` and `uniform`):",
"id": "263c9fd31156a4d5"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"plot(Prior(UNIFORM, (0.01, 2), transformation=LOG10))\n",
"plot(Prior(PARAMETER_SCALE_UNIFORM, (0.01, 2), transformation=LOG10))\n",
"\n",
"plot(Prior(UNIFORM, (0.01, 2), transformation=LIN))\n",
"plot(Prior(PARAMETER_SCALE_UNIFORM, (0.01, 2), transformation=LIN))\n"
],
"id": "5ca940bc24312fc6",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "markdown",
"source": "To prevent the sampled parameters from exceeding the bounds, the sampled parameters are clipped to the bounds. The bounds are defined in the parameter table. Note that the current implementation does not support sampling from a truncated distribution. Instead, the samples are clipped to the bounds. This may introduce unwanted bias, and thus, should only be used with caution (i.e., the bounds should be chosen wide enough):",
"id": "b1a8b17d765db826"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"plot(Prior(NORMAL, (0, 1), bounds=(-4, 4))) # negligible clipping-bias at 4 sigma\n",
"plot(Prior(UNIFORM, (0, 1), bounds=(0.1, 0.9))) # significant clipping-bias"
],
"id": "4ac42b1eed759bdd",
"outputs": [],
"execution_count": null
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Further distribution examples:",
"id": "45ffce1341483f24"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"plot(Prior(NORMAL, (10, 1), bounds=(6, 14), transformation=\"log10\"))\n",
"plot(Prior(PARAMETER_SCALE_NORMAL, (10, 1), bounds=(10**6, 10**14), transformation=\"log10\"))\n",
"plot(Prior(LAPLACE, (10, 2), bounds=(6, 14)))"
],
"id": "581e1ac431860419",
"outputs": [],
"execution_count": null
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
1 change: 1 addition & 0 deletions doc/modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ API Reference
petab.v1.composite_problem
petab.v1.conditions
petab.v1.core
petab.v1.distributions
petab.v1.lint
petab.v1.measurements
petab.v1.models
Expand Down
3 changes: 2 additions & 1 deletion petab/v1/C.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,8 @@
LOG10 = "log10"
#: Supported observable transformations
OBSERVABLE_TRANSFORMATIONS = [LIN, LOG, LOG10]

#: Supported parameter transformations
PARAMETER_SCALES = [LIN, LOG, LOG10]

# NOISE MODELS

Expand Down
Loading

0 comments on commit 8456635

Please sign in to comment.