theislab · aGuyLearning · Dec 6, 2024 · Dec 11, 2024 · Dec 11, 2024 · Dec 11, 2024
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -4,10 +4,10 @@
 
 <!-- Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested on pull requests (PRs). -->
 
--   [ ] This comment contains a description of changes (with reason)
--   [ ] Referenced issue is linked
--   [ ] If you've fixed a bug or added code that should be tested, add tests!
--   [ ] Documentation in `docs` is updated
+- [ ] This comment contains a description of changes (with reason)
+- [ ] Referenced issue is linked
+- [ ] If you've fixed a bug or added code that should be tested, add tests!
+- [ ] Documentation in `docs` is updated
 
 **Description of changes**
 

diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -14,23 +14,23 @@ religion, or sexual identity and orientation.
 Examples of behavior that contributes to creating a positive environment
 include:
 
--   Using welcoming and inclusive language
--   Being respectful of differing viewpoints and experiences
--   Gracefully accepting constructive criticism
--   Focusing on what is best for the community
--   Showing empathy towards other community members
+- Using welcoming and inclusive language
+- Being respectful of differing viewpoints and experiences
+- Gracefully accepting constructive criticism
+- Focusing on what is best for the community
+- Showing empathy towards other community members
 
 Examples of unacceptable behavior by participants include:
 
--   The use of sexualized language or imagery and unwelcome sexual
-    attention or advances
--   Trolling, insulting/derogatory comments, and personal or political
-    attacks
--   Public or private harassment
--   Publishing others’ private information, such as a physical or
-    electronic address, without explicit permission
--   Other conduct which could reasonably be considered inappropriate in a
-    professional setting
+- The use of sexualized language or imagery and unwelcome sexual
+  attention or advances
+- Trolling, insulting/derogatory comments, and personal or political
+  attacks
+- Public or private harassment
+- Publishing others’ private information, such as a physical or
+  electronic address, without explicit permission
+- Other conduct which could reasonably be considered inappropriate in a
+  professional setting
 
 ## Our Responsibilities
 

diff --git a/README.md b/README.md
@@ -16,10 +16,10 @@
 
 ## Features
 
--   Exploratory and targeted analysis of Electronic Health Records
--   Quality control & preprocessing
--   Visualization & Exploration
--   Clustering & trajectory inference
+- Exploratory and targeted analysis of Electronic Health Records
+- Quality control & preprocessing
+- Visualization & Exploration
+- Clustering & trajectory inference
 
 ## Installation
 

diff --git a/docs/_static/docstring_previews/coxph_forestplot.png b/docs/_static/docstring_previews/coxph_forestplot.png
diff --git a/docs/contributing.md b/docs/contributing.md
@@ -126,11 +126,11 @@ in the cookiecutter-scverse template.
 
 Please write documentation for new or changed features and use-cases. This project uses [sphinx][] with the following features:
 
--   the [myst][] extension allows to write documentation in markdown/Markedly Structured Text
--   Google-style docstrings
--   Jupyter notebooks as tutorials through [myst-nb][] (See [Tutorials with myst-nb](#tutorials-with-myst-nb-and-jupyter-notebooks))
--   [Sphinx autodoc typehints][], to automatically reference annotated input and output types
--   Citations (like {cite:p}`Virshup_2023`) can be included with [sphinxcontrib-bibtex](https://sphinxcontrib-bibtex.readthedocs.io/)
+- the [myst][] extension allows to write documentation in markdown/Markedly Structured Text
+- Google-style docstrings
+- Jupyter notebooks as tutorials through [myst-nb][] (See [Tutorials with myst-nb](#tutorials-with-myst-nb-and-jupyter-notebooks))
+- [Sphinx autodoc typehints][], to automatically reference annotated input and output types
+- Citations (like {cite:p}`Virshup_2023`) can be included with [sphinxcontrib-bibtex](https://sphinxcontrib-bibtex.readthedocs.io/)
 
 See the [scanpy developer docs](https://scanpy.readthedocs.io/en/latest/dev/documentation.html) for more information
 on how to write documentation.
@@ -144,10 +144,10 @@ These notebooks come from [pert-tutorials](https://github.com/theislab/ehrapy-tu
 
 #### Hints
 
--   If you refer to objects from other packages, please add an entry to `intersphinx_mapping` in `docs/conf.py`. Only
-    if you do so can sphinx automatically create a link to the external documentation.
--   If building the documentation fails because of a missing link that is outside your control, you can add an entry to
-    the `nitpick_ignore` list in `docs/conf.py`
+- If you refer to objects from other packages, please add an entry to `intersphinx_mapping` in `docs/conf.py`. Only
+  if you do so can sphinx automatically create a link to the external documentation.
+- If building the documentation fails because of a missing link that is outside your control, you can add an entry to
+  the `nitpick_ignore` list in `docs/conf.py`
 
 #### Building the docs locally
 

diff --git a/docs/index.md b/docs/index.md
@@ -61,8 +61,8 @@ medRxiv 2023.12.11.23299816; doi: https://doi.org/10.1101/2023.12.11.23299816 ](
 
 # Indices and tables
 
--   {ref}`genindex`
--   {ref}`modindex`
--   {ref}`search`
+- {ref}`genindex`
+- {ref}`modindex`
+- {ref}`search`
 
 [scanpy genome biology (2018)]: https://doi.org/10.1186/s13059-017-1382-0
diff --git a/docs/installation.md b/docs/installation.md
@@ -51,10 +51,10 @@ pip install ehrapy[medcat]
 
 Available language models are
 
--   en_core_web_md (python -m spacy download en_core_web_md)
--   en-core-sci-sm (pip install <https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_sm-0.4.0.tar.gz>)
--   en-core-sci-md (pip install <https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_md-0.4.0.tar.gz>)
--   en-core-sci-lg (pip install <https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_lg-0.4.0.tar.gz>)
+- en_core_web_md (python -m spacy download en_core_web_md)
+- en-core-sci-sm (pip install <https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_sm-0.4.0.tar.gz>)
+- en-core-sci-md (pip install <https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_md-0.4.0.tar.gz>)
+- en-core-sci-lg (pip install <https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.4.0/en_core_sci_lg-0.4.0.tar.gz>)
 
 [github repo]: https://github.com/theislab/ehrapy
 [pip]: https://pip.pypa.io

diff --git a/docs/tutorials/notebooks b/docs/tutorials/notebooks
diff --git a/ehrapy/plot/__init__.py b/ehrapy/plot/__init__.py
@@ -2,6 +2,6 @@
 from ehrapy.plot._colormaps import *  # noqa: F403
 from ehrapy.plot._missingno_pl_api import *  # noqa: F403
 from ehrapy.plot._scanpy_pl_api import *  # noqa: F403
-from ehrapy.plot._survival_analysis import kaplan_meier, kmf, ols
+from ehrapy.plot._survival_analysis import coxph_forestplot, kaplan_meier, ols
 from ehrapy.plot.causal_inference._dowhy import causal_effect
 from ehrapy.plot.feature_ranking._feature_importances import rank_features_supervised
diff --git a/ehrapy/plot/_survival_analysis.py b/ehrapy/plot/_survival_analysis.py
@@ -4,7 +4,10 @@
 from typing import TYPE_CHECKING
 
 import matplotlib.pyplot as plt
+import matplotlib.ticker as ticker
 import numpy as np
+import pandas as pd
+from matplotlib import gridspec
 from numpy import ndarray
 
 from ehrapy.plot import scatter
@@ -14,7 +17,7 @@
     from xmlrpc.client import Boolean
 
     from anndata import AnnData
-    from lifelines import KaplanMeierFitter
+    from lifelines import CoxPHFitter, KaplanMeierFitter
     from matplotlib.axes import Axes
     from statsmodels.regression.linear_model import RegressionResults
 
@@ -293,5 +296,133 @@ def kaplan_meier(
 
     if not show:
         return ax
+
     else:
         return None
+
+
+def coxph_forestplot(
+    coxph: CoxPHFitter,
+    labels: list[str] | None = None,
+    fig_size: tuple = (10, 10),
+    t_adjuster: float = 0.1,
+    ecolor: str = "dimgray",
+    size: int = 3,
+    marker: str = "o",
+    decimal: int = 2,
+    text_size: int = 12,
+    color: str = "k",
+):
+    """Plots a forest plot of the Cox Proportional Hazard model.
+    Inspired by the forest plot in the zEpid package in Python.
+    Link: https://zepid.readthedocs.io/en/latest/Graphics.html#effect-measure-plots
+
+    Args:
+        coxph: Fitted CoxPHFitter object.
+        labels: List of labels for each coefficient, default uses the index of the coxph.summary.
+        fig_size: Width, height in inches.
+        t_adjuster: Adjust the table to the right.
+        ecolor: Color of the error bars.
+        size: Size of the markers.
+        marker: Marker style.
+        decimal: Number of decimal places to display.
+        text_size: Font size of the text.
+        color: Color of the markers.
+
+    Examples:
+        >>> import ehrapy as ep
+        >>> adata = ep.dt.mimic_2(encoded=False)
+        >>> adata_subset = adata[:, ["mort_day_censored", "censor_flg", "gender_num", "afib_flg", "day_icu_intime_num"]]
+        >>> coxph = ep.tl.coxph(adata_subset, event_col="censor_flg", duration_col="mort_day_censored")
+        >>> ep.pl.coxph_forestplot(coxph)
+
+        .. image:: /_static/docstring_previews/coxph_forestplot.png
+
+    """
+
+    data = coxph.summary
+    auc_col = "coef"
+
+    if labels is None:
+        labels = data.index
+    tval = []
+    ytick = []
+    for i in range(len(data)):
+        if not np.isnan(data[auc_col][i]):
+            if (
+                (isinstance(data[auc_col][i], float))
+                & (isinstance(data["coef lower 95%"][i], float))
+                & (isinstance(data["coef upper 95%"][i], float))
+            ):
+                tval.append(
+                    [
+                        round(data[auc_col][i], decimal),
+                        (
+                            "("
+                            + str(round(data["coef lower 95%"][i], decimal))
+                            + ", "
+                            + str(round(data["coef upper 95%"][i], decimal))
+                            + ")"
+                        ),
+                    ]
+                )
+            else:
+                tval.append(
+                    [
+                        data[auc_col][i],
+                        ("(" + str(data["coef lower 95%"][i]) + ", " + str(data["coef upper 95%"][i]) + ")"),
+                    ]
+                )
+            ytick.append(i)
+        else:
+            tval.append([" ", " "])
+            ytick.append(i)
+
+    maxi = round(((pd.to_numeric(data["coef upper 95%"])).max() + 0.1), 2)  # setting x-axis maximum
+
+    mini = round(((pd.to_numeric(data["coef lower 95%"])).min() - 0.1), 1)  # setting x-axis minimum
+
+    fig = plt.figure(figsize=fig_size)
+    gspec = gridspec.GridSpec(1, 6)  # sets up grid
+    plot = plt.subplot(gspec[0, 0:4])  # plot of data
+    tabl = plt.subplot(gspec[0, 4:])  # table
+    plot.set_ylim(-1, (len(data)))  # spacing out y-axis properly
+
+    plot.axvline(1, color="gray", zorder=1)
+    lower_diff = data[auc_col] - data["coef lower 95%"]
+    upper_diff = data["coef upper 95%"] - data[auc_col]
+    plot.errorbar(
+        data[auc_col],
+        data.index,
+        xerr=[lower_diff, upper_diff],
+        marker="None",
+        zorder=2,
+        ecolor=ecolor,
+        linewidth=0,
+        elinewidth=1,
+    )
+    plot.scatter(data[auc_col], data.index, c=color, s=(size * 25), marker=marker, zorder=3, edgecolors="None")
+    plot.xaxis.set_ticks_position("bottom")
+    plot.yaxis.set_ticks_position("left")
+    plot.get_xaxis().set_major_formatter(ticker.ScalarFormatter())
+    plot.get_xaxis().set_minor_formatter(ticker.NullFormatter())
+    plot.set_yticks(ytick)
+    plot.set_xlim([mini, maxi])
+    plot.set_xticks([mini, 1, maxi])
+    plot.set_xticklabels([mini, 1, maxi])
+    plot.set_yticklabels(labels)
+    plot.tick_params(axis="y", labelsize=text_size)
+    plot.yaxis.set_ticks_position("none")
+    plot.invert_yaxis()  # invert y-axis to align values properly with table
+    tb = tabl.table(
+        cellText=tval, cellLoc="center", loc="right", colLabels=[auc_col, "95% CI"], bbox=[0, t_adjuster, 1, 1]
+    )
+    tabl.axis("off")
+    tb.auto_set_font_size(False)
+    tb.set_fontsize(text_size)
+    for _, cell in tb.get_celld().items():
+        cell.set_linewidth(0)
+    plot.spines["top"].set_visible(False)
+    plot.spines["right"].set_visible(False)
+    plot.spines["left"].set_visible(False)
+    return fig, plot
diff --git a/tests/_scripts/coxph_forestplot_create_expected.ipynb b/tests/_scripts/coxph_forestplot_create_expected.ipynb
@@ -0,0 +1,86 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%load_ext autoreload\n",
+    "%autoreload 2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "import ehrapy as ep"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "current_notebook_dir = %pwd\n",
+    "_TEST_IMAGE_PATH = f\"{current_notebook_dir}/../plot/_images\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "adata = ep.dt.mimic_2(encoded=False)\n",
+    "adata_subset = adata[:, [\"mort_day_censored\", \"censor_flg\", \"gender_num\", \"afib_flg\", \"day_icu_intime_num\"]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "genderafib_coxph = ep.tl.cox_ph(adata_subset, duration_col=\"mort_day_censored\", event_col=\"censor_flg\")\n",
+    "\n",
+    "fig, ax = ep.pl.coxph_forestplot(genderafib_coxph, fig_size=(12, 3), t_adjuster=0.15, marker=\"o\", size=2, text_size=14)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig.savefig(f\"{_TEST_IMAGE_PATH}/coxph_forestplot_expected.png\", dpi=80)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/tests/conftest.py b/tests/conftest.py
@@ -29,6 +29,12 @@ def rng():
     return np.random.default_rng(seed=42)
 
 
+@pytest.fixture
+def mimic_2():
+    adata = ep.dt.mimic_2()
+    return adata
+
+
 @pytest.fixture
 def mimic_2_encoded():
     adata = ep.dt.mimic_2(encoded=True)

diff --git a/tests/plot/_images/coxph_forestplot_expected.png b/tests/plot/_images/coxph_forestplot_expected.png