Skip to content

Commit

Permalink
v0.3.11 release (#74)
Browse files Browse the repository at this point in the history
v0.3.11 release

Features:
* Traffic light overview (#62)

Documentation:
* Downloads badge readme
* List talks and articles in readme (#66)
* Add image to README.rst (#64)

Other improvements:
* Change notebook testing to pytest-notebook (previously these tests were skipped in CI). Add try-except ImportError for pyspark code. (#67)
* Fix a few typo's
* suppress "matplotlib backend" verbose warning
* click on "popmon report" also scrolls to top
* Update HTML reports using Github Actions (#63)
* Bugfix in hist.py that broke the advanced tutorial.

Notebooks:
* Add %%capture to pip install inside of notebooks.
* Make package install in notebooks work with paths with spaces.
* Pickle doesn't work with tests (not really a popmon-specific feature anyway). Changed the notebook to fix the issue, left the code for reference.

* Version bump
  • Loading branch information
sbrugman authored Dec 7, 2020
1 parent dd707f4 commit b8d94a0
Show file tree
Hide file tree
Showing 17 changed files with 489 additions and 137 deletions.
36 changes: 35 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Population Shift Monitoring
===========================

|build| |docs| |release| |release_date|
|build| |docs| |release| |release_date| |downloads|

|logo|

Expand Down Expand Up @@ -128,6 +128,37 @@ These examples also work with spark dataframes.
You can see the output of such example notebook code `here <https://crclz.com/popmon/reports/test_data_report.html>`_.
For all available examples, please see the `tutorials <https://popmon.readthedocs.io/en/latest/tutorials.html>`_ at read-the-docs.

Resources
=========

Presentations
-------------

+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Title | Host | Date | Speaker |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Popmon - population monitoring made easy | `Data Lunch @ Eneco <https://www.eneco.nl/>`_ | October 29, 2020 | Max Baak, Simon Brugman |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Popmon - population monitoring made easy | `Data Science Summit 2020 <https://dssconf.pl/en/>`_ | October 16, 2020 | Max Baak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| `Population Shift Monitoring Made Easy: the popmon package <https://youtu.be/PgaQpxzT_0g>`_ | `Online Data Science Meetup @ ING WBAA <https://www.meetup.com/nl-NL/Tech-Meetups-ING/events/>`_ | July 8 2020 | Tomas Sostak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| `Popmon: Population Shift Monitoring Made Easy <https://www.youtube.com/watch?v=HE-3YeVYqPY>`_ | `PyData Fest Amsterdam 2020 <https://amsterdam.pydata.org/>`_ | June 16, 2020 | Tomas Sostak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+
| Popmon: Population Shift Monitoring Made Easy | `Amundsen Community Meetup <https://github.com/amundsen-io/amundsen>`_ | June 4, 2020 | Max Baak |
+------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+------------------+-------------------------+


Articles
--------

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+----------------+
| Title | Date | Author |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+----------------+
| `Popmon Open Source Package — Population Shift Monitoring Made Easy <https://medium.com/wbaa/population-monitoring-open-source-1ce3139d8c3a>`_ | May 20, 2020 | Nicole Mpozika |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------+----------------+


Project contributors
====================

Expand Down Expand Up @@ -171,3 +202,6 @@ Copyright ING WBAA. `popmon` is completely free, open-source and licensed under
.. |notebook_incremental_data_colab| image:: https://colab.research.google.com/assets/colab-badge.svg
:alt: Open in Colab
:target: https://colab.research.google.com/github/ing-bank/popmon/blob/master/popmon/notebooks/popmon_tutorial_incremental_data.ipynb
.. |downloads| image:: https://pepy.tech/badge/popmon
:alt: PyPi downloads
:target: https://pepy.tech/project/popmon
8 changes: 4 additions & 4 deletions popmon/alerting/alerts_summary.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def __init__(
:param str read_key: key of input data to read from datastore.
:param str store_key: key of output data to store in datastore (optional).
:param str combined_variable: name of artifical variable that combines all alerts. default is '_AGGREGATE_'.
:param str combined_variable: name of artificial variable that combines all alerts. default is '_AGGREGATE_'.
:param list features: features of data frames to pick up from input data (optional).
:param list ignore_features: list of features to ignore (optional).
"""
Expand Down Expand Up @@ -77,7 +77,7 @@ def transform(self, datastore):
df = (self.get_datastore_object(data, feature, dtype=pd.DataFrame)).copy(
deep=False
)
df.columns = [feature + "_" + c for c in df.columns]
df.columns = [f"{feature}_{c}" for c in df.columns]
df_list.append(df)

# the different features could technically have different indices.
Expand All @@ -99,8 +99,8 @@ def transform(self, datastore):
dfc["worst"] = tlv[cols].values.max(axis=1) if len(cols) else 0
# colors of traffic lights
for color in ["green", "yellow", "red"]:
cols = fnmatch.filter(tlv.columns, "*_n_{}".format(color))
dfc["n_{}".format(color)] = tlv[cols].values.sum(axis=1) if len(cols) else 0
cols = fnmatch.filter(tlv.columns, f"*_n_{color}")
dfc[f"n_{color}"] = tlv[cols].values.sum(axis=1) if len(cols) else 0

# store combination of traffic alerts
data[self.combined_variable] = dfc
Expand Down
2 changes: 1 addition & 1 deletion popmon/hist/histogram.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ def __repr__(self):
return f"HistogramContainer(dtype={self.npdtype}, n_dims={self.n_dim})"

def __str__(self):
return str(self)
return repr(self)

def _edit_name(self, axis_name, xname, yname, convert_time_index, short_keys):
if convert_time_index and self.is_ts:
Expand Down
60 changes: 33 additions & 27 deletions popmon/notebooks/popmon_tutorial_advanced.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
Expand All @@ -26,10 +25,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"# install popmon (if not installed yet)\n",
"import sys\n",
"\n",
"!{sys.executable} -m pip install popmon"
"!\"{sys.executable}\" -m pip install popmon"
]
},
{
Expand Down Expand Up @@ -145,11 +145,13 @@
"outputs": [],
"source": [
"# download histogrammar jar files if not already installed, used for histogramming of spark dataframe\n",
"from pyspark.sql import SparkSession\n",
"try:\n",
" from pyspark.sql import SparkSession\n",
"\n",
"spark = SparkSession.builder.config(\n",
" \"spark.jars.packages\", \"org.diana-hep:histogrammar-sparksql_2.11:1.0.4\"\n",
").getOrCreate()"
" pyspark_installed = True\n",
"except ImportError:\n",
" print(\"pyspark needs to be installed for this example\")\n",
" pyspark_installed = False"
]
},
{
Expand All @@ -158,18 +160,19 @@
"metadata": {},
"outputs": [],
"source": [
"sdf = spark.createDataFrame(df)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sdf.pm_stability_report(\n",
" time_axis=\"DATE\", time_width=\"1w\", time_offset=\"2015-07-02\", extended_report=False\n",
")"
"if pyspark_installed:\n",
" spark = SparkSession.builder.config(\n",
" \"spark.jars.packages\", \"org.diana-hep:histogrammar-sparksql_2.11:1.0.4\"\n",
" ).getOrCreate()\n",
"\n",
" sdf = spark.createDataFrame(df)\n",
"\n",
" sdf.pm_stability_report(\n",
" time_axis=\"DATE\",\n",
" time_width=\"1w\",\n",
" time_offset=\"2015-07-02\",\n",
" extended_report=False,\n",
" )"
]
},
{
Expand Down Expand Up @@ -287,7 +290,7 @@
"outputs": [],
"source": [
"split_hist = split_hists.query(\"date == '2015-07-05 12:00:00'\")\n",
"split_hist.histogram[0].hist.plot.matplotlib();"
"split_hist.histogram[0].hist.plot.matplotlib()"
]
},
{
Expand All @@ -303,7 +306,7 @@
"metadata": {},
"outputs": [],
"source": [
"split_hist.histogram_ref[0].hist.plot.matplotlib();"
"split_hist.histogram_ref[0].hist.plot.matplotlib()"
]
},
{
Expand All @@ -320,11 +323,14 @@
"metadata": {},
"outputs": [],
"source": [
"import pickle\n",
"# As HTML report\n",
"report.to_file(\"report.html\")\n",
"\n",
"with open(\"report.pkl\", \"wb\") as f:\n",
" pickle.dump(report, f)\n",
"report.to_file(\"report.html\")"
"# Alternatively, as serialized Python object\n",
"# import pickle\n",
"\n",
"# with open(\"report.pkl\", \"wb\") as f:\n",
"# pickle.dump(report, f)"
]
},
{
Expand Down Expand Up @@ -473,18 +479,18 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
"version": "3.8.6"
},
"nteract": {
"version": "0.15.0"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"source": [],
"metadata": {
"collapsed": false
}
},
"source": []
}
}
},
Expand Down
4 changes: 2 additions & 2 deletions popmon/notebooks/popmon_tutorial_basic.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
Expand Down Expand Up @@ -36,10 +35,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"# install popmon (if not installed yet)\n",
"import sys\n",
"\n",
"!{sys.executable} -m pip install popmon"
"!\"{sys.executable}\" -m pip install popmon"
]
},
{
Expand Down
3 changes: 2 additions & 1 deletion popmon/notebooks/popmon_tutorial_incremental_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"# install popmon (if not installed yet)\n",
"import sys\n",
"\n",
"!{sys.executable} -m pip install popmon"
"!\"{sys.executable}\" -m pip install popmon"
]
},
{
Expand Down
47 changes: 24 additions & 23 deletions popmon/pipeline/report_pipelines.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


from pathlib import PosixPath
from pathlib import Path

from ..base import Pipeline
from ..config import config
Expand All @@ -30,6 +30,7 @@
metrics_self_reference,
)
from ..visualization import (
AlertSectionGenerator,
HistogramSection,
ReportGenerator,
SectionGenerator,
Expand All @@ -46,7 +47,7 @@ def self_reference(
features=None,
skip_empty_plots=True,
last_n=0,
plot_hist_n=2,
plot_hist_n=6,
report_filepath=None,
show_stats=None,
**kwargs,
Expand Down Expand Up @@ -160,7 +161,7 @@ def rolling_reference(
features=None,
skip_empty_plots=True,
last_n=0,
plot_hist_n=2,
plot_hist_n=6,
report_filepath=None,
show_stats=None,
**kwargs,
Expand Down Expand Up @@ -218,7 +219,7 @@ def expanding_reference(
features=None,
skip_empty_plots=True,
last_n=0,
plot_hist_n=2,
plot_hist_n=6,
report_filepath=None,
show_stats=None,
**kwargs,
Expand Down Expand Up @@ -284,7 +285,7 @@ def __init__(
last_n=0,
skip_first_n=0,
skip_last_n=0,
plot_hist_n=2,
plot_hist_n=6,
):
"""Initialize an instance of Report.
Expand Down Expand Up @@ -329,35 +330,35 @@ def sg_kws(read_key):
# - a section showing all traffic light alerts of monitored statistics
# - a section with a summary of traffic light alerts
# --- o generate report
SectionGenerator(
dynamic_bounds="dynamic_bounds",
section_name=profiles_section,
static_bounds="static_bounds",
ignore_stat_endswith=["_mean", "_std", "_pull"],
**sg_kws("profiles"),
HistogramSection(
read_key="split_hists",
store_key=sections_key,
section_name=histograms_section,
hist_name_starts_with="histogram",
last_n=plot_hist_n,
description=descs.get("histograms", ""),
),
TrafficLightSectionGenerator(
section_name=traffic_lights_section, **sg_kws("traffic_lights")
),
AlertSectionGenerator(section_name=alerts_section, **sg_kws("alerts")),
SectionGenerator(
dynamic_bounds="dynamic_bounds_comparisons",
static_bounds="static_bounds_comparisons",
section_name=comparisons_section,
ignore_stat_endswith=["_mean", "_std", "_pull"],
**sg_kws("comparisons"),
),
TrafficLightSectionGenerator(
section_name=traffic_lights_section, **sg_kws("traffic_lights")
),
SectionGenerator(section_name=alerts_section, **sg_kws("alerts")),
HistogramSection(
read_key="split_hists",
store_key=sections_key,
section_name=histograms_section,
hist_name_starts_with="histogram",
last_n=plot_hist_n,
description=descs.get("histograms", ""),
SectionGenerator(
dynamic_bounds="dynamic_bounds",
section_name=profiles_section,
static_bounds="static_bounds",
ignore_stat_endswith=["_mean", "_std", "_pull"],
**sg_kws("profiles"),
),
ReportGenerator(read_key=sections_key, store_key=store_key),
]
if isinstance(report_filepath, (str, PosixPath)) and len(report_filepath) > 0:
if isinstance(report_filepath, (str, Path)) and len(report_filepath) > 0:
self.modules.append(FileWriter(store_key, file_path=report_filepath))

def transform(self, datastore):
Expand Down
4 changes: 2 additions & 2 deletions popmon/version.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""THIS FILE IS AUTO-GENERATED BY SETUP.PY."""

name = "popmon"
version = "0.3.10"
full_version = "0.3.10"
version = "0.3.11"
full_version = "0.3.11"
release = True
4 changes: 3 additions & 1 deletion popmon/visualization/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,15 @@

# flake8: noqa

from popmon.visualization.alert_section_generator import AlertSectionGenerator
from popmon.visualization.histogram_section import HistogramSection
from popmon.visualization.report_generator import ReportGenerator
from popmon.visualization.section_generator import SectionGenerator
from popmon.visualization.traffic_light_section_generator import (
TrafficLightSectionGenerator,
)

# set matplotlib backend to batchmode when running in shell
# set matplotlib backend to batch mode when running in shell
# need to do this *before* matplotlib.pyplot gets imported
from ..visualization.backend import set_matplotlib_backend

Expand All @@ -39,4 +40,5 @@
"HistogramSection",
"ReportGenerator",
"TrafficLightSectionGenerator",
"AlertSectionGenerator",
]
Loading

0 comments on commit b8d94a0

Please sign in to comment.