Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merlion dashboard app #129

Merged
merged 61 commits into from
Nov 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
0ee3a1a
Merlion dashboard app
yangwenzhuo08 Oct 25, 2022
e260b59
Merge branch 'main' into dashboard
aadyotb Oct 26, 2022
501198d
Restructure the dashboard module
yangwenzhuo08 Oct 27, 2022
ae4324f
Restructure the dashboard module
yangwenzhuo08 Oct 27, 2022
c8abce9
Restructure the dashboard module
yangwenzhuo08 Oct 27, 2022
1e42e17
Add AutoETS and AutoProphet
yangwenzhuo08 Oct 27, 2022
95d078a
Remove MSES from dashboard
yangwenzhuo08 Oct 27, 2022
14f96f1
Restructure the dashboard module
yangwenzhuo08 Oct 27, 2022
bc83667
Restructure the dashboard module
yangwenzhuo08 Oct 27, 2022
7e2fc9a
Updates to dashboard directory structure.
aadyotb Oct 27, 2022
d82576f
Add Java to the dashboard Dockerfile.
aadyotb Oct 27, 2022
c2caa1d
Update version to 2.0.0
aadyotb Oct 27, 2022
4c501c3
Add explicit diskcache requirement.
aadyotb Oct 27, 2022
cfb05bb
Remove merlion/dashboard/app.py
aadyotb Oct 27, 2022
5e66484
Make empty figure actually be empty.
aadyotb Oct 27, 2022
f11a128
Use post-rule on train anomaly scores.
aadyotb Oct 27, 2022
6512fc5
Add exogenous regressor support to dashboard.
aadyotb Oct 28, 2022
7bbd74c
Add "max_lag" parameter for AutoETS and AutoProphet
yangwenzhuo08 Oct 28, 2022
9ecb98b
add type for input arguments with default values
chenghaoliu89 Oct 28, 2022
c38c103
Merge remote-tracking branch 'origin/dashboard' into dashboard
chenghaoliu89 Oct 28, 2022
21bde1d
Update type annotations.
aadyotb Oct 28, 2022
cd21dda
Plot ground truth anomalies if given.
aadyotb Oct 28, 2022
57c46d2
Fix a bug in plot_anoms_plotly
yangwenzhuo08 Oct 31, 2022
f3c78c7
Revise the control panel
yangwenzhuo08 Oct 31, 2022
aa8fc09
Revise the control panel
yangwenzhuo08 Oct 31, 2022
422872e
Fix bug in plot_anoms
aadyotb Oct 31, 2022
edbc10a
Update exogenous regressor notebook.
aadyotb Oct 31, 2022
ecc0e39
More systematic handling of exogenous regressors.
aadyotb Oct 31, 2022
9396606
Make anomaly dash more similar to forecast.
aadyotb Nov 1, 2022
424cc33
Update radio button names.
aadyotb Nov 1, 2022
14432af
Fix a bug in click_train_test
yangwenzhuo08 Nov 1, 2022
5082f8d
Change the default values of collapses
yangwenzhuo08 Nov 1, 2022
d673c63
Change the location of the progress bar
yangwenzhuo08 Nov 1, 2022
133672e
Allow default maxlags & add type annotations.
aadyotb Nov 1, 2022
7ed92c1
Add the docs for the Merlion dashboard
yangwenzhuo08 Nov 3, 2022
3e02e57
Add the docs for the Merlion dashboard
yangwenzhuo08 Nov 3, 2022
28a992f
Add the docs for the Merlion dashboard
yangwenzhuo08 Nov 3, 2022
78ffdce
Add dashboard info to main docs.
aadyotb Nov 3, 2022
9310743
Update dashboard screenshots.
aadyotb Nov 3, 2022
a408414
Fix the datetime format issue
yangwenzhuo08 Nov 4, 2022
5ae1800
Fix the datetime format issue
yangwenzhuo08 Nov 4, 2022
6989739
Fix the json stats format issue
yangwenzhuo08 Nov 4, 2022
9d08723
Fix docs build error.
aadyotb Nov 4, 2022
2d0fef8
Add exogenous regressors to VectorAR.
aadyotb Nov 4, 2022
baf8c28
Fix deprecation warning in GH Actions.
aadyotb Nov 4, 2022
23b54fc
Exclude dashboard from tests.
aadyotb Nov 4, 2022
0b378a4
Minor bugfix.
aadyotb Nov 4, 2022
0d36ba9
Use future exog values, not current ones.
aadyotb Nov 4, 2022
7d205b4
More flexible MV forecasting with the dashboard.
aadyotb Nov 4, 2022
f070be5
Reorder forecast & anomaly in the dashboard.
aadyotb Nov 4, 2022
ad47cb2
Fix the bug in load_data
yangwenzhuo08 Nov 7, 2022
c9f7311
Allow specifying enum parameters in dashboard.
aadyotb Nov 7, 2022
4f18b6c
Merge branch 'dashboard' of https://github.com/salesforce/Merlion int…
aadyotb Nov 7, 2022
07f713d
Add info-level logging to dashboard.
aadyotb Nov 7, 2022
e39fd74
Create static method for seasonality detection.
aadyotb Nov 7, 2022
7875e4f
Subtract trend before doing seasonality detection.
aadyotb Nov 7, 2022
cea4bd2
Use seasonality for default maxlags.
aadyotb Nov 7, 2022
a3760a6
Reorder forecasting models on dashboard.
aadyotb Nov 7, 2022
c81bce6
Fix bugs: prevent init call for several callbacks
yangwenzhuo08 Nov 8, 2022
035f056
Compute seas for both regular & de-trended data.
aadyotb Nov 8, 2022
a25ad79
Merge branch 'dashboard' of https://github.com/salesforce/Merlion int…
aadyotb Nov 8, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,15 @@ jobs:
timeout_minutes: 40
command: |
# Get a comma-separated list of the directories of all python source files
source_files=$(for f in $(find merlion -iname "*.py"); do echo -n ",$f"; done)
script="import os; print(','.join({os.path.dirname(f) for f in '$source_files'.split(',') if f}))"
files=$(for f in $(find merlion -iname "*.py"); do echo -n ",$f"; done)
script="import os; print(','.join({os.path.dirname(f) for f in '$files'.split(',') if f and 'dashboard' not in f}))"
source_modules=$(python -c "$script")

# Run tests & obtain code coverage from coverage report.
coverage run --source=${source_modules} -L -m pytest -v -s
coverage report && coverage xml -o .github/badges/coverage.xml
COVERAGE=`coverage report | grep "TOTAL" | grep -Eo "[0-9\.]+%"`
echo "##[set-output name=coverage;]${COVERAGE}"
echo "coverage=${COVERAGE}" >> $GITHUB_OUTPUT

# Choose a color based on code coverage
COVERAGE=${COVERAGE/\%/}
Expand All @@ -65,7 +65,7 @@ jobs:
else
COLOR=red
fi
echo "##[set-output name=color;]${COLOR}"
echo "color=${COLOR}" >> $GITHUB_OUTPUT

- name: Create coverage badge
if: ${{ github.ref == 'refs/heads/main' && matrix.python-version == '3.10' }}
Expand Down
25 changes: 20 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,9 @@ time series as ``pandas.DataFrame`` s with accompanying metadata.
You can install `merlion` from PyPI by calling ``pip install salesforce-merlion``. You may install from source by
cloning this repoand calling ``pip install Merlion/``, or ``pip install -e Merlion/`` to install in editable mode.
You may install additional dependencies via ``pip install salesforce-merlion[all]``, or by calling
``pip install "Merlion/[all]"`` if installing from source. Individually, the optional dependencies include ``plot``
for interactive plots and ``deep-learning`` for all deep learning models.
``pip install "Merlion/[all]"`` if installing from source.
Individually, the optional dependencies include ``dashboard`` for a GUI dashboard,
``spark`` for a distributed computation backend with PySpark, and ``deep-learning`` for all deep learning models.

To install the data loading package `ts_datasets`, clone this repo and call ``pip install -e Merlion/ts_datasets/``.
This package must be installed in editable mode (i.e. with the ``-e`` flag) if you don't want to manually specify the
Expand Down Expand Up @@ -107,10 +108,23 @@ and presents experimental results on time series anomaly detection & forecasting
time series.

## Getting Started
Here, we provide some minimal examples using Merlion default models,
to help you get started with both anomaly detection and forecasting.
The easiest way to get started is to use the GUI web-based
[dashboard](https://opensource.salesforce.com/Merlion/merlion.dashboard.html).
This dashboard provides a great way to quickly experiment with many models on your own custom datasets.
To use it, install Merlion with the optional ``dashboard`` dependency (i.e.
``pip install salesforce-merlion[dashboard]``), and call ``python -m merlion.dashboard`` from the command line.
You can view the dashboard at http://localhost:8050.
Below, we show some screenshots of the dashboard for both anomaly detection and forecasting.

![anomaly dashboard](https://github.com/salesforce/Merlion/raw/main/docs/source/_static/dashboard_anomaly.png)

![forecast dashboard](https://github.com/salesforce/Merlion/raw/main/docs/source/_static/dashboard_forecast.png)

To help you get started with using Merlion in your own code, we provide below some minimal examples using Merlion
default models for both anomaly detection and forecasting.

### Anomaly Detection
Here, we show the code to replicate the results from the anomaly detection dashboard above.
We begin by importing Merlion’s `TimeSeries` class and the data loader for the Numenta Anomaly Benchmark `NAB`.
We can then divide a specific time series from this dataset into training and testing splits.

Expand Down Expand Up @@ -164,6 +178,7 @@ Precision: 0.6667, Recall: 0.6667, F1: 0.6667
Mean Time To Detect: 1 days 10:30:00
```
### Forecasting
Here, we show the code to replicate the results from the forecasting dashboard above.
We begin by importing Merlion’s `TimeSeries` class and the data loader for the `M4` dataset. We can then divide a
specific time series from this dataset into training and testing splits.

Expand Down Expand Up @@ -215,7 +230,7 @@ msis = ForecastMetric.MSIS.value(ground_truth=test_data, predict=test_pred,
print(f"sMAPE: {smape:.4f}, MSIS: {msis:.4f}")
```
```
sMAPE: 6.2855, MSIS: 19.1584
sMAPE: 4.1944, MSIS: 18.9331
```

## Evaluation and Benchmarking
Expand Down
15 changes: 15 additions & 0 deletions docker/dashboard/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM python:3.9-slim
WORKDIR /opt/Merlion
# Install Java
RUN rm -rf /var/lib/apt/lists/* && \
apt-get clean && \
apt-get update && \
apt-get upgrade && \
apt-get install -y --no-install-recommends openjdk-11-jre-headless && \
rm -rf /var/lib/apt/lists/*
# Install Merlion from source & set up a gunicorn server
COPY *.md ./
COPY setup.py ./
COPY merlion merlion
RUN pip install gunicorn "./[dashboard]"
CMD gunicorn -b 0.0.0.0:80 merlion.dashboard.server:server
2 changes: 0 additions & 2 deletions Dockerfile → docker/spark-on-k8s/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,3 @@ RUN pip install pyarrow "./"
COPY apps /opt/spark/apps
RUN chmod g+w /opt/spark/apps
USER ${spark_uid}
COPY emissions.csv emissions.csv
COPY emissions.json emissions.json
Binary file added docs/source/_static/dashboard_anomaly.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/dashboard_file.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/dashboard_forecast.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion docs/source/examples

This file was deleted.

12 changes: 9 additions & 3 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ You can install ``merlion`` from PyPI by calling ``pip install salesforce-merlio
cloning the Merlion `repo <https://github.com/salesforce/Merlion>`__ and calling ``pip install Merlion/``, or
``pip install -e Merlion/`` to install in editable mode. You may install additional optional dependencies via
``pip install salesforce-merlion[all]``, or by calling ``pip install "Merlion/[all]"`` if installing from source.
Individually, the optional dependencies include ``plot`` for interactive plots
and ``deep-learning`` for all deep learning models.
Individually, the optional dependencies include ``dashboard`` for a GUI dashboard,
``spark`` for a distributed computation backend with PySpark, and ``deep-learning`` for all deep learning models.

To install the data loading package ``ts_datasets``, clone the Merlion
`repo <https://github.com/salesforce/Merlion>`__ and call ``pip install -e Merlion/ts_datasets/``. This package must be
Expand Down Expand Up @@ -59,7 +59,13 @@ Note the following external dependencies:

Getting Started
---------------
To get started, we recommend the linked tutorials on `anomaly detection <tutorials/anomaly/0_AnomalyIntro>`
The easiest way to get started is to use the GUI web-based `dashboard <merlion.dashboard>`.
This dashboard provides a great way to quickly experiment with many models on your own custom datasets.
To use it, install Merlion with the optional ``dashboard`` dependency (i.e.
``pip install salesforce-merlion[dashboard]``), and call ``python -m merlion.dashboard`` from the command line.
You can view the dashboard at http://localhost:8050.

For code resources, we recommend the linked tutorials on `anomaly detection <tutorials/anomaly/0_AnomalyIntro>`
and `forecasting <tutorials/forecast/0_ForecastIntro>`. After that, you should read in more detail about Merlion's
main data structure for representing time series `here <tutorials/TimeSeries>`.

Expand Down
78 changes: 78 additions & 0 deletions docs/source/merlion.dashboard.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
merlion.dashboard package
=========================

This package includes a GUI dashboard app for Merlion, providing a convenient way to train
and test a time series forecasting or anomaly detection model supported in Merlion. To launch
the dashboard app, type the following command: ``python -m merlion.dashboard``.

It will launch a Dash app on http://localhost:8050/ by default. After opening the link, the app
will create a folder ``merlion`` in your home directory. This folder includes the datasets you want to
analyze or train a model with (in the ``data`` folder), and the trained models for time series
forecasting or anomaly detection (in the ``models`` folder).

The app has three tabs. The first one is called "file manager" in which you can upload your datasets
(the datasets will be stored in ``~/merlion/data``), check basic statistics of the datasets, visualize
the time series data, or download a particular trained model:

.. image:: _static/dashboard_file.png

You can click "Drag & Drop" to upload the file to the ``merlion`` folder (our app is designed to support
docker deployment, so it doesn't allow to open a local file directly). If you use the app on a local
machine, you can also copy the data to ``~/merlion/data`` directly. The supported data file is in
the csv format, where the first column should be either integer Unix timestamps in milliseconds, or datetimes in a
string format (e.g., "1970-01-01 00:00:00"). The other columns are the features/variables.

Clicking the load button will load the dataset and show the time series figure on the right hand side.
It will also show some basic statistics, e.g., time series length, mean/std for each variable.
If you have already trained a model using the dashboard, you can select the model you want to download
and click the download button. The model and its configuration file will be compressed into a zip file.

The second tab is used to train a time series anomaly detection model:

.. image:: _static/dashboard_anomaly.png

The app provides full support for these models, where you can choose different algorithms and set particular parameters
according to your needs. To train a model, you need to:

- **Select the dataset**: You can select a single training dataset if there is no test dataset, and then choose
a train/test split fraction for splitting this dataset into training and test dataset for evaluation.
If you have the test dataset, you can choose "Separate train/test files" and select the test dataset,
and then the model will be trained with the training dataset and evaluated with the separate test dataset.
The screenshot above uses a single data file, where we use the first 15% for training and the last 85% for testing.
- **Set the feature columns**: Merlion supports both univariate and multivariate time series anomaly detection,
so you can choose one or more features on which to train an anomaly detection model.
- **Set the label column**: If the dataset has a label column, you can set it for evaluation. Otherwise,
ignore this setting.
- **Select an anomaly detection algorithm**: You need to choose an anomaly detection algorithm such as
IsolationForest. You may modify the model's hyperparameters if the default values do not work well.
- **Set threshold parameters**: You can also test different settings for the detection threshold to
determine which value is better for your specific application. Note that updating the threshold will
not re-train the entire model; it will simply change the post-processing applied by the trained model.

The training procedure begins after clicking the train button, and the trained model is saved in the
folder ``~/merlion/models/algorithm_name``. The figure on the right hand side shows the detection results
on the test dataset, and the tables show the training and testing performance metrics if you set the
label column.

The third tab is used to train a time series forecasting model supported in Merlion:

.. image:: _static/dashboard_forecast.png

The app provides full support for these models, where you can choose different algorithms and set particular parameters
according to your needs. To train a model, you need to:

- **Select the dataset**: You can select a single training dataset if there is no test dataset, and then choose
a train/test split fraction for splitting this dataset into training and test dataset for evaluation.
If you have the test dataset, you can choose "Separate train/test files" and select the test dataset,
and then the model will be trained with the training dataset and evaluated with the separate test dataset.
The screenshot above uses separate train/test files.
- **Set the target column**: You need to set the target column whose value you wish to forecast (required),
any additional features to use for `multivariate forecasting <tutorials/forecast/2_ForecastMultivariate>` (optional),
and the `exogenous variables <tutorials/forecast/3_ForecastExogenous>` whose values are known a priori (optional).
- **Select a forecasting algorithm**: Finally, you need to choose a forecasting algorithm such as
Arima, AutoETS. You may modify the model's hyperparameters if the default values do not work well.

The training procedure begins after clicking the train button. It may take some time to finish model
training. After the model is trained, the model files will be saved in the folder ``~/merlion/models/algorithm_name``.
The figure on the right hand side shows the forecasting results on the test dataset, and the tables
show the training and testing performance metrics.
8 changes: 8 additions & 0 deletions docs/source/merlion.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ each associated with its own sub-package:
detection and forecasting.
- :py:mod:`merlion.models.automl`: AutoML layers for various models

- :py:mod:`merlion.dashboard`: A GUI dashboard app for Merlion, which can be started with
``python -m merlion.dashboard``. This dashboard provides a good way to quickly experiment many models on a new
time series.
- :py:mod:`merlion.spark`: APIs to integrate Merlion with PySpark for using distributed computing to run training
and inference on multiple time series in parallel.
- :py:mod:`merlion.transform`: Data pre-processing layer which implements many standard data transformations used in
Expand Down Expand Up @@ -55,6 +58,11 @@ Subpackages
:maxdepth: 4

merlion.models

.. toctree::
:maxdepth: 2

merlion.dashboard
merlion.spark
merlion.transform
merlion.post_process
Expand Down
Loading