-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merlion dashboard app #129
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution Wenzhuo! Here are some of my initial comments:
- Can update the PR description to outline what you've done, how the code is organized, and what the major files are for?
- Why is the dashboard a separate folder, rather than a part of the Merlion package? Specifically, I'm wondering if it's possible to do something like
python -m merlion.dashboard
instead ofpython app.py
to launch the app server. You could include a subprocess call inmerlion/dashboard/__main__.py
like this StackOverflow answer. Note that you could listdashboard
as an optional dependency in Merlion'ssetup.py
and throw an import error inmerlion/dashboard/__init__.py
if the dashboard dependencies are not installed. If you do this, please also change all relative import paths to absolute paths. - Can you install pre-commit and make sure the formatting & copyright headers are applied to all python files? See here.
- Can you provide an overview this dashboard in the repo's main
README.md
? I'm thinking you can add this as a new section before "Getting Started". And you can reproduce the same information indocs/source/index.rst
. - What is the purpose of
test_anomaly.py
andtest_forecast.py
? Seems like they are redundant with existing test coverage.test_models.py
makes sense though, since it's testing your new model classes. - Can you move the new tests to the main
tests
folder instead ofdashboard/tests
? This also follows from point (2).
Move merlion/dashboard/dashboard to merlion/dashboard, change all relative imports to absolute imports, move Dockerfiles to a separate folder.
@yangwenzhuo08 thanks for your changes! This looks great. I've finished what you started in terms of restructuring the module. Now, In terms of my original comments, can you add the documentation I requested previously? Besides, this, I have a couple new requests.
|
@aadyotb Thanks for the revision. For the forecasting tab, we can split train file and test file as the anomaly tab does. Well, to combine these two UIs (upload two files, upload a single file with a split fraction), I'm not sure what layout is better for it. Do you have suggestion on the UI design for this part? For forecasting, it may be straightforward, e.g., we have two dropdown lists, one for train file, the other for test file. And then we have a slider to set the split fraction which is used to split the training data into "train" and "validation". But for anomaly detection, such split has a problem when the number of labels is small, i.e., it is possible that the split validation dataset has no anomalies. |
@yangwenzhuo08 I envision something like the following: you can have a radio box which can select "use same file for train/test" or "use separate test file". If you select "use same file for train/test", you get the slider where you specify the train/test fraction. If you select "use separate test file", you get a prompt to choose the test file. If you specify "use separate test file", the module should throw an error if the test data is not given. What do you think? And in terms of anomaly detection, it's kind of a well-known issue that the labels are sparse. The evaluation metrics are implemented in such a way that they have reliable fallback options if there are no true positives present in the data. Maybe you can use the |
So the layout is like this:
|
Yes, this sounds good. |
For endogenous variables X and exogenous variables Y, the old implementation of sklearn_base predicted X_t = f(X_{t-1}, Y_{t-1}). Now, we predict X_t = f(X_{t-1}, Y_t), i.e. we actually use the future value of the exogenous regressors.
Now, the user can manually select which features they want to use for multivariate forecasting (instead of just using all non-exogenous features by default).
This PR implements a web-based visualization dashboard for Merlion. Users can get it set up by installing Merlion with the optional
dashboard
dependency, i.e.pip install salesforce-merlion[dashboard]
. Then, they can start it up withpython -m merlion.dashboard
, which will start up the dashboard on port 8050. The dashboard has 3 tabs: a file manager where users can upload CSV files & visualize time series; a forecasting tab where users can try different forecasting algorithms on different datasets; and an anomaly detection tab where users can try different anomaly detection algorithms on different datasets. This dashboard thus provides a no-code interface for users to rapidly experiment with different algorithms on their own data, and examine performance both qualitatively (through visualizations) and quantitatively (through evaluation metrics).We also provide a Dockerfile which runs the dashboard as a microservice on port 80. The Docker image can be built with
docker build . -t merlion-dash -f docker/dashboard/Dockerfile
from the Merlion root directory. It can be deployed withdocker run -dp 80:80 merlion-dash
.