Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added anomaly display cell and comments #1493

Conversation

piaz97
Copy link
Contributor

@piaz97 piaz97 commented Jan 18, 2023

Summary

Review of the AD example notebook.

Details

  • I've added a cell allowing the user to select and visualize a particular anomaly in the taxi dataset: Screenshot 2023-01-18 at 12 10 12
    In this way, we can help the user understand the anomalies in the dataset we are showcasing.

Added some comments in the cells (in red). In general, I would suggest to:

  • Make the train test split clearer to the user. Right now, it might be confusing to get an overview of which split of the initial data is used for what. If you have time, would be nice to attach an image of the split, explaining what is used for model training, scorer training, test, etc. Otherwise, I would group the train test split in the same cell at least.
  • If we choose to keep the cell visualizing a particular anomaly, it could be nice to add a second cell visualizing, for each anomaly, the ground truth signal, the model predictions, ground truth anomaly labels, anomaly score, and detected anomalies.
  • I don't see the usage of the detectors in the notebook. Could we add them, at least in the first example?
  • When printing AUC ROC/PR with the multiple scorers in the first example, would be nice to keep the same printing format in the cells presenting the 'automatic' approach and the 'manual' approach, so that it's easier for the user to compare the results.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@julien12234 julien12234 merged commit 8a96091 into example/anomaly_detection_example Jan 26, 2023
@julien12234 julien12234 deleted the example/anomaly_detection_example_comments branch January 26, 2023 11:35
dennisbader added a commit that referenced this pull request May 10, 2024
* Small fix in utils.py

* Factorize tests

* Correct format

* Dataset taxiNY

* jupyter notebook addition, XX-anomaly-detection.ipynb

* relocated  XX-anomaly-detection.ipynb

* Fix NormScorer proba input, and show_anomaly function

* Fix NormScorer proba input, and show_anomaly function

* Refactor window of Wasserstein, Kmeans and PyOD Scorers

* Refactor test

* Added anomaly display cell and comments (#1493)

* Added anomaly display cell and comments

* Added samuele comments

Co-authored-by: julien12234 <julien.adda@gmail.com>

* Added images, and Julien's recommendation

* Added parameter window_transform, git statusChange the default windowing methodgit status

* fix: solve error due to merge conflict and apply linting

* round of Julien_H's comment

* with images

* states to values

* Committing old local changes

* Small fix

* fix: reduced code redundancy between the two detectors, renamed the method eval_accuracy to eval_metric

* refactor: simplified class hierarchy, added a bit of type hinting, fixed bug in predict

* feat: migrated tests from unittest to pytest framework for the aggregators

* feat: parametrized tests to reduce code repetition

* fix: added docstring, increased test granularity

* fix: bug in fittableaggreg predict sanity check

* refactor: renamed eval_accuracy to eval_metric, removed NonFittableScorer class

* fix: changed tests after eval_accuracy function name change

* refactor: changed scorers tests from unittest to pytest framework

* fix: all non fittable anomaly scorer are tested

* refactor: renamed eval_accuracy eval_metric

* refactor: changed framework from unittest to pytest

* fix: typo

* refactor: changed test framework from unittest to pytest

* refactor: reduced code redundancy by using pytest.mark.parametrize

* refactor: reduced redundant code in kmeans, pyod and wasserstein scorers

* fix: logging

* fix: ad module use series2seq instead of its own util method

* refactor: single show_anomalies method across anomaly model classes

* fix: modularized scorer training, fixed logging

* fix: indentation error

* feat: parallelize training of scorers

* feat: parallelize scorer score method for component-wise multivariate

* feat: parallelize and/or aggregators predict_core method

* feat: simplified aggregation of anomaly scorer, added corresponding tests

* Apply suggestions from code review

Co-authored-by: Samuele Giuliano Piazzetta <samuele.piazzetta@gmail.com>

* fix lint

* update docs init and utils

* refactor ad utils

* refactor aggregators

* refactor aggregators

* refactor aggregators

* refactor detectors

* refactor module docs

* refactor anomaly models

* refactor anomaly models

* refactor scorers

* update diff_fn for scorers

* refactor WindowAnomalyScorer

* refactor tabularization for scorers

* improve scorer docs

* refactor score_from_prediction

* use slice_intersect_values in ad evaluation

* further code clean up

* further code clean up

* make API consistent

* refactor show anomalies api

* refactor eval_metric api for anomaly models

* refactor eval_metric api for anomaly scorers

* refactor eval_metric api for anomaly detectors

* refactor eval_metric api for anomaly aggregator

* enfore GlobalForecastingModel for AnomalyModel

* update changelog

* remove  prefix in AD API to keep unified  and covariates parameter names

* apply suggestions from PR review

* final updates

* improve docs

* revert changes

* prepare example notebook

* update changelog

* add taxi dataset test

---------

Co-authored-by: Samuele Giuliano Piazzetta <samuele.piazzetta@gmail.com>
Co-authored-by: madtoinou <32447896+madtoinou@users.noreply.github.com>
Co-authored-by: Antoine Madrona <antoine.madrona@epfl.ch>
Co-authored-by: Julien Sven Adda <julien.adda@epfl.ch>
Co-authored-by: madtoinou <antoine.madrona@unit8.co>
Co-authored-by: dennisbader <dennis.bader@gmx.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants