This is the implementation of ICDM 2020 paper Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning. We propose to learn a meta-policy with deep reinforcement learning to optimize the performance of active anomaly detection. Please refer the paper for more deteails.
📢 Do you want to learn more about data labeling? Please check out our data-centric AI survey and data-centric AI resources!
If you find this project helpful, please cite
@inproceedings{DBLP:conf/icdm/ZhaLWH20,
author = {Daochen Zha and
Kwei{-}Herng Lai and
Mingyang Wan and
Xia Hu},
editor = {Claudia Plant and
Haixun Wang and
Alfredo Cuzzocrea and
Carlo Zaniolo and
Xindong Wu},
title = {Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning},
booktitle = {20th {IEEE} International Conference on Data Mining, {ICDM} 2020,
Sorrento, Italy, November 17-20, 2020},
pages = {771--780},
publisher = {{IEEE}},
year = {2020},
url = {https://doi.org/10.1109/ICDM50108.2020.00086},
doi = {10.1109/ICDM50108.2020.00086},
timestamp = {Wed, 17 Feb 2021 11:24:58 +0100},
biburl = {https://dblp.org/rec/conf/icdm/ZhaLWH20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Make sure you have python 3.5+ installed.
git clone https://github.com/daochenzha/Meta-AAD.git
cd Meta-AAD
pip install -r requirments.txt
pip install -e .
Train a meta-policy with train.py
. The important arguments are as follows.
--train
: the datasets used for training, seperated by commas.--test
: the datasets used for testing, seperated by commas.--num_timesteps
: the number of training steps of reinforcement learning agents--log
: where the log and models will be outputted
By default, the reinforcement learning training log will be saved in log/
, the anomaly discovery curves will be saved in log/anomaly_curves
, and the trained model will be saved in log/
.
You may evaluate a trained model with evaluate.py
. The important arguments are as follows.
--load
: the path tomodel.zip
file.--test
: the datasets used for testing, seperated by commas.
We provide two baselines in this repo for comparison: a random query strategy and IForest query strategy. They are available in evaluate_baselines.py
. For other baselines, please refer to the following repos.