DAEMA: Denoising Autoencoder with Mask Attention

This repository contains the code used for the paper DAEMA: Denoising Autoencoder with Mask Attention. The documentation of the code, generated by sphinx, is available here.

Please cite as

@article{tihon2021daema,
  title={DAEMA: Denoising Autoencoder with Mask Attention},
  author={Tihon, Simon and Javaid, Muhammad Usama and Fourure, Damien and Posocco, Nicolas and Peel, Thomas},
  journal={arXiv preprint arXiv:2106.16057},
  year={2021}
}

How to setup the environment

On a Local Machine

Create and activate the conda environment with python 3.8.2

conda create --name <env-name> python=3.8.2
conda activate <env-name>

Install the libraries listed in requirements.txt

pip install -r requirements.txt

Run the code

cd src
python run.py

With Docker

The repo also contains Dockerfile to run the code

docker build -t <image_name>:<tag> .
docker run -t --name <container-name> <image_name> <experiment-to-run>

Example:

docker build -t daema:latest .
docker run -t --name daema_container daema:latest python run.py

Test your installation

You can test your installation by running

PYTHONPATH=src/ pytest tests

How to reproduce the results of the paper

MCAR state-of-the-art comparison:

DAEMA: python run.py
DAE: python run.py --daema_attention_mode no --daema_ways 1
AimNet: python run.py --model Holoclean --batch_size 0 --lr 0.05 --metric_steps 18 19 20 21 22
MIDA: python run.py --model MIDA --batch_size -1 --metric_steps 492 494 496 498 500 --scaler MinMax
MissForest: python run.py --model MissForest --metric_steps 0 --scaler MinMax
Mean: python run.py --model Mean --metric_steps 0
Real: python run.py --model Real --metric_steps 0

MNAR state-of-the-art comparison:

Same as above, but with an additional argument: --ms_setting mnar

Missingness proportions:

Same as above, but with an additional argument (e.g. for 10% missingness): --ms_prop 0.1

Ablation study part 1 (not part of the paper in the end):

Full: python run.py
Classic: python run.py --daema_attention_mode classic
Sep.: python run.py --daema_attention_mode sep

Ablation study part 2 (not part of the paper in the end):

DAEMA: python run.py
Reduced loss: python run.py --daema_loss_type dropout_only
Full loss: python run.py --daema_loss_type full
No art. miss.: python run.py --daema_pre_drop 0

How to add a dataset

To test the code on a local dataset:

put the dataset in files/data/<name>.csv;
update the src/pipeline/datasets/DATASETS variable to add your dataset;
run the tests;
use the --datasets argument to select it for the experiments (e.g. python run.py --datasets <name>).

How to add a model

To test the code on a custom model:

implement the model following the expected interface (see src/models/baseline_imputations/Identity for the basic structure);
update the src/models/__init__/MODELS variable to add your model;
run the tests;
use the --model argument to select it for the experiments (e.g. python run.py --model <Name>).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAEMA: Denoising Autoencoder with Mask Attention

How to setup the environment

On a Local Machine

With Docker

Test your installation

How to reproduce the results of the paper

MCAR state-of-the-art comparison:

MNAR state-of-the-art comparison:

Missingness proportions:

Ablation study part 1 (not part of the paper in the end):

Ablation study part 2 (not part of the paper in the end):

How to add a dataset

How to add a model

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
tests		tests
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

euranova/DAEMA

Folders and files

Latest commit

History

Repository files navigation

DAEMA: Denoising Autoencoder with Mask Attention

How to setup the environment

On a Local Machine

With Docker

Test your installation

How to reproduce the results of the paper

MCAR state-of-the-art comparison:

MNAR state-of-the-art comparison:

Missingness proportions:

Ablation study part 1 (not part of the paper in the end):

Ablation study part 2 (not part of the paper in the end):

How to add a dataset

How to add a model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages