The official implementation of: "Kolmogorov–Smirnov GAN"

by Maciej Falkiewicz and Naoya Takeishi and Alexandros Kalousis

The arXiv preprint can be found under this link.

Project structure

The experimental pipeline

The project is governed with Data Version Control (DVC) with the pipeline defined in dvc.yaml and parameterized with a global config file: experiments/configs/global.yaml. There are two types of datasets: synthetic datasets and real-world datasets which have independent pipelines, both defined in dvc.yaml.

Synthetic datasets pipeline

flowchart TD
	node1["evaluate@{model_id}-{dataset_name}-{simulation_budget}-{model_seed}"]
	node2["generate@{dataset_name}"]
	node3["preprocess@{dataset_name}"]
	node4["train@{model_id}-{dataset_name}-{simulation_budget}-{model_seed}"]
	node5["visualize-data@{dataset_name}"]
	node6["visualize-evaluation@{model_id}-{dataset_name}-{simulation_budget}-{model_seed}"]
	node7["visualize-training@{model_id}-{dataset_name}-{simulation_budget}-{model_seed}"]
	node1-->node6
	node2-->node3
	node3-->node1
	node3-->node4
	node3-->node5
	node4-->node1
	node5-->node1
	node4-->node7

The possible values of the variables are:

{model_id}: "model1" (GAN), "model2" (WGAN), "model3" (KSGAN)
{dataset_name}: "swissroll", "circles", "rings", "moons", "8gaussians", "pinwheel", "2spirals", "checkerboard"
{simulation_budget}: "512", "1024", "2048", "16384", "65536"
{model_seed}: "0", "1", "2", "3", "4"

Real-world datasets pipeline

flowchart TD
	node1["evaluate-real@{model_id}-{dataset_name}-{model_seed}"]
	node2["preprocess-real@{dataset_name}"]
	node3["train-real@{model_id}-{dataset_name}-{model_seed}"]
	node4["visualize-training-real@{model_id}-{dataset_name}-{model_seed}"]
	node2-->node1
	node2-->node3
	node3-->node1
	node3-->node4

The possible values of the variables are:

{model_id}: "model1" (GAN), "model2" (WGAN), "model3" (KSGAN)
{dataset_name}: "mnist", "cifar10"
{model_seed}: "0", "1", "2", "3", "4"

Execution on a slurm orchestrated cluster

To execute the synthetic datasets pipeline you can use experiments/scripts/pipeline.py script
To execute the real-world datasets pipeline you can use experiments/scripts/pipeline-real.py script
- The evaluate-real@{model_id}-{dataset_name}-{model_seed} stage is not triggerred by the script

Attention: the scripts assume that preprocess@{dataset_name} (preprocess-real@{dataset_name}) stages have been already executed!

Please mind that this way of execution bypasses DVC, and thus requires commiting the changes in order to control versions.

In CPU_PARTITIONS and GPU_PARTITIONS environmental variables you should specify the available CPU and GPU partitions.

Source code

Implementations of the synthetic datasets simulators: src/simulators/
Implementations of the generative models:
- GAN: src/models/gan.py
- Wasserstein GAN: src/models/wgan.py
- Kolmogorov–Smirnov GAN: src/models/ksgan.py (the proposed method)
Scripts used to run the experiments: experiments/scripts
Training utilities: src/training/utils.py
Evaluation utilities: src/evaluation/utils.py

Environment

All the python dependencies are listed in requirements.txt file.

Data

The DVC cache for the project can be downloaded from here.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.dvc		.dvc
data		data
experiments		experiments
notebooks		notebooks
scripts		scripts
src		src
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
hypothesis.mplstyle		hypothesis.mplstyle
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The official implementation of: "Kolmogorov–Smirnov GAN"

Project structure

The experimental pipeline

Synthetic datasets pipeline

Real-world datasets pipeline

Execution on a slurm orchestrated cluster

Source code

Environment

Data

About

Releases

Packages

Languages

License

DMML-Geneva/ksgan

Folders and files

Latest commit

History

Repository files navigation

The official implementation of: "Kolmogorov–Smirnov GAN"

Project structure

The experimental pipeline

Synthetic datasets pipeline

Real-world datasets pipeline

Execution on a slurm orchestrated cluster

Source code

Environment

Data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages