FluxML · ToucheSir · Sep 28, 2021 · Jun 12, 2021 · Jun 12, 2021 · Jun 12, 2021
diff --git a/vision/cdcgan_mnist/README.md b/vision/cdcgan_mnist/README.md
@@ -0,0 +1,43 @@
+# Conditional DCGAN
+
+<img src="..\cdcgan_mnist\output\img_for_readme.png" width="440"/>
+
+## Model Info
+
+Generative Adversarial Networks have two models, a _Generator model G(z)_ and a _Discriminator model D(x)_, in competition with each other. G tries to estimate the distribution of the training data and D tries to estimate the probability that a data sample came from the original training data and not from G. During training, the Generator learns a mapping from a _prior distribution p(z)_ to the _data space G(z)_. The discriminator D(x) produces a probability value of a given x coming from the actual training data.
+This model can be modified to include additional inputs, y, on which the models can be conditioned. y can be any type of additional inputs, for example, class labels. _The conditioning can be achieved by simply feeding y to both the Generator — G(z|y) and the Discriminator — D(x|y)_.
+
+## Training
+
+```shell
+cd vision/cdcgan_mnist
+julia --project cGAN_mnist.jl
+```
+
+## Results
+
+1000 training step
+
+![1000 training step](../cdcgan_mnist/output/cgan_steps_001000.png)
+
+3000 training step
+
+![30000 trainig step](../cdcgan_mnist/output/cgan_steps_003000.png)
+
+5000 training step
+
+![5000 training step](../cdcgan_mnist/output/cgan_steps_005000.png)
+
+10000 training step
+
+![10000 training step](../cdcgan_mnist/output/cgan_steps_010000.png)
+
+11725 training step
+
+![11725 training step](../cdcgan_mnist/output/cgan_steps_011725.png)
+
+## References
+
+[Conditional Generative Adversarial Nets by Mehdi Mirza et al.](https://arxiv.org/pdf/1411.1784.pdf)
+
+[Medium](https://medium.com/@utk.is.here/training-a-conditional-dc-gan-on-cifar-10-fce88395d610)
diff --git a/vision/cdcgan_mnist/cGAN_mnist.jl b/vision/cdcgan_mnist/cGAN_mnist.jl
@@ -185,5 +185,8 @@ function train(; kws...)
  return Flux.onecold.(cpu(fixed_labels))
 end 
 
-cd(@__DIR__)
-fixed_labels = train()
+if abspath(PROGRAM_FILE) == @__FILE__
+ train()
+end
+
+
diff --git a/vision/cdcgan_mnist/output/img_for_readme.png b/vision/cdcgan_mnist/output/img_for_readme.png
diff --git a/vision/conv_mnist/README.md b/vision/conv_mnist/README.md
@@ -0,0 +1,24 @@
+# LeNet-5
+
+![LeNet-5](../conv_mnist/docs/LeNet-5.png)
+
+## Model Info
+
+At a high level LeNet (LeNet-5) consists of two parts:
+(i) _a convolutional encoder consisting of two convolutional layers_;
+(ii) _a dense block consisting of three fully-connected layers_
+
+The basic units in each convolutional block are a convolutional layer, a sigmoid activation function, and a subsequent average pooling operation. Each convolutional layer uses a 5×5 kernel and a sigmoid activation function. These layers map spatially arranged inputs to a number of two-dimensional feature maps, typically increasing the number of channels. The first convolutional layer has 6 output channels, while the second has 16. Each 2×2 pooling operation (stride 2) reduces dimensionality by a factor of 4 via spatial downsampling. The convolutional block emits an output with shape given by (batch size, number of channel, height, width).
+
+## Training
+
+```shell
+cd vision/conv_mnist
+julia --project conv_mnist.jl
+```
+
+## Reference
+
+[Gradient-Based Learning Applied to Document Recognition by Yann LeCun et al.](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf)
+
+[d2l.ai](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html)
diff --git a/vision/conv_mnist/conv_mnist.jl b/vision/conv_mnist/conv_mnist.jl
@@ -157,4 +157,7 @@ function train(; kws...)
  end
 end
 
-train()
+if abspath(PROGRAM_FILE) == @__FILE__
+ train()
+end
+
diff --git a/vision/conv_mnist/docs/LeNet-5.png b/vision/conv_mnist/docs/LeNet-5.png
diff --git a/vision/dcgan_mnist/README.md b/vision/dcgan_mnist/README.md
@@ -0,0 +1,38 @@
+# Deep Convolutional GAN
+
+![dcgan_gen_disc](../dcgan_mnist/output/dcgan_generator_discriminator.png)
+
+## Model Info
+
+A DCGAN is a direct extension of the GAN, except that it explicitly uses convolutional and convolutional-transpose layers in the discriminator and generator, respectively. _The discriminator is made up of strided convolution layers, batch norm layers, and LeakyReLU activations_. The input is a 3x64x64 input image and the output is a scalar probability that the input is from the real data distribution. _The generator is comprised of convolutional-transpose layers, batch norm layers, and ReLU activations_. The input is a latent vector, _z_, that is drawn from a standard normal distribution and the output is a 3x64x64 RGB image. The strided conv-transpose layers allow the latent vector to be transformed into a volume with the same shape as an image.
+
+## Training
+
+```script
+cd vision/dcgan_mnist
+julia --project dcgan_mnist.jl
+```
+
+## Results
+
+2000 training step
+
+![2000 training steps](../dcgan_mnist/output/dcgan_steps_002000.png)
+
+5000 training step
+
+![5000 training steps](../dcgan_mnist/output/dcgan_steps_005000.png)
+
+8000 training step
+
+![8000 training steps](../dcgan_mnist/output/dcgan_steps_008000.png)
+
+9380 training step
+
+![9380 training step](../dcgan_mnist/output/dcgan_steps_009380.png)
+
+## References
+
+[Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks by Soumith Chintala et al.](https://arxiv.org/pdf/1511.06434v2.pdf)
+
+[pytorch.org/tutorials/beginner/dcgan_faces_tutorial](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html)
diff --git a/vision/dcgan_mnist/dcgan_mnist.jl b/vision/dcgan_mnist/dcgan_mnist.jl
@@ -144,5 +144,7 @@ function train(; kws...)
  save(@sprintf("output/dcgan_steps_%06d.png", train_steps), output_image)
 end
 
-cd(@__DIR__)
-train()
+if abspath(PROGRAM_FILE) == @__FILE__
+ train()
+end
+
diff --git a/vision/dcgan_mnist/output/dcgan_generator_discriminator.png b/vision/dcgan_mnist/output/dcgan_generator_discriminator.png
diff --git a/vision/mlp_mnist/README.md b/vision/mlp_mnist/README.md
@@ -0,0 +1,18 @@
+# Multilayer Perceptron (MLP)
+
+![mlp](../mlp_mnist/docs/mlp.svg)
+
+## Model Info
+
+An MLP consists of at least three of nodes: an input layer, a hidden layer and an output layer. Except for the input node each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable.
+
+## Training
+
+```script
+cd vision/mlp_mnist
+julia --project mlp_mnist.jl
+```
+
+## Reference
+
+[d2l.ai](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html)