Joyful Deep Learning - I

This repository is a collection of my useful resources of Deep Learning in Tensorflow, in 5 sections:

Machine Learning and Deep Learning Basics in Math and Numpy
Deep Learning Basics in Math, Numpy and Scikit-Learn
Deep Learning Basics in Tensoflow
Deep Learning Advanced in Tensoflow - CNN and Tensorboard
Deep Learning Advanced in Tensoflow - RNN, LSTM and RBM

It would be really grateful to contribute to/clone this repository, commercial uses are not welcomed. Thanks for the help of Prof.Brian Kulis and Prof.Kate Saenko and TFs of CS591-S2 (Deep Learning) at Boston University. Of course also thanks to Google's Open Source Tensorflow!

All results in Jupyter-Notebook are trained under GTX 1070, training CPUs may cost much more time.

This readme file is supported by readme2tex

Section 1 Content - Machine Learning and Deep Learning Basics in Math and Numpy (click to view full notebook)

Coding requirements:

# Python 3.5+
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial.distance import cosine
import matplotlib.cm as cm

Closed-Form Maximum Likelihood mathematical derivation:
- $P(x \ | \ \theta) = \theta e^{-\theta x}$ for $x \geq 0$
- $P(x \ | \ \theta) = \frac{1}{\theta}$ for $0 \leq x \leq \theta$
Gradient for Maximum Likelihood Estimation mathematical derivation:
- Gradients for log-likelihood of the following model:
  - we have $X \in \mathbf R^{n \times k}$ - constant data matrix, $\mathbf x_i$ - vector corresponding to a single data point
  - $\theta$ is a -dimensional (unknown) weight vector
  - $\varepsilon \sim \text{Student}(v)$ is a -dimensional (unknown) noise vector
  - and we observe vector $\mathbf y = X\theta + \varepsilon$
  - $P(y_i \ | \ \mathbf x_i, \theta, v) = \frac{1}{Z(v)} \Big(1 + \frac{(\theta^T \mathbf x_i - y_i) ^2}{v}\Big)^{-\frac{v+1}{2}}$
- Stochastic Gradient Descent Implementation
Matrix Derivatives mathematical derivation:
- Multivariate Gaussian:
  - $\frac{\partial \mathcal L(\Sigma)}{\partial \Sigma} = -\frac12 \left( \frac{1}{|\Sigma|} |\Sigma| \Sigma^{-T} - \Sigma^{-T} (x- \bar \mu)(x-\bar \mu)^T\Sigma^{-T} \right) = -\frac12 \left(\Sigma^{-T} - \Sigma^{-T} (x- \bar \mu)(x-\bar \mu)^T\Sigma^{-T} \right)$
- Multi-target Linear Regression model:
  - we have $X \in \mathbf R^{n \times k}$ is a constant data matrix
  - $\theta$ is a $k \times m$ -dimentional weight matrix
  - $\varepsilon_{ij} \sim \mathcal N(0, \sigma_\epsilon)$ is a normal noise ( $i \in [0; n], j \in [0;m]$ )
  - and we observe a matrix $Y = X\theta + \varepsilon \in \mathbf R^{n \times m}$
  - $\varepsilon = Y - X\theta \sim \mathcal N_n(0, \sigma_\epsilon I)$
  - $\mathcal L(\theta) = \log P(Y - X\theta \ | \ \theta) = \log \mathcal N_n(Y - X\theta \ | \ 0, \sigma_\epsilon I)$
  - $\theta_{MLE} = \arg \max_{\theta} \mathcal L(\theta) = \arg \min_{\theta} \text{loss}(\theta) = \arg \min_{\theta} \big( ||Y-X\theta||^2_F \big)$
  - Deriavation: $\frac{\partial\text{loss}(\theta)}{\partial \theta} = -2X^T (Y-X\theta)$
  - Deriavation: $\theta_{MLE} = (X^T X)^{-1} X^T Y$
Logistic Regression mathematical derivation
Logistic Regression implementation

Section 2 Content - Deep Learning Basics in Math, Numpy and Scikit-Learn (click to view full notebook)

Coding requirements:

# Python 3.5+
import numpy as np
import matplotlib.pyplot as plt
from scipy.misc import imread
from sklearn.datasets import fetch_mldata

Cross-Entropy and Softmax mathematical derivation:
- Minimizing the multiclass cross-entropy loss function to obtain the maximum likelihood estimate of the parameters $\theta$ :
  - $L(\theta)= - \frac{1}{N}\sum_{i=1}^{N} \sum_{k=1}^{K} y_{ik} \log(h_k(x_i,\theta))$ where is the number of examples $\{x_i,y_i\}$
Simple Regularization Methods:
- L2 regularization
- L1 regularization
Backprop in a simple MLP - Multi-layer perceptron's mathematical derivation:

XOR problem - A Neural network to solve the XOR problem: (This is a really good example to help us understand the essence of neural networks)

Implementing a simple MLP - Implement a MLP by hand in numpy and scipy

Common useful activation functions implementation escaped from numerical accuracy problems:

softplus function:

  import numpy as np

      def softplus(x):
          return np.logaddexp(0, x)

      def derivative_softplus(x):
          return np.exp(-np.logaddexp(0,-z))

sigmoid function:

  import numpy as np

      def sigmoid(x):
          return np.exp(-np.logaddexp(0, -x))

      def derivative_sigmoid(x):
          return np.multiply(np.exp(-np.logaddexp(0, -x)), (1.-np.exp(-np.logaddexp(0, -x))))

relu function:

  import numpy as np

      def relu(x):
          return np.maximum(0, x)

      def derivative_relu(x):
          for i in range(0, len(x)):
              for k in range(len(x[i])):
                  if x[i][k] > 0:
                      x[i][k] = 1
                  else:
                      x[i][k] = 0
          return x

Forward pass implementation
Backward pass implementation
Test MLP on MNIST dataset and its visualization

Section 3 Content - Deep Learning Basics in Tensoflow (click to view full notebook)

Coding requirements:

# Python 3.5+
import numpy as np

# tensorflow-gpu==1.0.1 or tensorflow==1.0.1
import tensorflow as tf

from matplotlib import pyplot as plt

# Scikit-learn's TSNE is relatively slow, use BHTSNE as a faster alternative:
# https://github.com/dominiek/python-bhtsne
from sklearn.manifold import TSNE

MNIST Softmax Classifier Demo in TensorFlow
Building Neural Networks with the power of Variable Scope - MLP in TensorFlow:

With the power of variable scope, we can implement a very flexible MLP in tensorflow without hard-code the layers and weights:

def mlp(x, hidden_sizes, activation_fn=tf.nn.relu):
    
    '''
    Inputs:
        x: an input tensor of the images in the current batch [batch_size, 28x28]
        hidden_sizes: a list of the number of hidden units per layer. For example: [5,2] means 5 hidden units in the first layer, and 2 hidden units in the second (output) layer. (Note: for MNIST, we need hidden_sizes[-1]==10 since it has 10 classes.)
        activation_fn: the activation function to be applied

    Output:
        a tensor of shape [batch_size, hidden_sizes[-1]].
    '''
    if not isinstance(hidden_sizes, (list, tuple)):
        raise ValueError("hidden_sizes must be a list or a tuple")
        
    # Number of layers
    L = len(hidden_sizes)

    for l in range(L):

        with tf.variable_scope("layer"+str(l)):
            
            # Create variable named "weights".
            if l == 0:
                weights = tf.get_variable("weights", shape= [x.shape[1], hidden_sizes[l]], dtype=tf.float32, initializer=None)
            else:
                weights = tf.get_variable("weights", shape= [hidden_sizes[l-1], hidden_sizes[l]], dtype=tf.float32, initializer=None)

            # Create variable named "biases".
            biases = tf.get_variable("biases", shape=[hidden_sizes[l]], dtype=tf.float32, initializer=None)

            # Pre-Actiation Layer
            if l == 0:
                pre_activation = tf.add(tf.matmul(x, weights), biases)
            else:
                pre_activation = tf.add(tf.matmul(activated_layer, weights), biases)

            # Activated Layer
            if l == L-1:
                activated_layer = pre_activation
            else:
                activated_layer = activation_fn(pre_activation)
    return activated_layer

Siamese Network in TensorFlow

Visualize learned features of Siamese Network with T-SNE

Section 4 Content - Deep Learning Advanced in Tensoflow - CNN and Tensorfoard (click to view full notebook)

Coding requirements:

# Python 3.5+
import numpy as np
import scipy
import scipy.io

# tensorflow-gpu==1.0.1 or tensorflow==1.0.1
import tensorflow as tf

from matplotlib import pyplot as plt

# Scikit-learn's TSNE is relatively slow, use BHTSNE as a faster alternative:
# https://github.com/dominiek/python-bhtsne
from sklearn.manifold import TSNE

Building and training a convolutional network in Tensorflow with tf.layers/tf.contrib
Building and training a convolutional network by hand in Tensorflow with tf.nn
Saving and Reloading Model Weights in Tensorflow
Fine-tuning a pre-trained network
Visualizations using Tensorboard:
- Visualize Filters/Kernels
- Visualize Loss
- Visualize Accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
.ipynb_checkpoints		.ipynb_checkpoints
Section1		Section1
Section2		Section2
Section3		Section3
Section4		Section4
imgs		imgs
svgs		svgs
INPUT.md		INPUT.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joyful Deep Learning - I

Section 1 Content - Machine Learning and Deep Learning Basics in Math and Numpy (click to view full notebook)

Section 2 Content - Deep Learning Basics in Math, Numpy and Scikit-Learn (click to view full notebook)

Section 3 Content - Deep Learning Basics in Tensoflow (click to view full notebook)

Section 4 Content - Deep Learning Advanced in Tensoflow - CNN and Tensorfoard (click to view full notebook)

Section 5 Content - Deep Learning Advanced in Tensoflow - RNN, LSTM and RBM (click to view full notebook)

About

Releases

Packages

Languages

neuralnotworklab/Joyful-Deep-Leaning-I

Folders and files

Latest commit

History

Repository files navigation

Joyful Deep Learning - I

Section 1 Content - Machine Learning and Deep Learning Basics in Math and Numpy (click to view full notebook)

Section 2 Content - Deep Learning Basics in Math, Numpy and Scikit-Learn (click to view full notebook)

Section 3 Content - Deep Learning Basics in Tensoflow (click to view full notebook)

Section 4 Content - Deep Learning Advanced in Tensoflow - CNN and Tensorfoard (click to view full notebook)

Section 5 Content - Deep Learning Advanced in Tensoflow - RNN, LSTM and RBM (click to view full notebook)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages