diff --git a/notebooks/Python XOR Neural Network.ipynb b/notebooks/Python XOR Neural Network.ipynb new file mode 100644 index 0000000..8a14411 --- /dev/null +++ b/notebooks/Python XOR Neural Network.ipynb @@ -0,0 +1,142 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Python XOR Neural Network" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**(C) 2018 by Oren Baldinger**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**License:** [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Perceptrons" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A perceptron, or single layer neural network with a linear activation function, cannot model functions that are not linearly separable. Minsky and Papert (1969) demonstrated that the XOR function in particular cannot be modeled this way." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Neural Networks" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Goodfellow et al. (2016) showed that by using two layers with the non-linear ReLU activation function, XOR can be modeled by a neural network." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The basic idea is that the first layer maps the XOR function onto a new, linearly separable space that the output layer can successfully classify (as 0 or 1)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This network can be implemented as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "def relu(z):\n", + " return max(0, z)\n", + "\n", + "def xor(x1, x2):\n", + " activaiton = np.vectorize(relu)\n", + "\n", + " # hidden layer\n", + " W = np.array([[1, 1], [1, 1]])\n", + " b = np.array([0, -1])\n", + " x = np.array([x1, x2])\n", + " h = activaiton(W.dot(x) + b)\n", + "\n", + " # output layer\n", + " return relu(np.array([1, -2]).dot(h))" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0 XOR 0: 0\n", + "0 XOR 1: 1\n", + "1 XOR 0: 1\n", + "1 XOR 1: 0\n" + ] + } + ], + "source": [ + "print(\"0 XOR 0:\", xor(0, 0))\n", + "print(\"0 XOR 1:\", xor(0, 1))\n", + "print(\"1 XOR 0:\", xor(1, 0))\n", + "print(\"1 XOR 1:\", xor(1, 1))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}