-
Notifications
You must be signed in to change notification settings - Fork 1
/
proposal.tex
146 lines (102 loc) · 17.5 KB
/
proposal.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
%-----------------------------------------------------------------------------
%
% Template for sigplanconf LaTeX Class
%
% Name: sigplanconf-template.tex
%
% Purpose: A template for sigplanconf.cls, which is a LaTeX 2e class
% file for SIGPLAN conference proceedings.
%
% Guide: Refer to "Author's Guide to the ACM SIGPLAN Class,"
% sigplanconf-guide.pdf
%
% Author: Paul C. Anagnostopoulos
% Windfall Software
% 978 371-2316
% paul@windfall.com
%
% Created: 15 February 2005
%
%-----------------------------------------------------------------------------
\documentclass{sigplanconf}
% The following \documentclass options may be useful:
% preprint Remove this option only once the paper is in final form.
% 10pt To set in 10-point type instead of 9-point.
% 11pt To set in 11-point type instead of 9-point.
% numbers To obtain numeric citation style instead of author/year.
\usepackage{amsmath}
\newcommand{\cL}{{\cal L}}
\begin{document}
\special{papersize=8.5in,11in}
\setlength{\pdfpageheight}{\paperheight}
\setlength{\pdfpagewidth}{\paperwidth}
%\conferenceinfo{CONF 'yy}{Month d--d, 20yy, City, ST, Country}
%\copyrightyear{20yy}
%\copyrightdata{978-1-nnnn-nnnn-n/yy/mm}
%\copyrightdoi{nnnnnnn.nnnnnnn}
% Uncomment the publication rights you want to use.
%\publicationrights{transferred}
%\publicationrights{licensed} % this is the default
%\publicationrights{author-pays}
\titlebanner{CSC 395: Modern Programming Principles} % These are ignored unless
\preprintfooter{Final project for CSC 395} % 'preprint' option specified.
\title{Solving NP-Hard Problems with Neural Networks}
\subtitle{Taking the ``Artificial'' out of Artificial Intelligence}
\authorinfo{Zebediah Figura}
{Grinnell College}
{figuraze@grinnell.edu}
\authorinfo{Theodoros Kalfas}
{Grinnell College}
{kalfasth@grinnell.edu}
\authorinfo{Daniel Nanetti-Palacios}
{Grinnell College}
{nanettip@grinnell.edu}
\maketitle
\begin{abstract}
\end{abstract}
\category{CR-number}{subcategory}{third-level}
\keywords
Neural Networks, Machine Learning, Haskell
\section{Introduction}
The concept of neural networks, while dating back to the 1940s, has gained popularity and interest in recent years, partly due to breakthroughs in fast GPU-based implementation and the invention of recurrent neural nets. In our paper we implement a neural net in Haskell, a strongly-typed functional language, with the intent of demonstrating how a neural network can solve NP-hard problems. In order to best demonstrate the process and successfulness of a neural network, we present as the problem to be solved the well-known video game \textit{Snake}. Solving a problem such as this allows us to observe elements of the network's heuristic, as well as show the speed and manner in which it learns to solve the problem.
A neural network functions by accepting a set of inputs, then feeding those inputs through a series of neurons. Each neuron receives some number of numerical inputs, either directly taken from the inputs to the network or from the output of some other neurons, and multiplies those inputs by an arbitrary set of weights, outputting the sum (i.e. the dot product) to another neuron. The outputs of the last set of neurons may then be interpreted. In the case of \textit{Snake} the set of inputs is simply the set of all cells, with (say) -1 representing the presence of a barrier, 0 an empty cell, and 1 representing a fruit. The program would then give four outputs, representing (approximately) the probabilities with which the player should turn left, right, up, and down, respectively.
The key function here which allows the program to actually solve the problem itself, rather than having all weights be programmed manually, is the concept of \textit{machine learning}. In order to find the most optimal configuration of weights which will allow the network to most effectively solve the problem, one might arbitrarily modify weights and test whether the network's output is more or less accurate. However, since there may be as many as billions of weights in the network, systematically finding the best combination is infeasible in this manner. A better solution involves using calculus to determine the partial derivative of the output's accuracy (judged by some \textit{cost function}) with respect to each neuron, and then changing all neurons according to the sign and magnitude of this derivative. The most efficient way of doing this, popularized in 1986 by Rumelhart et al., is the \textit{back-propagation} algorithm, makes use of the chain rule to calculate all modifications using only two ``passes'' through the network.
\section{Prior Work}
%%%% Please read and review this part, tell me if I should add something, or add it yourselves.
\t A tremendous amount of work has been conducted in the area of Game AI using neural networks. Ranging from the simplest games such as snake, to complex games such as Go with the recent victory of Google's AlphaGo program over renowned Go champion Lee Sedol. The game we will be focusing on, Snake, has definitely been solved using the simple approach of finding the snake's head, looking at the block that it's about to occupy in the next frame, and if it's already occupied by its body, make the snake turn, maybe with adjustments to choose the best way to turn. However, we are seeking to implement a solution using a neural network, so finding other relatively simple neural networks that are developed to solve games can be useful.\\
Several programs which beat arbitrary Mario-style levels have been made, including the open-source MarI/O by SethBling. A Mario AI Championship used to be conducted where programmers would submit code, so a lot of people have tried and succeeded in beating platformer levels in games. Other "AI" type neural network guides are available such as \textit{How to build Neural Networks for Games} by Penny Sweetser or \textit{Promising Game AI Techniques} by Steve Rabin.\\
As a general introduction to Neural Networks, Michael Nielsen's online textbook \textit{Neural Networks and Deep Learning} is an amazing resource that can give anyone a jump-start on all the research that has happened since the conception of neural networks. Core concepts that need to be understood are perceptrons, the equivalents to neurons, that take some weighed inputs and provide an output. If a lot of them are chained, they make a net comprised of different layers that depend on the inputs of the previous layer, thus they can make decisions depending on previous more simple decisions. Finally, the basis on which neural networks improve is changing the weights on each neuron so that the results are more accurate. It can be shown that small changes in the weights result in small changes in the output, so if during the training stages of the neural network, there is a small discrepancy between the expected result and the actual one, small changes in the weights can provide us with a better network. The actual changes can be done arbitrarily until we reach a consistent network, or they can be based on computations.\\
\t The general approach to building a neural network which plays a platformer game intelligently doesn't differ much from classic neural networks such as character recognition. The inputs are either screenshots of the game's execution or the memory of the game, both translated so that only the important patterns such as platforms, enemies and power-ups are present, and the outputs are button presses, which are then fed into the game. The approach that SethBling takes is to randomly generate the first layer of neurons, and let each generated neural network run on an emulation of Mario. The ones that perform the best are kept and built upon, introducing new layers of neurons on them, and tested as well. That is, the neural network ends up training itself, or rather, the group of neural networks that are generated end up choosing the best one, instead of looking at other trials or being fed data that will help them train. This is a much more complex game than Snake, so implementing a similar network for Snake will be feasible during the timeline of this project.
\section{Proposed Work}
%%%% Its not a lot but I don't how to flesh this out any more since I'm still a bit unsure of the exact things we need to do to get this done.
\t For this project we plan on producing a program that runs a game and makes the most optimal choices at each frame of gameplay. The program will use neural networks that will output the best move to make at the current state of the game in order to get closer to the goal. Within this project we will develop the neural network that will handle this task and learn by going through various different training data. This will be the biggest task of the project as we would need to create a neural network ourselves using Haskell that can learn through various trails. Other things we will need to develop is how the program will read each frame of the game and interpret that data so it can go through the neural network.
\section{Timeline}
%%%% Timeline feels a bit straightforward since there are only 3 checkpoints. If you guys know how to make this longer go ahead.
\t By the time we reach the first checkpoint, we hope to have at least completed a very basic structure for our neural network that we will use for the rest of the project. We also plan on figuring out how to get the data from the emulator that the game is running on and use it as the input for the program. For the second checkpoint, our neural network will hopefully be able to reach the goal in most if not all the tests given, but not exactly optimally. When the third checkpoint comes around, we would either have turned the neural network into recurrent neural network or have optimized and adjusted the previous neural network so the player would reach the goal in the most optimal way possible.
\section{Examples}
\t Within our NN.hs file, we've provided a demo function that gives a brief example of how our neural nets work. The function demo just takes a number for how many trials the neural net should go through and runs another function called dem0, which has a prebuilt neural net with two inputs, two hidden neurons, and two output neurons with the inputs being (0.05, 0.1) and the outputs (0.01, 0.99). Running the demo with just one trial returns us with [0.7313031537607506,0.7784159101449939], which is pretty far from our (0.01, 0.99) target. Increasing the trials to 25, our output becomes [0.3018611668094906,0.8550426427218847], which is much better. If we decide to increase it to a ridiculously large number like 10,000, we'll get [9.998833536323741e-2,0.9881957376009745], which is incredibly close to our output target and shows that the more trials we run, the better our neural nets become.
\t While there is no functional demo included to run the snake game that was coded in Snake.hs, we do provide a function that allows us to see how well our neural nets play the game. Our snnakeA.exe program takes two parameters from the user: number of trials and number of generations. The program runs the genetic algorithm within our code based on the number of generations given, and then repeats this for the number of trials given and returns the average score from each trial. The scores might be different each time due to how the neural net plays through the game, but increasing the number of generations and trials does show that the neural net improves its score based on how many times it plays the game. With just one trial and one generation, the majority of the time the score will be 3.0 or 4.0. Running 10 trials of 5 generations gives us scores ranging from 4.3 to 7.8. Attempting to increase the generation size to a much larger number will cause the program to take a lot of time to process, so these results will be satisfiable for now.
\section{System}
The implementation of our program consists of three Haskell files.
\begin{itemize}
\item The first file, NN.hs (short for "neural net") contains an entire neural net implementation. This consists of typedefs for the net components: Neuron as a list of input weight values coupled with abias value, Layer as a list of neurons, and Net as a list of layers. All weights and biases are given as double-precision floating points. A simple net with one hidden layer and two nodes per layer, as well as a simple neuron (which happens to compute the XOR value of two binary inputs), are provided for testing purposes, and a handful of test functions are given to demonstrate their correctness. Functions are provided to create empty and random nets, given a list of layer sizes. A set of functions are also provided to convert a given net to and from an often more easily manipulatable one-dimensional list format. The rest of the file is concerned with the implementation of two functions that provide the important functionality of a neural net: \textit{compute} and \textit{descend}. The former, \texit{compute}, simply takes a given neural net and set of input data, and computes the output generated by the neural net. The latter, \textit{descend}, implements the gradient descent algorithm. It works by first calling a function \textit{backprop}, an implementation of the backpropagation algorithm for efficiently finding the error value of each neuron. The quadratic cost function and an eta value of 0.5 are hard-coded, as we did not intend for our network to be a general-purpose implementation (indeed much better implementations are available, such as \texttt{hnn} and \texttt{neural}. We did not use any preëxisting implementations, as we intended to write our own as an educational exercise.) \textit{descend} then descends through the network, applying the error transformation to each neuron. While this implementation requires two separate descents rather than one, it is much easier to read. Finally, a simple demonstration function is given in which \textit{descend} is repeatedly applied to a set of values; for the functionality of this see above.
\item The second file, Snake.hs, contains a partial implementation of the Snake game. This consists primarily of a function \textit{generateFrame}, which accepts a direction of movement and a game state, and returns either an updated game state, or Nothing if the movement caused the game to be lost. The file also provides a (constant) initial game state and a function to convert a given game state into a one-dimensional array.
\item The third file, Genetic.hs, contains an implementation of the genetic algorithm, complete with random crossover and random mutation. Constants specifying the population size, mutation rate, and net sizes are defined. Since this file is treated as an end rather than a means, the interface specifically with Snake is hard-coded here. This includes a \textit{score} function, which simply counts the number of turns that a given agent survives. The agent itself is simply a neural net which has been given as input the entire gameboard (an array of integer values signifying what occupies each square) and returns as output a set of confidences by which each of the four directions should be taken; the maximum of these is then selected. Since it is possible for an agent to run forever by literally looping through the gameboard, we added the somewhat unelegant condition that if an agent runs for more than 100 turns without dying (an impressive feat even for humans) it is treated as looping infinitely and terminated with a score of 0. The remainder of the file is an implementation of the genetic algorithm on a set of randomly generated neural nets. A function is provided to run the genetic algorithm for a given number of generations. A fourth file, snnakeA.hs, executes this function a given number of times and returns the average of the scores of the fully evolved nets.
\end{itemize}
\section{Reflection}
The implementation of the neural net itself, being composed entirely of simple transformations of linear data types, was very easy to consider using a functional approach—perhaps easier than an imperative approach would have been. The existence of a strong type system made it easy to determine whether the right nesting of arrays was being used; the only possible feature one could desire along this point would be a way of affirming correctness of vector length. The implementation of Snake was fairly easy as well, as it is not difficult to conceive of any given step of the game as resulting from a previous step in some way. A certain transformation of data types, namely from a list of coordinates to a static array of binary values, would have been difficult to implement functionally, but fortunately a library (\texttt{Vector}) exists to provide this transformation. \\
The genetic algorithm, however, was much less pleasant to implement. Since the algorithm is based not on any mathematical model but rather a real-world process that occurs over time, it is much harder to consider such an algorithm from a functional perspective. Adding to the difficulty, the many random calculations needed for the genetic algorithm required a layer of monads.
\appendix
\section{Appendix Title}
This is the text of the appendix, if you need one.
\acks
Acknowledgments, if needed.
% We recommend abbrvnat bibliography style.
\bibliographystyle{abbrvnat}
% The bibliography should be embedded for final submission.
\begin{thebibliography}{}
\softraggedright
\bibitem[Smith et~al.(2009)Smith, Jones]{smith02}
P. Q. Smith, and X. Y. Jones. ...reference text...
\end{thebibliography}
\end{document}