For the written portion of the assignment, please include a PDF in your repo.
I've provided some starter code for you for this assignment; you are welcome to modify or extend it as needed to make it work better for your purposes.
- (20 points) Fill in the following table. (you're more than welcome to recreate this in a separate document)
Algorithm | Time Complexity | Space Complexity | Complete? | Optimal? |
---|---|---|---|---|
BFS | ||||
UCS | ||||
DFS | ||||
DLS | ||||
IDS | ||||
A* |
-
(15 points) Search. I've provided an example class for you for solving the Fox and Chickens problem. Using the FoxAndChickensState as an example, create an EightPuzzleState, with isGoal, successor, and any other helper functions you think are necessary. You may choose how to represent the grid. Test it with the included BFS implementation.
-
(10 points) Add a closed list to the BFS function. (consider using a dictionary)
-
(10 points) Using the BFS function as a template, implement DFS, depth-limited search, and iterative deepening depth-first search.
-
(15 points) Implement A* for the 8-puzzle, using Manhattan distance as a heuristic. You will probably want to use the heapq module for your priority queue.
-
(10 points) Run each of your functions on the 8-puzzle, and measure the number of states generated. Prepare a table showing your results.
-
(20 points). I've provided a TicTacToeState class. Using this, complete the implementation of the minimax algorithm for two-player search. Add a wrapper around this that allows a person to play tic-tac-toe against the computer.
-
(686 students only). In the late 90s, Deep Blue shocked the world by becoming the first computer to beat a human grandmaster, Garry Kasparov. This paper describes how Deep Blue was constructed - it took advantage of specialized hardware, along with hand-crafted heuristics and many optimizations of the alpha-beta pruning technique we've learned about.
20 years later, the Google team has re-revolutionized game search with the development of AlphaZero, which is described here.
AlphaZero uses a very different approach - specifically, a deep neural network is used to learn heuristic functions through self-play. (We'll look at reinforcement learning later in the semester). This allows the program to learn to play any game, as long as it knows the state space, how to win, and the legal actions.
These articles are both pretty dense, and I don't expect you to grasp every nuance.
a) What were the engineering advances that led to Deep Blue's success? Which of them can be transferred to other problems, and which are specific to chess?
b) AlphaZero is compared to a number of modern game-playing programs, such as StockFish, which work similarly to Deep Blue. The Science paper shows that AlphaZero is able to defeat StockFish even when it is given only 1/100 of the computing time. Why is that? Please frame your answer in terms of search and the number of nodes evaluated.