Skip to content

Nicola-Bini/2048_solver

Repository files navigation



Screen Shot 2021-09-25 at 1 21 14 PM



The Game

First released in 2014 on Github by Gabriele Cirulli, 2048 is a game where the player has to slide numbers on a 4x4 grid to merge them into bigger numbers. Every time a move is completed, the game add a new number to the grid. The objective is to obtain a cell with a value of 2048 in it, although it is possible to reach a higher number of 2048.

Additional rules:

  • There are 4 possible moves (directions) the player can choose from: up, down, right, left.
  • A move has to make a change in the board in order to be valid (i.e. the grid before and after the move must be different).
  • A new number is added to the grid after a valid move is completed.
  • After a valid move, a new number is inserted to the grid. it is '2' with a probability of 0.9 and a '4' with a probability of 0.1.
  • The player loses when the are no valid moves left

State space

Calculating the whole state space for 2048 is not an easy task, in fact the complete enumeration of states still remains an open problem. In 2017, John Lees-Miller tried to calculate it using a brute force approach, and he managed to compute all the possible states up to the tile 64. It took his decent computer (OVH HG-120 instance with 32 cores at 3.1GHz and 120GB RAM) more than a month and he could only conclude that the state space is much greater than 1.3 trillion states (≫ 1.3 trillion).

Actions

The set of actions is defined as Ai ⊆ M = {up, down, left, right}. Where Ai is the set of allowed moves a player can choose from when the game is in the state Gi. For example, in Figure 1 the player can choose between the whole set of moves M = {up, down, left, right}, in this case Ai = M. In Figure 2, the player can only play 2 actions Ai = {up, left}.






State Transitions

In 2048 the transition between states is stochastic, meaning that given a state of the game and a certain move, there are multiple possible resulting game states, depending on where the new number is spawned and whether is a '2' or a '4'. Take as example the following game state:




Say that the player chooses the move 'left'. There are 28 possible game states Gi that can result from the move, because there are 14 empty spots where the new number can appear and the new number is taken from the set N = {2, 4}.





Score

In the original game, the score starts from 0 and everytime a player merges two tiles, the sum of the 2 numbers is added to the score. For example, merging together two tiles of '2' would increase the score by 4, merging together two tiles of '16' would increase the score by +\32.
However, in this version of the game the score of a state of the game the score Si is given by summing up the values of all the tiles. (see examples below)





Random walk

A random walk is a random process made of a sequence of random steps. In 2048, the evolution of the score follows a random walk. If there is at least one valid move (i.e. it is not game over), the score of the State Si+1 is given by the score of the previous State Si +2 with a probability of 0.9 or +4 with a probability of 0.1. If it is game over, the score will remain the same.



Screen Shot 2021-09-25 at 6 05 30 PM



Monte Carlo

Monte Carlo methods are a category of algorithms that rely on random sampling to approximate a result that cannot be directly calculated. Sampling a large number of scenarios makes possible to estimate boundary conditions that are difficult or impossible to define a priori.

For example, Nicoguaro used Monte Carlo to estimate the value of π by drawing points uniformly in a quadrant. Then, by calculating the distance of each point from the origin he determined if such point is within the boundary of the circle or not. Finally, he computed the percentage of points within the boundary over the total points, and he multiplied it by 4 to obtain an estimate of π.

image source

Markov Chain Monte Carlo (MCMC)

Monte Carlo in 2048 can help us determine what is the best move to make next. Since there is only a maxmimum of 4 moves, we can use Monte Carlo to estimate what it will be the average score of the after m moves, if the player decides to make a certain move on the next step.
So, for each valid move Ai ⊆ M = {up, down, left, right}, the algorithm will generate n ∈ ℕ simulations, and in each simulation will take make m ∈ A ⊆ M random moves. Then it will calculate the average for score for each valid move. (See image)



Screen Shot 2021-09-25 at 6 05 30 PM


Screen Shot 2021-09-25 at 6 05 30 PM



Thanks to the Central Limit Thereom, we know that the average score calculated for each valid move is normally distributed, and its standard deviation decreases as the sample size (number of simulations) increases. In fact, the score does not always follow a normal distribution.

Example of a sample distribution that does not follow a normal distribution:


Screen Shot 2021-09-25 at 6 05 30 PM



Comparing sample distributions

Increasing the number of samples will increase the confidence of the model's decision. The picture below shows the difference in the sample distribution when increasing the sample size from n=100 to n=10000. (with s = 1000, A = {up, down, left, right}, and m = 16)

Screen Shot 2021-09-25 at 6 28 22 PM

final_graph2



Final result

Even though, the distribution is quite similar, the confidence of the model would increase a lot when increasing the sample size from 100 to 10000. Here there are two simulations, one made with n=100 and the other with n=1024.

n = 100 n = 1024
n.100.mov
n.1024.mov


Sources

https://en.wikipedia.org/wiki/2048_(video_game)
https://jdlm.info/articles/2017/12/10/counting-states-enumeration-2048.html

About

Solving 2048 using Markov Chain Monte Carlo

Resources

License

Stars

Watchers

Forks

Languages