In this project, we develop an AI (online demo here) which can indefinitely clear lines in a single Tetris game (I had to stop it because my lappy was burning out).
Definition of game
Perhaps the most interesting observation I made in the course of developing the AI was this: despite the existence of the “standard” Tetris guideline, there are still many variants of the game. If you were to play a game of Tetris online, chances are that it won’t be the “standard” one.
For this AI, I’ve stuck to the “official” guideline as closely as possible. For the sake of reducing ambiguity, here are the rules I’ve adhered to:
- The Tetris grid is 10 cells wide and 22 cells tall, with the top 2 rows hidden.
- All Tetrominoes (Tetris pieces) will start in the middle of the top 2 rows.
- There are 7 Tetrominoes : “I”, “O”, “J”, “L”, “S”, “Z”, “T”.
- The Super Rotation System is used for all rotations.
- The “7 system” random generator is used to randomize the next pieces.
- One lookahead piece is allowed (the player knows what the next piece will be).
Below shows a table of all possible pieces under these rules.
The best possible move
Our goal is to clear as many lines as possible, and therefore, to make as many moves as possible.
To meet this goal, our AI will decide the best move for a given Tetris piece by trying out all the possible moves (rotation and position). It computes a score for each possible move (together with the lookahead piece), and selects the one with the best score as its next move.
The score for each move is computed by assessing the grid the move would result in. This assessment is based on four heuristics: aggregate height, complete lines, holes, and bumpiness, each of which the AI will try to either minimize or maximize.
This heuristic tells us how “high” a grid is. To compute the aggregate height, we take the sum of the height of each column (the distance from the highest tile in each column to the bottom of the grid).
We’ll want to minimize this value, because a lower aggregate height means that we can drop more pieces into the grid before hitting the top of the grid.
This is probably the most intuitive heuristic among the four. It is simply the the number of complete lines in a grid.
We’ll want to maximize the number of complete lines, because clearing lines is the goal of the AI, and clearing lines will give us more space for more pieces.
A hole is defined as an empty space such that there is at least one tile in the same column above it.
A hole is harder to clear, because we’ll have to clear all the lines above it before we can reach the hole and fill it up. So we’ll have to minimize these holes.
Consider a case where we get a deep “well” in our grid that makes it undesirable:
The presence of these wells indicate that lines that can be cleared easily are not cleared. If a well were to be covered, all the rows which the well spans will be hard to clear. To generalize the idea of a “well”, we define a heuristic which I shall name “bumpiness”.
The bumpiness of a grid tells us the variation of its column heights. It is computed by summing up the absolute differences between all two adjacent columns.
In the above example,
To ensure that the top of the grid is as monotone as possible, the AI will try to minimize this value.
Putting the heuristic together
We now compute the score of a grid by taking a linear combination of our four heuristics. It is given by the following score function:
where are constant paramters.
We want to minimize aggregate height, holes and bumpiness, so we can expect to be negative. Similarly, we want to maximize the number of complete lines, so we can expect to be positive.
I used a Genetic Algorithm (GA) (explained in full detail below) to produce the following optimal set of parameters:
By using this set of parameters and the score formula, the AI can pick the best possible move by exhausting all possible moves (including the lookahead piece) and select the one with the highest score.
The Genetic Algorithm
Parameter sets, 4D vectors and the unit 3-sphere
Each possible set of parameters () can be represented as a vector in ,
A standard Genetic Algorithm for real-valued genes would involve searching the entire solution space () for an optimal set of parameters. In this project, however, we need only consider points on the unit 3-sphere (i.e. 4-dimensional unit sphere). This is because the score function defined above is a linear functional and the comparison outcomes are invariant under scaling (of the score function).
To understand this from a more mathematical point of view, first rewrite the score function in the context of vectors as
Suppose we want to compare between two moves, move 1 and move 2, to decide which move is “better” (i.e. gives a better score). Suppose also that move 1 and 2 produces values and respectively. To check if move 1 is better than move 2, we need to check if . If this condition fails then it is trivial to conclude that move 2 is better or equally good as compared to move 1.
Suppose is either positive or negative. The better move would then be move 1 and move 2 respectively. Here, we can see how the comparison outcome is invariant under scaling of the parameters – we can multiply by any positive real (scalar) constant and the sign of the above difference will remain the same (although the magnitude differs – but that is irrelevant), in which case the decision does not differ!
All parameter vectors in the same direction produce equivalent results. As such, it is sufficient to only consider a single vector for each direction. This justifies the restriction of the search to only points on the surface of the unit 3-sphere, as the surface of the sphere covers all directions possible.
We’ll now move on to define the fitness function, selection procedure, crossover operator, mutation operator and recombination procedure used for tuning the AI.
The population is first initialized with 1000 unit parameter vectors.
For each parameter vector, we run the AI for 100 games, each with at most 500 Tetromino pieces (i.e. 500 at most moves). We let the fitness of any one parameter vector be equal to the total number of lines cleared. The worst possible fitness is , in which case the AI clears no lines for all of the games.
Parent individuals are selected for reproduction using tournament selection. We select 10% (= 100) of the population at random, and the two fittest individuals in this subpool proceed on for crossover to produce a new offspring. This process is repeated until the number of new offsprings produced reaches 30% of the population size (= 300).
Weighted average crossover
For crossover, the two parent vectors are combined using a vector form of weighted averages. The unit offspring vector can then be produced where
In essence, the offspring vector lies in between the two parent vectors on the unit 3-sphere, but leans towards the fitter parent by an amount that increases with the difference between the fitness of the two parents. This reflects the favoritism of the crossover operator towards the fitter parent.
Each newly produced offspring is given a small chance (5%) to mutate. If it does indeed mutate, then a random component of the offspring vector is adjusted by a random amount of up to . The vector is then normalized.
As we are working with unit spheres, I would have preferred rotation of the offspring vectors by a certain angle. However, unlike rotations in 3D, 4D rotations are a lot more perplexing. As such, I have opted for the aforementioned more simple mutation operator.
After the number of offsprings produced reaches 30% (= 300) of the original population size, the weakest 30% individuals in the population are deleted and replaced by the newly produced offsprings, forming next generation of the population. The population size remains the same (1000). One can then repeat the entire process until the population is fit enough, after which the algorithm terminates.
Through the course of this project I’ve tried out some other heuristics such as the highest column height and the number of consecutive holes.
It turns out that some of them were redundant, and some of them were replaced by better heuristics (for example, the highest column height heuristic was replaced by the aggregate height), since this produces better results.
I would also like to stress that this AI was tuned for a specific set of rules (I defined them in the first few sections of this article). The results may or may not be the same for a different set of rules.
For example, using a naive random piece generator instead may result in an obscenely long sequence of “S” or “Z” tiles which will increase the difficulty of the AI. Allowing more lookaheads will also allow the AI to make more complex moves, in which case it will probably perform better (if properly tuned).
As such, when comparing between the results of different Tetris AIs, much attention should be placed on the set of rules used for a fair comparison to be made.