If you want to experiment your own Tetris AI, or just understand how it works, this post is for you.
At first I want to precise that I thought my AI as universal as possible:
- it works for all playfield dimensions
- any type of pieces could be added or removed (not only tetrominos)
For example, here you have the AI playing on a 16x16 playfield with some unusual pieces :
So how does it work ? The main structure of the algorithm acts as a BFS. The idea is the following: given the current playfield state and the tetrominos to come, we have to explore the tree of possible moves, evaluate their relevancy and keep the best to play.
But what is a move in Tetris ? Is a tetromino rotation or shifting considered as a move ? The answer is no. A move in Tetris is the action to lock a tetromino in a stable position on the playfield. It is relevant to think a move in this way, because it is only when a tetromino is locked that important events occur (lines collapse, game over) and as a result we can evaluate if the situation is good or not for us.
Then, we need to fix a limit to the depth of search. As a Tetris game can simply be infinite, and also because we want to limit the time the machine spend in this task. For pratical considerations, we will assume here the depth of search being at maximum the number of tetrominos that are previewed. Sometimes, the depth can simply be 1 (we consider only the current tetromino).
An exemple of partial game tree of depth 2, starting with the empty playfield
Last clarification: there is no perfect situation evaluation, simply because our information of the future is limited (by the arbitrary depth of search). That's why we will talk about heuristics for situation evaluation.
Now that we have some relevant concepts, we can look at the process more precisely. Given a playfield state (empty at start), the current tetromino to play and the next tetrominos in preview, the algorithm does the following:
1 : construction of the game tree of the chosen depth:
Depth 0: the current playfield state.
Depth 1: the generation of all the playfield states given by all possible locked positions of the current tetromino from the current playfield.
Depth 2: the generation of all the playfield states given by all possible locked positions of the next tetromino from all the payfields built at depth 1.
Consider here that for a playfield of width 10, your tree has an average total of 30^n nodes, where n is your chosen depth of search.
2 : utilization of an appropriate heuristic to score all the terminal nodes of the game tree previously generated.
3 : selection of the node with the better score, then going up through the branch of the game tree until the node of depth 1 is reached. It gives the move for the current tetromino that could potentialy leads to the best playfield state.
4 : generation of the relevant inputs (rotation, shifting...) in order to play correctly the current tetromino regarding the move previously found.
Some relevant remarks at this point. At first, this process is very time consuming, so if the AI has to played in real time, it is reasonable to keep your depth of search at maximum 2. Then, the inputs generation is an interesting part of the algorithm ; by tweaking it you can add some personnality to your virtual player in versus games. For example by allowing or not the quick fall and the hard drop, you can give the illusion of playing against a beginner or an expert.
Now let's talk about the more interesting part: the heuristic. It is where every developer can lead to different results. As previously said, there exists no perfect heuristic. What you need is to translate your own experience in playing the game into an algorithm. Here is the list of components I used to build mine:
- h0 = the number of holes in the playfield
- h1 = the altitude of the higher full cell in the playfield
- h2 = the number of full cells in the playfield
- h3 = the value of the higher slope in the playfield
- h4 = the roughness
- h5 = the number of full cells in the playfield weighted by their altitude
Notice that for each component, higher the value is, worse is the playfield state. Then we just need to combine them in order to finely express the balancing between them. The simplest way to do it is by using a linear form :
score = c0*h0 + c1*h1 + c2*h2 + c3*h3 + c4*h4 + c5*h5
with c0, c1, c2, c3, c4, c5 fixed numeric values
Of course it is possible to use more generic polynomial formulas, regarding the strategy you want to express. Actually I use a very simple version, where I set that holes in the playfield is the worse thing, by applying a coefficient of value 20 :
score = 20*h0 + h1 + h2 + h3 + h4 + h5
And surprisely, it works very well even with value 1 as depth of search.
Hoping it can inspire you.
Hoping it can inspire you.