Plan du site  
pixel
pixel

Articles - Étudiants SUPINFO

Adversarial Research in Artificial Intelligence

Par Seifeddine AMMAR Publié le 27/10/2019 à 11:48:36 Noter cet article:
(0 votes)
Avis favorable du comité de lecture

Introduction

Have you ever wondered how a computer can play games like chess, X/O, Go, etc. intelligently?

In this article we will discuss how to make board games using one of the artificial intelligence algorithms Min-Max.

What is Adversarial Search ?

Adversarial search is a search method to find the best move in a game made up of two players.This method is used in games where one player's lead over the other is measurable, and a player's win is a loss to the other and thus the sum is 0 hence the appellation zero-sum.This method is usually used in board games such as X/O, chess, etc.

The Min-Max algorithm is common in this field and an improved version of it is the called Alpha-Beta Pruning.

Let's start by defining some of the terms used in this algorithm, and then move on to understanding how the algorithm works:

Initial State:It is the state of the game before the implementation of the algorithm.

Successor Function:It is the function responsible for generating possibles moves (Legal Moves) in a particular state of the game.

Terminal Test:It is a test that determines whether the game reached it's end by one player's win or a draw.

Evaluation Function:A function that gives a numeric value to an assessment that identifies the advanced player in a particular state of the game.

In our application of the algorithm, the main problems that we must solve are:

  • Finding a way to represent the game board.

  • Generate all the Legal Moves.

  • Evaluate a particular situation and determine the leading player and assess his lead.

How does the algorithm works ?

The algorithm is based on two principles:

  1. When it's my turn to play (the computer), I will try to choose a movement that increases my assessment and my chance of winning.

  2. When it's my opponent's turn to play, he will try to choose a movement that reducesmy assessment and chance of winning.

These two principles are what's behind the naming Min-Max, Max is where I'm trying to maximize my assessment and Min is where the opponent is trying to minimize it.

In order to choose the move, the algorithm tests all the available moves using the Successor Function, and for each of those moves, the next available moves are tested and so on, until we reach a certain depth previously defined. When this depth is reached the game state is evaluated through the Evaluation Function.The best move is the one that led to the game's state with the highest assessment.

To facilitate the topic, we can represent the process with a Game Tree, so that:

  • Nodes will represent game states

  • Edges will represent the available moves that takesthe game from one state to another.

  • The Root node will represent the Initial State.

    Initial state means it's the computer's time to play, so it has to chose the move that will gives it the highest assessment that's why the node in this level is a Max node.

    The level that preceeds the root node will be the opponent's move, so we will assume that the opponent will choose his best move, which will decrease the computer's assessment, so the nodes in this level are called Min nodes.

    This cycle continues until the desired depth is reached or the game ends.

  • Leaves or Terminal Nodes will be the nodes where we will evaluate the game state.

We will try to illustrate more with pictures, let's take the following game tree from an in progress X/O game:

In this game, the computer plays the role of X.

As we said earlier, the Root represents the current state of the game and the state in which the computer has to make a move.The algorithm will try all the available moves to choose the move that gives it the best value, and for each of these moves it will try the available moves that follow and assume that the opponent will choose his best move which will be the move that gives the computer the least value and so on ..

In the picture, the labels on the left Max and Min at each level shows the nature of the move to be chosen at that state, in the first level "Root", the move that gives the computer the best rating will be chosen, so the lable is Max, and the label at the second level, where it's the opponent turn the algorithm will try to choose the move that gives the least rating to the computer, is Min. The roles are swapped until we reach the desired Depth or the game ends.

When we get to the leaf, node 4 for example, the game state is evaluated and because the computer won it gets infinity rating, and since node 3 is Max will choose the highest and only value coming from node 4.

Take node 2 as another example, because Min node will choose the lowest value between 3 and 5, however choosing either one of them makes no diffrence since they both have an infinity value that expresses the computer's victory, meaning the opponent's defeat is inevitable.

As for node 7, and because it is a Min node, the lowest value assigned between nodes 8 and 9 will be chosen, which is 8 with a negative value of infinity which expresses the computer's loss and announces the winning of the opponent.

In node 11, which is a Min Node the algorithm will choose the lowest value assigned between nodes 12 and 13, which is 12 because it is also negative infinity.

Let's now go back to the Root Node, where the computer has to choose it's next move. There are 3 options available which are infinity (from node 2) and negative infinity (from nodes 7 and 11). And because Root Node is Max Node, the algorithm will choose the highest value which is positive infinity coming from node 2.

The algorith decided that the best move for the computer is to put the X in the middle like node 2 as shown in the figure below.

Now that we understand how the algorithm works, let's take a look at Pseudo-code and understand it:

 function minimax(node, depth, maximizingPlayer) is 
    
    if depth = 0 or node is a terminal node then    
        return the heuristic value of node
        
    if maximizingPlayer then 
        value := -∞ 
        for each child of node do 
            value := max(value, minimax(child, depth - 1, FALSE))
        return value
        
    else (* minimizing player *)
        value := +∞ 
        for each child of node do 
            value := min(value, minimax(child, depth - 1, TRUE))
        return value
                

The algorithm receives as input the current state of the game (represented by a node) and the required depth.

The maximizingPlayer variable determines whether the current Node is Max Node or Min Node.

In line 2, we check if we reach the desired Depth or if the game has reached an ending with a player's win or a draw, if so we will return the node rating (using the Evaluation Function as mentioned).

At line 4 the coding part of the Max Node begins. The algorithm will go through all the available moves (Legal Moves) from the current node, and in each move we call the same function with the change of the value of maximizingPlayer to FALSE because the next turn will be the opponent's move which is a Min Node.

At each iteration, the value of Depth has to be decreased by 1. After all the moves has received an evaluation, the move that gives the computer the best evaluation will be chosen.

As for Min node, it's not very different from the previous. The algorithm has to see through all the available moves from the current node, and in each move we call the same function with the change of the value of maximizingPlayer to TRUE since the next turn will be for the computer (Max Node).

At each iteration the value of Depth is decreased by 1. After all the moves has received an evaluation, we will assume that the opponent will play his best move, the move that gives the computer the worst evaluationit will be chosen.

Conclusion

The difficulty of the game can be conrolled by selecting the depth. The highest the value of depth, the more difficult the game will be and vice versa. In X/O, there are very few legal moves, it's called Branching Factor in the game tree representation, so it's possible to set the depth to 9 so that the algorithm can always reach the end of the game without noticing any slowing with the game or the computer unlike complex games where the number of moves is way higher, like chess for example.

That's why there is an optimization of the Min-Max Algorithm that ignores the calculations of the unnecessary nodes. This algorithm is called Alpha-Beta Pruning.

A propos de SUPINFO | Contacts & adresses | Enseigner à SUPINFO | Presse | Conditions d'utilisation & Copyright | Respect de la vie privée | Investir
Logo de la société Cisco, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société IBM, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Sun-Oracle, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Apple, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Sybase, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Novell, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Intel, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Accenture, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société SAP, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Prometric, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo de la société Toeic, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management Logo du IT Academy Program par Microsoft, partenaire pédagogique de SUPINFO, la Grande École de l'informatique, du numérique et du management

SUPINFO International University
Ecole d'Informatique - IT School
École Supérieure d'Informatique de Paris, leader en France
La Grande Ecole de l'informatique, du numérique et du management
Fondée en 1965, reconnue par l'État. Titre Bac+5 certifié au niveau I.
SUPINFO International University is globally operated by EDUCINVEST Belgium - Avenue Louise, 534 - 1050 Brussels