Alpha Beta Pruning in Minimax Algorithm

Reading time: minutes | Coding time: minutes

Like we discussed in earlier article, a hard coded AI can be used to create an intelligent opponent which you can challenge to a game of classical tic-tac-toe!

But, that algorithm could still be optimised further. But how do we achieve that? The Alpha Beta Pruning is a search algorithm that tries to diminish the quantity of hubs that are assessed by the minimax algorithm in its search tree. It is an antagonistic search algorithm utilized usually for machine playing of two-player recreations (Tic-tac-toe, Chess, Go, and so forth.).

It stops totally assessing a move when no less than one probability has been observed that ends up being more regrettable than a formerly analyzed move. Such moves require not be assessed further. At the point when connected to a standard minimax tree, it restores an indistinguishable move from minimax would, however prunes away branches that can't in any way, shape or form impact an official conclusion!


You must be thinking what does this figure really signifies. Well, it shows how the alogorithm ignores the sub trees which are not really desireable in our game moves. The algorithm keeps up two qualities, alpha and beta, which speak to the base score that the expanding player is guaranteed of and the greatest score that the limiting player is guaranteed of separately. At first alpha is negative infinity and beta is sure infinity, i.e. the two players begin with their most exceedingly terrible conceivable score. At whatever point the most extreme score that the limiting player(beta) is guaranteed of turns out to be not as much as the base score that the expanding player(alpha) is guaranteed of (i.e. beta <= alpha), the augmenting player require not consider the relatives of this hub as they will never be come to in genuine play.


Algorithm


Note :

  • Alpha Beta Pruning is not a new algorithm but actually an optimization!

  • Alpha is the best value that the maximizer currently can guarantee at that level or above.

  • Beta is the best value that the minimizer currently can guarantee at that level or above.

And ofcourse, we are using Tic Tac Toe as our reference example!

def minimax(depth, player, alpha, beta)
  if gameover or depth == 0
    return calculated_score
  end
  children = all legal moves for player
  if player is AI (maximizing player)
    for each child
      score = minimax(depth - 1, opponent, alpha, beta)
      if (score > alpha)
        alpha = score
      end
      if alpha >= beta
        break
      end
      return alpha
    end
  else #player is minimizing player
    best_score = +infinity
    for each child
      score = minimax(depth - 1, player, alpha, beta)
      if (score < beta)
        beta = score
      end
      if alpha >= beta
        break
      end
      return beta
    end
  end
end

#then you would call it like
minimax(2, computer, -inf, +inf)

Complexity


  • Worst case Performance : O(|E|) = O(b^(d/2))
  • Worst case space complexity : O(|V|) = O(b * d)

Implementations


  • Next Best Move Guesser- Python

Interactive console game- Python


import numpy as np
import sys
from copy import copy

rows = 3
cols = 3

board = np.zeros((rows,cols))

inf = float("inf")
neg_inf = float("-inf")

def printBoard():
    for i in range(rows):
        for j in range(cols):
            if board[i, j] == 0:
                sys.stdout.write(' _ ')
            elif board[i, j] == 1:
                sys.stdout.write(' X ')
            else:
                sys.stdout.write(' O ')
        print ''

# The heuristic function which will be evaluating board's position for each of the winning POS
heuristicTable = np.zeros((rows+1, cols+1))
numberOfWinningPositions = rows + cols + 2

for index in range(rows+1):
    heuristicTable[index, 0] = 10**index
    heuristicTable[0, index] =-10**index

winningArray = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 3, 6], [1, 4, 7], [2, 5, 8], [0, 4, 8], [2, 4, 6]])


def utilityOfState(state):
    stateCopy = copy(state.ravel())
    heuristic = 0

    for i in range(numberOfWinningPositions):
        maxp = 0
        minp = 0
        for j in range(rows):
            if stateCopy[winningArray[i,j]] == 2:
                maxp += 1
            elif stateCopy[winningArray[i,j]] == 1:
                minp += 1
        heuristic += heuristicTable[maxp][minp]

    return heuristic

def minimax(state,alpha,beta,maximizing,depth,maxp,minp):

    if depth==0:
        return utilityOfState(state),state

    rowsLeft,columnsLeft=np.where(state==0)
    returnState=copy(state)
    if rowsLeft.shape[0]==0:
        return utilityOfState(state),returnState

    if maximizing==True:
        utility=neg_inf
        for i in range(0,rowsLeft.shape[0]):
            nextState=copy(state)
            nextState[rowsLeft[i],columnsLeft[i]]=maxp
            #print 'in max currently the Nextstate is ',nextState,'\n\n'
            Nutility,Nstate=minimax(nextState,alpha,beta,False,depth-1,maxp,minp)
            if Nutility > utility:
                utility=Nutility
                returnState=copy(nextState)
            if utility>alpha:
                alpha=utility
            if alpha >=beta :
                #print 'pruned'
                break;

        #print 'for max the best move is with utility ',utility,' n state ',returnState
        return utility,returnState

    else:
        utility=inf
        for i in range(0,rowsLeft.shape[0]):
            nextState=copy(state)
            nextState[rowsLeft[i],columnsLeft[i]]=minp
            #print 'in min currently the Nextstate is ',nextState,'\n\n'
            Nutility,Nstate=minimax(nextState,alpha,beta,True,depth-1,maxp,minp)
            if Nutility < utility:
                utility=Nutility
                returnState=copy(nextState)
            if utility< beta:
                beta=utility
            if alpha >=beta :
                #print 'pruned'
                break;
        return utility,returnState

def checkGameOver(state):
    stateCopy=copy(state)
    value=utilityOfState(stateCopy)
    if value >=1000:
        return 1
    return -1


def main():
    num=input('enter player num (1st or 2nd) ')
    value=0
    global board
    for turn in range(0,rows*cols):
        if (turn+num)%2==1: #make the player go first, and make the user player as 'X'
            r,c=[int(x) for x in raw_input('Enter your move ').split(' ')]

            board[r-1,c-1]=1
            printBoard()
            value=checkGameOver(board)
            if value==1:
                print 'U win.Game Over'
                sys.exit()
            print '\n'
        else: #its the computer's turn make the PC always put a circle'
            #right now we know the state if the board was filled by the other player 
            state=copy(board)
            value,nextState=minimax(state,neg_inf,inf,True,2,2,1)
            board=copy(nextState)
            printBoard()
            print '\n'
            value=checkGameOver(board)
            if value==1:
                print 'PC wins.Game Over'
                sys.exit()

    print 'Its a draw'

if __name__ == "__main__":
    main()


Applications


Its major applications are in game of chess. Stockfish (chess) is a C++ open source chess program that implements the Alpha Beta pruning algorithm. Stockfish is a mature implementation that is rated as one of the strongest chess engines available today as evidenced by it winning the Top Chess Engine Championship in 2016 and 2017.

One more interesting thing to notice is that implementations of alpha-beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard." The pseudo-code illustrates the fail-soft variation. With fail-soft alpha-beta, the alphabeta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha-beta limits its function return value into the inclusive range of α and β.

Questions

What is the depth of a standard tic-tac-tow game?

9
4
2^9
3

Further Reading