Like we discussed in earlier article, a hard coded AI can be used to create an intelligent opponent which you can challenge to a game of classical tic-tac-toe!
But, that algorithm could still be optimised further. But how do we achieve that? The Alpha Beta Pruning is a search algorithm that tries to diminish the quantity of hubs that are assessed by the minimax algorithm in its search tree. It is an antagonistic search algorithm utilized usually for machine playing of two-player recreations (Tic-tac-toe, Chess, Go, and so forth.).
It stops totally assessing a move when no less than one probability has been observed that ends up being more regrettable than a formerly analyzed move. Such moves require not be assessed further. At the point when connected to a standard minimax tree, it restores an indistinguishable move from minimax would, however prunes away branches that can't in any way, shape or form impact an official conclusion!
You must be thinking what does this figure really signifies. Well, it shows how the alogorithm ignores the sub trees which are not really desireable in our game moves. The algorithm keeps up two qualities, alpha and beta, which speak to the base score that the expanding player is guaranteed of and the greatest score that the limiting player is guaranteed of separately. At first alpha is negative infinity and beta is sure infinity, i.e. the two players begin with their most exceedingly terrible conceivable score. At whatever point the most extreme score that the limiting player(beta) is guaranteed of turns out to be not as much as the base score that the expanding player(alpha) is guaranteed of (i.e. beta <= alpha), the augmenting player require not consider the relatives of this hub as they will never be come to in genuine play.
Alpha Beta Pruning is not a new algorithm but actually an optimization!
Alpha is the best value that the maximizer currently can guarantee at that level or above.
Beta is the best value that the minimizer currently can guarantee at that level or above.
And ofcourse, we are using Tic Tac Toe as our reference example!
def minimax(depth, player, alpha, beta) if gameover or depth == 0 return calculated_score end children = all legal moves for player if player is AI (maximizing player) for each child score = minimax(depth - 1, opponent, alpha, beta) if (score > alpha) alpha = score end if alpha >= beta break end return alpha end else #player is minimizing player best_score = +infinity for each child score = minimax(depth - 1, player, alpha, beta) if (score < beta) beta = score end if alpha >= beta break end return beta end end end #then you would call it like minimax(2, computer, -inf, +inf)
- Worst case Performance :
O(|E|) = O(b^(d/2))
- Worst case space complexity :
O(|V|) = O(b * d)
- Next Best Move Guesser- Python
Interactive console game- Python
import numpy as np import sys from copy import copy rows = 3 cols = 3 board = np.zeros((rows,cols)) inf = float("inf") neg_inf = float("-inf") def printBoard(): for i in range(rows): for j in range(cols): if board[i, j] == 0: sys.stdout.write(' _ ') elif board[i, j] == 1: sys.stdout.write(' X ') else: sys.stdout.write(' O ') print '' # The heuristic function which will be evaluating board's position for each of the winning POS heuristicTable = np.zeros((rows+1, cols+1)) numberOfWinningPositions = rows + cols + 2 for index in range(rows+1): heuristicTable[index, 0] = 10**index heuristicTable[0, index] =-10**index winningArray = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 3, 6], [1, 4, 7], [2, 5, 8], [0, 4, 8], [2, 4, 6]]) def utilityOfState(state): stateCopy = copy(state.ravel()) heuristic = 0 for i in range(numberOfWinningPositions): maxp = 0 minp = 0 for j in range(rows): if stateCopy[winningArray[i,j]] == 2: maxp += 1 elif stateCopy[winningArray[i,j]] == 1: minp += 1 heuristic += heuristicTable[maxp][minp] return heuristic def minimax(state,alpha,beta,maximizing,depth,maxp,minp): if depth==0: return utilityOfState(state),state rowsLeft,columnsLeft=np.where(state==0) returnState=copy(state) if rowsLeft.shape==0: return utilityOfState(state),returnState if maximizing==True: utility=neg_inf for i in range(0,rowsLeft.shape): nextState=copy(state) nextState[rowsLeft[i],columnsLeft[i]]=maxp #print 'in max currently the Nextstate is ',nextState,'\n\n' Nutility,Nstate=minimax(nextState,alpha,beta,False,depth-1,maxp,minp) if Nutility > utility: utility=Nutility returnState=copy(nextState) if utility>alpha: alpha=utility if alpha >=beta : #print 'pruned' break; #print 'for max the best move is with utility ',utility,' n state ',returnState return utility,returnState else: utility=inf for i in range(0,rowsLeft.shape): nextState=copy(state) nextState[rowsLeft[i],columnsLeft[i]]=minp #print 'in min currently the Nextstate is ',nextState,'\n\n' Nutility,Nstate=minimax(nextState,alpha,beta,True,depth-1,maxp,minp) if Nutility < utility: utility=Nutility returnState=copy(nextState) if utility< beta: beta=utility if alpha >=beta : #print 'pruned' break; return utility,returnState def checkGameOver(state): stateCopy=copy(state) value=utilityOfState(stateCopy) if value >=1000: return 1 return -1 def main(): num=input('enter player num (1st or 2nd) ') value=0 global board for turn in range(0,rows*cols): if (turn+num)%2==1: #make the player go first, and make the user player as 'X' r,c=[int(x) for x in raw_input('Enter your move ').split(' ')] board[r-1,c-1]=1 printBoard() value=checkGameOver(board) if value==1: print 'U win.Game Over' sys.exit() print '\n' else: #its the computer's turn make the PC always put a circle' #right now we know the state if the board was filled by the other player state=copy(board) value,nextState=minimax(state,neg_inf,inf,True,2,2,1) board=copy(nextState) printBoard() print '\n' value=checkGameOver(board) if value==1: print 'PC wins.Game Over' sys.exit() print 'Its a draw' if __name__ == "__main__": main()
Its major applications are in game of chess. Stockfish (chess) is a C++ open source chess program that implements the Alpha Beta pruning algorithm. Stockfish is a mature implementation that is rated as one of the strongest chess engines available today as evidenced by it winning the Top Chess Engine Championship in 2016 and 2017.
One more interesting thing to notice is that implementations of alpha-beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard." The pseudo-code illustrates the fail-soft variation. With fail-soft alpha-beta, the alphabeta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha-beta limits its function return value into the inclusive range of α and β.