Tic-tac-toe OOP Minimax Algorithm

Question

I wrote a module that I intend to use in my next tic-tac-toe project. All it is is a class that stores the boardstate along with some methods/properties. The biggest thing is the minimax alg. As shown in the line profiler results below, the check_win attribute takes by far the most time out of anything in the minimax. The algorithm runs pretty quickly already (taking around a second to calculate an empty board and practically instantly for a board with any move on it already), but it would be nice to be able to speed it up a bit. Improving the check_win speed is the only specific thing I am looking for, other than just general improvements.

Edit: I forgot to mention at first, the way I implemented the minimax, instead of having one player minimize and one player maximize, both players maximize and then return the negative of their value. Also, is there any concern with creating multiple instances of the Board class in terms of Board.board being mutable?

Edit 2: Added an example usage to the code

tictactoe_framework.py:

"(tictactoe_framework.py) Module for representing a tictactoe board" from math import cos from random import randint from typing import Self from line_profiler import profile class Board: """ Represents the state of a tictactoe board. Attributes: - self.board (list[int]): Represents the state of the board in a 9-element flat array where moves by X are represented by 1, moves by O are represented by -1, and empty spaces are represented by 0. """ def __init__(self, board: list[int] | None = None) -> None: if board is None: board = [0] * 9 self.board: list[int] = board @property @profile def check_win(self) -> int | None: """ (Property) Checks all possible winning combinations of board spaces and returns based on whether the current board state has a winner, is a draw, or is ongoing. Returns (int | None): Int: - 1 if X has won - -1 if O has won - 0 if game is a draw None: - None if game is ongoing """ win_indices: list[list[int]] = [ [0, 1, 2], [3, 4, 5], [6, 7, 8], # Rows [0, 3, 6], [1, 4, 7], [2, 5, 8], # Columns [0, 4, 8], [2, 4, 6], # Diagonals ] for win_condition in win_indices: line_sum = (self.board[win_condition[0]] + self.board[win_condition[1]] + self.board[win_condition[2]]) if abs(line_sum) == 3: return line_sum // 3 if 0 in self.board: return None return 0 @property def player_to_move(self) -> int: """ (Property) Determines which player's turn it is to move by finding the parity of the total number of moves. Returns (int): - 1 if player X is to move - -1 if player O is to move """ num_of_moves: int = sum(abs(space) for space in self.board) if num_of_moves % 2 == 0: return 1 return -1 @property def ideal_move(self) -> int | None: """ (Property) Determines the ideal move for the current player using the minimax algorithm. Returns (int | None): Int: - The index of the ideal move if the game has not ended yet None: - None if the game has already ended """ @profile def minimax(board: Board = self, depth: int = 0) -> int | None: win_state: int | None = board.check_win if (not depth) and (win_state is not None): return None if (depth) and (win_state is not None): return -1 * win_state * board.player_to_move if win_state is None: move_list: list[int] = [] for idx, state in enumerate(board.board): if not state: move_list.append(idx) value_list: list[int | None] = [0 for _ in move_list] for idx, move in enumerate(move_list): board.move(move) value_list[idx] = minimax(board, depth + 1) board.board[move] = 0 if depth: return -1 * max(i for i in value_list if i is not None) max_value: int = max(i for i in value_list if i is not None) max_indices: list[int] = [ idx for idx, val in enumerate(value_list) if max_value == val ] board_state_int: int = sum( (2**j) * abs(board.board[j]) + 4 for j in range(len(value_list)) ) board_state_seed: int = ( int(100 * abs(cos(board_state_int)) % len(max_indices)) - 1 ) return move_list[max_indices[board_state_seed]] return depth return minimax() def move(self, space: int | None) -> None: """ Plays a move in a space on the board. Raises an error if the move is not valid. Returns: None """ if space is not None: try: assert self.board[space] == 0 self.board[space] = self.player_to_move except AssertionError: print( "ERROR: Invalid Move\n" f"Boardstate: {self.board}\n" f"Attempted move: {space}" ) def __str__(self) -> str: disp_dict: dict[int, str] = {0: " ", 1: "X", -1: "O"} disp_board: list[str] = [disp_dict[space] for space in self.board] disp_string: str = ( f"-----\n" f"{disp_board[0]} {disp_board[1]} {disp_board[2]}\n" f"{disp_board[3]} {disp_board[4]} {disp_board[5]}\n" f"{disp_board[6]} {disp_board[7]} {disp_board[8]}\n" f"-----" ) return disp_string @classmethod def random(cls) -> Self: """ Creates an instance of the class with board state with a random number of moves made by both players. Returns (Self): - An instance of Board representing a random boardstate """ num_of_moves: int = randint(0, 8) random_board: Self = cls() for _ in range(num_of_moves): while True: move: int = randint(0, 8) if random_board.board[move] == 0: random_board.board[move] = random_board.player_to_move if random_board.check_win is None: break return random_board if __name__ == "__main__": x = Board() print(x) while x.check_win is None: x.move(x.ideal_move) print(x)

Line profiler results for check_win and minimax:

Timer unit: 1e-06 s Total time: 0.0001395 s File: tictactoe_framework.py Function: minimax at line 85 Line # Hits Time Per Hit % Time Line Contents ============================================================== 85 @profile 86 def minimax(board: Board = self, depth: int = 0) -> int | None: 87 2 81.7 40.9 58.6 win_state: int | None = board.check_win 88 89 2 0.6 0.3 0.4 if (not depth) and (win_state is not None): 90 return None 91 2 0.6 0.3 0.4 if (depth) and (win_state is not None): 92 1 9.5 9.5 6.8 return -1 * win_state * board.player_to_move 93 1 0.2 0.2 0.1 if win_state is None: 94 1 0.3 0.3 0.2 move_list: list[int] = [] 95 10 5.4 0.5 3.9 for idx, state in enumerate(board.board): 96 9 3.0 0.3 2.2 if not state: 97 1 0.9 0.9 0.6 move_list.append(idx) 98 99 2 1.2 0.6 0.9 value_list: list[int | None] = [0 for _ in move_list] 100 2 1.2 0.6 0.9 for idx, move in enumerate(move_list): 101 1 15.4 15.4 11.0 board.move(move) 102 1 2.0 2.0 1.4 value_list[idx] = minimax(board, depth + 1) 103 1 0.6 0.6 0.4 board.board[move] = 0 104 105 1 0.4 0.4 0.3 if depth: 106 return -1 * max(i for i in value_list if i is not None) 107 1 3.3 3.3 2.4 max_value: int = max(i for i in value_list if i is not None) 108 2 0.8 0.4 0.6 max_indices: list[int] = [ 109 2 1.3 0.7 0.9 idx for idx, val in enumerate(value_list) if max_value == val 110 ] 111 2 4.2 2.1 3.0 board_state_int: int = sum( 112 1 1.1 1.1 0.8 (2**j) * abs(board.board[j]) + 4 for j in range(len(value_list)) 113 ) 114 1 0.3 0.3 0.2 board_state_seed: int = ( 115 1 4.2 4.2 3.0 int(100 * abs(cos(board_state_int)) % len(max_indices)) - 1 116 ) 117 1 1.3 1.3 0.9 return move_list[max_indices[board_state_seed]] 118 119 return depth Total time: 10.0331 s File: tictactoe_framework.py Function: check_win at line 26 Line # Hits Time Per Hit % Time Line Contents ============================================================== 26 @property 27 @profile 28 def check_win(self) -> int | None: 29 """ 30 (Property) Checks all possible winning combinations of board spaces and returns 31 based on whether the current board state has a winner, is a draw, or 32 is ongoing. 33 34 Returns (int | None): 35 Int: 36 - 1 if X has won 37 - -1 if O has won 38 - 0 if game is a draw 39 None: 40 - None if game is ongoing 41 """ 42 618208 181935.5 0.3 1.8 win_indices: list[list[int]] = [ 43 618208 245636.1 0.4 2.4 [0, 1, 2], [3, 4, 5], [6, 7, 8], # Rows 44 618208 208033.1 0.3 2.1 [0, 3, 6], [1, 4, 7], [2, 5, 8], # Columns 45 618208 178003.1 0.3 1.8 [0, 4, 8], [2, 4, 6], # Diagonals 46 ] 47 4570613 1416790.5 0.3 14.1 for win_condition in win_indices: 48 12561585 3763096.5 0.3 37.5 line_sum = (self.board[win_condition[0]] 49 4187195 1087555.1 0.3 10.8 + self.board[win_condition[1]] 50 4187195 1090959.8 0.3 10.9 + self.board[win_condition[2]]) 51 4187195 1401266.0 0.3 14.0 if abs(line_sum) == 3: 52 234790 129333.6 0.6 1.3 return line_sum // 3 53 383418 132961.0 0.3 1.3 if 0 in self.board: 54 331273 170638.7 0.5 1.7 return None 55 52145 26844.3 0.5 0.3 return 0 0.00 seconds - tictactoe_framework.py:85 - minimax 10.03 seconds - tictactoe_framework.py:26 - check_win

To improve the speed, use alpha-beta pruning. I'd also suggest showing an example main so we can run this. — ggorlen, CommentedApr 23, 2024 at 0:09

J_H · Accepted Answer · 2024-04-23 04:12:54Z

it would be nice to be able to speed it up a bit.

To properly study that, I recommend you scale up from \$3 × 3\$ to \$n × n\$.

mutable arg idiom

 if board is None: board = [0] * 9

Prefer the usual idiom here:

 board = board or [0] * 9

Or rather, throw that expression into the self.board assignment.

Also, I'm not crazy about repeating the type of self.board in the docstring. It's easy to see that the Optional aspect of the signature has been dealt with by the time we assign the attribute.

enum

 def check_win(self) -> int | None: """ ... Returns (int | None): Int: - 1 if X has won - -1 if O has won - 0 if game is a draw None: - None if game is ongoing

Use an enum, please. It could do what disp_dict currently does.

Also, even if we don't use an enum, the returned values seem inconvenient for the caller, as both 0 and None are boolean False, yet they have very different meanings. So caller cannot e.g. say while not board.check_win:

Maybe we want a separate .ongoing property?

speed

Choosing a list-of-lists representation forces you to chase lots of 64-bit object pointers. Prefer an array, or an ndarray, or even an int bit-vector.

There's only \$3^9\$ possible standard game positions (some of which can't be reached from the starting position). You can easily malloc() that many positions, and then a quick dict lookup gives you the desired result.

map

Prefer map() here:

 num_of_moves: int = sum(abs(space) for space in self.board)

 num_of_moves: int = sum(map(abs, self.board))

AssertionError

 try: assert ... except AssertionError:

No, please don't do that. Use raise ValueError(...), or invent your own MoveError.

When we write assert c, we're saying that a bug-free program would never make condition c true. Unless you're a pytest maintainer, you should never be catching assertion errors.

Also, I imagine that once the docstring was true. But the current code never re-raises, so it won't report an exception to the caller.

top-level function

This is just odd:

 @classmethod def random(cls) -> Self: ... random_board: Self = cls()

Creating lots of mutable Boards is just fine.

But please have some function up at module-level invoke the Board() ctor to create them. We don't organize Board code by putting it all within the Board class.

I have addressed pretty much everything you mentioned. Regarding the Board.random() method, are you recommending removing it from the class entirely and putting it in its own function outside of the class, or are you recommending having a function outside of the class that calls the random method? — flakpm, CommentedApr 24, 2024 at 5:01
The former: remove it from the class entirely and put it in its own function outside of the class. Think about that beautiful docstring you wrote: "Represents the state of a tictactoe board." We write down such things so we can later decide "does this belong in here?", "does that belong?" I claim that def random() is a Kitchen Sink method, unrelated to the Single Responsibility of the class. Rather, it is a very nice consumer of what the class offers. We could add this, that, and the kitchen sink, but then we'd have a big mess, right? Keep it organized, and it will be maintainable. — J_H, CommentedApr 24, 2024 at 6:30
mutable args - I strongly disagree with your "usual" advice. While yours is working correctly here, I strongly advice to use the ìs None test which works for empty lists and strings. I claim ìs None is standard way. — stefan, CommentedMay 4, 2024 at 9:48
@stefan, a param or some_param_value() idiom is easily read, and is a direct translation of the much older Common Lisp (setf x (or param (some_param_value))) expression using generalized predicates. I confess that in a python context the idiom requires the Author to devote a moment's thought about truthiness, in order to spare the Reader a bit of effort. Occasionally passing in a zero-length dict vs list is of interest. Here it is immediately clear that a passed in vector full of zeros will definitely be True. — J_H, CommentedMay 4, 2024 at 16:41

Cris Luengo · Accepted Answer · 2024-04-24 06:56:23Z

if 0 in self.board: return None

There are board positions where a win is no longer possible but there still are open spaces. Continuing playing at this point is awkward, the game should stop. Therefore your class should be reporting such a board state as a draw.

One of the many such positions is this one:

|x|o|x| | | | | |o|x|o|

No matter what the players do at this point, there's no possible outcome other than draw.

The other answer already mentioned NumPy arrays. Using such an array for the board would make the code both faster and easier to read, because the indexing is that much easier. For example,

line_sum = (self.board[win_condition[0]] + self.board[win_condition[1]] + self.board[win_condition[2]])

would become

line_sum = np.sum(self.board[win_condition])

And actually you’d be able to get rid of the loop over the win condition array entirely.

I would speed up the code by writing a routine that iterates over all board positions, and generates a table that you can then include as a hard-coded table in your program. 3⁹ is only 19,683 possible board positions. Taking symmetry into account this is a much smaller list. Now evaluating the board and finding the optimal move during the game is just a simple table lookup. This might not be necessary, it is likely that your code is fast enough for interactive play, but it’s an option.

I don’t know what code the @property decorator adds, whether it adds to the cost of calling this function or not. I would profile your code again without this decorator.

I personally find it unnecessary to syntactically turn a function call into referencing a variable, it doesn’t seem to improve readability and just hides the fact that a function is being called. But this is just an opinion, it might be meaningful to you.

Regarding your first point, I don't know how I would implement that without making the check_win method even slower. Also, can you edit and give an example of one of those board states? I can't think of one other than at the very end of the game where it doesn't matter (unless you assume perfect play by both sides in which case pretty much every state starting from the beginning will end in a draw) Regarding your last point, I did some testing in a different script and I couldn't tell a difference in speed between having and not having the @property. (cont...) — flakpm, CommentedApr 24, 2024 at 5:07
(...cont) I like having the decorator since the function really just returns a property about the board. It might be better to change the name of check_win to something like game_state since check_win does sound like the name of a function. For that reason I personally think it improves readability. — flakpm, CommentedApr 24, 2024 at 5:13
@flakpm I've added an example, but there's many such positions. I imagine ideal_move would tell you what the best and worst possible outcome is given a board position. If a draw is both the best and the worst possible outcome, then the game has ended. — Cris Luengo, CommentedApr 24, 2024 at 7:03
If I was only using this to simulate thousands of games of the minimax against itself I would implement some game-stopping like that, but my goal is eventually to train a tictactoe neural network and use the ideal_move property to evaluate the networks move. I need to be able to return the ideal move for all positions even if a draw is inevitable. — flakpm, CommentedApr 24, 2024 at 13:42
@flakpm Why would you train a neural network for this? The ideal move can be encoded in five or six lines of code, the logic is trivial. But consider that, when the board is empty and you call ideal_move, you are evaluating all possible board positions recursively. If you store the tree of all those positions then there’s nothing left to compute at this point. Basically, your program is computing, for each move, stuff it already computed for the previous move. — Cris Luengo, CommentedApr 24, 2024 at 13:59

Stack Exchange Network

Tic-tac-toe OOP Minimax Algorithm

2 Answers 2

mutable arg idiom

enum

speed

map

AssertionError

top-level function

Hot Network Questions

Tic-tac-toe OOP Minimax Algorithm

2 Answers 2

mutable arg idiom

enum

speed

map

AssertionError

top-level function

Related

Hot Network Questions