I’ll be attempting not only to develop an AI which can beat me at Connect 4, but to study how NN size and design affect performance (both final ability as well as learning speed).
Every round, a round robin tournament will commence between the AI players themselves. The top ranking AIs will then play a large number of games to generate a data set which will be used to train the AIs for the next round.
I will also play against each AI to verify progress and success. If any AI defeats me, I will then play up to 6 more games against it, essentially a best-of-seven match.
I’m not playing several dozen games each round.
Each round, I will play a game against the top scoring AI. If it beats me, I will then play up to 6 more games against it, essentially a best of seven match. When an AI wins 4 games against me out of 7, I will consider the project a success.
For this project, I’ll be working only with feed-forward networks. Input format will be standardized with two values representing each space on the board: 10 will represent a black piece, 01 will represent a red piece, and 00 will represent an empty space, resulting in a total of 84 input values. Only the current board will be evaluated, rather than looking at several moves combined; while this could possibly limit long-term strategic planning, I’m confident that a network of sufficient complexity should still be able to master the game.
As part of this project will be to evaluate how NN design effects final performance, I will be evaluating a number of topologies, though none of them will be exotic or unusual; they will only vary in the size and number of intermediate layers.
The size of the layers will be one of 100, 250, 500, 1000, 2500, 5000, or 10000 nodes. The number of layers will vary between 1 and 10.
Memory and time constraints limit what I can actually test. The NNs will have between 1 and 5 layers, and each layer will be one of 100, 250, 500, 1000, 2500 or 5000 nodes wide.
Although I expect the smaller networks to perform poorly, seeing how they perform against their more complex brethren will still be of interest.
Non AI players:
Two “random” players will also be used in the project.
The first will be truly random, and will be a decent quality check. Every NN I train should, in theory, be able to defeat this player given sufficient training.
The second will mostly play random, except when it sees a winning move, which it will make immediately. This player should provide an early boost in quality to the training data; even though it won’t play optimally, it will cut short a large number of games.
Input will be standardized (in a future project I’ll look at how varying the input format effects the ability of neural networks to learn). Only the current state of the board will be considered; although looking at previous board state could allow for some long-term strategic thinking, I suspect it will not be necessary. That possibility will be investigated in the future.
Every space on the board will be represented by two inputs, each of which will have a value of either 0 or 1. 00 will represent an empty space, 10 your piece, and 01 an opposing piece.
Output will consist of an array of 7 values, for each of the 7 possible columns that could be chosen to play in. The highest rated column in which a legal move may be made will be considered the choice of the AI.
Every training set will then consist of 84 values, representing the current state of the board, and a single value representing the expected move. A single game will then be converted to up to 20 sets of training data, representing the moves made by the winning player (game length is capped to 42 moves due to the size of the board, and only the moves from the winning player will be considered).
The top performing players of each round will be used to generate a set of 1 million games which will be used as the current training set.
The data set will consist of moves made by the winning player in each of 20,000 games.
This data set will then be used to train all the networks I am evaluating, after which another round will commence.