So, I set up my initial CNTK configuration file, and ran it on some test data I’d generated. Unfortunately, my GPU ran out of memory pretty quickly. Well, it IS a laptop.
I’ll actually be working networks up to 5 layers of 5000 nodes each. My GPU can actually handle up to 6 layers that size, but I don’t think one additional layer will be that critical.
A network this size (5x5k) completes one epoch of training on a dataset of 1,000 games in just under 11 seconds; training a single epoch on a data set of 1M games would then take about 3 hours. That’s too long, especially as I’d prefer to do several epochs of training each round.
So I set up configuration files for all 30 networks (each of 100, 250, 500, 1000, 2500, and 5000 wide, with 1-5 layers), and ran a training session again the set of 1,000 games for 25 epochs. This completed in just under 20 minutes. If I want the training session to finish in eight hours (overnight), that means I can have data sets of up to 25,000 games. Of course, sometimes I only get 6 hours sleep, so that would cut it down to 18,000 games. I’ll split the difference and set it at 20,000 games each.
So, I now have my networks created, and I’ve decided on 20,000 games of training data each round. My next step is to write the tournament program which will grade the networks against each other.