So that’s a surprise… I estimated that training would take eight hours, and it took 3 1/2. Just goes to show, you have to be careful extrapolating from small data sets.
So I started over and ran the first round again. Here are the results:

None of the AIs won fewer than 20 games, or more than 40. Nor did any of them win against the random player.