Building to Win: Picking the Right AI Agent to Take 1st Place.

Ian Gore
Oct 21
5 min read

"AI is like a bag of magic tricks. You fill up your bag over time and pull out the best one for the job at hand."

That was how my professor summarized his specialty. I didn't realize it then but that simple idea would become the foundation for how I approach engineering AI systems. When you're creating an AI agent nothing matters more than the algorithm behind it, so nothing helps more than a big bag of tricks and a clear understanding of the problem you are trying to solve.

I approach AI problems through a four-step iterative process that keeps development efficient and effective. In this article, I will walk you through my process for creating a winning AI agent for the game of Gothello.

Know your job Know your resources Know your tools Know your result

Know your job

The hardest and most important part of the process is understanding your problem from a technical perspective. Without a precise understanding you are unlikely to be able to design an agent that performs better than random.

While the factors vary widely based on the project, the factors I needed to consider were fairly straightforward. I had to create an AI agent that played a 2D board game called Gothello, which is a cross between Go and Othello that is played on a 5x5 grid.

The rules of Gothello can be summarized as follows

Each player takes a turn by placing a black or white stone on the board and captures an opposing group if they completely surround them.
You may not place a stone in an area that would cause you to be surrounded or form a group with no places to move.

From that alone I could identify several important factors.

The best choice would likely be an algorithm that handles tree searches well
Since the board size is fixed the best possible runtime would be O(1) with a board state lookup, but this solution could never scale and wouldn't be truly intelligent

From the game rules we can estimate the complexity of the game.

Board size	5 * 5 = 25 positions
Average game length	24 moves
State-space complexity	log10(3^25) = 11.9
Average branching factor	13
Game-tree complexity	at least log10(13^24) or 26.7

Now we know that our target game is about as complex as Connect Four with a branching factor similar to Othello. With a clear understanding of the game we need to make an AI player for, we can look at the resources we have to address it with.

Know your resources

Next I looked at the game of Gothello and the resources it gives us

The game ships with a formal specification of its rules
A game is guaranteed to finish in a relatively short amount of time (we calculated an average game as 24 moves)
The package also includes about 500 simulated games
The game has an AI opponent called Grossthello

I also considered my own development resources. Compared to other games, Gothello is not complex so it is unlikely that we will need a great deal of compute to implement an efficient solution. Any decent computing setup will likely be good enough and we can always refactor the implementation for efficiency with parallelization in the unlikely event that becomes an issue.

Now that we have a good understanding of the game and our resources we can put those two together and look at what tools are best for the job.

Know the tools

We know that we want to use an AI algorithm to address this game but before choosing, we should revisit our ultimate goal of building the best possible AI agent as efficiently as possible.

Our provided AI opponent is a depth-limited Negamax agent called Grossthello.
We don't know what algorithms we may face in the future but we can assume that they will outperform Grossthello.

From this we can infer that we need an agent that consistently outperforms Grossthello.

Broadly speaking, we could choose an expert system or a machine learning algorithm. We don't want to spend a great deal of time creating the perfect evaluation function for an expert system so that leaves us with choosing a machine learning algorithm suited to the task.

There are many machine learning techniques that can be applied to games. Instead of exhaustively searching through every technique we can narrow our search down to a select few best candidates with the clues we have so far.

Gothello is first and foremost, a formally specified board game. That tells us that we have specified game logic that we can connect our AI agent to, a model. Furthermore, games are relatively short. When we have an environment (here a zero sum game) we can interact with we should first consider reinforcement learning and since we have a model (formally specified game logic) we want to pay special attention to algorithms that use that model to great effect.

From here things get a bit trickier. We could choose to use a form of Q-Learning to approach this problem but that would not take advantage of the model we are given. On the other hand we could choose to use the well-known AlphaZero algorithm which would take advantage of the model we are given.

Using Q-Learning would discard a significant resource, but using AlphaZero would be like putting up a picture hanger with a sledgehammer. Since we want to make the most efficient use of all of our development resources, including time, we want to pick a solution that will be powerful enough to take first place but not so complex that we risk overengineering our solution.

Recall that Gothello is not a complex or lengthy game, so we can safely choose less powerful solutions than we might need for a game like Go or Chess. If we take a deeper look at AlphaZero we see that it is composed of two important components, a Monte Carlo Tree Search (MCTS) planning algorithm to balance exploration and exploitation and a deep neural network to speed up and reduce variance in the simulation phase where MCTS does playouts of the game from the current state to see which routes are most promising. For complex games we can see that Monte Carlo Tree Search by itself would be quite slow at the simulation phase for a game like Go with a staggering branching factor of 250 or even Chess with a branching factor of 35. However, Gothello has a branching factor of 13 and quick games so we probably won't need the neural network or need to be concerned about Monte Carlo method's episodic learning.

If we remove the deep neural network, then we are left with a classic Monte Carlo Tree Search algorithm, which isn't even traditionally viewed as true machine learning despite sharing many characteristics with reinforcement learning.

Know the result

So far we have taken a deep look at Gothello and examined the tools that suit it. Additionally, we have considered the resources we have to implement a solution. We found an algorithm that balances power with simplicity. We know that we have enough compute resources to actually implement the solution and reason to expect that it will perform well. At this point any further questions would be experimental, meaning that they can only be answered by doing an experiment.

So now we move to actually writing code and setting up experiments. Fast forward to testing and we have results.

In addition to strong performance over a few games we can see that the machine learner, which I decided to call OwlZero, does well against Grossthello under all testing conditions, achieving win rates of at least 90% over 100 games.

	OwlZero Win Rate	Win : Loss : Draw
100 Rollouts	93%	93 : 6 : 1
200 Rollouts	90%	90 : 9 : 1
300 Rollouts	96%	96 : 3 : 1

I was quite satisfied with the win rates in this first iteration. We win almost all of the games we play and we did not overengineer the solution for this particular game. All in all, the perfect trick for first place on performance and efficiency.