How I Trained AI To Play Games
Goals
I have been interested in computer AI for years, and followed the Turing Tournament for a number of years. When I heard that Google had released TensorFlow.js I knew that I wanted to give it a shot. After reading through the TensorFlow documentation, I realized that I needed to learn a lot more about neural networks before I could come up with a project plan. I read through as many scholarly articles and resources I could find about the subject, and eventually developed an understanding of exactly how it worked. Convolutional Neural Networks rely heavily on matrix calculations, and graphic cards happen to be well suited for running this type of calculation. The WebGL component that's part of most web browsers now, allows you to be able to leverage the users graphic card when rendering content in the browser. TensorFlow.js takes advantage of this, which is why we're now able to do these types of calculations in browser. What's amazing, is you can actually train the network offline using TensorFlow and Keras and then upload it to the web where TensorFlow.js will use it to make predictions.
After learning about Neural Networks, I was finally ready to put together a game plan. I put together the following objectives that I knew I would need to complete. Since I wasn't really working with a data set, I figured the best option for training the network is to use a genetic algorithm. This allows the AI to learn patterns based on inputs from the environment around it. Below are the goals that I set out:
- Determine Input Layers
- Determine number and size of hidden layers
- Determine how to mutate & crosss breed neural network weights
- Implement Genetic Algorithm
Input & Output Layers
Deciding the output layer was a simple decision. There is only one available option for the player; to jump or not. This means we only need a single node on our output layer. The output layer returns a floating value between 0 and 1. You can think of this as how certain the neural network is of its action. We can then set a threshold saying the certainty has to be over a set value like 95% before we execute the jump command.
The input layer is a little more difficult in our case. We need to be able to give the AI the ability to perceive the environment around it. This requires us to digitize those elements in a way that the neural network can understand. For example, the platform game has a double jump feature. The AI needs to be able to know whether or not it has double jumped, so we dedicated a single node on the input layer to be either 0 or 1 based on whether that AI has double jumped yet. Eventually I set the input layer to have eight input nodes. Those values are the player's Y position, if the player has double jumped, the distance to the next two platforms, the length of the next two platforms, and the height of the next two platforms. This gives the AI enough information to be able to make a decision to jump or not.
Training
It's pretty common to train neural networks using a genetic algorithm. Genetic algorithms are made up of three components: selection, cross over, and mutation. Selection or fitness is the measure of how much survivability a member of the population has. In our caes, the fitness of our players can be measured by how long they stay alive. The cross over is how we create new offspring by taking traits from each parent and passing it on. The final part is mutation, which is how we introduce genetic diversity into our population. Everytime we breed two parents we introduce some random noise into the genome.
I found the mutation and crossover to be the most difficult part of this project. What I ended up doing was getting the weights for each hidden layer and using those as the genetic code for the crossover and mutation algorithms. I also added some additional tracking data so that I could look at data for each generation. I kept track of the best and worst players from each generation as well as the average. Luckily TensorFlow.js had built in methods for saving neural network weights, so I was able to leverage a lot of pre-built functionality. Once the genetic algorithm was complete the only remaining feature was to implement the prediction system which would tell each player AI when to jump.
Making Progress
At this point I have a small population of player AI learning how to play my game. I needed a way to be able to analyze the progress being made for each generation. I implemented a simple tracking system that would log the best and worst score as well as the average score for each generation. This would allow me to see whether or not the general population improved with each generation, or if I was getting sporadic outliers that were failing to pass on their genes. After some tweaking of the genetic algorithm weights I was able to see improvement with each generation. I have noticed a trend that it tends to evolve in spurts. Often a new trait will be introduced randomly and it triggers the top score to sky rocket. There are calculations that can be made to determine optimal weights, but formula tend to be rather complicated I have not yet figured out how to implement them.