Let’s make an AI that destroys video games: Crash Course AI #13

Let’s make an AI that destroys video games: Crash Course AI #13

Jabril: John Green Bot are you serious?!
I made this game and you beat my high score?
John-Green-bot: Pizza!
Jabril: So John Green Bot is pretty good at Pizza Jump, but what about this new game we made, TrashBlaster?
John-Green-bot: Hey, that’s me!
Jabril:Yeah, let’s see watch you’ve got.
John-Green-bot: That’s not fair, Jabril!!
Jabril: It’s okay John Green Bot we’ve got you covered.
Today we’re gonna design and build an AI
program to help you play this game like a pro.
Hey, I’m Jabril and welcome to Crash Course
Last time, we talked about some of the ways
that AI systems learn to play games.
I’ve been playing video games for as long
as I can remember.
They’re fun, challenging, and tell interesting
stories where the player gets to jump on goombas
or build cities or cross the road or flap
a bird.
But games are also a great way to test AI
techniques because they usually involve simpler
worlds than the one we live in.
Plus, games involve things that humans are
often pretty good at like strategy, planning,
coordination, deception, reflexes, and intuition.
Recently, AIs have become good at some tough
games, like Go or Starcraft II.
So our goal today is to build an AI to play
a video game that our writing team and friends
at Thought Cafe designed called TrashBlaster!
The player’s goal in TrashBlaster is to
swim through the ocean as a little virtual
John-Green-bot, and destroy pieces of trash.
But we have to be careful, because if John-Green-bot
touches a piece of trash, then he loses and
the game restarts.
Like in previous labs, we’ll be writing
all of our code using a language called Python
in a tool called Google Colaboratory.
And as you watch this video, you can follow
along with the code in your browser from the
link we put in the description.
In these Colaboratory files, there’s some
regular text explaining what we’re trying
to do, and pieces of code that you can run
by pushing the play button.
These pieces of code build on each other,
so keep in mind that we have to run them in
order from top to bottom, otherwise we might
get an error.
To actually run the code and experiment with
changing it, you’ll have to either click
“open in playground” at the top of the
page or open the File menu and click “Save
a Copy to Drive”.
And just an fyi: you’ll need a Google account
for this.
So to create this game-playing AI system,
first, we need to build the game and set up
everything like the rules and graphics.
Second, we’ll need to think about how to
create a TrashBlaster AI model that can play
the game and learn to get better.
And third, we’ll need to train the model
and evaluate how well it works.
Without a game, we can’t do anything.
So we’ve got to start by generating all
the pieces of one.
To start, we’re going to need to fill up
our toolbox by importing some helpful libraries,
such as PyGame.
The first step in 1.1 and 1.2 loads the libraries,
and step 1.3 saves the game so we can watch
it later.
This might take a second to download.
The basic building blocks of any game are
different objects that interact with each other.
There’s usually something or someone the
player controls and enemies that you battle
— All these objects and their interactions
with one another need to be defined in the
So to make TrashBlaster, we need to define
three objects and what they do: a blaster,
a hero, and trash to destroy.
The blaster is what actually destroys the
trash, so we’re going to load an image that
looks like a laser-ball and set
some properties.
How far does it go, what direction does it
fly, and what happens to the blast when it
hits a piece of trash?
Our hero is John-Green-bot, so now we’ve
got to load his image, and define
properties like how fast he can swim and how a blast appears when he uses his blaster.
And we need to load an image for the trash pieces, and then code how they
move and what happens if they get hit by a
blast, like, for example, total destruction
or splitting into 2 smaller pieces.
Finally, all these objects are floating in
the ocean, so we need a piece of code to generate
the background.
The shape of this game’s ocean is toroidal,
which means it wraps around, and if any object
flies off the screen to the right, then it
will immediately appear on the far left side.
Every game needs some way to track how the player’s doing, so we’ll show the score too.
Now that we have all the pieces in place,
we can actually build the game and decide
how everything interacts.
The key to how everything fits together is
the run function.
It’s a loop of checking whether the game
is over; moving all the objects; updating
the game; checking whether our hero is okay;
and making new trash.
As long as our hero hasn’t bumped into any
trash, the game continues.
That’s pretty much it for the game mechanics.
We’ve created a hero, a blaster, trash,
and a scoreboard, and code that controls their
Step 2 is modeling the AI’s brain so John-Green-bot
can play!
And for that, we can turn back to our old
friend the neural network.
When I play games, I try to watch for the
biggest threat because I don’t want to lose.
So let’s program John-Green-bot to use a
similar strategy.
For his neural network’s input layer, let’s
consider the 5 pieces of trash that are closest
to his avatar.
(And remember, the closest trash might actually
be on the other side of the screen!)
Really, we want John-Green-bot to pay attention
to where the trash is and where it’s going.
So we want the X and Y positions relative
to the hero, the X and Y velocities relative
to the hero, and the size of each piece of
That’s 5 inputs for 5 pieces of trash, so
our input layer is going to have 25 nodes.
For the hidden layers, let’s start small
and create 2 layers with 15 nodes each.
This is just a guess, so we can change it
later if we want.
Because the output of this neural network
is gameplay, we want the output nodes to be
connected to the movement of the hero and
shooting blasts.
So there will be 5 nodes total: an X and Y
for movement, an X and Y direction for aiming
the blaster, and whether or not to fire the
To start, the weights of the neural network
are initialized to 0, so the first time John-Green-bot
plays he basically sits there and does nothing.
To train his brain with regular supervised
learning, we’d normally say what the best
action is at each timestep.
But because losing TrashBlaster depends on
lots of collective actions and mistakes, not
just one key moment, supervised learning might
not be the right approach for us.
Instead, we’ll use reinforcement learning
strategies to train John-Green-bot based on
all the moves he makes from the beginning
to the end of a game, and we’ll evolve
a better AI using a genetic algorithm which
is commonly referred to as GA.
To start, we’ll create some number of John-Green-bots
with empty brains
(let’s say 200), and we’ll have them play
They’re all pretty terrible, but because
of luck,
some will probably be a little bit less terrible.
In biological evolution, parents pass on most
of their characteristics to their offspring
when they reproduce.
But the new generation may have some small
differences, or mutations.
To replicate this, we’ll use code to take
the 100 highest-scoring John-Green-bots and
clone each of them as our reproduction step.
Then, we’ll slightly and randomly change
the weights in those 100 cloned neural networks,
which is our mutation step.
Right now, we’ll program a 5% chance that
any given weight will be mutated, and randomly
choose how much that weight mutates (so it
could be barely any change or a huge one).
And you could experiment with this if you
Mutation affects how much the AI changes overall,
so it’s a little bit like the learning rate
that we talked about in previous episodes.
We have to try and balance steadily improving
each generation with making big changes that
might be really helpful (or harmful).
After we’ve created these 100 mutant John-Green-bots,
we’ll combine them with the 100 unmutated
original models (just in case the mutations
were harmful) and have them all play the game.
Then we evaluate, clone, and mutate them over
and over again.
Over time, the genetic algorithm usually makes
AI that are gradually better at whatever they’re
being asked to do, like play TrashBlaster.
This is because models with better mutations
will be more likely to score high and reproduce
in the future.
ALL of this stuff, from building John-Green-bot’s
neural network to defining mutation for our
genetic algorithm, are in this section of
After setting up all that, we have to write
code to carefully define what doing “better”
at the game means.
Destroying a bunch of trash?
Staying alive for a long time?
Avoiding off-target blaster shots?
Together, these decisions about what “better”
means define an AI model’s fitness.
Programming this function is pretty much the
most important part of this lab, because how
we define fitness will affect how John-Green-bot’s
AI will evolve.
If we don’t carefully balance our fitness
function, his AI could end up doing some pretty
weird things.
For example, we could just define fitness
as how long the player stays alive, but then
John-Green-bot’s AI might play TrashAvoider
and dodge trash instead of TrashBlaster and
destroy trash.
But if we define the fitness to only be related
to how many trash pieces are destroyed, we
might get a wild hero that’s constantly
So, for now, I’m going to try a fitness
function that keeps the player alive and blasts
We’ll define the fitness as +1 for every
second that John-Green-bot stays alive, and
+10 for every piece of trash that is zapped.
But it’s not as fun if the AI just blasts
everywhere, so let’s also add a penalty
of -2 for every blast he fires.
The fitness for each John-Green-bot AI will
be updated continuously as he plays the game,
and it’ll be shown on the scoreboard we
created earlier.
You can take some time to play around with
this fitness function and watch how John-Green-bot’s
AI can learn and evolve differently.
Finally, we can move onto Step 3 and actually train John-Green-bot’s AI to blast some trash!
So first, we need to start up our game.
And to kick off the genetic algorithm, we
have to define how many randomly-wired John-Green-bot
models we want in our starting population.
Let’s stick with 200 for now.
If we waited for each John-Green-bot model
to start, play, and lose the game… this
training process could take DAYS.
But because our computer can multitask, we
can use a multiprocessing package to make
all 200 AI models play separate games at the
same time, which will be MUCH faster.
And this is all part of the training.
This is where we’ll code in the details
of the genetic algorithm, like sorting John-Green-bots
by their fitness and choosing which ones will
Now that we have the 100 John-Green-bots that
we want to reproduce, this code will clone
and mutate them so we have that combined group
of 100 old and 100 mutant AI models.
Then, we can run 200 more games for these
200 John-Green-bots.
It just takes a few seconds to go through
them all thanks to that last chunk of code.
And we can see how well they do!
The average score of the AI models that we
picked to reproduce is almost twice as high
as the overall average.
Which is good!
It means that the John-Green-bot is learning
We can even watch a replay of the best AI.
Uh… even the best isn’t very exciting
right now.
We can see the fitness function changing as
time passes, but the hero’s just sitting
there not getting hit and shooting forward
– we want John-Green-bot to actually play,
not just sit still and get lucky.
We can also see visual representation of this
specific neural network, where higher weights
are represented by the redness of the connections.
It’s tough to interpret what exactly this
diagram means, but we can keep it in mind
as we keep training John-Green-bot.
Genetic algorithms take time to evolve a good
So let’s change the number of iterations
in the loop in STEP 3.3, and run the training
step 10 times to repeatedly copy, mutate, and test the fitness
of these AI models.
Okay, now I’ve trained 10 more iterations.
And if I view a replay of the last game, we
can see that John-Green-bot is doing a little
He’s moving around a little and actually
sort of aiming.
If we keep training, one model might get lucky,
destroy a bunch of trash, has a high fitness,
and gets copied and mutated to make future
generations even better.
But John-Green-bot needs lots of iterations
to get really good at TrashBlaster.
You might consider changing the number of
iterations to 50 or 100 times per click…
which might take a while.
Now here’s an example of a game after 15,600
training iterations just look at John-Green-bot
swimming and blasting trash like a pro.
And all this was done using a genetic algorithm, raw luck, and a carefully crafted fitness function.
Genetic algorithms tend to work pretty well
on small problems like getting good at TrashBlaster.
When the problems get bigger, the random mutations
of genetic algorithms are sometimes… well,
too random to create consistently good results.
So part of the reason this works so well is
because John-Green-bot’s neural network
is pretty tiny compared to many AIs created
for industrial-sized problems.
But still, it’s fun to experiment with AI
and games like TrashBlaster.
For example, you can try to change the values
of the fitness function and see how John-Green-bot’s
AI evolves differently.
Or you could change how the neural network
gets mutated, like by messing with the structure
instead of the weights.
Or you could change how much the run function
loops per second, from 5 times a second to
10 or 20, and give John-Green-bot superhuman
You can download the clip of your AI playing
TrashBlaster by looking for game_animation.gif
in the file browser on the left-hand side
of the Colaboratory file.
You can also download source code from Github
to run on your own computer if you want to
experiment (we’ll leave a link in the description).
And next time, we’ll start shifting away
from games and learn about other ways that
humans and AI can work together in teams. See ya then.
Crash Course AI is produced in association
with PBS Digital Studios.
If you want to help keep Crash Course free
for everyone, forever, you can join our community
on Patreon.
And if you want to learn more about genetics
and evolution check out Crash Course Biology.

19 thoughts on “Let’s make an AI that destroys video games: Crash Course AI #13”

  1. This is so awesome man, I developed a game a few years back and pitched it to Nintendo, but we weren’t able to complete it. Two people indie developers developing and presenting at Nintendo after watching this I’m super inspired to at least try to code again. Thanks for the fire videos man!

  2. OMG you blowed the nintendo cartridge!
    I am brazilian and I have never wordered that this kind of behaivor was the same in USA!

  3. I was hoping you'd talk about the algorithm that was used to help an AI beet Montezuma's Revenge, but I guess that is a bit more advanced XD

Leave a Reply

Your email address will not be published. Required fields are marked *