May 24, 2024

Snapdragon, Introduction

Part 1 of many

Introduction

It’s been a while since I’ve written up anything on this blog, mostly because it’s a lot easier to do side projects and write them up when I have lots of free time. Luckily, I was laid off recently, so we’re back in business.

After the layoff, I applied for a job at Second Dinner, the company that makes Marvel Snap. Although my last job was at a mobile game company, I don’t have a ton of professional experience working on games, so I decided to put together a bit of a demonstration project. (Spoiler alert: didn’t get the job, or even an interview, but I have no regrets.)

The demonstration project, which I have dubbed “Snapdragon”, is an AI to play, and build decks for, Marvel Snap. In this post I will give an overview of the general design and discuss a couple of technical points, and subsequent posts will talk about some practical challenges and my progress on them.

The current version of the code can be found here, and as of publication time I am still actively developing it. As such it’s a little messier than normal, and also anything I’ve written here is potentially in flux, although I will do my best to come back and make changes to these posts as appropriate.

A Brief(ish) Overview of Marvel Snap

If you’ve already played Marvel Snap or are familiar with the mechanics of it, you can skip this part. Or you could just go download and play it (it’s free-to-play and a single game takes a few minutes, max). But just for the sake of anybody else:

Marvel Snap is a digital collectible card game (like Magic the Gathering or Pokémon). Each game consists of six turns in which players play cards into three different locations. Each location can have up to four cards per player. Each card has a cost (in “energy”), and a power value. On each turn, players get an amount of energy equal to the turn count (1 on turn 1, 2 on turn 2, 6 on turn 6, etc.) which they can spent to play cards (and which doesn’t carry over to subsequent turns - you use it or lose it). At the end of the game, whoever has the most power at each location wins that location. Whoever wins two out of three locations wins the game. If nobody wins two locations, whoever has the most overall power wins the game. If this is also a tie, the game is a draw.

And, because it’s a collectible card game, almost every card has some sort of special ability that can modify/move/destroy/etc. other cards, or alter the fundamental rules of the game in some other way. Locations have powers like this as well.¹

Project Structure

Snapdragon consists of three things:

A core Marvel Snap engine
Monte Carlo game search, to play the game
A genetic algorithm, to build decks

These three things all work together and are interconnected. I will break this down more in the following sections, but the Monte Carlo game search logic specifically relies on the implementation of the game rules to function, and the genetic algorithm basically would not function properly without the Monte Carlo game search, although at the end of the day the main “output” of the system, so to speak, is the decks created by the genetic algorithm.

Pure Monte Carlo Game Search

You might have heard of Monte Carlo Tree Search before - it’s a common method to play discrete games (ones where you have a finite number of moves and take individual turns, like many board and card games). Basically it works like this:

Create a “tree” of possible future states of the game (based on the legal moves you can make)
Use some mechanism to determine which are the most promising
Simulate the game being played randomly from those states some number of times (N)
See how often you win from each of those future states
Use that win ratio to update your sense of which of those states is the best

For the purposes of this project, though, I opted to go with the more simple pure Monte Carlo game search, which is basically similar to the above except that you just:

Determine all legal moves from your current state
For each of those, run N random simulations to the end of the game
Choose whichever move produces the highest number of wins (or randomly choose if several are tied)

One important thing to note - and a general advantage of Monte Carlo search algorithms - is that the search algorithm basically doesn’t need to know how the game works at all. It just needs a working game engine that will provide it with a list of possible moves, allow the game to be simulated randomly, and report back on the winner at the end of the game.

Now, why did I go with the more simple game search? Mainly because I didn’t need to worry about the additional complexity. In some games, like Chess or Go, this would be a wildly inefficient approach, because the games can run for arbitrarily long periods of time and you have a huge number of legal moves. However, in Marvel Snap, games run for exactly six turns², and you typically don’t have that many possible moves in a given turn.³ As such, there’s no need to implement a complex heuristic or store a real representation of the tree - every time a player takes a turn, they just figure out all possible moves, simulate each of them, and pick one.

As noted above, this has some implications for the core game code. The biggest one is that I implemented the game state, and everything within it, as immutable record types, so that all operations within a game (playing a card, advancing to the next turn, etc.) are done by creating a new instance of the game and anything in it that has been modified. This makes it trivial to simulate all possible moves, because “applying a move” to the game state doesn’t actually change it, it just creates a new copy with that move applied.

If you want to see the code that performs this, most of it is in the MonteCarloSearchController class.

There is one key issue with the current implementation, which is that the MonteCarloSearchController is given the full state of the game, including information that would normally not be revealed to players (what cards are still in their and their opponent’s decks, what cards are in their opponent’s hand, and what the unrevealed locations⁴ are). This means that the players are basically psychic. I haven’t dealt with this yet because any solution will be fairly complicated, and at the moment players are still effectively on a level playing field, but there are some implications I will discuss later.

Genetic Algorithm

A genetic algorithm is basically an attempt to mimic biological evolution, which I will briefly describe:

In biological evolution, organisms have some kind of gene sequence that determines various traits about them. Some organisms wind up with traits that benefit them more than others, and those organisms are more likely to survive and reproduce. Therefore, the genes that contribute to those traits get passed on more frequently. Over time, the entire population tends to have more positive traits and fewer negative traits. There are some caveats here. First, there’s lots of random chance involved in any single life, so organisms with better genes can still have worse life outcomes. Also “positive” and “negative” can be somewhat relative - being smart but scrawny is a bad tradeoff for a caveman but a pretty good one for a software engineer (I can confidently say), so really traits have to be considered with respect to their environment.

Anyway, getting back to our collectible card game. Players in Marvel Snap play with a 12-card deck that they select in advance (and you can’t have duplicate cards in a deck). For our purposes, we can treat these 12 cards as the “gene sequence” for a deck. A population can be created by just repeatedly picking any 12 cards at random. Once we have a population, we determine the “fitness” of each deck by randomly playing them against each other and counting the number of wins for each deck. Whichever decks win the most get to “reproduce”, which means that we repeatedly pick any two of them, “cross” them with each other to create an “offspring” deck, and then repeat until we have the same number of decks as in the previous generation.

The “crossing” process is pretty basic as well (at least at the moment) - we shuffle the cards in both decks, and then go through each of the 12 positions and randomly pick either the card from the first deck or the card from the second deck.⁵

Putting It All Together

So, our basic idea for the process is as follows:

We generate a bunch of random decks
We play the decks against each other to determine which ones are the best
The best decks reproduce
Over time we should get better decks

This all seems theoretically sound, but Machine Learning algorithms can be tricky, so in the next post I’ll examine some assumptions that are being made here, how those assumptions might be incorrect, and how to test them.

Some of these card/location abilities are pretty dramatic, such as Galactus which can potentially destroy two of the three game locations, or Bar With No Name, which changes the scoring rules so the player with the least power wins that location. ↩︎
Or seven, if one of your locations happens to be Limbo or somebody creates it by playing Magik. ↩︎
Unless somebody plays Cloak, which is the bane of my CPU and will probably be discussed later. ↩︎
The three locations start off unknown to the player, and one gets revealed at the start of each of the first three turns. ↩︎
Because there are no duplicates, sometimes one of the choices is invalid and we automatically pick the other one. Sometimes neither choice is valid, so we effectively just pick a different random card from another slot. This probably has some statistical implications which may be explored in a later post. ↩︎