Friday, January 1, 2010

Tuning With Metrics Redux

A while back I posted about the system I devised for recording player deaths and plotting them on level images to find trouble spots.  Since then I've improved the system a little and written some more tools to crunch the data I am receiving.  Several hundred players have now participated in testing Replica Island, so I have quite a lot of feedback to process.  The goal, of course, is to use anonymous statistics about play (specifically, where players died) to smooth out the difficulty curve and find frustration spikes.

Rendering death locations on level maps is a good way to go about tuning individual levels, but what about the game as a whole?  I now have enough data to look at how players move through the game.  Maybe I can draw some conclusions about how the game is paced and how smoothly the difficulty increases.

The first thing I did is graph the number of deaths for each level.  That graph looks like this:

As you can see, a few levels jump right out: levels 17, 21, 39, 32, and 34 all look like spikes in the death graph.  My theory here is that the game should get harder at almost exactly the same rate that the player gets better, so I would expect the number of deaths per level to stay uniform across the entire game.  And actually, this graph suggests that with the exception of the outliers mentioned above, I'm doing an ok job at keeping the difficulty increasing at a constant rate.  After fixing the very hard levels, I can see that the middle of the game, from around level 11 to level 24, there are some very easy levels as well.  I should probably go back and mix those up a bit to increase the difficulty.  So far, so good.  Looks like this is a pretty good metric for assessing difficulty.

But one thing this data doesn't tell me is how frustrating these levels are.  Just because a player died more than once doesn't mean that the level was frustrating; consider a poorly constructed level in which the player isn't in immediate danger but simply cannot progress.  In a case like that, the level could be extremely frustrating without the player dying a lot, and this graph wouldn't catch it.

Let me tell you more about the data I have.  I collect a few basic fields for each event.  They are:

  • event type - the type of event that occurred.  Usually "death."
  • x, y - the position in the level in which the event occurred.
  • time - the time since the level started, in seconds, until the event.
  • level - the level in which the event occurred.
  • session - a unique session key associated with the user (based on a random number generator).
For the first version of the game that I sent out to testers, only death events were recorded.  In the most recent version I added a new type: level completion events.  When a level is completed, it causes that event to be logged to the server.  With level completion events in place I can now track the total amount of time that a player spent on any given level by summing all of the time entries from death events associated with that user on that level and adding the result to the level complete time event.  Do this for every user and average the results and I can get a graph of how long each level takes to complete including restarts from death.  That graph looks like this:

Now, Replica Island is a game on a phone.  It needs to be playable in short bursts.  So my levels are all designed with the idea that you can finish them in around 5 minutes.  And looking at the graph, it looks like my levels are pretty much in-line with that expectation.  Levels 32 and 34, which were outliers on the deaths per level chart, are again outliers here; those levels must really suck.  I mean, it's the end of the game, so the levels are supposed to be pretty hard, but the average player dies 8 times on level 32 and spends a total of almost 20 minutes trying to complete it.  That's probably super frustrating.  That level needs work.

I can also see that there's something very wrong with level 18.  People are completing it in almost no time whatsoever, even though the other graph shows that most people die 2.25 times on that level.  And again, with the exception of the very long outliers, I can see that the levels in the end are sometimes too easy; ideally all the bars on this graph should rarely vary from the 5 minute mark.

Of course, these are all averages.  Some players might spend a very, very long time on a particular level (and actually, with my expanded view of the data, I can see some poor souls really are getting stuck for hours).  But on the other hand, I can't design levels for every level of difficulty; hitting the perfect median that is challenging for novices but not boring for pros is hard, but I think that averages are a good guideline.

So now I know which levels are probably worth investigating.  A cursory look at the levels identified by the death event graph revealed an interesting pattern: a lot of people seem to be dying by falling down pits.  Replica Island, like many other side-scroller games before it, features pits that you must jump across.  If you fall down a bit and off the bottom of the level, it's game over.  It turns out that many of the levels in which people are dying in droves are levels that feature a lot of pits, like this one:

Every little dot on this image is a player who died, and 99% of them are in pits.  So it seems like I might have a problem with pits that affects a lot of different players.  Rather than make that assumption, however, I graphed it:

I can detect a pit death from my data because the y coordinate in my x,y field goes negative only in this case.  And yep, it looks like levels with a lot of pits are deadly.  

So how pervasive is this problem?  I know by looking at the death event graph and the completion time graph that I've got some outlier levels to deal with, and this latest graph tells me that some of the levels are affected by some specific problem with pits.  So what's the relationship between the pits and the overall level flow that I'm trying to control?  Let's graph it and find out!

This is pretty neat, right?  I can tell that in several cases, in particular level 34, the pit problem is leading to a lot of deaths which is causing the level to take forever to complete.  I can also see that the pit issue isn't my only problem: problematic level 32 has almost no pits, and yet the number of deaths and the time to complete that level is still way out of range compared to the rest of the game.

In case you are interested, the problem with pits is actually pretty simple: my camera moves too much in the vertical direction.  Classic platformers that involve a lot of jumping, like Super Mario Bros., are very careful to limit the amount of vertical movement required by the camera when making jumps.  You can always see the landing point before you leave the ground in the Mario games, which makes jumping over pits a fun challenge rather than a frustrating penalty.  But in Replica Island, the protagonist can fly, which requires the camera to track in the vertical axis a lot more aggressively.  That means that if you are falling you might not actually realize that there isn't going to be any ground below you until it's too late.  The solution to this problem is probably to make the camera smarter about pits, and maybe to throw some visual indicator into the mix as well.  We'll see--I have a couple of solutions in mind but it'll take another round of testing to verify them.

Anyway, based on an extremely simple packet of data from a sample of several hundred players, I can pretty accurately pinpoint levels that need polish and game mechanics that are sources for error.  As I push forward to a public release, my goal should be to first fix the mechanical issue (the pit problem), and then polish levels like #32 that are broken for some other reason.  That's a much better task list than just some gut feeling about the quality of my levels.


  1. That's a great peace of know-how shared right there :)

  2. I am posting this here because this is the closest related blog post, but my comment is really related to The reason so many people die at those spots in level 27 (Memory #34) is that the spikes are not being rendered. I died at least a couple times before figuring out where the invisible spikes were.

  3. Fascinating...
    I am impressed and thought I'd dig out this functionality from the code, until now - that I found out it's been frameworked...
    Turns out "Google Analytics for mobile" could be leveraged to obtain the metrics and work with them.
    Obviously it only provides the machinery - one would still have to decide how to use it.

    Thought it'd be interesting for many developers though I haven't yet tried it myself.

    For introduction watch "Analyzing and monetizing your Android & iPhone apps" an excellent IO session ,second only to "Writing real-time games for Android redux".
    For details:

  4. Sometime I wonder how a few of these gameplay mechanics map to "real" work life.

  5. hello friends how are you ?you can find here replica mobilesreplika telefonlar