Replica Island: code design

Showing posts with label code design. Show all posts

Sunday, November 21, 2010

Building a Reflective Object System in C++

Every game engine I've worked with for the last several years has had some sort of reflection system. Reflection, and its cousin introspection, are features supported by some languages that allow runtime inspection of object structure and types. My most recent game, Replica Island, as a Java application makes use of Java's Class object, as well as useful type-related tests like instanceof. Many other languages support some sort of reflection, especially those that are modern and VM-based (like C#). Reflective objects can provide the answers to all sorts of interesting questions at runtime, like "is this anonymous pointer derived from class X?" or "does this object have a field named Y?" It's also common to be able to allocate and construct objects by string in reflective systems.

These traits make it very easy to serialize and deserialize reflective objects: entire object trees can be written to disk and then read back again without the serialization code having to know anything about the objects it is writing. Reflection also makes it very easy to write tools; if the tools and the runtime have access to the same type data, the tools can output data in the same format that the runtime will access it in memory (see my previous post about loading object hierarchies). Reflection systems can also make for some pretty powerful tools: for example, it's easy to build a Property Editor dialog that can edit any object in the code base, and automatically adds support for new classes as they are written over the course of development.

But C++ doesn't support reflection. There's RTTI, which gives a tiny little sample of the power that a real reflection system brings, but it's hardly worth the overhead on its own. C++'s powerful template system is often used in places that reflection might be used in other languages (e.g. type independent operators), but that method is also restricted to a small subset of a fully reflective system. If we want real reflection in C++, we'll need to build it ourselves.

Before we dive into the code, let me take a moment to describe the goals of my reflective object system. Other reflection systems might choose a different approach based on different needs (compare, for example, Protocol Buffers to the method I'm about to describe). For me, the goals are:

Runtime type information of pointers.
Ability to iterate over the fields in an object.
Ability to query and set a field in an object.
Ability to allocate and construct an object by string, or based on other meta data.
No extra overhead for normal class use.

The Java Class object is a pretty good model to follow here: each class has a static object that describes the class in detail. Using classes normally involves no extra overhead, but if we choose we can also load the Class descriptor and work with objects using reflection. So my goal is to make something like the Java Class object in C++; my approach is to generate a class for each reflected class that holds metadata about the class it describes. A "meta object."

Once I start generating meta objects for individual classes, servicing the first goal of anonymous pointer typing isn't very hard. As long as we know that the pointer comes from some base type that implements a virtual accessor for returning the meta object, we can pull a static class definition out and compare it to other classes, or to a registry of meta objects (and, if we're tricky, we can even do this when we can't guarantee that the pointer is derived from some known base). So let's assume we have a way to generate meta information for each class, wrap it up in a class called a MetaObject, and stuff it as a static field into the class it describes.

The next question is, what does this MetaObject class contain? Well, per requirement #2, it must at least contain some information about fields. In order to access fields within a class we'll need to know the offset of the field, its size (and, if it's an array, the size of each element), and probably the name of the field and the name of its type as strings.

Now might be a good time to think about what a C++ object actually looks like in memory. Say we have the following object:

class Foo : public MetaBase
{
   public:
      static const MetaObject sMeta;  // The object describing Foo
      virtual const MetaObject* getMetaObject() { return &sMeta; }  // required by the abstract base
 
      void setup() { mBar = 10; mBaz = 267; }
   private:
      int mBar;
      int mBaz;
};

If we allocate an instance of Foo and then pop open our debugger to inspect the raw memory, it probably looks something like this (assuming MetaBase has no fields):

C0 A1 D0 F7     // Pointer to the virtual table; some random address
00 00 00 0A     // Value of mBar
00 00 01 0B     // Value of mBaz

I say "probably" because the actual format of a class in memory is basically undefined in C++; as long as it works the way the programmer expects it to, the compiler can do whatever it wants. For example, the actual size of this object might very well be 16 bytes (rather than the twelve bytes shown above), with zero padding or junk in the last word; some architectures require such padding for alignment to byte boundaries in memory. Or the vtable might be at the end of the object rather than the beginning (though, to be fair, all the compilers I've worked with have put it at the top).

Anyway, assuming this is what we see in the debugger, it's not hard to pull out the information we want for our meta object. The value of mBar is at offset 4 (the first four bytes are the address of the virtual table), and the value of mBaz is at offset 8. We know that sizeof(int) == 4. And sMeta, because it is static, doesn't actually appear in the class instance at all--it's stored off somewhere else in the data segment. If we had this information about every field in every class, we'd easily be able to access and modify fields in objects without knowing the type of the object, which satisfies most of the goals above. And since this data is stored outside the object itself, there shouldn't be any overhead to standard class use.

Here's a abbreviated version of the object I use to describe individual fields in classes. You can see the entire object here.

class MetaField : public Object
{
public:
  enum MetaType
  {
    TYPE_value,
    TYPE_pointer,
  };
  
  MetaField(const MetaType type, const char* pName, 
    const char* pTypeName, int offset, size_t fieldSize)
  : mType(type),
    mpName(pName),
    mpTypeName(pTypeName),
    mOffset(offset),
    mFieldSize(fieldSize),
  {};
  
  const char* getName() const;
  const char* getTypeName() const;
  const int getOffset() const;
  const size_t getFieldSize() const;
  const MetaType getStorageType() const;
  
  virtual void* get(const MetaBase* pObject) const;
  virtual void set(MetaBase* pObject, const void* pData) const;
  
private:
  const MetaType mType;
  const char* mpName;
  const char* mpTypeName;
  const int mOffset;
  const size_t mFieldSize; 
};

A static array of MetaFields is defined for each class and wrapped up in a container MetaObject, which also provides factory methods and some other utility functions. You can see that object here. These two objects, MetaField and MetaObject, make up the core of my C++ reflection implementation.

So we know what information that we need, and we have a class structure to describe it. The hard part is finding a way to generate this information automatically in a compiler-independent way. We could fill out MetaObjects by hand for every class, but that's error prone. It might be possible to pull this information out of the debug symbols generated for a debug build, but symbol formats change across compilers and we don't want to compile a debug build for every release build. We could probably contort C++ macro expansion to generate meta data, but in the interests of sanity let's not do that. We could write a preprocessor that walks our header files and generates the necessary meta data, but that's actually a sort of annoying problem because of the idiosyncrasies of C++ syntax.

The solution I chose is to use a separate format, an interface definition language, to generate metadata-laden C++ header files using a preprocessor tool. The idea is that you write your class headers in the IDL, which converts them to C++ and generates the necessary metadata objects in the output as it goes. I leverage compiler intrinsics like sizeof() and offsetof() to let the compiler provide the appropriate field information (meaning I don't care where the vtable is stored, or what padding might be inserted). My IDL looks like this:

metaclass PhysicsComponent
{
  base GameComponent

  function void update(const float timeDelta, GameObject* pParentObject) { public }
  function virtual bool runsInPhase(const GameObjectSystem::GameObjectUpdatePhase phase) { public }
  
  field mMass { type float, value 1.0f, private }
  field mStaticFrictionCoeffecient { type float, value 0.5f, private }
  field mDynamicFrictionCoeffecient { type float, value 0.1f, private }
  // mBounciness = coeffecient of restitution. 1.0 = super bouncy, 0.0 = no bounce.
  field mBounciness { type float, value 0.0f, private }
  field mInertia { type float, value 0.1f, private }
}

.. and the output of the preprocessor tool looks like this:

class PhysicsComponent : public GameComponent
{
 public:
  void update(const float timeDelta, GameObject* pParentObject);
  virtual bool runsInPhase(const GameObjectSystem::GameObjectUpdatePhase phase);
 private:
  float mMass;
  float mStaticFrictionCoeffecient;
  float mDynamicFrictionCoeffecient;
  // mBounciness = coeffecient of restitution. 1.0 = super bouncy, 0.0 = no bounce.
  float mBounciness;
  float mInertia;
 public:
  // AUTO-GENERATED CODE
  static void initialize(PhysicsComponent* pObject);
  static PhysicsComponent* factory(void* pAddress = 0);
  static void* factoryRaw(void* pAddress, bool initializeObject);
  static PhysicsComponent* arrayFactory(int elementCount);
  static const MetaObject* getClassMetaObject();
  virtual const MetaObject* getMetaObject() const;
  static bool registerMetaData();
  static PhysicsComponent* dynamicCast(MetaBase* pObject);
};

You can see that the IDL pretty much just spits C++ out exactly as it was written, but in the process it also records enough information to generate the functions at the bottom of the class. The most interesting of those is getClassMetaObject(), which is a static method that defines the meta data itself:

inline const MetaObject* PhysicsComponent::getClassMetaObject()
{
  static MetaField field_mMass(MetaField::TYPE_value, "mMass", "float",
    offsetof(PhysicsComponent, mMass), sizeof(float));
  
  static MetaField field_mStaticFrictionCoeffecient(MetaField::TYPE_value, 
    "mStaticFrictionCoeffecient", "float", 
    offsetof(PhysicsComponent, mStaticFrictionCoeffecient), sizeof(float));
  
  static MetaField field_mDynamicFrictionCoeffecient(MetaField::TYPE_value, 
    "mDynamicFrictionCoeffecient", "float", 
    offsetof(PhysicsComponent, mDynamicFrictionCoeffecient), sizeof(float));
  
  static MetaField field_mBounciness(MetaField::TYPE_value, "mBounciness", "float",
    offsetof(PhysicsComponent, mBounciness), sizeof(float));
  
  static MetaField field_mInertia(MetaField::TYPE_value, "mInertia", "float",
    offsetof(PhysicsComponent, mInertia), sizeof(float));
  
  static const MetaField* fields[] =
  {
    &field_mMass,
    &field_mStaticFrictionCoeffecient,
    &field_mDynamicFrictionCoeffecient,
    &field_mBounciness,
    &field_mInertia,
  };
  
  static MetaObject meta(
    "PhysicsComponent", 
    MetaObject::generateTypeIDFromString("PhysicsComponent"),
    MetaObject::generateTypeIDFromString("GameComponent"), 
    sizeof(PhysicsComponent),
    static_cast(sizeof(fields) / sizeof(MetaField*)), 
    fields, 
    GameComponent::getClassMetaObject(), 
    &PhysicsComponent::factoryRaw);
  
  return &meta;
}

*note that, in more recent versions, I've replaced offsetof() with a macro that does the right thing for compilers that do not support that intrinsic. Offsetof() isn't really kosher in C++, but for my purposes it works fine. If you want to learn all about it, and why it's rough for "non-POD types," try Stack Overflow.

With this data, I now have a pretty complete reflection system in C++. I can iterate over fields in a class, look them up by string, get and set their values given an anonymous pointer. I can compare object types without knowing the type itself (I can implement my own dynamic_cast by walking up the MetaObject parent hierarchy and comparing MetaObject pointers until I find a match or reach the root). It's very easy to construct objects from a file, or serialize a whole object tree. I can, for example, make a memory manager that can output an entire dump of the heap, annotated with type and field information for every block. And I can compile all of my objects into a DLL and load them into a tool and have full type information outside of the game engine environment. Sweet!

There are, however, many caveats. This implementation doesn't even attempt to support templates, multiple inheritance, or enums. If we serialize and deserialize using only this data, some standard programming practices start to get screwed: what happens when we can construct objects without invoking the constructor? How do we deal with invasive smart pointers or other data that weakly links to objects outside of the immediate pointer tree? How do we mix objects that have this meta data and objects that do not? How do we deal with complex types like std::vector? If object structure is compiler dependent, how can we serialize class contents in a way that is safe across architectures? These are all solvable problems, but the solutions are all pretty complicated. They often involve dusty corners of the C++ language that I rarely visit, like placement new. If you get into this stuff, Stanley Lippman is your new best friend.

But even with those caveats in mind, the power of reflection is absolutely worth the price of admission. It's the first chunk of code I write or port whenever I start a new project in C++. It was the first bits of my old game engine that I got running on Android, and is now the core of the engine I am building on that platform. Reflection is not a simple bit of infrastructure to get up and running, but once you have it it's really hard to go back.

Sunday, November 7, 2010

Leveraging Java and C++ for Hybrid Games

I've been thinking a lot lately about how best to use the resources that Android provides for game development. A lot of the game developers I know (and I know a lot!) are quick to treat any new platform as a dumb host to their game engines. Usually developers have a bunch of code, or even entire games, that are written to be aggressively cross-platform, so all they need is a way to compile the source, attach it to input events, and draw to screen. Any platform that can provide those basics can host their tech, so when evaluating a new platform to support, these developers only look at the most basic level of functionality.

This is certainly true on Android as well. Lots of developers look at the NDK and see a C++ environment that they can run their code in and decide that supporting the platform only requires gluing their existing code to the hooks that Android exposes. And that's true--if your only goal is to port an existing game from one platform to another, only the minimal set of common functionality is necessary to get something up and running.

But since I am in a position to write games exclusively for Android, I've been thinking about how to leverage parts of the platform that most game developers ignore: the OS and Java runtime itself. There's a lot of functionality there, and maybe there are ways that I could leverage it to make better games.

One project I've been working on recently is a little game framework using the NDK. My friend Gregg and I ported Google's open source browser-based 3D framework, O3D, to Android a while back, and I've been using that to get some dudes running around on the screen. O3D has a big Javascript component which we've ignored; the rest of it is a C++-based, shader-centric rendering backend. Gregg did the heavy lifting of getting the thing to run on OpenGL ES 2.0 and I've been hacking in bits and pieces of old game engines on top. The result is that we have a pretty complete rendering engine running on Android without a whole lot of effort.

It's a lot of code considering that it doesn't really do anything yet--almost 500k lines of C/C++. But it wasn't hard to port because in the end, Android is really just another Linux OS with hooks into things like OpenGL ES 2.0. So for this work, we basically did what lots of other game developers do: we ported the code using as little Android-specific stuff as possible and got something up pretty fast.

I've been slowly adding game code to this project, and as of this writing I have an early prototype of a shooting game up and running: you can run a little test character around and shoot placeholder art zombies with dual thumbsticks. It's not a game, yet, but it's enough to prove out the code.

Not a game, yet. Place holder art courtesy of 3drt.com.

If this thing ever gets off the ground, it'll be a game written almost entirely in C++, with just a few hooks back to Java for input, sound, and application life cycle events. Just like most games that are built to be cross platform, or brought over from other platforms.

But I think there's an opportunity to use Android's unique hybrid application structure to do things that might be difficult or impossible on other platforms. There are areas where I can get a lot of value out of Java while leaving the game engine and performance-critical code all in C++.

For example, I've hooked up a web server to this game. It's a very, very simple web server; I found some code on the web that implemented a basic HTTP server in Java, copied and pasted it, and then hacked it up until it did what I needed. It runs in a separate thread within the main game process, and allows us to connect to the device from a desktop browser while the game is running. Here's a graphic to illustrate the structure of the code.

The high-level structure of this engine. Red bits are Android Framework, blue are separate threads, and green is native code.

I'm sure you're reading this and are thinking, why the heck would you want to run a web server inside a game?! Well, sir, I'll tell you. With the web server in place, I've opened the door to real-time game editing. This web server doesn't serve static pages, it reads and writes data directly to and from the native engine. I can, for example, pipe O3D's scene graph up to the web server and let the user browse its structure from their browser. I can do that with my game objects too (thanks to the meta system I referenced in the last post, which lets me query the structure of a given object by string). And perhaps most useful, I implemented a simple interface for editing shader code on the fly; I can write vertex and fragment shaders right in the browser, click a button, and immediately see the rendering change in the running game.

This obviously isn't a full runtime editor, but with just a little bit of effort it's already proved to be pretty powerful. The whole thing is exceedingly simple: my copy-pasted web browser calls down into the native code via JNI and just passes a string payload, which a few hundred lines of runtime code process and then return to the server and thus to the browser. I'll extend this interface as necessary to other aspects of the game; building a way to do very fast iteration for things like game play physics and shaders is the way to turn a mediocre game into a good one.

Despite the simplicity of the web server system, I'm not sure it would have been as successful on other platforms. C++ is great for rendering a 3D shader-based game, but it's actually a bit arduous to use for building a web server. Java, on the other hand, is a great language to write a web server in--it's actually designed with that kind of application in mind. Android hybrid apps let you leverage both native code and Java simultaneously, which can lead to some pretty neat combinations. I think that, if this particular game engine ever becomes a full-fledged game, this kind of language diversity will make it a lot of fun to build.

Update: Oh, internet, you fickle beast. Every potentially disputable line of text must be disputed!

OK, to be clear: of course it's not very difficult to write a web server in C or C++. I did not mean to offend your sensitive language fanboyism by suggesting that maybe, perhaps, possibly, some languages are more predisposed to certain types of work than others. Though I could write a GLES 2.0 game entirely in Java, I would not choose to do so: that language is not the best fit for that problem. So yes, you may of course write a web server in C++, or in C, or in assembler or any other language. And it's not that hard. But in Java, it's so, so easy. Heck, I even implemented a memory file cache just for the heck of it. The code generates JSON on the fly based on results coming back from the engine runtime. Sure, you could do this in C. Be my guest. Me, I'm looking for the simplest possible solution to each of my problems, so I can spend most of my time on the part that counts: making the game fun.

I also did not mean to suggest that I am the first to think of piping game data through a web server. I just thought it was a neat and easy method for this project specifically on Android. So there.

Wednesday, November 3, 2010

Game Object Construction Rabbit Hole

Today I want to write a little bit about a boring, yet utterly fundamental part of game engine design: spawning game objects.

Garry's Mod makes it look easy.

Say you have a level that contains a player object and an enemy object. However these objects are represented in memory, you have to allocate them somehow. And probably register them with some sort of update loop, and maybe load some other data (graphics, sound?) associated with those objects. There's a little bit of bootstrap to just get the game into a state where the simulation can start.

So, how do you do it? How do you get that player and enemy up and running? This is one of those problems that can be as complex as you choose to make it.

Well, I mean, you could write a function that looks like this:

void startUpGame()
{
   spawnPlayer();
   spawnEnemy();
   // done!
}

Sounds good until you get to level 2, which has two enemies. You could make a spawning function for every single level, I guess. It would work, but it wouldn't scale, especially if you have lots of objects to spawn. I think the average Replica Island level has several hundred objects to spawn, so writing one of these for each level would suck hard.

Long, long ago I wrote a game called Fysko's Playhouse for Mac OS 6 computers. If that fact alone doesn't date me, this will: it starred the mascot of a local BBS of the same name. Anyway, back then I had no idea what I was doing, and so when faced with this problem of how to spawn game objects for different levels in a way that doesn't suck, I hit the code with a hammer and moved on. My solution back then was to write a level editor (in HyperCard!!) that would output start up functions like the one above. I could draw a level in the editor, hit a button, and out would pop a bunch of Pascal code which I could then copy and paste into the game. Great!

Well, not really. That kind of solution works only for very simple games and only when the programmer is the level designer. And even then it's kind of crappy. Move a guy 2 pixels to the left and you have to recompile your code.

Some years later I wrote a Bomberman clone called Bakudanjin. This time I was a little smarter. Instead of hard coding my level information, I made a map file describing every level. The map file was basically just a bunch of indexes with XY locations. To start the level, I load the file and then walk through each index. The index maps to a table of function pointers (or, since this was also Pascal, a big-ass switch statement) that call the appropriate spawn functions. Actually, Replica Island works this way too.

And if you just clicked on that link, you can see why this method isn't so hot either: Replica Island only has about 50 different object types and that code is still 6500 lines long. And a lot of it is copy and paste, because many objects turn out to be structurally similar. And god forbid you accidentally get your map format indexes out of sync with your runtime indexes; arbitrary enums needing to correctly index into separate arrays is a recipe for bugs.

Still, this is a quick and easy method, and it worked fine for Bakudanjin and Replica Island. All that code gets thrown out and rewritten when I start a new game, though.

The problem here is that all this bootstrap code is basically code describing data. I draw a line between "code" and "data" as follows: code is literally programming that instructs the game how to operate. Data is non-code that is input to the code system. You feed data into your game and out comes a playable simulation. Things like enemy placement on a map are subject to lots of iteration and change; the code to actually move an enemy to that location is probably pretty stable and static. Therefore, the placement information is data and shouldn't live in code, while the runtime for consuming that data is code but can be generic and reused across multiple levels and games.

So in order to write better code and to enable faster iteration and reusability, it's probably a good idea to move more of the information for spawning guys into data. Moving spawn locations into a file wasn't a bad first step, but we can go further.

What does spawnPlayer() do, anyway? It probably allocates a bunch of objects, ties them together with pointers, and sets a bunch of parameters on those objects. Hmm, sounds like something we could represent with just some smarter data.

How about this: we'll make all objects in the world basically the same, make a single spawnObject() function, and then expose a bunch of parameters which we can use to customize the objects and differentiate them. If we can do that, all we need to do is serialize all the parameters and pass them to spawnObject(). Health = 5, Speed = 10, Sprite = "AngryGorilla.png", etc.

GameObject* spawnObject(ObjectParams* params)
{
   GameObject* object = new GameObject;
   object->setLife(params->life);
   object->setSprite(params->sprite);
   if (params->isThePlayer)
   {
      object->setRespondToControllerInput(true);
   }
   // ...
   return object;
}

OK, that works, but now we have a new problem: it's actually pretty hard to make all our objects exactly the same. Take the code that reacts to controller input, for example. That belongs on a player object but on nothing else; with this type of system every single angry gorilla is going to carry that code around, probably turned off with a flag that we serialized. Or consider what happens when an object is hit. The player probably wants to get damaged, go into a short invincibility mode, and then go back to normal. The enemies probably want to get damaged and not be invincible. If the player dies there's probably some code to cause the Game Over screen to pop up. Not so for an enemy.

We could make this all the same code and control it with flags, but it's going to become ugly fast. Maybe if we can refactor our game such that all objects are functionally similar we can ease the pain, but that will make it hard to say, add in little bits of player-specific hacks to improve the feel of game play later. A better system would be able to generate objects that only contain the code that they need. If we use the aggregate object model that I often recommend, we could think of this as only inserting relevant components. In a more traditional object model, we could think about instantiating the object at a particular level of derivation from the base to provide only the necessary set of functionality. Either way, we're talking about sticking code together with pointers and setting parameters. Hmm, sounds like data again.

One method I've seen but never tried myself is to define a hard-coded list of object types ("player", "angry gorilla", "banana bomb", etc), and then provide a unique struct of static spawn data related to each type. For example, the player object would be allocated to contain code for movement based on the controller, and it would read at spawn time a PlayerObjectData struct.

GameObject* spawnObject(ObjectType type, void* params)
{
  GameObject* object = NULL;
  switch (type)
  {
    case TYPE_PLAYER:
        object = new PlayerObject();
        object->parseParams(static_cast<PlayerObjectData*>(params));
        break;
    // ...
  }
  return object;
}

The attractive thing about this method is that you get a chance to control object creation on a per-type basis, but you can also serialize different data for each type, thus avoiding the one-size-fits-all problem. That should let you move almost all information about this object into data, and just leave this one spawn function in code.

But let's go a step further. Say we don't want to have to enumerate every object type in code. If objects are really just collections of data and code, why can't we move the entire structure of a game object into data?

In a language that supports reflection, this shouldn't be too hard. We can encode an entire object hierarchy, say as XML, and then use it to allocate classes, patch pointers, and set fields. In a language like Java, we can look up class names by string and instantiate them, and poke into fields to finish constructing objects. We can imagine some XML that looks like this:

<object name="gorilla" type="com.gorillaboom.gameobject">
  <field name = "mLife">10</field>
  <field name = "mSprite">gorillagraphic</field>
</object>

<object name="gorillagraphic" type="com.gorillaboom.imagefile">AngryGorilla.png</object>

If we write an XML parser to build object trees for us based on this, our entire spawn code can become a single call to that parser. The parsing code is complicated but generic, it can be reused across objects and levels and games. And we still have the ability to customize each object because the hierarchy can contain custom classes or be structured differently per type.

GameObject player = (GameObject)constructFromXML("player.xml");

Even cooler, this method isn't even specific to game objects. We could use it to serialize all sorts of data, even the contents of our main loop. The code is still code, but once we put the structure in data we have a crazy amount of control.

But, getting back to just the game object application of this idea, there are two problems. First, this approach requires us to walk the XML object tree every time we want to spawn an object. We could try to walk the XML once and then use the resulting object tree as an entity "archetype," but that way leads to a particular brand of hell known as "shallow copy vs deep copy." When trying to copy the tree to create a new game object instance, how do you know which pointers should be recursively copied and which should be copied by value? People have lost years of their life to that problem.

A less practical but ultimately simpler solution is just to re-parse the XML for every instantiation. Which brings us to the second problem: reading XML and stuff is slow. I mean, really slow. Compared to the simple code functions we started with, it's glacial. And allocation-heavy. Not something we really want to do during the runtime of a game.

I know what you're going to say. Hey, dumbass, just use a binary format instead of XML. Problem solved. And that's true, to an extent. Excuse me for a moment while I go out on a limb to the extreme of this line of thought.

If you're going to go to a binary format, why not just store the entire object tree in its native memory format on disk? Build the object hierarchy offline, write it as binary data to a file, then load it at runtime and just patch pointers. Boom, instant object tree. In C++ you can actually do this by building your own reflection system, making liberal use of placement new, and then walking the objects stored in the file to patch pointers.

I've actually written this system before. It becomes horrifically complicated, but it does work. A couple of years ago I wrote an engine that could load a binary blob of C++ objects as a contiguous block, walk the objects therein to patch vtables and other pointers, and then simply start using the objects as regular fully constructed C++ objects. You lose constructers (objects were constructed before they were written to disk) and destructors (lifetime of the object is the lifetime of the blob; individual objects within the blob can't be freed), and god help you if you try to manage those objects with reference counts or smart pointers, but the method does work and it's pretty fast. To make a new instance of the object tree, you can just memcpy the whole block and repatch pointers. Cool.

It starts to break down when you need to target multiple platforms, or build your data files on an architecture that does not match the runtime architecture. These problems are also solvable, but probably not in-place; you'll need to read the objects out of the file and then allocate them at runtime to ensure padding and vtable placement is correct. And if you do that you're back to a lot of runtime allocation and object parsing. The system is still complicated but some value is lost.

So for a new code base that I'm working on, I'm experimenting with a slightly different approach. I still want to use the "load code structure from data" approach, but I don't want it to be slow or architecture dependent (or complicated, if I can avoid it). And I need to be able to spawn objects with this system dynamically at runtime. So instead of constructing objects directly, I'm constructing factory objects that can generate new instances of my object hierarchy on the fly.

The method is to read in an object hierarchy as XML. Instead of building the tree right there, I build a bunch of "instruction" objects--basically the minimal list of commands required to recreate the object tree described in XML. "Create an object of type GameObject," "Set field mLife of object index 5 to 10," "Point field mSprite of object index 22 to object index 25." Each of these small "patch" objects gets initialized with a single command (representing a delta from the default state of the object upon construction), and the whole list of patches is stored in a factory object I call a "builder." Reading the XML is still slow, but I only need to do it once before the game starts; at runtime, when I want to spawn a new object tree, I simply execute the appropriate builder. The runtime speed is similar to what we had way back at the top of this lengthy post: just a bunch of object allocations and field initializations. Should be pretty fast.

Builder* enemyBuilder = makeBuilderFromXML("angrygorilla.xml");

GameObject* enemy1 = enemyBuilder->build();
GameObject* enemy2 = enemyBuilder->build();
GameObject* enemy3 = enemyBuilder->build();

One nifty aspect of this approach is that I can easily extend the XML format to do more complicated things by building more complex patch object types. The basic set of patches (int, float, vector, pointer, string, etc) are entirely generic, as is the builder system itself. But I can add to that engine- or game-specific patches if necessary. I've already added a patch that knows about this particular engine's asset system, and can use it to schedule other files for loading, thus allowing for references to external assets which may be loaded in the future (and patched at that time appropriately). A different game might have an entirely different asset system, in which case I can chuck the one patch written for this game and write a new one against that engine; the system should scale without losing its general purpose core.

The actual XML parser and builder system is very simple--only a couple hundred lines of code. But I should mention that it only works in C++ because my game engine is backed by a (fairly involved) reflection system. Using an Interface Definition Language, I can create classes that are laden with meta data, much like Java's Class object. Using that data I can poke into anonymous classes at runtime and set fields, which is how the patches in the builder actually work. I think this approach could be done without reflection, but it would basically resemble the hard-coded types with unique static data structs method that I mentioned above. I'll talk more about the reflection system in a future post.

To bring this giant document to a close, I just want to note that the methods I've described here are hardly an exhaustive list. These are the various approaches that I've tried to spawn objects in games; there are many others and probably a lot of really good ideas that I've never considered. But when making a game engine, the question of how objects get spawned--and what a game object actually is, is a pretty huge piece. Though sort of mundane, it's probably worthy of a lot of thought.

Monday, May 3, 2010

Control Configuration and Abstraction

The #1 thing that I've learned since shipping Replica Island is that users want configurable controls. I mean, I might have guessed that some devices would have one sort of controller and not another, but I didn't anticipate the number of people who prefer a specific control configuration even when others are available. Users with trackballs and directional pads asked for configurable keyboard settings, and when I added orientation sensor-based movement for devices without other controls (I'm looking at you, Xperia), many users who could already play the game chose to switch to tilt controls too. I've made four updates so far and all of them have had to do with the input system; in the first three I added more and more configuration options, and in the most recent (v1.3) I rewrote the core input framework to improve non-standard control configurations.

When I started writing Replica Island, the only device available was the G1. About half way through development I switched to an HTC Magic, and at the very end of development I switched to a Nexus One. The game was entirely designed around HTC's trackball-on-the-right design. Fairly late in development, devices sporting directional pads (like the Motorola Cliq, and more importantly, the Droid) started to hit the market, so I added some support for d-pad controls. I didn't really think anybody was going to use the keyboard to play, so I only added a few key-based controls to support the ODROID.

The input system started out like this:

MotionEvents from touch and trackball motion, as well as KeyEvents, were passed to the InputSystem (via the GameThread, for reasons I'd rather not discuss), which recorded them in some internal structures. The goal here was to abstract the Android events from the game interface. The game wants to be able to say things like "is the jump button pressed," or "was the jump button just pressed since the last frame," or "how long has it been since the last time the button was pressed." It's a query-based interface, rather than the message-based interface that the Android framework provides. So the initial role of the InputSystem was to record Android events so that they could be queried in the future.

The trackball was tricky to get right. I want to allow the player to flick the trackball in a direction and have the character get an impulse in that direction scaled by the magnitude of the flick. But the Android motion events come in at a fixed frequency and fixed magnitude, so in order to build a vector describing recent motion, I needed to maintain some history between motion events. My first implementation, which survived for the entire course of development, was to cache a history of 10 motion events and calculate the average direction of motion across all of them to find the flick direction and magnitude. After a specific timeout had passed with no new events, the cache was cleared and averaging would begin again with the next event.

This worked ok as a way to calculate a motion vector, but it had problems. The biggest issue was that there was no way for a user to move slowly; even if the user rolled the ball slowly (thus causing motion events to come less frequently), as long as he rolled fast enough to make the internal event timeout, the events would get averaged together and would come out looking the same as a fast flick. So users who tried to move with precision or in small steps often found themselves rocketing across the level.

When I went to add d-pad support, I just treated the pad as a different source of motion events. I treated each keydown as a roll of a specific magnitude in a specific direction, and fed that into the same cache system I used for motion events. This worked, sort of: it allowed me to pipe the directional pad through the trackball interface (which connected directly to the game) pretty easily, but it didn't feel good. The problem with this approach was that directional pad events don't need any averaging; in fact, you want exactly the most recent state to be represented, as the player can release a key at any time (the trackball, unlike other kinds of input, never goes "up", and thus required a history). So directional pad support in Replica Island, in the first few versions, sucked.

Add in configurable control options and very quickly my simple G1-centric input system grew into a mess that didn't work very well. So, for the most recent version, I rewrote the whole thing. Now the structure looks like this:

The main change here is to separate input recording (necessary for querying) from game-specific filtering and control configuration switching. The InputSystem is now generic; it just records input events from the keyboard, touch panel, orientation sensor, and trackball, and provides an interface for the current state (as defined by the most recently received events) to be queried. A new system, InputGameInterface, reads the hardware state from InputSystem, applies heuristics and filters, and presents fake buttons for the game to use. This way the game can ask for the "directional pad" and get input from a trackball, orientation sensor, keyboard, directional pad, or whatever, already filtered and normalized. I put all of the filtering code for the trackball into this class, and I can now pass directional pad input directly to the game without tying it to the trackball.

Speaking of the trackball, I changed my approach to filtering. Now I accumulate trackball events that occur within a very short cutoff, and average them after a slightly longer cutoff. Instead of turning the trackball input "off" after a fixed duration, I make it decay until it reaches zero. This lets the user make small, precise movements, and still get a big motion from a large flick (as in the latter case, events occur in rapid succession and accumulate). This method also gave me an obvious spot to stick a sensitivity factor for the trackball, which several users of devices with optical trackpads (HTC Desire, Samsung Moment, etc) had requested.

The new system probably needs a bit more tuning, but I played the game through with it and it feels pretty good. The code is about 100x cleaner now, and InputSystem is something that others can easily reuse without any other dependencies.

Wednesday, January 13, 2010

The Elusive Perfect Platformer Camera

I've come to believe that platformers live and die by their camera system. The camera system is the code that decides how the player should be centered in the frame. The camera must somehow track the player as he moves such that the player can see what's coming. That might seem like a simple problem, but it's not. In fact, I'll go out on a limb and say that a bad (or even mediocre) camera system can ruin a 2D scrolling platformer game.

I mentioned in a previous post that the data coming back from my play testers showed them dying in droves in bottomless pits. I guessed that this had to do with the camera system scrolling up and down, and I was right; a review of the camera code revealed a lot of room for improvement, and after some tuning and bug fixing, I think the experience is much improved.

But this experience again drove home the point I made in the intro paragraph: that the camera in a 2D scrolling platformer has the potential to affect the play experience dramatically--it has to be as perfect as possible. I've made a lot of side-scrollers before, and I should know this, but I was still surprised by how much play was improved by a few simple camera tweaks.

If you are ever at a loss about what to do when it comes to 2D platformer design, refer back to Super Mario Bros. It's like the bible of platforming games--every problem that you might encounter has already been solved, and it's probably been solved in a way that works better than whatever you came up with. At least, that's been my experience. Take a look at this video from Super Mario Bros. 3. Pay attention to the amount of vertical scrolling that the game does when the player gets close to the top of the screen.

You can see that the game almost never scrolls vertically. The really interesting case is around 0:56, where the level (which has previously refused to scroll vertically) scrolls up in one very specific point to get the secret 1up. It's like vertical scrolling is only allowed in very specific situations. You can also see this sort of logic at work when Mario grabs the tanuki suit and starts to fly--immediately the game begins to follow him vertically.

Now compare the camera movement in Mario to the video below. This is Frogger Advance: The Great Quest, a GBA game that I worked on all the way back in 2001.

Quite a difference, right? The camera is all over the place, but despite all of the motion it's pretty much impossible to see where you are going. Part of the problem is that Frogger himself is really big; he takes up so much space on the screen that the camera really has to move just to keep him in the frame. This is a leading camera--it's supposed to always show you the direction that that you are moving. But in practice the physics are so fast that even if the camera rushes to show what's coming up, the player doesn't have time to react. When we made this game we understood that players were dying because they couldn't see where they would fall after a jump, but we didn't understand what to do about it. If you watch this video, you'll see the player use Frogger's float move to slow his falling motion down; this move was added explicitly to combat fall-into-pit deaths. A better solution would have been to try to reduce the amount of movement of the camera by designing levels that don't need to scroll vertically and reducing the size of the main character.

For Replica Island, my camera algorithm is based on the concept of a "window." I actually thought of it as a sphere when I wrote it, but my good friend and ultra-veteran platformer author gman pointed out that it's more accurate to think of a window. The center of the screen is defined by the center of the window, so when the window moves, the game scrolls. The rule that the camera's target (the player) must always remain within the window. When the player crosses out of the bounds of the window, the camera must move the window so that it contains the player at his new position. However, as long as the player stays within the window the camera does not move. So the player is able to cause scrolling in a particular direction by pushing up against a side of the window.

To fix the levels in which huge numbers of users were dying, I adjusted the bounds of the window so that almost no scrolling occurs in the Y axis until the player approaches the top of the screen. The camera also does not allow the player to move below the middle of the screen. So now a small jump causes no vertical camera movement, but hopping off a ledge keeps the player right in the center of the display. This makes seeing what's below you a lot easier than before.

But the heuristic wasn't good enough on its own, so I've also added a special object that, when visible, biases the camera in one direction or another. This lets me put camera hints in the map in areas that I know to be particularly problematic.

Finally, on a few levels I squeezed the level size down so that there's almost no vertical scrolling at all. This makes these levels feel a bit more like Mario, as the game almost never scrolls up and down. This makes the jumping puzzles actually fun, rather than one leap of faith after another.

So far I'm pretty happy with the results, but the real test will be to compare this new version of the code and levels with the data that I presented before; if my theory is right, the number of deaths from falls should be dramatically reduced. If I'm wrong, well, it'll be another round of iteration. It's worth it though; bad cameras are the death of 2D scrolling games.

Tuesday, October 13, 2009

The Main Loop as a Graph Traversal

In an earlier post, I talked about treating game objects as a list of game components, with each component implementing a specific feature of the object. That way, I can call GameObject.update() and get a different result for each game object instance depending on which components have been inserted.

In fact, in Replica Island I apply that same idea to the entire game loop. The whole simulation step can be completed with a single call to MainLoop.update(). Though it's not a game object, the MainLoop object is similar to GameObject it that it simply contains a list of things to update. Those things might be GameObjects, or they might be some other kind of object that can be updated--the main loop doesn't have to know. Things that are inserted into that list will be polled every frame; that's all the MainLoop object does.

For example, there's a bit of code that needs to work out the current state of all the hardware inputs--touch screen, trackball, keyboard, d-pad, etc. That code needs to run every frame, but it's not a game object. As long as it has the right base class and implements update(), it just needs to be added to the MainLoop's list to be run every frame. Then there's the renderer; once all the game objects are finished this object needs to send the draw commands for the frame to the rendering thread (also described in a previous post), so it too can be inserted at the end of the MainLoop's list.

And in fact, systems that need to update every frame can themselves manage other systems that need to update every frame.

For example, game objects can be added directly to the MainLoop's list, but with one or two exceptions I never do that. Instead, a special system, called the GameObjectManager, is run by MainLoop every frame, and that object contains a list of GameObjects which it runs in its update() function. The reason that I have inserted this GameObjectManager class is that not all game objects should be updated every frame. Only a subset--the set that are within some radius from the camera--should be run. The rest should remain inactive to save CPU cycles. So the GameObjectManager, when updated by MainLoop, selects a subset of the GameObjects that it controls and updates them based on the position of the camera.

If you hadn't guessed already, the structure I am describing here is a tree. The MainLoop is the root node of this tree, and its children are things like the input system, render system, and GameObjectManager--bits of code that need to run every frame. The GameObjectManager has all of the game objects as its children, but it stipulates that not all of its children will be visited every traversal. The game objects themselves contain game components as their children; the game components are the leaf nodes of this kind of tree. So, to run the simulation step for the current frame, I just traverse the tree.

Actually, to be precise the structure that I am describing is a graph. The reason it must be a graph rather than a tree is that the structure allows for instancing of subgraphs and of cross-level traversals. For example, certain types of GameComponents that don't need to track state from frame to frame can be shared across multiple game objects; in that case, only one instance of the component exists but it is inserted into a number of different game object instances. In graph terms, shared GameComponents are nodes with multiple parents. However, for the general case the structure behaves like a tree, and so it's pretty safe to think about it that way.

I like using a graph to describe all the work that must be done in a frame because it's an extremely flexible way to set up a main loop. The MainLoop object hasn't changed since I originally wrote it; though the number of objects that it contains has increased, the management code itself has remained the same. For the next game, I can rip out the individual systems that I don't need any longer and insert new ones without altering any of the main program architecture.

This type of graph structure can also give you precise control over how your simulation step is run. Say you want to pause the game, but you need key systems (such as the renderer) to continue operating so that you can run the pause UI graphics. With a tree or graph system, you can insert "pausable" nodes into the tree and append to them children that should stop when the game is paused. At runtime these nodes will simply not traverse their children if the game is paused. This kind of control is hard to thread into a game that is already up and running using traditional hard-coded methods; it usually results in a lot of switches on the g_paused variable littered throughout the code base. With a graph, none of the actual simulation code needs to change--only the graph structure is modified to accommodate pausing.

Another advantage is that it's pretty easy to drive this sort of system with data. Though I haven't done this in Replica Island yet, on previous games I've worked with systems in which the entire runtime for the game is loaded from a file in the form of a main loop graph; you can see how such a structure would be pretty easy to describe in XML, and you could even use Java's reflective properties to automatically instantiate the various systems that live in the tree. Once the graph is described in data, you can change it easily from game to game, or even from level to level if necessary, all with general-purpose infrastructure code. I've not done that with Replica Island yet, but I will eventually--probably after the game ships.

Game graphs are not specific to Android, but I use them a lot and I find them a pretty powerful (and generally underrated) pattern for managing real-time update loops. Like the GameComponent system, they leave the door open to future revision by separating data structures from code. This kind of system is also pretty simple to write (my entire graph is based on two core classes, a node and a group node). Of course, for small projects they are probably overkill--it is likely faster and less error prone to just write a traditional main loop and update the code every time you need to change something. But for medium or large projects, or projects based on a codebase that is intended to be reusable across many different titles, game graphs are a pretty neat way to structure your frame.

Friday, October 2, 2009

Rendering With Two Threads

The Replica Island renderer is based heavily on the GLSurfaceView class that ships with the Android SDK. I've made a couple of modifications but the code is pretty similar to the regular version: a derivation of GLSurfaceView.Renderer that draws the frame gets called every frame, followed by a call to eglSwapBuffers() to actually display the rendered frame.

GLSurfaceView provides a way to run user code in the same thread as the renderer. This makes writing games pretty easy; you can just implement a Runnable, implement a Renderer, stick them both into a GLSurfaceView and get stuff moving around on the screen. Indeed, it's more than sufficient for many applications; my SpriteMethodTest demo works this way just fine.

But for Replica Island I took a different approach. The problem with the single GLSurfaceView thread is that eglSwapBuffers() must block on the hardware until the previous frame finishes drawing. That means that even if you have nothing to draw, a call to eglSwapBuffers() takes 16.67ms to complete. (And of course, if you have a lot to draw, it could take a lot longer).

Now, just in case you are not used to thinking in terms of milliseconds, here's a quick primer. To achieve the magical "60 frames per second" that many games strive for, you need to have a new frame displayed to the user every 16.67 ms. If you go for 30 fps, you have ~32 ms to complete a frame. All your game code, plus all your OpenGL code, plus the actual time it takes to draw the frame must fit within 16.67 ms to achieve 60fps.

In Replica Island, the game code is fairly heavy-weight. I have all that collision to run, plus updates of all the active entities on the screen, plus sound playback and all that jazz. Turns out that it's usually more work to calculate a single simulation step than it is to actually draw the frame. Since this code takes time to execute, the 16 ms block that eglSwapBuffers() incurs makes it really hard to hit 60 fps. What I really want to be able to do is run game code while eglSwapBuffers() is blocking; that way I can pipeline the game updates while the hardware is busy drawing the frame.

So I split the game code off into a separate thread. This makes three threads, by the way: the main UI thread that all Activities have by default, the GLSurfaceView render thread, and this new game thread (actually, there are a few more that are generated by the system for things like orientation sensor updates, but they don't affect the equation much). Now my game code and my renderer can run asynchronously, and I win back some of that time spent in eglSwapBuffers().

Now comes the tricky part. I have two threads running in parallel that need to sync up once a frame so that the game thread can tell the render thread what to do. There's a lot of ways to go about synchronizing these two threads, but I went with a double buffer solution. The game thread fills up a buffer of commands to draw the next frame, and when it is ready it waits for the render thread to begin the next frame. At that point, the buffer is passed to to the render, which can then go off and draw the next frame asynchronously. The buffer that was used to draw the last frame is passed back to the game thread, which fills it up again the next frame. So drawing is the process of swapping these two buffers back and forth during a (hopefully short) choke point at which both threads stop and communicate.

This solution was attractive to me because it was simple, and so far it seems to be plenty fast. However, another solution might be to have a queue that is shared by both threads, with the game thread pushing commands in one end and the renderer executing commands out of the other. In theory such a solution wouldn't need both threads to ever perfectly align--blocking would only occur when one thread or the other was starved. But I haven't done this yet because it is going to be significantly more complex than the double buffer.

My render commands are objects that are allocated out of pools that the game thread owns, and must be returned to those pools when they have been drawn. In the double buffer system, the queue that is returned from the render thread contains commands that can be safely returned to their pools, but in the shared queue system there's no obvious way for the game thread to know how much has been drawn. I suppose there could be two shared queues, one in each direction, but that would still be a lot more complicated than what I have now. Right now almost no code outside of the buffer swap system knows about other threads; the pool objects and the objects they contain are not thread safe and, as it stands, don't need to be.

Is my solution the best for Android apps? I don't know. It seems to work pretty well and it is uncomplicated, which are two points in its favor. Still, I'd like to give this shared queue idea a shot at some point; my gut tells me that it will be slightly faster than the double buffer (less blocking in the average case) but a lot more complex, which might make it not worth the effort. Programmer guts are, however, extremely unreliable, so I will probably give this method a shot after Replica Island ships.

Tuesday, September 8, 2009

Aggregate Objects via Components

I used to make games like this:

class RenderableMovingCollidableGameObject extends RenderableMovingGameObject {
   public void update() {
      super.update();   // Parent classes implement rendering and movement.
      // implement collision detection here
   }
}

class PlayerObject extends RenderableMovingCollidableGameObject {
   public void update() {
      super.update();  // Run rendering, movement, and collision.
      // update the player
   }
}

...
// Main loop!
while (true) {
   InputSystem.update();  // poll for input.

   for (gameObject : ListOGameObjects) {
     gameObject.update();
     gameObject.draw();
   }
}

This isn't a bad way to start making games, but it doesn't scale. Making a good game requires flexibility and the ability to iterate quickly, and this approach starts to break down when the game becomes medium-sized. What if we have more than one game mode? What if the behavior of a specific game object needs to change for a short period of time? What if Level 27's version of the player needs to be subtly different than Level 26's?

In a nutshell, the problem with this approach is that the structure of the code producing a single frame of the game is hard-coded into the program. What I want is something that can change its structure between levels, or between game modes, or even on the fly. I need something more dynamic than a hard-coded for loop.

Another Approach

Let's look at game objects as an example. Game objects are entities in the game world like coins, enemies, moving platforms, and the player. They often have similar functionality, so one way to go about implementing a set of similar objects is to use an inheritance tree. We can create a base GameObject class, and from that derive RenderableGameObject, and from that derive RenderableMovingGameObject, and from that derive RenderableMovingCollidableGameObject, etc, etc, etc. Each level of derivation can add some common functionality until the leafs of this tree are specific entities, like the player.

The real problem with inheritance trees is that features that don't need to be inter-related become dependent on each other because of the way the code itself is written. Given the example class structure above, it's not possible to make a GameObject that can move but doesn't need to render (short of littering the code with flags--don't do that). Because of inheritance, a dependency between movement and rendering has been created where none actually needs to exist. What if we could mix and match features on a per-instance basis rather than tying them all together in a class hierarchy? This is where object composition comes in.

Object composition (or "object aggregation" depending on which design patterns book you read) is the idea that an object "has a" feature instead of "is a" feature. Rather than using inheritance or some other code-side method of collecting functionality together in a single object, we make the object manage a list of separate feature objects; the contents of that list can be different per instance.

So for Replica Island, I have a GameObject class that contains a list of GameComponents. A game object with an empty list does nothing; it can't be drawn, or make noise, or hit things, or do anything else to affect the game. GameComponents implement all of those features, and they must be inserted into the GameObject for it to actually be able to act. Here's some psudeo-code of how GameObject and GameComponents work:

class GameObject {
   private Array<GameComponent> mComponents;

   public void addComponent(GameComponent component) {
      mComponents.push(component);
   }

   public void update(float time) {
      for (component : mComponents) {
         component.update(time, this);
      }
   }
}

class GameComponent {

   public void update(float time, GameObject parent) {
      // ... functionality goes here
   }
}

A GameObject just runs all the GameComponents it contains to produce its output each frame. GameComponents in Replica Island implement things like movement, physics, collision detection, sprite animation, rendering, animation selection, AI, player control, etc. Once a generic GameComponent is written it can be inserted in any number of objects; if a specific object requires special functionality, a new GameComponent can be written just for that object without disturbing anything else in the system.

The beauty of this approach is that individual components can be compartmentalized features, and brand new objects in the game world can be created just by sticking different pieces of pre-existing code together in new ways. The result is also dynamic: unlike an inheritance-based object, Replica Island game objects can change their structure at runtime. For example, one of the Android robot's powers is to possess other robots. When an enemy robot is possessed, his AI component is removed and a new component that lets the player control him is inserted. All of a sudden the robot is being driven by the player; all the rest of his code for animation selection, physics, and collision detection, continues to work without realizing that anything has changed. When the player releases the robot, the original component structure can be restored (actually, in this case the robot blows up, taking out anything else near it, but you get the idea).

I've made a couple of games now using components and I'm very happy with the result. In Replica Island I took a couple of shortcuts for speed that damage the goal of completely independent components, but I think it was worth it; the sacrifices I've made for frame rate haven't actually proved detrimental to the flexibility or extensibility of the system.

Saturday, August 1, 2009

Detecting and responding to dynamic collisions.

In the last post I discussed how Replica Island uses line segments organized as a 2D regular grid of tiles as the basis for its background collision system. This time I will explain how dynamic collisions (collisions between moving objects) are detected and resolved.

In Replica Island I draw a distinction between collisions that occur with the background geometry (falling on the ground, sliding on a slope, hitting the Android's head on the ceiling) and collisions that occur between game objects (the Android depressing a button, or hitting an enemy, or collecting a coin). While both of those cases are forms of "collision detection," they represent very different types of tests and I have two (entirely separate--mostly) systems for dealing with them.

Moving objects vs the background

Since I alluded to how background collision detection works in the last post, I'll start with that system. If Android is falling through space and passes into the ground, I need to detect that intersection and then fix it so that he doesn't actually fall through the floor. This is actually a pretty tricky problem because the frame rate of any Android game (and really, any modern game on any platform) can fluctuate as the game is played. The Android moves through space in game units / second, but the game is displayed in frames per second, and depending on the current speed of the game there's no good way to predict how far he'll move in a single frame. So I need a method that can cover a range of space between the last frame and this one so that even a dramatic movement won't allow the player to pass through walls.

The solution is to "sweep" the space in between the character's position at the last frame and his current position, and snap the character back if an intersection is detected. I use rays to do this: rays are cast from a character's previous position to his current position, and the first intersection along the ray (that is, the intersection that is closest to the ray's start point) is considered to be the spot at which the character hit a wall. I also filter my ray test by the normals of the surfaces I am considering; surfaces that do not oppose the direction of the ray can be ignored (that is, I only care about surfaces whose dot product against the direction of the ray is less that 0). Characters are not well described by a single, thin ray, however, so I do two tests: one filtered against horizontal surfaces and one filtered against vertical surfaces (angled surfaces fall into one bucket or the other depending on their slope). This is a nice method because it allows me to tune exactly how I test a volume against collision; often I want to allow a small amount of intersection: when the character is standing on a sloped surface, for example, I want to allow his bounding box to intersect with the slope slightly so that it looks like his feet are on the ground. With a couple of simple ray tests this method covers the space between the previous frame and the current frame pretty well without too much processor overhead. I briefly experimented with a final volume test pass to make sure that collisions were not being missed by the ray tests, but in the end such a test wasn't necessary (and actually, despite being more technically correct, the results were a lot less fun).

Sometimes I want game objects that act like background collision but are not part of the collision tile map. For example, a moving platform might act in every way like a background element except that it can move. In these cases, I allow entities to submit temporary line segments to the background collision system which will then be used along with all the rest of the background collision line segment data. This way characters in the world can be made to act exactly like solid objects without remaining static, and other characters can come along and react to them without any special code.

Moving objects vs each other

However, the more common case is when two game objects--a bullet and the player, an enemy and some spikes, etc--that are moving and non-solid come into contact. In this case we need to detect the intersection and then let code specific to each entity decide what to do about it. Still, we can generalize the system a little bit more: usually in such collisions we can name one of the entities as the "offender" and the other entity as the "victim." When a bullet hits the player, the bullet is the offender and the player is the victim. When a robot runs into some spikes, the spikes are the offender and the robot the victim. In fact, if we consider an animating character, we might want to mark some parts of a given frame of animation as "offensive" and other parts "vulnerable." In a game where a character can punch, we probably want the character's fist to deal damage to other characters but at the same time we'd expect other parts of the character, say his back and head, to be vulnerable to hits. So in order to detect collisions between game entities, I decided to give my entities multiple collision volumes, some associated with offensive areas and others associated with areas that are vulnerable to hits.

Each animation frame in Replica Island can carry a list of "attack" volumes and "vulnerability" volumes. When detecting collisions, I stipulate that collisions can only occur between a single attack volume and a single vulnerability volume. Furthermore, volumes can be set to deal and receive specific types of hits, which allows me to filter the number of actual volume intersection tests I need to perform (for example, the coin is only vulnerable to a "collection" hit, so only collection-hit-dealing attack volumes will be tested against the coin.

Each time an animation frame changes a new set of attack and vulnerability volumes may become active. These volumes are unioned together into a sphere that is guaranteed to encompass them called the bounding sphere. The volumes, along with the bounding sphere, are then submitted to the runtime collision detection system. Each frame, the collision detection system sorts all of the bounding spheres that have been submitted and tests them for intersections. The sort is by the left-most point of the sphere, so objects end up sorted along the x-axis of the level. This is a type of sweep and prune algorithm, and it makes it easy to quickly find overlapping bounding spheres because potentially colliding sphere pairs are guaranteed to be grouped together in the sorted list. When a pair of bounding spheres that intersect is found, each of the related entities' attack and vulnerability volumes are tested for intersection. If an intersection between an attack volume and a vulnerability volume is found, we know that these two entities have hit each other and some other code needs to run in order to respond.

For a long time the Replica Island engine only supported sphere collision volumes for these kinds of dynamic tests. About half-way through development I added an axis-aligned box collision type as well, but otherwise no complicated collision volume tests have been necessary. I'm very happy with the way that this system turned out: it's reliable, fast, and easy to extend.

Fast and accurate 2D collision detection with line segments.

Replica Island is a tile-based game, which means that the levels are laid out using small (32x32), reusable tiles. I chose this approach for two reasons: it's memory efficient and exceedingly common for this genre. With a tile-based game you define a set of tiles--in the case of Replica Island, a single texture representing each possible tile--and then draw the level by combining those tiles together. Even if you have a lot of levels you don't need to spend a lot of disk space (or runtime memory) on the actual art for the tiles, and the data describing the layout of each level is very small. So tiles are useful for games that have a lot of levels and wish keep the total size of their application down. Tiles are also the standard way to make side scrolling games. Pre-Sony Playstation 1 game hardware (and some post-PS1 hardware, like the Nintendo GameBoy Advance) was hardwired to deal with tiles because of these efficient properties, and as a result the vast majority of side-scrolling games are tile-based. So going with a tile-based game engine also made sense because it helps the game feel like a proper side scroller.

So when it came time to write a collision system to represent level geometry for Replica Island, tiles seemed like a natural choice. I already had a tool for editing tiles and tile maps, as well as runtime code for loading tile sets and maps, so using tiles for collision meant I could leverage a lot of existing infrastructure. Also, maintaining a 2D array of collision tiles in memory was appealing because it's very fast to query the contents of any particular tile; while many collision systems organize their data in trees, doing so requires that the tree be traversed for every collision test. A 2D array, on the other hand, can be indexed directly into, which is very fast (the down side is that these arrays are generally sparse, so there is a lot of runtime memory wasted; however, for the size of the levels I was considering and the available ram on Android devices, the memory required of an array to describe the level is trivially small).

The collision tile layout for an early test level.

Tile-Based Line Segment Collision? Huh?

The problem with using tiles for collision is that they are basically square. The simplest implementation is to simply consider the collision world a 2D grid of booleans, set to true in grid cells that are "solid" and false to cells that are "empty." Using this method you can calcuate the location of a collision cell in game space and then check to see if it is solid or not before deciding if, for example, your player can move forward. While that's a very simple approach (and I shipped a few games that worked that way back in the day), it's not really going to scale to the types of interesting level designs that modern players expect. At the very least you want to be able to support a sloped surface so you can make hills and valleys, and ideally you should be able to express bumps, spikes, curves, and other shapes that are much more complicated than a simple solid cube.

So, for Replica Island, I decided to implement a collision system based on arbitrary line segments, stored as shapes in tiles and laid out in the world as a tile map. The idea is pretty straightforward: within a 32x32 collision tile I can define any number of line segments, which can have arbitrary angles and normals. I can use my tile editor tools to lay the collision tiles out and at runtime I can leverage the speed of a 2D array (direct index into the current tile) to find a set of potentially-intersecting line segments. The nice thing about line segments is that they can contain both a slope and a normal, which allows for all kinds of interesting physics calculations without a lot of code (for example, to make a ball bounce convincingly off an angled surface you can simply reflect the ball's velocity about the normal of the surface). I've used similar systems (though never stored as tiles) on a lot of other games and, at least for 2D collision worlds, I'm quite happy with the approach.

Data Generation

Once I had settled on the line-segments-in-tiles approach, the problem became actually generating the line segment data for each tile. Genki, the artist behind Replica Island, generated a set of collision tiles that would serve as the basic building blocks of our levels, but now I needed a way to represent that same data as line segments. I've had success with edge-tracing algorithms in the past but my experience with them also suggested that they require a bit of tuning to get right. And since collision detection and response is so closely tied into core game play, I wanted a way to hand-modify individual segments. When I worked in games development full time we would just sit down and write an editor tool for this kind of thing, but one of my goals with Replica Island is to see how simply (read: cheaply) I can get it finished, so writing a dedicated tool was a little out of scope. So I decided to piggyback on some existing tool to generate the data that I needed. After thinking about it for a bit, I settled on Photoshop.

The collision tileset used to build levels in Replica Island.

Did you know that you can actually script Photoshop with Javascript? I did not know this, but it's actually pretty easy to do (if neigh undebuggable). After a day of futzing around with Photoshop's Javascript API I had a tool that would walk Photoshop paths and generate a list of line segments with normals (calculated by requiring the path be closed and by assuming that all normals point away from the centroid of the path shape). Since I couldn't figure out a way to actually write out a text file from Photoshop, the script opens a new document, creates a text layer, and then dumps its output into that layer. Once the script was written I went over Genki's collision tileset with the path tool and generated unique paths for each tile. I also added a very simple tool to take the text output of my script and pack it as binary data for loading at runtime. It took a total of about three days to go from having no collision data to having a full tileset of data, ready for use at runtime.

Querying Collision at Runtime

So, now I have two different sets of data: I have a single collision tile set, which maps collision tile indicies to collections of line segments, and a bunch of collision tile maps (one for each level) that place individual collision tiles in space to describe level geometry. I can load the collision tile set once and keep it around; each level that I load brings with it its own collision tile map. At runtime, to check to see if a region of space in the game world is blocked or not, I can cast a ray through the tile map (using one of my favorite algorithms: the Bresenham line algorithm) and check it against the collision tiles that it touches. The actual line segment vs line segment test is pretty simple (here's a useful reference), and it produces an exact intersection point. As soon as an intersection is detected the ray test can be aborted and the results returned (note that it's necessary to test the ray against all segments in a particular tile, though; you always want the intersection point closest to the origin of the ray, so all segments within a given tile must be tested and only the closest intersection returned). It's also possible to test a collision volume against the world by simply visiting each cell that intersects the volume and doing a volume vs line test, but in Replica Island this sort of test did not end up being necessary. The game play is described using ray casts alone.

So, given a collection of collision tiles, each containing a collision of line segments, and a map of those tiles laid out on a 2D grid, I was able to make a pretty expressive 2D collision system without a whole lot of effort. The next tricky bit, which I'll cover in a subsequent post, was what sort of tests, and what sort of response, are actually the best for game play. And this system only deals with intersections between entities and the background; in the future I'll write about the system I used to detect collisions between individual entities.