July 3rd, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off on Community News Roundup.. with videos!

We’ve got a handful of small news items that weren’t big enough for their own posts, so now we’ll catch you up on some Chipmunk and Howling Moon news!

Bobo Explores Light.. and wins an Apple Design Award in the process! This is a great presentation/book game that uses Chipmunk for all sorts of interactive elements. It’s aimed towards children and does an amazing job demonstrating how a digital book can offer amazing interactive learning opportunities.

AtPlayMusic Recorder is an educational app we developed for a local client that uses the iPad to teach children to play the recorder, and features some pretty cool interactive tech. The iPad listens to the student play, and the app also features interactive games mixed in with an animated instructional lesson. Although it doesn’t use Chipmunk, we’re pretty proud of our technical contributions, such as an efficient avatar creation system, shaders for recoloring the avatar’s clothing, and microphone capabilities.

Jump Dewds is the latest game to come out of our friends at Graveck. If you’re looking for some free-to-play coin-grabbin’, lava-dodging, fun that is stuffed with a quirky cast of collectable dewds.. check it out.

And in the Chipmunk world, famous cocos2D guy Birkemose has been using the new deformable terrain stuff and making some really amazing demos. I’ll show off the flashy video here:

He explains his drawing technique used in the above video on his blog.  But he’s also got an explanation and a example project for a simpler version.

Finally, The Wind in the Willows is another eBook that uses Chipmunk for many interactive elements.

May 29th, 2012 | Posted by slembcke
Categories: Uncategorized | Tags:

Framerate Independent Interpolation:

So one of my favorite math tricks is simple damping. Say you have a moving object that you want to apply some drag to. The easy way to do it is to multiply the velocity of the object by a number slightly less than 1.0 each frame:

speed = speed*0.99

Despite the simplicity, this is a pretty good approximation of drag. The faster the object is moving, the faster it will slow down. Using a similar method, you can smooth out noisy input data or target positions for an object to follow:

value = lerp(value, targetValue, 0.1)

Each time this is called, value will move closer to targetValue, and the greater the difference, the faster it will converge.

BUT!

Both of these suffer from a major problem. The damping value used is highly dependent on the framerate. If you call it once per frame and the framerate slows way down, so will the movement. That’s not good at all! You can always call it at a fixed rate, but that’s not alway convenient. Fortunately with a little extra math, you can fix this problem completely.

value = lerp(value, targetValue, pow(0.1, dt))

Where ‘dt’ is the time that has passed since you last called it. Before you were simply moving 10% of the way towards the target value each frame, now you are moving 10% per second. This is much better, and won’t get messed up when your framerate fluctuates.

Logarithmic Interpolation:

Another related trick that I really like is logarithmic interpolation. A mistake that I often see in games is when they do something like this:

scale = lerp(scale, targetScale, 0.1)

Scale, zooming, and sound frequencies are not really perceived as linear quantities, and lerping between them will just seem off. The problem gets worse the greater the ratio between scale and targetScale. The correct way to interpolate such values is to do it logarithmicly. Basically, like this:

scale = exp(lerp(log(scale), log(targetScale), 0.1)

Use linear interpolation on the values after converting them to logarithmic values, then use exp() to convert the logarithmic value back. That’s a lot of expensive math functions to call however! If you remember your log and power rules, you can simplify that to a single pow() call instead:

scale = scale*pow(targetScale/scale, 0.1)

You can combine this with the framerate independence trick from above too. Add these tricks to your math library and give them a whirl.

May 17th, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off on Collision Detection and Spatial Indexes

This is a reprint of an article we wrote a few months ago for Game Physics Mag.

“Ninety percent of your time is spent in ten percent of your code”, as the old adage goes, so choosing the proper algorithms for that 10% of the code in your game can be key. In the realm of collision detection and other game code, a good broadphase algorithm can be vital for many kinds of games to run in real time if they have a lot of moving objects in the game. A lot of game developers treat this as some sort of voodoo that only middleware can provide for them, but this isn’t the case. A good collision detection broadphase algorithm doesn’t need to be difficult or hopelessly complicated. In some cases, writing your own can be very beneficial when you run into performance problems. If you wrote it and know how it works, you can fix it yourself without needing to guess.

Collision detection broadphase:

Determining if two shapes intersect can become very computationally expensive, increasingly so based on their complexity. Why waste this computation time when they might be on completely opposite sides of the screen anyway? Broadphase algorithms, which cheaply return lists of possible collisions, alleviate this problem. They usually make as few comparisons as possible based on simplified bounding volumes (such as axis-aligned bounding boxes) to eliminate obvious non-collisions. Then you can apply the much more expensive collision check on a significantly smaller list of potential collisions. This article will discuss two simple broadphase algorithms, each with different strengths and weaknesses. Collision detection of specific kinds of shapes (polygons, circles, etc) and collision response is beyond the scope of this article, although many other helpful resources in that area can be found online.

The simplest possible broadphase would be to keep a list of all the shapes in your game and check each one against all others. It’s a naive, brute force algorithm, and it won’t scale to large numbers of shapes, but don’t completely discount it. If you only have a couple dozen interactive objects in your game, a simple list will perform just fine and is very, very easy to implement. In an Asteroids-like game, because only asteroids and bullets and asteroids and the ship collide, you can already skip most of the potential collision pairs. A broadphase wouldn’t help your performance much in that case. Until you profile and identify a performance problem, don’t spend time doing something you don’t need. It should generally be easy to plug a broadphase into your API after the fact, once you determine that it is necessary.

Single Axis Sweep

Single axis sweep is a very simple but effective broadphase algorithm. It is very easy to implement and is well suited for many types of 2D games, especially those where most of the action is spread out across a single axis. A horizontally scrolling motorbike game or a vertically scrolling space shooter are particularly good candidates. A similar algorithm called sweep and prune can efficiently operate on full 2D and 3D data, but is more complicated to implement.

Starting at the blue dot, sweep to the right until you hit the blue dotted line. The red dot lies within this range so blue and red may be colliding. Continuing on to the red dot and sweeping right until the red dotted line you find the yellow dot. Yellow and red should be checked for collisions. When sweeping the yellow and green ranges, you find no more dots, so there are no more potential collisions to report.

Starting at the blue dot, sweep to the right until you hit the blue dotted line. The red dot lies within this range so blue and red may be colliding. Continuing on to the red dot and sweeping right until the red dotted line you find the yellow dot. Yellow and red should be checked for collisions. When sweeping the yellow and green ranges, you find no more dots, so there are no more potential collisions to report

To implement single axis sweep, axis aligned bounding boxes are computed for each element under consideration, and then sorted by their minimum bound on one of the axes. Now with the sorted list of shapes, the collision detection is just a simple set of for loops.

static void
Sweep(List objects, CollisionCallback callback)
{
    // Sort the objects by their minimum bounds on one of the axes
    // Insertion sort is recommended as the list should already be mostly sorted from the last frame.
    SortByXMin(objects);

    int count = objects.count;
    for(int i = 0; i < count; i++){
        Object object = objects[i];
        float max = cell.bounds.max;

        // 'object' overlaps other objects in the list until 'object[j].aabb.xmin' is greater than 'max'
        for(int j = i+1; objects[j].aabb.xmin < max && j < count; j++){
            // 'object' and 'objects[j]' may be colliding.
            // Use the callback to pass the pair of objects back to the game's code for a better check.
            callback(object, objects[j]);
        }
    }
}

Additionally, because the shapes generally don’t move very far from frame to frame, their sorting order doesn’t change much either, if at all. This is known as temporal coherence, and it can really speed up your collision detection if you can take advantage of it. In the case of single axis sweep, all we need to do is select a sorting algorithm such as insertion sort that performs best on nearly sorted lists.

In the average case when you have smoothly moving objects mostly spread out over a single axis, the algorithm runs in O(n) time when using insertion sort, which is very good. However, if your objects aren’t well spread out, or they don’t move smoothly and their position is unrelated to the previous frame’s position, it runs in O(n^2) time which is no better than the naive brute force algorithm.

Another advantage of single axis sort is that it can be trivially parallelized. This works especially well for implementing a GPGPU accelerated broadphase where it’s better to be simple and wasteful instead of complex. The GPU gives you a massive amount of CPU power for simple parallelizable tasks, but works very poorly when given complex tasks that require a lot of synchronization. This approach has been mentioned in a number of GPGPU papers.

However, an issue that you might run into with single axis sweep is that it can’t efficiently be used for positional queries from your game code. Even with the sorted list, you still need to check against every object in the list for a random query when doing a raycast or an explosion in your game. To perform such queries efficiently, you need a spatial index algorithm. It’s also not optimal when you are comparing a small number of moving objects to a large number of stationary objects. You’ll spend a lot of time resorting the stationary objects even though they never move.

Spatial indexing algorithms:

Spatial indexing algorithms focus on finding objects based on positional queries, much like how a database uses an index to quickly find records based on a search query. To use a spatial index as a collision detection broadphase, you run a query for each object to find the other objects around it. A good spatial index has many uses beyond just collision detection though! AI will frequently query for nearby entities or use raycasts for line of sight checks when making behavioral decisions. One time events such as explosions may occur where you need to efficiently find which objects are within the blast range. Another benefit is that you can split up your moving and stationary objects into separate indexes. The index of stationary objects would only need to be updated when one was aded, moved or destroyed. Compare this to a broadphase which can only give you pairs of colliding objects for one large list.

When working with a spatial index, there are a few core operations. Insertion, removal, querying, and reindexing are the most important. Also when using a spatial index for a broadphase, it’s also useful to combine reindexing and querying for all collisions into one operation. An advantage of this reindex/query operation is that you can make it avoid returning duplicate collision pairs. If shapes A and B are colliding, a query using A’s bounding box will return shapes A and B, and a query using B’s bounding box will also return A and B. The list of collision pairs you get would be (A, A), (A, B), (B, A), and (B, B), but you really only want (A, B). Normally you’d have to filter these out afterwards.

Simple Grids

Placing your game entities into a grid is a natural and easy way of indexing them. Visualize drawing a grid over your world that maps onto a 2D or 3D array. Each grid cell stores references to shapes if the AABB of the shape intersects the grid cell. Tile maps can be thought of as a simple grid, although they usually have a single tile per grid cell. In general though, you can have many shapes overlapping a single grid cell, and a single shape can overlap many grid cells. While there are a lot of variations to grid algorithms, we’ll walk through one simple and flexible implementation.

In a game that takes place on a single screen or on a fixed size map, your grid size matches the size of your world. The size of the cells within the grid should match the size of the objects in your game. The broadphase search becomes quite simple, since you’re just checking each cell for multiple objects.

With a grid, you store lists of overlapping shapes into each cell. The purple and orange grid cells show places where multiple shapes have been inserted into a cell. The shapes in these cells may be colliding and should be checked closer.

With a grid, you store lists of overlapping shapes into each cell. The purple and orange grid cells show places where multiple shapes have been inserted into a cell. The shapes in these cells may be colliding and should be checked closer.

Time for some pseudo code! First the simple data structure for the grid.

static void
Sweep(List objects, CollisionCallback callback)
{
    // Sort the objects by their minimum bounds on one of the axes
    // Insertion sort is recommended as the list should already be mostly sorted from the last frame.
    SortByXMin(objects);

    int count = objects.count;
    for(int i = 0; i < count; i++){
        Object object = objects[i];
        float max = cell.bounds.max;

        // 'object' overlaps other objects in the list until 'object[j].aabb.xmin' is greater than 'max'
        for(int j = i+1; objects[j].aabb.xmin < max && j < count; j++){
            // 'object' and 'objects[j]' may be colliding.
            // Use the callback to pass the pair of objects back to the game's code for a better check.
            callback(object, objects[j]);
        }
    }
}

Instead of storing references to the objects directly in the grid’s array, we’ll be creating proxies. We’ll perform simple retain counting on these (in non GC languages) and keep a stamp to avoid duplicate results when querying. The reason for this will be explained later.

GridProxy {
    // The object this proxy is for.
    Object object;

    // Used during query operations to avoid duplicate results.
    int queryStamp;

    // Use retain counting on the handle so you know when it can be released.
    int retainCount;
}

Let’s start with the code for the insert operation.

void SimpleGrid_Insert(SimpleGrid grid, Object object)
{
    AABB aabb = object.aabb;
    GridProxy proxy = GridProxyNew(object);
    HashTableSet(grid.objectsProxies, object, proxy);

    // Convert the object's AABB to integer grid coordinates.
    // Objects outside of the grid are clamped to the edge.
    float cellSize = grid.cellSize;
    int xmin = max(floor(aabb.xmin/cellSize), 0.0);
    int xmax = min(floor(aabb.xmax/cellSize), grid.width - 1);
    int ymin = max(floor(aabb.ymin/cellSize), 0.0);
    int ymax = min(floor(aabb.ymax/cellSize), grid.height - 1);

    // Loop over the cells and insert the proxy into their lists.
    for(int y = ymin; y <= ymax; y++){
        for(int x = xmin; x <= xmax; x++){
            List proxiesInCell = grid.array[x][y];
            List_Append(proxiesInCell, proxy);

            // Retain the proxy for each cell it's inserted into.
            GridProxyRetain(proxy);
        }
    }
}

Removal from a grid can be tricky, since an object can be in several grid cells. There are several approaches, such as iterating over all the grid cells and removing the object from each one it’s present in. Fortunately, with the proxies removal becomes very simple. All you need to do is to invalidate the proxy by setting it’s object to NULL. Don’t forget to release the proxy as well so the next time the grid is reindexed so it won’t be leaked.

void SimpleGrid_Remove(SimpleGrid grid, Object object)
{
    GridProxy proxy = HashTableGet(grid.objectsProxies, object);
    HashTableRemove(grid.objectsProxies, object);

    // Set the object to NULL to mark the proxy as invalid.
    proxy.object = NULL;

    GridProxyRelease(proxy);
}

While there are a number of queries that you might want to implement (point, line, AABB, etc) the AABB query is good starting place. Much of it should look familiar from the SimpleGrid_Insert() function. The first thing you should do is to increment the grid’s query stamp. This is used to uniquely identify the query. All proxies that are visited get marked with this stamp value. It’s very likely that you’ll encounter a proxy several times in the neighboring cells. This way you know to only report each object once for each query.

void SimpleGrid_AABBQuery(SimpleGrid grid, AABB aabb, Function callback)
{
    // Increment the grid's queryStamp so it's unique.
    grid.queryStamp++;

    // Convert the object's AABB to integer grid coordinates.
    float cellSize = grid.cellSize;
    int xmin = max(floor(aabb.xmin/cellSize), 0.0);
    int xmax = min(floor(aabb.xmax/cellSize), grid.width - 1);
    int ymin = max(floor(aabb.ymin/cellSize), 0.0);
    int ymax = min(floor(aabb.ymax/cellSize), grid.height - 1);

    for(int y = ymin; y <= ymax; y++){
        for(int x = xmin; x <= xmax; x++){
            List proxiesInCell = grid.array[x][y];

            for(GridProxy proxy in proxiesInCell){
                Object object = proxy.object;

                // Check that the proxy hasn't already been processed by this query in a neighboring cell.
                // Also check that the proxy's object hasn't been removed.
                if(proxy.queryStamp != hash.queryStamp && object != NULL){
                    // Stamp the proxy so it won't been visited again.
                    proxy.queryStamp = grid.queryStamp;
                    callback(object);
                }
            }
        }
    }
}

Reindexing a grid basically involves temporarily removing all of the objects, clearing the grid, and then reinserting the objects. A good first optimization is to reuse the proxy objects. This was omitted to keep the sample code short however.

void SimpleGrid_Reindex(SimpleGrid grid)
{
    List objects = HastTableKeys(objects);

    // Remove all the objects. We'll readd them later.
    for(Object object in objects){
        SimpleGrid_Remove(grid, object)
    }

    // Clear out all the cells
    // This gets slower the bigger your grid is.
    for(int y = 0; y < grid.height; y++){
        for(int x = 0; x <= grid.width; x++){
            List proxiesInCell = grid.array[x][y];
            for(GridProxy proxy in proxiesInCell){
                GridProxy_Release(proxy);
            }

            // Reset the list for that cell
            grid.array[x][y] = EmptyList;
        }
    }

    // Readd the objects.
    for(Object object in objects){
        SimpleGrid_Insert(grid, object);
    }
}

Lastly, it was mentioned that a reindex/query operation is useful. This is easy enough to implement by performing a query immediately before inserting each object during the reindexing operation. This way you’ll never have an object get itself back from querying it’s own bounding box. Also, because the objects are inserted and queried incrementally, you won’t get both an (A, B) and (B, A) collision pair. You can even combine the insert and query operations together to avoid iterating the same grid cells separately for each.

An advantage of a grid based index is the ease of efficiently running queries. Point queries simply need to check the cell that contains that point, and a raycast implementation only needs to check grid cells the ray travels though, much like Bresenham’s line drawing algorithm. Unlike the single axis sweep, the grid is a full spatial index that can be efficiently queried by the rest of the game code.

Grids do have a couple problems though. Large worlds require a large grid and thus a large 2D or 3D array to store it. This introduces some potential problems with memory usage. Infinitely sized procedural worlds are increasingly popular, and raises the question about what you’d do when your grid’s array size gets unreasonably large. Additionally, a simple grid can be wasteful since they often end up being almost entirely empty. At a minimum, empty grid cells still use at least as much memory as a pointer. Imagine a 1000×1000 grid with 5000 objects inserted into it. If each object was inserted into 4 cells each, you would at most be using 2% of the cells. Fortunately, these concerns are addressed by the spatial hash algorithm.

Spatial Hash:

Spatial hashes are special type of the grid index that uses memory more efficiently. Instead of creating a large multidimensional array, you create a small 1D array, and then semi-randomly map the grid cells onto that array using a hashing function. This has a lot of benefits. You no longer need to think about grid bounds or size. Any grid cell can be mapped onto the hash table, limited only by the size of your integer type. Sparsely populated grids won’t eat up your memory either. If the array is mostly empty, you can just make it smaller. This is also good for when you want to clear the grid as a smaller array can be cleared faster.

There are a couple of minor downsides however. Because many grid cells can be mapped onto the same array location, it’s possible for two objects that are far away to be returned as a false positive by the spatial hash. However this is fairly rare and can be quickly rejected by a bounding box check. Another issue is that it’s possible for a single object to be inserted into the same array index more than once from two separate grid cells. You’ll either need to catch this condition, or make sure that you never assume that the list for a cell contains all unique entries.

What makes a good hashing function? A poor hashing function could result in an uneven distribution of objects into the array, negating some of the benefits. On the other hand, the hashing function may easily be called hundreds of thousands of times per second, so it should be fairly efficient. In Chipmunk, I use the following hash function in conjunction with prime sized hash tables. This has proven to provide a reasonable level of performance.

int HashFunc(int x, int y, int tableSize){
    // multiplying the coordinates by large primes, XORing them together and then
    // modding by a prime table size makes a pretty good hash function
    return (x*1640531513 ^ y*2654435789) % tableSize;
}

Implementing the insert, delete, query, and reindex operations for a spatial hash are almost identical to a spatial hash. You can remove the min/max clamping when calculating the grid coordinates as they are no longer needed to protect against array indexing errors. Instead of accessing the grid’s array as a 2D array, you just need to swap in the hash function.

for(int y = ymin; y <= ymax; y++){
    for(int x = xmin; x <= xmax; x++){
        List proxiesInCell = grid.array[HashFunc(x, y, grid.arraySize)];
        // ...
    }
}

The efficiency of the spatial hash and simple grid depends on the grid cell size roughly matching the sizes of the shapes being put into it. If your grid cells are much smaller than your objects, each object will be in a handful of cells, so you’ll be checking and inserting into a lot more cells than you need to. If your grid cells much larger than your objects, you might end up with a lot of non-colliding objects in a single cell generating a lot of false-positives. Adjust your grid size to be comparable to the size of typical objects. For this reason, grids work best when all your objects are roughly the same size.

The grid cell size is much too big. Too many shapes are inserted into each cell, and each cell requires n^2 collision checks. The grid cells are the grey squares; A darker color means there are more shapes in a cell.
The grid cell size is much too big. Too many shapes are inserted into each cell, and each cell requires n^2 collision checks. The grid cells are the grey squares; A darker color means there are more shapes in a cell.
This grid cell size is much too small. Each shape is inserted into many grid cells. Notice the speckles between shapes. These are grid cells that map to the same hash table index as another grid cell where a shape was inserted.
This grid cell size is much too small. Each shape is inserted into many grid cells. Notice the speckles between shapes. These are grid cells that map to the same hash table index as another grid cell where a shape was inserted.
The size of the grid cells is just right.
The size of the grid cells is just right.

Spatial hashes require an additional tuning parameter compared to a simple grid, the hash table size. When working with a grid, you simply need to worry about making it’s dimensions big enough to fit all your objects in it and and then tuning the cell size. With a spatial hash, you also must tune the size of the hash table itself. If the hash table size is too small, you will get many false positives because too many grid cells that are far away from each other will map onto the same hash table index. If the hash table size is too big, you’ll just be wasting memory and making clearing operations take longer. You’ll have to tune this by hand a little, but a good rule of thumb is that your hash table should be 10x bigger than the number of objects you insert into it.

When simple grids and spatial hashes are properly tuned, they also have an O(n) running time to all the collisions. This is very good and means that they scale very well to huge numbers objects. You could easily expect to collide 10k to 20k shapes at 60 hz on a modern desktop CPU. Despite the linear running time, because of the number of memory operations grids and spatial hashes use to access each grid cell, they have a high constant cost associated with them. For smaller numbers of shapes, or shapes that aren’t the all about the same size, there may be more optimal spatial indexes.

Conclusion:

So now that you’ve seen that spatial index and collision detection broadphase algorithms don’t have to be hard, I hope you’ll try writing some of your own! It’s a very interesting topic with a wide variety of ideas that has filled many books. If you are interested in learning more, bounding volume hierarchies such as quadtrees, k-d trees, and axis aligned bounding box trees are a good next step.

Scott Lembcke and Andy Korth have been developing games for the last four years. Scott wrote the Chipmunk Physics engine, and together they have offered game physics consulting services, iOS development, and Unity development services. They are located in Minnesota, USA.

References:

“Optimized Spatial Hashing for Collision Detection of Deformable Objects”
http://graphics.ethz.ch/Downloads/Publications/Papers/2003/Tes03/Tes03.pdf

May 9th, 2012 | Posted by slembcke
Categories: Development, News | Tags: , , , ,

Chipmunk Logo

It has been a few months since Chipmunk 6.0.3 was released! We’ve had a lot of contracting work to distract us from core Chipmunk development lately. Even so, we’ve been cooking up some new features and now I’m currently working full time putting the final touches on 6.1. So what’s new to get excited about? Here are a lot of words!


Approximate Concave Decomposition (Pro only):


ACD

One of the features that has been missing from Chipmunk Pro’s autogeometry is some form of concave decomposition. A lot of people use a triangularization algorithm to solve this. Unfortunately, while triangularization is great for creating perfectly decomposed triangle meshes for rendering, and are pretty easy to implement, they doesn’t work very well for generating collision detection shapes. It tends to create more shapes than you want or need and can produce very small or thin shapes. This can clog up the spatial index, giving you poor overall performance, and cause other frustrating artifacts in the simulation quality.

Instead of triangularization, I’ve been working on approximate concave decomposition (ACD) for Chipmunk Pro. This means that you ask Chipmunk to decompose a concave polygon with a distance tolerance, and it will give you back one or more convex polygons that match the shape of the original to within the tolerance. The ACD algorithm tries to match the most important features of the input first to create a high quality decomposition. Also, because it can stop once it satisfies the tolerance, the performance can be really good and used at runtime even. If you want a perfect decomposition, you can still always use a tolerance of 0.0. Although this can take more time and produce more shapes than you might want.

The ACD algorithm shipping with 6.1 isn’t perfect though. I’m still improving it’s splitting plane hueristic to improve how it cuts the polygons (I’m not too happy with the ‘H’ in the image above -_- ), and it doesn’t yet have options to detect or process polygons with holes in them (like the ‘P’). Lastly, if you run ACD on a self intersecting polygon, it might cause it to over-simplify it because it detects the polygon as having a negative tolerance.

Because we are still working on the quality of the algorithm, and its missing the ability to work with holes, we’ve marked it as being beta. Its usable and stable, but its performance and quality might not be as good as some people want yet. I didn’t want to hold up the release longer while I did more research for this, but at the same time didn’t want to leave it in an unusable state.


Multithreaded Solver (Pro only):

This has been in the pipeline for a long time! There has been early access to this available through the Chipmunk Pro git repository for some time, but the next release will make it official. This complements the ARM NEON optimized solver and works on any platform that supports pthreads. This is an easy way to take advantage of those extra cores that now ship on everything from desktop to mobile machines. The Chipmunk Pro showcase app on the App Store uses the multi-threaded solver and enables a lot of extra objects to simulate on the iPad 2 and iPhone 4s. Check it out!


Convex Hulls:


Convex Hull

Since the beginning of time (or like 2006… whatever), Chipmunk has required that you carefully provide it with convex polygon data with a specific winding. Every good graphics API does this too, so people know to expect this sort of thing right? Nope. I talked with Erin Catto for a while at GDC and he mentioned that the winding and concavity of polygon shapes was one of his biggest support issues and that he recently added a convex hull algorithm so that it would “just work”. I’ve had assertions to ensure concavity and winding, but Erin’s solution was better. 2D game programmers often aren’t graphics experts, and they shouldn’t need to be. I’ve had a nice QuickHull implementation sitting around for some time, so I decided to make it more generic and include it with Chipmunk. Nobody will ever have to bother with concavity or winding ever again. \o/


Nearest Point Queries:


Nearest Points

The main inspiration for the original point query feature in Chipmunk was selecting shapes with the mouse. Using a single point made great sense for this as mice are very precise devices. With touchscreen devices its not perfect. Using a single point to represent a big fat finger doesn’t work very well. Its very frustrating to try to grab a small shape with your finger only to fail repeatedly. I’ve seen some people doing clever sampling tricks to try and get around it, but we can do better! Nearest point queries are the next logical step. Instead of just a yes/no answer to if a point is inside a shape or not, nearest point queries give you the nearest point on the surface of a shape as well as the distance to it (or the depth the point is inside). Now when you tap a big fat finger on the screen, you can trivially find the closest shape within a certain radius. The ChipmunkMultiGrab class in Chipmunk Pro has already been updated to support this.


Block Iterators:

Almost every language in existence today other than C/C++ has some sort of concept of closures or anonymous functions. One of their uses is to allow you to make some really slick, simple APIs for iterating complex data structures. You just pass a chunk of code to the thing you want to iterate and it will call your code for each object. This is why Chipmunk has so many function based iterators (cpSpaceEachBody(), cpBodyEachShape, cpSpaceSegmentQuery(), etc). Exposing the underlying data structures would be annoying to the user as some of them aren’t simple, and it would mean I can’t change them later if I need to.

Even if you are using Chipmunk from C or C++, most compiler provide some extensions to make your life easier. GCC supports inner functions, and Clang supports blocks. I’ve often encouraged people to use them, as the iterator functions were intended to be used with language features such as these. I started making sample code for a tutorial showing how to use them with Clang blocks, but then decided that I might as well just put that code into the official Chipmunk API instead.

Without the block based iterators, you write code like this:


// Define your callback function somewhere like this:
static void
ScaleIterator(cpBody *body, cpArbiter *arb, cpVect *sum)
{
	(*sum) = cpvadd(*sum, cpArbiterTotalImpulseWithFriction(arb));
}

// ... and then somewhere else use it like this:
cpVect impulseSum = cpvzero;
cpBodyEachArbiter(scaleStaticBody, (cpBodyArbiterIteratorFunc)ScaleIterator, &impulseSum);

With the new block based iterators, its much nicer. You can keep the code all in one place:


__block cpVect impulseSum = cpvzero;
cpBodyEachArbiterBlock(scaleStaticBody, ^(cpArbiter *arb){
	impulseSum = cpvadd(impulseSum, cpArbiterTotalImpulseWithFriction(arb));
});
March 3rd, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off on GDC 2012

Scott and I are heading to GDC tomorrow, and if you’d like to meet up sometime, please email us at admin (at) howlingmoonsoftware.com. We’ve got loads of Chipmunk t-shirts that we’re planning on giving out, and we’d love to meet as many people as possible.

You’ll also find me on Twitter as @kortham.

February 7th, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off on Meet Yggy, our global game jam game!

Last month, Scott and I participated in the Global Game Jam. Together with Ashley, the extremely skilled artist we found, we made a fun little game that highlights the concepts of the cycle of life.

Ashley did a nice writeup on her blog: http://hardshellart.blogspot.com/2012/02/yggy.html

Download Yggy for Mac and try it out! It’s even cuter when you see it animated. Sorry, we wrote it in Objective-C from scratch, so there’s no Windows port. But you can see a video of the sprite animations on Ashley’s blog, above.

Overall, it was a lot of fun to sit down and make something over 48 hours. I was really pleased with what we threw together, because it feels like a complete game, even if it’s pretty small and short.

If you’re in Minnesota, you can come play it tomorrow (Wednesday, Feb 8th) at the IGDA Twin Cities!

November 15th, 2011 | Posted by slembcke
Categories: Uncategorized | Tags:

So this last weekend I was bored. I could have played any number of video games (including my recently acquired copy of Wario Land 2 😀 ), but instead found myself reading instruction manuals for instructions (CPU instructions that is).

On a number of occasions I’ve found myself looking into ways to better optimize Chipmunk specifically for iOS devices. A few years ago I spent a bunch of time looking into the VFP (the vector math extensions) for the original iPhone and iPod friends. VFP was the SIMD (Single Instruction Multiple Data) instruction set supported by the ARM 6 hardware. The idea is that instead of adding up numbers one at a time, you put them into a vector and add (multiply, divide, etc) them up 2 or more at a time and be 2 or more times faster! Nice idea, not so great implementation. VFP’s big letdown was that it didn’t really make anything 2 or more times faster. It just sort of uses less instructions and is a little bit faster. The really big problem is that in order to get “a little bit faster” you have to write a lot of assembly code, and you have have to do it just right or it will end up being slower than the compiler generated code anyway. Generally people stuck to using VFP for implementing things that were straightforward and well ordered like matrix math or particle effects. Trying to use VFP to optimize Chipmunk seemed like a colossal waste of time.

Now enter the 3GS with it’s shiny new ARM 7 CPU. It has a fancy new vector unit called NEON. Unlike VFP, with NEON you can process a vector of two floats in the same amount of time as applying the same operation to a single float. Also, because NEON uses ordinary instructions without needing to set extra state in the CPU, the compiler provides “intrinsics”. Basically, these are functions that map more or less directly onto the instructions the CPU provides. That way you can let the compiler allocate registers for your variables and move your data into and out of memory, and pack and unpack vectors while you just write the math. There is still one problem though. You can’t just switch back and forth between vectors and scalars whenever it’s convenient. Simply replacing Chipmunks vector operators (cpvadd, cpvsub, etc) with NEON intrinsics (vadd_f32, vsub_f32, etc) works, but the performance is awful (I tried it…). The problem is, despite being a vector instruction set, it doesn’t cover a lot of vector algebra (dot or cross products, etc.) because it’s designed to be simple to implement in silicon. This basically leaves you to do some of the compiler’s work and figure out how to rewrite the code onto the limited vector instruction set.

I’ve actually made several attempts to speed Chipmunk up using NEON now. It’s sort of demoralizing to spend hours rewriting a chunk of code to be more complicated only to have it run slower. Ugh! The first time I simply tried replacing the individual vector functions (cpvadd, cpvmult, cpvdot, etc) and seeing what happened. It was a lot slower. The problem was that all the rest of the code didn’t use NEON vectors, and some operations (such as cpvdot or cpvcross) couldn’t be written to use NEON efficiently. So it ended up spending more time packing and unpacking vectors than it ever saved by processing two numbers at a time. Months later I tried rewriting only cpArbiterApplyImpulse() to be vectorized. In most Chipmunk programs, half or more of the CPU time of everything used by Chipmunk is spent in cpArbiterApplyImpulse() so it’s a good candidate for optimizations and has already been the target of a half dozen passes. After hours of tracking down new bugs that I had made, it just ran slightly slower again. Ugh… The problem was that I hadn’t gone all in. I only tried to vectorize the parts that were easy and still avoided things like dot products and all the fancier operations that I really needed to be able to use. So I was still doing a bunch of unnecessary packing/unpacking. The general trick to get around that is to do something like process two pixels at a time instead of trying to process one pixel twice as fast. Unfortunately, while it’s easy with pixels, I couldn’t really do that when processing contacts because it was just too complicated to set up in a sequential solver.

This time I decided I was going to do the whole function (or just give up). Several of the key vector operations that I needed didn’t map onto the instruction set and processing two contacts at once was too complicated. I needed to do something else and realized that I had a third option, why not just rearrange a bunch of math so that I can perform two unrelated dot products or cross products (or whatever) at the same time? When it’s not possible to pair two operations up, just continue processing as a vector anyway and throw away the unused component at the end instead of unpacking several vectors along the way. So I spent an evening turning the 30 lines of fairly readable code into 90 fairly unreadable ones. The best case scenario is that you can double the speed of a piece of code if it maps perfectly onto the vector instructions. Realistically I was expecting maybe a 10 or 15% speedup due to some inevitable packing/unpacking costs and a few places where I had to waste some of the calculations. Instead I got a 40% speedup. Not bad at all! Running the benchmark demos I have, I found that it was a 19% average speedup *overall*! The demos that had a lot of stacked objects were as much as 30-35% faster. The best part is that unlike multithreading, there is 0 overhead and it works on 95% of the iOS devices out there. You can leave it on all the time for any sort of simulation and it will always make things faster.

This is all very exciting and whatnot, but when will it see the light of day, right? Expect it to be available as part of Chipmunk Pro very soon when I release 6.0.3 (along with properly integrated AutoGeometry). You might also be glad to hear that I’ve made the Git repository for Chipmunk Pro to be readable by our customers. That way when I work on improvements like this or need to fix a bug, you can always pull and use the latest without having to wait for the next release. If you have bought Chipmunk Pro and want a free performance boost _right now_, all you have to do is pull from the repository, build it, and replace ChipmunkSpace with ChipmunkHastySpace (or cpHastySpace if you are using the C API).

To check out Chipmunk Pro from git run this, substituting in your own Chipmunk Pro login name and password.
git clone --recursive http://myLogin@chipmunk-physics.net/release/chipmunkPro/ChipmunkPro.git

From there you can just run the iphonestatic.command script (double click or run on the command line with ruby iphonestatic.command) and copy out the ChipmunkPro-iPhone directory into your project.

October 19th, 2011 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off on Your Story Postmortem, uDevGames 3rd place winner!

While Andy was working on Leader of Mans, Scott joined forces with Max Williams to create Your Story.

Just another day at the office? Not so! Your career takes a sudden plummet as you get promoted to a position deep underground.

Meet friendly natives and explore their world. Fight off unwanted affection from the local wildlife. Discover abandoned facilities and long-forgotten artifacts.

Your Story is a 2D platformer with a retro style.

Max finished and posted the Your Story postmortem as well. Here’s a playthrough video. Watch out for spoilers, although if you can’t play the game on a Mac, at least you’ll get to see it!

October 17th, 2011 | Posted by Andy Korth
Categories: Uncategorized | Tags:

Leader of Mans Postmortem

 

A big thanks to everyone who voted for Leader of Mans in uDevGames. I was awarded 2nd place in the ‘overall’ category!


Leader of Mans was my uDevGames 2011 entry. I wanted to create something that felt a bit unique- LOM is a construction based rts-like game that is a bit difficult to place squarely in a single genre. Gameplay focuses around developing resources and constructing settlements, but Leader of Mans also directs you to explore your varied surroundings and overcome challenges.

Leader of Mans was sort of an experiment for me; When I started, I didn’t have a really specific picture of how the game would play. The goals of a score based arcade style game have some pretty direct goals, and I wanted to try something a little different. I think a driving force was to make unique challenges on each island, and to make advancing to each island feel like a new experience, while retaining what you’ve lovingly constructed on previous islands. I have a lot more ideas and plot I could add, but I think what I’ve got is pretty satisfying as a complete experience.

 

Original Design

 

One reason I struggled initially with gameplay, is because I wanted to make the game multiplayer. I used my knowledge from my previous uDevGames entry, Reclaimed to write a fully networked game, featuring fancy stuff like client side prediction and latency reduction code. That was working pretty well, until I realized interacting with other players within the context of the gameplay didn’t really make a lot of sense. I was hoping for a cooperative environment, but exploring and unlocking content wasn’t really suited to multiple players sharing a world.

So I figured I’d save some time and I dropped multiplayer support, and instead focused on plot and progression elements.

What Went Right

 

Tool Choice

 

I used tools I was familiar with, and I leveraged my codebase from my previous contest game, using the networking code, the UI library I wrote, and enhancing the rendering engine. I used some neat new techniques to dynamically generate terrain and plugged in code for a non-tile based engine. I was able to work much faster in my own code, even though I am familiar with tools like Unity, which Scott and I use in our day job. It’s nice to use the contest as an opportunity to get away from ‘work’ games.

No menus or crazy UI

 

While I slowly figured out how the game was supposed to work, I constantly examined how the user interface would work for the game. As I considered how constructing a building worked, I was very conscious of implementing it in a way that was fun on it’s own, and unobtrusive. Reclaimed approached this problem with large sets of menus. Constructing an item in Reclaimed meant scrolling through a list of 80 possible items. Adding filtering makes it easier, but further complicates the UI. I wanted all interactions in Leader of Mans to happen without opening a menu or an in-game window.

I think keeping a simple UI kept the game approachable and fun. Working out the gameplay like this surprisingly difficult, but it feel it was a big success. When you see a simple detail implemented in game, it seems obvious and trivial, but it often took a lot of thought and iteration to get that point.

Friends!

 

Since I didn’t really have a clear picture of the game when I started, it was easy to get myself stuck in a rut. At that point, I turned to my friends in iDevGames and asked for some ideas. A huge thanks to Scott, Seth, Alex, Keith, Neil, and everyone else in the channel for sending me ideas! And thanks to those who send feedback to the Leader of Mans forum thread. The unique use of the word ‘mans’ came out of a IRC discussion, and Seth produced the excellent bear artwork. My wife, Beth, was very supportive. On night before submission she stayed up late and drew the dead trees used on the last island. Thanks hun!

Getting feedback and ideas really energized me. Those were the nights I stayed up late, plowing through features without realizing how quickly the time was passing!

What Went Wrong

 

Oops, I forgot to blog!

 

We’ve got a blog at Howling Moon Software… and if you read it, you would think we only did one interesting thing every 3 months! Unfortunately, when we’re doing something exciting, sharing it on our blog isn’t a high priority. When we do think about making a post, we’re usually working on a client project that’s covered under NDA.

What Went Okay

 

No Clear Initial Idea

I’m actually pretty happy with how things worked out despite not having a clear idea of what I wanted. It took extra time, but I’m happy with how iterating the gameplay turned out. I figured this could be a recipe for disaster, but it worked out alright.

My first few weeks of programming all ensured a working multiplayer experience. I dropped this when I realized I had no idea how the game would play in a multiple player environment. Although it did add up to some lost time, I made the decision early enough to still be able to pull together a finished game.

Development time

 

Scott and I actually set aside some time for working on our uDevGames contest entries. Since many of my evenings are scheduled with my wife and our new house, it’s difficult to find time (and energy!) for programming in the evening after a full day of game programming. So I got a week or so of work done early in the contest, but then additional important work showed up from clients. I managed to eek out most of the last week to really throw the game together. It could have been worse, but it’s always hard to schedule time exactly.

Conclusions

 

I learned a lot since my entry in uDevGames 2008. I had a better idea of what would work well in the contest, and I’m very happy to be awarded the second place overall award. I think Leader of Mans pulled together nicely- I did manage to get some of the important polish in, like nice sounds, music, and some graphical effects. I spent a lot of time improving the interface and controls, although there’s still a few more things I would have liked to get to. Leader of Mans was very very close to several other awards, so next time I’ll need to give it a little extra oomph to get it up into third place in each category!

  • Developer: Andy Korth
  • Title: Leader of Mans
  • Team Size: 1
  • Hardware: Mac Pro, Macbook
  • Critical Software: Eclipse IDE, Photoshop, Amadeus Pro, Pedegree particle dohicky, Heiro font thingy
  • SDKS & APIs: Slick library, OpenGL, homegrown code

 

October 13th, 2011 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off on Andy’s Contest Game: Leader of Mans

“All of creation is at your fingertips. Your mans look to you for guidance. Lead them to greatness.”

Leader of Mans was my entry for the uDevGames contest. uDevGames is the big Mac game development contest. It’s a multi-month contest with very few restrictions, just make a game and share it!

Leader of Mans was sort of an experiment for me; the game ended up being an interesting exploration based RTS god-game. I had a lot of fun putting it together and figuring out where I wanted to take it. I think a driving force was to make unique challeneges on each island. and to make advancing to each island feel like a new experience, while retaining what you’ve lovingly constructed on previous islands. I have a lot more ideas and plot I could add, but I think what I’ve got is pretty satisfying as a complete experience.

Leader of Mans got a fantastic review that walks you through the game and has some great screenshots.

Download and Play: Leader of Mans