October 14th, 2013 | Posted by admin
Categories: Uncategorized | Tags:
Comments Off
April 24th, 2013 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off

The last few weeks we’ve been working on Chipmunk bindings for Unity! If you’d like to stay up to date on our progress, follow us on Twitter:

http://twitter.com/kortham
http://twitter.com/slembcke

So far we’re finding a lot of challenges, but we’re really trying to go the extra mile to make it feel Unity-ish. And very early performance tests put it simulating over twice as many objects as PhysX, in spite of our need to cross the managed/native barrier each frame for each object!

April 1st, 2013 | Posted by slembcke
Categories: Uncategorized | Tags:
Comments Off

We had a great time at GDC last week! We ended up meeting a lot of new and interesting people. It’s always great because we can chat with a lot of people who are and are not using Chipmunk to figure out how we can make it even better.

Farewell Chipmunk Physics. Long Live Chipmunk2D!

Chipmunk isn’t going anywhere. Just rebranding a little bit to make it easier to search for. Google for “Chipmunk” and see how much of it is physics related. ;)

This was a suggestion that came up a bit before GDC in a Cocos2D thread, but came up a couple times more when there.

On that note, I’ve set up a new URL, chipmunk2d.net (currently just a redirect), and a new twitter account @chipmunk2d dedicated to Chipmunk stuff. Follow us and help spread the word!

Unity3D support:

I was never quite sure how to make Chipmunk fit in nicely with Unity, so it’s been an idea that sat on the very back burner. After talking with people at GDC, there definitely seems to be a nice niche market for a good 2D physics engine. It seems a lot of people are frustrated trying to get the builtin 3D physics to do 2D nicely. I think with a bunch of custom editor components we can get something pretty close to the simplicity Unity provides now and then sprinkle in some of our own secret Chipmunk sauce to make it even better.

Chipmunk provides a number of nice features that PhysX does not (or at least are not exposed). Since Chipmunk isn’t doing all that extra work for a dimension that is just being thrown away it’s also *much* faster. So we think we can go simpler, faster, and more flexible. Seems like a win/win/win scenario if you want to do 2D physics games.

Better JS Support:

Last year, Joseph Gentle made a fantastic port of Chipmunk to Javascript. He’s also done a great job so far of keeping it up to date so far. I don’t do much Javascript myself, but it’s become so much better recently that I’m excited for the possibilities! I’d love to put some interactive stuff on the Chipmunk website and in the documentation.

Anyway, with Chipmunk-JS becoming a more or less standard part of Cocos2D-JS, I’m going to try and make timely patches for Joseph so he’s not stuck volunteering to port all my changes and fixes.

Simplify Simplify Simplify!

One of my design goals with Chipmunk is to keep the API as simple as it can be without taking away flexibility. I’ve spent a lot of time thinking about how to avoid thresholds, sensitivity to units and other “tuning” factors. I’ve gotten some very nice compliments on it’s simplicity, so I think I’ve probably done OK in that regard. It could be better though! You shouldn’t need to be a physics engine master to make great games using one. Sometimes I lose sight of what it’s like to be a beginner. I’ve got two particular cases in mind. Simplified object creation and a collision only mode.

Creating a physics object in Unity or Box2D is different than Chipmunk. You lay out your collision shapes and they guess the mass, center of gravity and moment of inertia for you. This takes away some flexibility, but it’s also vastly simpler for beginners. I have utility functions in Chipmunk to help out with that, but sometimes people don’t care or need to know what a moment of inertia is to make a good game. It’s just a barrier that gets in their way.

A lot of people want to make a game that needs collision detection, but don’t really care about physics. It’s perfectly possible to do that with Chipmunk now, but it’s certainly not obvious how for a beginner. I could totally build a simpler API on top of Chipmunk to help with this.

Continuous Collision Detection (CCD):

This is related to the previous point. I’ve tried to make Chipmunk’s API as simple as possible and make it so that things “just work” without having to fiddle with thresholds and scales and such. One thing that doesn’t “just work” in Chipmunk is continuous collision detection. I usually sort of brush it off as being unecessary or usually very easy to work around, which is true, but missing my own point. Collisions should “just work” and not be something that you need to worry about as a designer. Erin Catto made that point in his GDC talk on collision detection, and it really struck a chord. The 6.2 beta branch is already a big step in the right direction.

I’ve been sitting on some ideas for mixing pseudo-velocities and speculative contacts for some time. I think it could work really well and solve some of the things I didn’t initially like about speculative contacts. If it works well, it will be awesome. :D If not… then I’ll have to go with something more traditional. :(

More Tutorial Content:

Documentation, examples, tutorials… There can never be enough! I try to make as much as I can, but it’s very time consuming to do. I met up with Ray Wenderlich and Rod Strougo (They wrote the book “Learning Cocos2D”) at GDC and they sounded pretty interested in working with us to make more of it and get it out where people can see it. They are pretty active in the books and documentation realms, so I think that can be a great asset.

Great. Can you have it done by Friday?

Oof. So clearly that is a lot of stuff to chew on. Our company mostly supports itself by doing contracting work, but we’ve scheduled a big block of time for enhancing Chipmunk for a while. If there’s something you’re really dying to have in Chipmunk, let me know. It’s good to have some external input when prioritizing.

March 11th, 2013 | Posted by slembcke
Categories: Uncategorized | Tags:

There are some notable limitations with Chipmunk’s current code that a lot of people run into. Deep poly/poly or poly/segment collisions could produce a lot of contact points with wacky positions and normals which were difficult to filter sensibly. This was preventing me from implementing smoothed terrain collision feature that I wanted (fix the issue where shapes can catch on the “cracks” between the endpoints of segment shapes). Also, until recently line segment shapes couldn’t collide against other line segment shapes even if they had a nice fat beveling radius. They can now thanks to a patch from LegoCylon, but they have the same issues as above, generating too many points with wacky normals. I also had no way to calculate the closest points between two shapes preventing the implementation of polygon shapes with a beveling radius. Yet another issue is that the collision depth for all of the contacts is assigned the minimum separating distance determined by SAT. This can cause contacts to pop when driven together by a large force. Lastly, calculating the closest points is also a stepping stone on the way to get proper swept collision support in Chipmunk in the future.

For some time now, I’ve been quietly toiling away on a Chipmunk branch to improve the collision detection and fix all of these issues. I’ve implemented the GJK/EPA collision detection algorithms as well as a completely new contact point generation algorithm. After struggling on and off for a few months with a number of issues (getting stuck in infinite recursion/loops, issues with caching, issues with floating point precision, suboptimal performance, etc… ugh), it finally seems to be working to my expectations! The performance is about the same as my old SAT based collision code, maybe 10% faster or slower in some cases. Segment to segment collisions work perfectly as do beveled polygons. Smoothed terrain collisions are even working, although the API to define them is a little awkward right now.

All collision types.
Beveled line segments colliding with other shapes!

GJK:

The aptly named Gilbert–Johnson–Keerthi algorithm is what calculates the closest points between two convex shapes. My implementation preserves the winding of the vertexes it returns which helps avoid precision issues when calculating the separating axis of shapes that are very close together or touching exactly. With the correct winding, you can assume the edge you are given lies along a contour in the gradient of the distance field of the minkowski difference. I’ve also modified the bounding box tree to cache GJK solution. Then in the next frame, you can use that as the starting point. It’s a classic trick that makes GJK generally require only a single iteration per pair of colliding objects.

GJK example
GJK is rather abstract, but it calculates the closest points between two shapes by finding the distance between the minkowski difference of two shapes (the red polygon) and the origin (the big red dot). If you look closely, the red shape is a flipped version of the pentagon with the little triangle tacked on to all it’s corners. It’s one of the coolest and most bizarre algorithms I’ve ever seen. :D I’ll probably make a blog entry about my implementation eventually too.

EPA:

EPA stands for Erik-Peterson-Anders… Nah, it stands for Expanding Polytope Algorithm and is a very close cousin to GJK. While GJK can detect the distance between two shapes, EPA is what you can use to find the minimum separating axis when they are overlapping. It’s sort of the opposite of the closest points. It gives you the distance and direction to slide the two shapes to bring them apart (as well as which points on the surface will be touching). It’s not quite as efficient as GJK and it’s an extra step to run which has the interesting effect of making collision detection of beveled polygons more efficient than regular hard edged ones. This is one thing I’m not completely happy with. Polygon heavy simulations will generally run slower than with 6.1 unless you enable some beveling. It’s not a lot slower, but I don’t like taking steps backwards. On the other hand, it will be much easier to apply SIMD to the hotspots shared by the GJK/EPA code than my previous SAT based code.

Contact Points:

Having new collision detection algorithms is neat and all, but that didn’t solve the biggest (and hardest) issue; my contact point generation algorithm sucked! Actually, it worked pretty good. It has stayed mostly unchanged for 6 years now, but it has also accumulated some strange workarounds for rare conditions that it didn’t handle well. The workarounds sometimes produced strange results (like too many contact points or weird normals…). They also made it practically impossible to add the smoothed terrain collision feature.

In the new collision detection code, polygon to polygon and polygon to segment collisions are treated exactly the same. They pass through the same GJK/EPA steps and end up being passed to the same contact handling code as a pair of colliding edges. It handles collisions with endcaps nicely, always generates either 1 or 2 contact points for the pair, and uses the minimum separating axis for the normals. It’s all very predictable and made the smoothed terrain collisions go pretty easily. It only took me about 10 iterations of ideas for how to calculate the contact points before I got something I was happy with. -_- It really ended up being much, much harder than I expected for something that seems so simple in concept.

The implementation works somewhat similarly to Erin Catto’s contact clipping algorithm, although adding support for beveling (without causing popping contacts) and contact id’s made it quite a bit harder. There are some pretty significant differences now. Perhaps that is another good post for another day.

Coming to a Branch Near You!

The branch is on GitHub here if you want to take a peek and poke around. It’s been a pretty big change, and hasn’t been extensively tested yet. I wouldn’t recommend releasing anything based on it quite yet, but it should be plenty stable for development work, and should even fix a number of existing issues. I’d certainly appreciate feedback!

January 9th, 2013 | Posted by slembcke
Categories: Random | Tags: , ,
Comments Off

I finally joined the 21st century after buying a smartphone, an iPhone 5. (Woo I guess?) Anyway. One of the only interesting things that changed with the iPhone 5 was the CPU. It’s supposed to be much faster and uses an new instruction set with extra registers for the NEON vector unit. I was curious to see how much faster it was compared to our iPad 2 at running Chipmunk so I ran the benchmark app and crunched the numbers. It’s certainly a nice leap in performance!

These aren’t conclusive results, but they do paint a nice picture. Normally I run each set a few times and pick the lowest time for each benchmark to remove outliers due to being scheduled on a multitasking OS.

Comparing the iPhone 5 to the iPad 2:

speedup: 2.72x (benchmark - SimpleTerrainCircles_1000)
speedup: 2.60x (benchmark - SimpleTerrainCircles_500)
speedup: 2.50x (benchmark - SimpleTerrainCircles_100)
speedup: 2.68x (benchmark - SimpleTerrainBoxes_1000)
speedup: 2.57x (benchmark - SimpleTerrainBoxes_500)
speedup: 2.55x (benchmark - SimpleTerrainBoxes_100)
speedup: 2.64x (benchmark - SimpleTerrainHexagons_1000)
speedup: 2.63x (benchmark - SimpleTerrainHexagons_500)
speedup: 2.53x (benchmark - SimpleTerrainHexagons_100)
speedup: 2.56x (benchmark - SimpleTerrainVCircles_200)
speedup: 2.62x (benchmark - SimpleTerrainVBoxes_200)
speedup: 2.62x (benchmark - SimpleTerrainVHexagons_200)
speedup: 2.60x (benchmark - ComplexTerrainCircles_1000)
speedup: 2.62x (benchmark - ComplexTerrainHexagons_1000)
speedup: 1.95x (benchmark - BouncyTerrainCircles_500)
speedup: 2.10x (benchmark - BouncyTerrainHexagons_500)
speedup: 1.77x (benchmark - NoCollide)

If you want an idea of what these benchmarks do, the Chipmunk Pro page has little animations of them.

I was also curious how much the extended NEON register set improved performance. It’s been a while, but from what I remember of looking at the disassembled output, the NEON solver in Chipmunk Pro did run out of registers and push values to the stack. So it was possible that the extra registers would be able to speed it up more. It was initially a pain to support armv7s as it was sprung on everybody unexpectedly. All projects were automagically upgraded to build for armv7s when Apple released the new SDK which was annoying for library developers. Having had bad experiences with compiler bugs in Apple’s Clang, I wasn’t willing to release an armv7s build without being able to test it. Anyway, it turns out it was worth the hassle with a decent little performance boost.

The speedups for armv7s vs. armv7 are as follows:

armv7 -> armv7s Speedup:

speedup: 1.23x (benchmark - SimpleTerrainCircles_1000)
speedup: 1.32x (benchmark - SimpleTerrainCircles_500)
speedup: 1.24x (benchmark - SimpleTerrainCircles_100)
speedup: 1.17x (benchmark - SimpleTerrainBoxes_1000)
speedup: 1.17x (benchmark - SimpleTerrainBoxes_500)
speedup: 1.18x (benchmark - SimpleTerrainBoxes_100)
speedup: 1.13x (benchmark - SimpleTerrainHexagons_1000)
speedup: 1.14x (benchmark - SimpleTerrainHexagons_500)
speedup: 1.15x (benchmark - SimpleTerrainHexagons_100)
speedup: 1.23x (benchmark - SimpleTerrainVCircles_200)
speedup: 1.18x (benchmark - SimpleTerrainVBoxes_200)
speedup: 1.15x (benchmark - SimpleTerrainVHexagons_200)
speedup: 1.18x (benchmark - ComplexTerrainCircles_1000)
speedup: 1.13x (benchmark - ComplexTerrainHexagons_1000)
speedup: 1.03x (benchmark - BouncyTerrainCircles_500)
speedup: 1.03x (benchmark - BouncyTerrainHexagons_500)
speedup: 1.00x (benchmark - NoCollide)

That’s about what I expected to see. It was able to speed up a lot of the benchmarks that were using the solver heavily while the last 3 benchmarks that attempted to stress the collision detection performance instead were mostly unaffected. A 10-30% speedup is pretty nice for something as simple as a recompile.

I wonder how this would compare to performance on similar Android devices like the Galaxy S III or Nexus 7. I’ll have to get my hands on one of them and try it.

December 31st, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off

I promise we’re not dead, I’ve just been working on the New Year App Blowout. This year’s sale went great. I had all the spots filled by mid-December, and we got some really great high-profile apps.



New Year App Blowout


The Fifth Annual New Year App Blowout




We’ve already had some press mentions, we’re looking forward to more!

July 3rd, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off

We’ve got a handful of small news items that weren’t big enough for their own posts, so now we’ll catch you up on some Chipmunk and Howling Moon news!

Bobo Explores Light.. and wins an Apple Design Award in the process! This is a great presentation/book game that uses Chipmunk for all sorts of interactive elements. It’s aimed towards children and does an amazing job demonstrating how a digital book can offer amazing interactive learning opportunities.

AtPlayMusic Recorder is an educational app we developed for a local client that uses the iPad to teach children to play the recorder, and features some pretty cool interactive tech. The iPad listens to the student play, and the app also features interactive games mixed in with an animated instructional lesson. Although it doesn’t use Chipmunk, we’re pretty proud of our technical contributions, such as an efficient avatar creation system, shaders for recoloring the avatar’s clothing, and microphone capabilities.

Jump Dewds is the latest game to come out of our friends at Graveck. If you’re looking for some free-to-play coin-grabbin’, lava-dodging, fun that is stuffed with a quirky cast of collectable dewds.. check it out.

And in the Chipmunk world, famous cocos2D guy Birkemose has been using the new deformable terrain stuff and making some really amazing demos. I’ll show off the flashy video here:

He explains his drawing technique used in the above video on his blog.  But he’s also got an explanation and a example project for a simpler version.

Finally, The Wind in the Willows is another eBook that uses Chipmunk for many interactive elements.

May 29th, 2012 | Posted by slembcke
Categories: Uncategorized | Tags:

Framerate Independent Interpolation:

So one of my favorite math tricks is simple damping. Say you have a moving object that you want to apply some drag to. The easy way to do it is to multiply the velocity of the object by a number slightly less than 1.0 each frame:

speed = speed*0.99

Despite the simplicity, this is a pretty good approximation of drag. The faster the object is moving, the faster it will slow down. Using a similar method, you can smooth out noisy input data or target positions for an object to follow:

value = lerp(value, targetValue, 0.1)

Each time this is called, value will move closer to targetValue, and the greater the difference, the faster it will converge.

BUT!

Both of these suffer from a major problem. The damping value used is highly dependent on the framerate. If you call it once per frame and the framerate slows way down, so will the movement. That’s not good at all! You can always call it at a fixed rate, but that’s not alway convenient. Fortunately with a little extra math, you can fix this problem completely.

value = lerp(value, targetValue, pow(0.1, dt))

Where ‘dt’ is the time that has passed since you last called it. Before you were simply moving 10% of the way towards the target value each frame, now you are moving 10% per second. This is much better, and won’t get messed up when your framerate fluctuates.

Logarithmic Interpolation:

Another related trick that I really like is logarithmic interpolation. A mistake that I often see in games is when they do something like this:

scale = lerp(scale, targetScale, 0.1)

Scale, zooming, and sound frequencies are not really perceived as linear quantities, and lerping between them will just seem off. The problem gets worse the greater the ratio between scale and targetScale. The correct way to interpolate such values is to do it logarithmicly. Basically, like this:

scale = exp(lerp(log(scale), log(targetScale), 0.1)

Use linear interpolation on the values after converting them to logarithmic values, then use exp() to convert the logarithmic value back. That’s a lot of expensive math functions to call however! If you remember your log and power rules, you can simplify that to a single pow() call instead:

scale = scale*pow(targetScale/scale, 0.1)

You can combine this with the framerate independence trick from above too. Add these tricks to your math library and give them a whirl.

May 17th, 2012 | Posted by Andy Korth
Categories: Uncategorized | Tags:
Comments Off

This is a reprint of an article we wrote a few months ago for Game Physics Mag.

“Ninety percent of your time is spent in ten percent of your code”, as the old adage goes, so choosing the proper algorithms for that 10% of the code in your game can be key. In the realm of collision detection and other game code, a good broadphase algorithm can be vital for many kinds of games to run in real time if they have a lot of moving objects in the game. A lot of game developers treat this as some sort of voodoo that only middleware can provide for them, but this isn’t the case. A good collision detection broadphase algorithm doesn’t need to be difficult or hopelessly complicated. In some cases, writing your own can be very beneficial when you run into performance problems. If you wrote it and know how it works, you can fix it yourself without needing to guess.

Collision detection broadphase:

Determining if two shapes intersect can become very computationally expensive, increasingly so based on their complexity. Why waste this computation time when they might be on completely opposite sides of the screen anyway? Broadphase algorithms, which cheaply return lists of possible collisions, alleviate this problem. They usually make as few comparisons as possible based on simplified bounding volumes (such as axis-aligned bounding boxes) to eliminate obvious non-collisions. Then you can apply the much more expensive collision check on a significantly smaller list of potential collisions. This article will discuss two simple broadphase algorithms, each with different strengths and weaknesses. Collision detection of specific kinds of shapes (polygons, circles, etc) and collision response is beyond the scope of this article, although many other helpful resources in that area can be found online.

The simplest possible broadphase would be to keep a list of all the shapes in your game and check each one against all others. It’s a naive, brute force algorithm, and it won’t scale to large numbers of shapes, but don’t completely discount it. If you only have a couple dozen interactive objects in your game, a simple list will perform just fine and is very, very easy to implement. In an Asteroids-like game, because only asteroids and bullets and asteroids and the ship collide, you can already skip most of the potential collision pairs. A broadphase wouldn’t help your performance much in that case. Until you profile and identify a performance problem, don’t spend time doing something you don’t need. It should generally be easy to plug a broadphase into your API after the fact, once you determine that it is necessary.

Single Axis Sweep

Single axis sweep is a very simple but effective broadphase algorithm. It is very easy to implement and is well suited for many types of 2D games, especially those where most of the action is spread out across a single axis. A horizontally scrolling motorbike game or a vertically scrolling space shooter are particularly good candidates. A similar algorithm called sweep and prune can efficiently operate on full 2D and 3D data, but is more complicated to implement.

Starting at the blue dot, sweep to the right until you hit the blue dotted line. The red dot lies within this range so blue and red may be colliding. Continuing on to the red dot and sweeping right until the red dotted line you find the yellow dot. Yellow and red should be checked for collisions. When sweeping the yellow and green ranges, you find no more dots, so there are no more potential collisions to report.

Starting at the blue dot, sweep to the right until you hit the blue dotted line. The red dot lies within this range so blue and red may be colliding. Continuing on to the red dot and sweeping right until the red dotted line you find the yellow dot. Yellow and red should be checked for collisions. When sweeping the yellow and green ranges, you find no more dots, so there are no more potential collisions to report

To implement single axis sweep, axis aligned bounding boxes are computed for each element under consideration, and then sorted by their minimum bound on one of the axes. Now with the sorted list of shapes, the collision detection is just a simple set of for loops.

static void
Sweep(List objects, CollisionCallback callback)
{
    // Sort the objects by their minimum bounds on one of the axes
    // Insertion sort is recommended as the list should already be mostly sorted from the last frame.
    SortByXMin(objects);

    int count = objects.count;
    for(int i = 0; i < count; i++){
        Object object = objects[i];
        float max = cell.bounds.max;

        // 'object' overlaps other objects in the list until 'object[j].aabb.xmin' is greater than 'max'
        for(int j = i+1; objects[j].aabb.xmin < max && j < count; j++){
            // 'object' and 'objects[j]' may be colliding.
            // Use the callback to pass the pair of objects back to the game's code for a better check.
            callback(object, objects[j]);
        }
    }
}

Additionally, because the shapes generally don’t move very far from frame to frame, their sorting order doesn’t change much either, if at all. This is known as temporal coherence, and it can really speed up your collision detection if you can take advantage of it. In the case of single axis sweep, all we need to do is select a sorting algorithm such as insertion sort that performs best on nearly sorted lists.

In the average case when you have smoothly moving objects mostly spread out over a single axis, the algorithm runs in O(n) time when using insertion sort, which is very good. However, if your objects aren’t well spread out, or they don’t move smoothly and their position is unrelated to the previous frame’s position, it runs in O(n^2) time which is no better than the naive brute force algorithm.

Another advantage of single axis sort is that it can be trivially parallelized. This works especially well for implementing a GPGPU accelerated broadphase where it’s better to be simple and wasteful instead of complex. The GPU gives you a massive amount of CPU power for simple parallelizable tasks, but works very poorly when given complex tasks that require a lot of synchronization. This approach has been mentioned in a number of GPGPU papers.

However, an issue that you might run into with single axis sweep is that it can’t efficiently be used for positional queries from your game code. Even with the sorted list, you still need to check against every object in the list for a random query when doing a raycast or an explosion in your game. To perform such queries efficiently, you need a spatial index algorithm. It’s also not optimal when you are comparing a small number of moving objects to a large number of stationary objects. You’ll spend a lot of time resorting the stationary objects even though they never move.

Spatial indexing algorithms:

Spatial indexing algorithms focus on finding objects based on positional queries, much like how a database uses an index to quickly find records based on a search query. To use a spatial index as a collision detection broadphase, you run a query for each object to find the other objects around it. A good spatial index has many uses beyond just collision detection though! AI will frequently query for nearby entities or use raycasts for line of sight checks when making behavioral decisions. One time events such as explosions may occur where you need to efficiently find which objects are within the blast range. Another benefit is that you can split up your moving and stationary objects into separate indexes. The index of stationary objects would only need to be updated when one was aded, moved or destroyed. Compare this to a broadphase which can only give you pairs of colliding objects for one large list.

When working with a spatial index, there are a few core operations. Insertion, removal, querying, and reindexing are the most important. Also when using a spatial index for a broadphase, it’s also useful to combine reindexing and querying for all collisions into one operation. An advantage of this reindex/query operation is that you can make it avoid returning duplicate collision pairs. If shapes A and B are colliding, a query using A’s bounding box will return shapes A and B, and a query using B’s bounding box will also return A and B. The list of collision pairs you get would be (A, A), (A, B), (B, A), and (B, B), but you really only want (A, B). Normally you’d have to filter these out afterwards.

Simple Grids

Placing your game entities into a grid is a natural and easy way of indexing them. Visualize drawing a grid over your world that maps onto a 2D or 3D array. Each grid cell stores references to shapes if the AABB of the shape intersects the grid cell. Tile maps can be thought of as a simple grid, although they usually have a single tile per grid cell. In general though, you can have many shapes overlapping a single grid cell, and a single shape can overlap many grid cells. While there are a lot of variations to grid algorithms, we’ll walk through one simple and flexible implementation.

In a game that takes place on a single screen or on a fixed size map, your grid size matches the size of your world. The size of the cells within the grid should match the size of the objects in your game. The broadphase search becomes quite simple, since you’re just checking each cell for multiple objects.

With a grid, you store lists of overlapping shapes into each cell. The purple and orange grid cells show places where multiple shapes have been inserted into a cell. The shapes in these cells may be colliding and should be checked closer.

With a grid, you store lists of overlapping shapes into each cell. The purple and orange grid cells show places where multiple shapes have been inserted into a cell. The shapes in these cells may be colliding and should be checked closer.

Time for some pseudo code! First the simple data structure for the grid.

static void
Sweep(List objects, CollisionCallback callback)
{
    // Sort the objects by their minimum bounds on one of the axes
    // Insertion sort is recommended as the list should already be mostly sorted from the last frame.
    SortByXMin(objects);

    int count = objects.count;
    for(int i = 0; i < count; i++){
        Object object = objects[i];
        float max = cell.bounds.max;

        // 'object' overlaps other objects in the list until 'object[j].aabb.xmin' is greater than 'max'
        for(int j = i+1; objects[j].aabb.xmin < max && j < count; j++){
            // 'object' and 'objects[j]' may be colliding.
            // Use the callback to pass the pair of objects back to the game's code for a better check.
            callback(object, objects[j]);
        }
    }
}

Instead of storing references to the objects directly in the grid’s array, we’ll be creating proxies. We’ll perform simple retain counting on these (in non GC languages) and keep a stamp to avoid duplicate results when querying. The reason for this will be explained later.

GridProxy {
    // The object this proxy is for.
    Object object;

    // Used during query operations to avoid duplicate results.
    int queryStamp;

    // Use retain counting on the handle so you know when it can be released.
    int retainCount;
}

Let’s start with the code for the insert operation.

void SimpleGrid_Insert(SimpleGrid grid, Object object)
{
    AABB aabb = object.aabb;
    GridProxy proxy = GridProxyNew(object);
    HashTableSet(grid.objectsProxies, object, proxy);

    // Convert the object's AABB to integer grid coordinates.
    // Objects outside of the grid are clamped to the edge.
    float cellSize = grid.cellSize;
    int xmin = max(floor(aabb.xmin/cellSize), 0.0);
    int xmax = min(floor(aabb.xmax/cellSize), grid.width - 1);
    int ymin = max(floor(aabb.ymin/cellSize), 0.0);
    int ymax = min(floor(aabb.ymax/cellSize), grid.height - 1);

    // Loop over the cells and insert the proxy into their lists.
    for(int y = ymin; y <= ymax; y++){
        for(int x = xmin; x <= xmax; x++){
            List proxiesInCell = grid.array[x][y];
            List_Append(proxiesInCell, proxy);

            // Retain the proxy for each cell it's inserted into.
            GridProxyRetain(proxy);
        }
    }
}

Removal from a grid can be tricky, since an object can be in several grid cells. There are several approaches, such as iterating over all the grid cells and removing the object from each one it’s present in. Fortunately, with the proxies removal becomes very simple. All you need to do is to invalidate the proxy by setting it’s object to NULL. Don’t forget to release the proxy as well so the next time the grid is reindexed so it won’t be leaked.

void SimpleGrid_Remove(SimpleGrid grid, Object object)
{
    GridProxy proxy = HashTableGet(grid.objectsProxies, object);
    HashTableRemove(grid.objectsProxies, object);

    // Set the object to NULL to mark the proxy as invalid.
    proxy.object = NULL;

    GridProxyRelease(proxy);
}

While there are a number of queries that you might want to implement (point, line, AABB, etc) the AABB query is good starting place. Much of it should look familiar from the SimpleGrid_Insert() function. The first thing you should do is to increment the grid’s query stamp. This is used to uniquely identify the query. All proxies that are visited get marked with this stamp value. It’s very likely that you’ll encounter a proxy several times in the neighboring cells. This way you know to only report each object once for each query.

void SimpleGrid_AABBQuery(SimpleGrid grid, AABB aabb, Function callback)
{
    // Increment the grid's queryStamp so it's unique.
    grid.queryStamp++;

    // Convert the object's AABB to integer grid coordinates.
    float cellSize = grid.cellSize;
    int xmin = max(floor(aabb.xmin/cellSize), 0.0);
    int xmax = min(floor(aabb.xmax/cellSize), grid.width - 1);
    int ymin = max(floor(aabb.ymin/cellSize), 0.0);
    int ymax = min(floor(aabb.ymax/cellSize), grid.height - 1);

    for(int y = ymin; y <= ymax; y++){
        for(int x = xmin; x <= xmax; x++){
            List proxiesInCell = grid.array[x][y];

            for(GridProxy proxy in proxiesInCell){
                Object object = proxy.object;

                // Check that the proxy hasn't already been processed by this query in a neighboring cell.
                // Also check that the proxy's object hasn't been removed.
                if(proxy.queryStamp != hash.queryStamp && object != NULL){
                    // Stamp the proxy so it won't been visited again.
                    proxy.queryStamp = grid.queryStamp;
                    callback(object);
                }
            }
        }
    }
}

Reindexing a grid basically involves temporarily removing all of the objects, clearing the grid, and then reinserting the objects. A good first optimization is to reuse the proxy objects. This was omitted to keep the sample code short however.

void SimpleGrid_Reindex(SimpleGrid grid)
{
    List objects = HastTableKeys(objects);

    // Remove all the objects. We'll readd them later.
    for(Object object in objects){
        SimpleGrid_Remove(grid, object)
    }

    // Clear out all the cells
    // This gets slower the bigger your grid is.
    for(int y = 0; y < grid.height; y++){
        for(int x = 0; x <= grid.width; x++){
            List proxiesInCell = grid.array[x][y];
            for(GridProxy proxy in proxiesInCell){
                GridProxy_Release(proxy);
            }

            // Reset the list for that cell
            grid.array[x][y] = EmptyList;
        }
    }

    // Readd the objects.
    for(Object object in objects){
        SimpleGrid_Insert(grid, object);
    }
}

Lastly, it was mentioned that a reindex/query operation is useful. This is easy enough to implement by performing a query immediately before inserting each object during the reindexing operation. This way you’ll never have an object get itself back from querying it’s own bounding box. Also, because the objects are inserted and queried incrementally, you won’t get both an (A, B) and (B, A) collision pair. You can even combine the insert and query operations together to avoid iterating the same grid cells separately for each.

An advantage of a grid based index is the ease of efficiently running queries. Point queries simply need to check the cell that contains that point, and a raycast implementation only needs to check grid cells the ray travels though, much like Bresenham’s line drawing algorithm. Unlike the single axis sweep, the grid is a full spatial index that can be efficiently queried by the rest of the game code.

Grids do have a couple problems though. Large worlds require a large grid and thus a large 2D or 3D array to store it. This introduces some potential problems with memory usage. Infinitely sized procedural worlds are increasingly popular, and raises the question about what you’d do when your grid’s array size gets unreasonably large. Additionally, a simple grid can be wasteful since they often end up being almost entirely empty. At a minimum, empty grid cells still use at least as much memory as a pointer. Imagine a 1000×1000 grid with 5000 objects inserted into it. If each object was inserted into 4 cells each, you would at most be using 2% of the cells. Fortunately, these concerns are addressed by the spatial hash algorithm.

Spatial Hash:

Spatial hashes are special type of the grid index that uses memory more efficiently. Instead of creating a large multidimensional array, you create a small 1D array, and then semi-randomly map the grid cells onto that array using a hashing function. This has a lot of benefits. You no longer need to think about grid bounds or size. Any grid cell can be mapped onto the hash table, limited only by the size of your integer type. Sparsely populated grids won’t eat up your memory either. If the array is mostly empty, you can just make it smaller. This is also good for when you want to clear the grid as a smaller array can be cleared faster.

There are a couple of minor downsides however. Because many grid cells can be mapped onto the same array location, it’s possible for two objects that are far away to be returned as a false positive by the spatial hash. However this is fairly rare and can be quickly rejected by a bounding box check. Another issue is that it’s possible for a single object to be inserted into the same array index more than once from two separate grid cells. You’ll either need to catch this condition, or make sure that you never assume that the list for a cell contains all unique entries.

What makes a good hashing function? A poor hashing function could result in an uneven distribution of objects into the array, negating some of the benefits. On the other hand, the hashing function may easily be called hundreds of thousands of times per second, so it should be fairly efficient. In Chipmunk, I use the following hash function in conjunction with prime sized hash tables. This has proven to provide a reasonable level of performance.

int HashFunc(int x, int y, int tableSize){
    // multiplying the coordinates by large primes, XORing them together and then
    // modding by a prime table size makes a pretty good hash function
    return (x*1640531513 ^ y*2654435789) % tableSize;
}

Implementing the insert, delete, query, and reindex operations for a spatial hash are almost identical to a spatial hash. You can remove the min/max clamping when calculating the grid coordinates as they are no longer needed to protect against array indexing errors. Instead of accessing the grid’s array as a 2D array, you just need to swap in the hash function.

for(int y = ymin; y <= ymax; y++){
    for(int x = xmin; x <= xmax; x++){
        List proxiesInCell = grid.array[HashFunc(x, y, grid.arraySize)];
        // ...
    }
}

The efficiency of the spatial hash and simple grid depends on the grid cell size roughly matching the sizes of the shapes being put into it. If your grid cells are much smaller than your objects, each object will be in a handful of cells, so you’ll be checking and inserting into a lot more cells than you need to. If your grid cells much larger than your objects, you might end up with a lot of non-colliding objects in a single cell generating a lot of false-positives. Adjust your grid size to be comparable to the size of typical objects. For this reason, grids work best when all your objects are roughly the same size.

The grid cell size is much too big. Too many shapes are inserted into each cell, and each cell requires n^2 collision checks. The grid cells are the grey squares; A darker color means there are more shapes in a cell.
The grid cell size is much too big. Too many shapes are inserted into each cell, and each cell requires n^2 collision checks. The grid cells are the grey squares; A darker color means there are more shapes in a cell.
This grid cell size is much too small. Each shape is inserted into many grid cells. Notice the speckles between shapes. These are grid cells that map to the same hash table index as another grid cell where a shape was inserted.
This grid cell size is much too small. Each shape is inserted into many grid cells. Notice the speckles between shapes. These are grid cells that map to the same hash table index as another grid cell where a shape was inserted.
The size of the grid cells is just right.
The size of the grid cells is just right.

Spatial hashes require an additional tuning parameter compared to a simple grid, the hash table size. When working with a grid, you simply need to worry about making it’s dimensions big enough to fit all your objects in it and and then tuning the cell size. With a spatial hash, you also must tune the size of the hash table itself. If the hash table size is too small, you will get many false positives because too many grid cells that are far away from each other will map onto the same hash table index. If the hash table size is too big, you’ll just be wasting memory and making clearing operations take longer. You’ll have to tune this by hand a little, but a good rule of thumb is that your hash table should be 10x bigger than the number of objects you insert into it.

When simple grids and spatial hashes are properly tuned, they also have an O(n) running time to all the collisions. This is very good and means that they scale very well to huge numbers objects. You could easily expect to collide 10k to 20k shapes at 60 hz on a modern desktop CPU. Despite the linear running time, because of the number of memory operations grids and spatial hashes use to access each grid cell, they have a high constant cost associated with them. For smaller numbers of shapes, or shapes that aren’t the all about the same size, there may be more optimal spatial indexes.

Conclusion:

So now that you’ve seen that spatial index and collision detection broadphase algorithms don’t have to be hard, I hope you’ll try writing some of your own! It’s a very interesting topic with a wide variety of ideas that has filled many books. If you are interested in learning more, bounding volume hierarchies such as quadtrees, k-d trees, and axis aligned bounding box trees are a good next step.

Scott Lembcke and Andy Korth have been developing games for the last four years. Scott wrote the Chipmunk Physics engine, and together they have offered game physics consulting services, iOS development, and Unity development services. They are located in Minnesota, USA.

References:

“Optimized Spatial Hashing for Collision Detection of Deformable Objects”

http://graphics.ethz.ch/Downloads/Publications/Papers/2003/Tes03/Tes03.pdf

May 9th, 2012 | Posted by slembcke
Categories: Development, News | Tags: , , , ,

Chipmunk Logo

It has been a few months since Chipmunk 6.0.3 was released! We’ve had a lot of contracting work to distract us from core Chipmunk development lately. Even so, we’ve been cooking up some new features and now I’m currently working full time putting the final touches on 6.1. So what’s new to get excited about? Here are a lot of words!


Approximate Concave Decomposition (Pro only):


ACD

One of the features that has been missing from Chipmunk Pro’s autogeometry is some form of concave decomposition. A lot of people use a triangularization algorithm to solve this. Unfortunately, while triangularization is great for creating perfectly decomposed triangle meshes for rendering, and are pretty easy to implement, they doesn’t work very well for generating collision detection shapes. It tends to create more shapes than you want or need and can produce very small or thin shapes. This can clog up the spatial index, giving you poor overall performance, and cause other frustrating artifacts in the simulation quality.

Instead of triangularization, I’ve been working on approximate concave decomposition (ACD) for Chipmunk Pro. This means that you ask Chipmunk to decompose a concave polygon with a distance tolerance, and it will give you back one or more convex polygons that match the shape of the original to within the tolerance. The ACD algorithm tries to match the most important features of the input first to create a high quality decomposition. Also, because it can stop once it satisfies the tolerance, the performance can be really good and used at runtime even. If you want a perfect decomposition, you can still always use a tolerance of 0.0. Although this can take more time and produce more shapes than you might want.

The ACD algorithm shipping with 6.1 isn’t perfect though. I’m still improving it’s splitting plane hueristic to improve how it cuts the polygons (I’m not too happy with the ‘H’ in the image above -_- ), and it doesn’t yet have options to detect or process polygons with holes in them (like the ‘P’). Lastly, if you run ACD on a self intersecting polygon, it might cause it to over-simplify it because it detects the polygon as having a negative tolerance.

Because we are still working on the quality of the algorithm, and its missing the ability to work with holes, we’ve marked it as being beta. Its usable and stable, but its performance and quality might not be as good as some people want yet. I didn’t want to hold up the release longer while I did more research for this, but at the same time didn’t want to leave it in an unusable state.


Multithreaded Solver (Pro only):

This has been in the pipeline for a long time! There has been early access to this available through the Chipmunk Pro git repository for some time, but the next release will make it official. This complements the ARM NEON optimized solver and works on any platform that supports pthreads. This is an easy way to take advantage of those extra cores that now ship on everything from desktop to mobile machines. The Chipmunk Pro showcase app on the App Store uses the multi-threaded solver and enables a lot of extra objects to simulate on the iPad 2 and iPhone 4s. Check it out!


Convex Hulls:


Convex Hull

Since the beginning of time (or like 2006… whatever), Chipmunk has required that you carefully provide it with convex polygon data with a specific winding. Every good graphics API does this too, so people know to expect this sort of thing right? Nope. I talked with Erin Catto for a while at GDC and he mentioned that the winding and concavity of polygon shapes was one of his biggest support issues and that he recently added a convex hull algorithm so that it would “just work”. I’ve had assertions to ensure concavity and winding, but Erin’s solution was better. 2D game programmers often aren’t graphics experts, and they shouldn’t need to be. I’ve had a nice QuickHull implementation sitting around for some time, so I decided to make it more generic and include it with Chipmunk. Nobody will ever have to bother with concavity or winding ever again. \o/


Nearest Point Queries:


Nearest Points

The main inspiration for the original point query feature in Chipmunk was selecting shapes with the mouse. Using a single point made great sense for this as mice are very precise devices. With touchscreen devices its not perfect. Using a single point to represent a big fat finger doesn’t work very well. Its very frustrating to try to grab a small shape with your finger only to fail repeatedly. I’ve seen some people doing clever sampling tricks to try and get around it, but we can do better! Nearest point queries are the next logical step. Instead of just a yes/no answer to if a point is inside a shape or not, nearest point queries give you the nearest point on the surface of a shape as well as the distance to it (or the depth the point is inside). Now when you tap a big fat finger on the screen, you can trivially find the closest shape within a certain radius. The ChipmunkMultiGrab class in Chipmunk Pro has already been updated to support this.


Block Iterators:

Almost every language in existence today other than C/C++ has some sort of concept of closures or anonymous functions. One of their uses is to allow you to make some really slick, simple APIs for iterating complex data structures. You just pass a chunk of code to the thing you want to iterate and it will call your code for each object. This is why Chipmunk has so many function based iterators (cpSpaceEachBody(), cpBodyEachShape, cpSpaceSegmentQuery(), etc). Exposing the underlying data structures would be annoying to the user as some of them aren’t simple, and it would mean I can’t change them later if I need to.

Even if you are using Chipmunk from C or C++, most compiler provide some extensions to make your life easier. GCC supports inner functions, and Clang supports blocks. I’ve often encouraged people to use them, as the iterator functions were intended to be used with language features such as these. I started making sample code for a tutorial showing how to use them with Clang blocks, but then decided that I might as well just put that code into the official Chipmunk API instead.

Without the block based iterators, you write code like this:


// Define your callback function somewhere like this:
static void
ScaleIterator(cpBody *body, cpArbiter *arb, cpVect *sum)
{
	(*sum) = cpvadd(*sum, cpArbiterTotalImpulseWithFriction(arb));
}

// ... and then somewhere else use it like this:
cpVect impulseSum = cpvzero;
cpBodyEachArbiter(scaleStaticBody, (cpBodyArbiterIteratorFunc)ScaleIterator, &impulseSum);

With the new block based iterators, its much nicer. You can keep the code all in one place:


__block cpVect impulseSum = cpvzero;
cpBodyEachArbiterBlock(scaleStaticBody, ^(cpArbiter *arb){
	impulseSum = cpvadd(impulseSum, cpArbiterTotalImpulseWithFriction(arb));
});