Chipmunk Pro performance on the iPhone 5

January 9th, 2013| Posted by slembcke
Categories: Random | Tags: , ,

I finally joined the 21st century after buying a smartphone, an iPhone 5. (Woo I guess?) Anyway. One of the only interesting things that changed with the iPhone 5 was the CPU. It’s supposed to be much faster and uses an new instruction set with extra registers for the NEON vector unit. I was curious to see how much faster it was compared to our iPad 2 at running Chipmunk so I ran the benchmark app and crunched the numbers. It’s certainly a nice leap in performance!

These aren’t conclusive results, but they do paint a nice picture. Normally I run each set a few times and pick the lowest time for each benchmark to remove outliers due to being scheduled on a multitasking OS.

Comparing the iPhone 5 to the iPad 2:

speedup: 2.72x (benchmark - SimpleTerrainCircles_1000)
speedup: 2.60x (benchmark - SimpleTerrainCircles_500)
speedup: 2.50x (benchmark - SimpleTerrainCircles_100)
speedup: 2.68x (benchmark - SimpleTerrainBoxes_1000)
speedup: 2.57x (benchmark - SimpleTerrainBoxes_500)
speedup: 2.55x (benchmark - SimpleTerrainBoxes_100)
speedup: 2.64x (benchmark - SimpleTerrainHexagons_1000)
speedup: 2.63x (benchmark - SimpleTerrainHexagons_500)
speedup: 2.53x (benchmark - SimpleTerrainHexagons_100)
speedup: 2.56x (benchmark - SimpleTerrainVCircles_200)
speedup: 2.62x (benchmark - SimpleTerrainVBoxes_200)
speedup: 2.62x (benchmark - SimpleTerrainVHexagons_200)
speedup: 2.60x (benchmark - ComplexTerrainCircles_1000)
speedup: 2.62x (benchmark - ComplexTerrainHexagons_1000)
speedup: 1.95x (benchmark - BouncyTerrainCircles_500)
speedup: 2.10x (benchmark - BouncyTerrainHexagons_500)
speedup: 1.77x (benchmark - NoCollide)

If you want an idea of what these benchmarks do, the Chipmunk Pro page has little animations of them.

I was also curious how much the extended NEON register set improved performance. It’s been a while, but from what I remember of looking at the disassembled output, the NEON solver in Chipmunk Pro did run out of registers and push values to the stack. So it was possible that the extra registers would be able to speed it up more. It was initially a pain to support armv7s as it was sprung on everybody unexpectedly. All projects were automagically upgraded to build for armv7s when Apple released the new SDK which was annoying for library developers. Having had bad experiences with compiler bugs in Apple’s Clang, I wasn’t willing to release an armv7s build without being able to test it. Anyway, it turns out it was worth the hassle with a decent little performance boost.

The speedups for armv7s vs. armv7 are as follows:

armv7 -> armv7s Speedup:

speedup: 1.23x (benchmark - SimpleTerrainCircles_1000)
speedup: 1.32x (benchmark - SimpleTerrainCircles_500)
speedup: 1.24x (benchmark - SimpleTerrainCircles_100)
speedup: 1.17x (benchmark - SimpleTerrainBoxes_1000)
speedup: 1.17x (benchmark - SimpleTerrainBoxes_500)
speedup: 1.18x (benchmark - SimpleTerrainBoxes_100)
speedup: 1.13x (benchmark - SimpleTerrainHexagons_1000)
speedup: 1.14x (benchmark - SimpleTerrainHexagons_500)
speedup: 1.15x (benchmark - SimpleTerrainHexagons_100)
speedup: 1.23x (benchmark - SimpleTerrainVCircles_200)
speedup: 1.18x (benchmark - SimpleTerrainVBoxes_200)
speedup: 1.15x (benchmark - SimpleTerrainVHexagons_200)
speedup: 1.18x (benchmark - ComplexTerrainCircles_1000)
speedup: 1.13x (benchmark - ComplexTerrainHexagons_1000)
speedup: 1.03x (benchmark - BouncyTerrainCircles_500)
speedup: 1.03x (benchmark - BouncyTerrainHexagons_500)
speedup: 1.00x (benchmark - NoCollide)

That’s about what I expected to see. It was able to speed up a lot of the benchmarks that were using the solver heavily while the last 3 benchmarks that attempted to stress the collision detection performance instead were mostly unaffected. A 10-30% speedup is pretty nice for something as simple as a recompile.

I wonder how this would compare to performance on similar Android devices like the Galaxy S III or Nexus 7. I’ll have to get my hands on one of them and try it.

Comments are closed.