Performance notes

 from Red Blob Games
12 Jan 2021

Table of Contents

I normally don’t want to worry about performance but I did take a detour to look at this problem. The main reason is that the performance affects the set of questions I will ask. If I can simulate something in 1 minute, I might be willing to simulate hundreds of scenarios, but non-interactively. If i can simulate something in 1 second, I could either simulate tens of thousands of scenarios non-interactively, or maybe one scenario interactively. If I can simulate something in 1 millisecond, I might simulate hundreds or maybe thousands of scenarios interactively. It’s nice to be able to set the parameters interactively and see the results, but that shows one at a time. To understand the space of possible solutions, I want to see lots of results simultaneously. See Bret Victor’s page about the Ladder of Abstraction[1]. So although I’m not optimizing, I am wanting to get a sense of what is possible to simulate in a reasonable amount of time.

I started with a baseline: Odex.js[2] is a differential equation solving library. Mike Bostock’s predator-prey uses it. I assumed it is accurate (see the accuracy page) and then wanted to compare different approximations to get a feel for the accuracy vs speed tradeoff.

 1  Algorithms#

The first code I wrote was the naive straightforward “add up the values” integration: x += dx. I think everyone who does numerical integration knows this doesn’t work well, but it’s been so long that I had forgotten how bad Euler’s method was.

But it’s fast.

On the accuracy page you can see that it’s not stable at all.

The next step up is to use Heun’s method. This is fairly stable at least on this problem. And … it’s fast. I tried on my old (m0) and new (m1) machines in Chrome:

Algorithm First run, m1 40 runs, m1 First run, m0 40 runs, m0
Odex 112ms 3014ms 308ms 6726ms
Euler 5.7ms 88ms 8.2ms 117ms
Heun 9.3ms 211ms 15.8ms 281ms

So the dumb approach is 20-30X the speed of Odex, and even switching to a more accurate method, it’s still 10-15X the speed of Odex on this problem.

 2  Implementations#

I also wrote the Heun code in C++ and Rust, and compiled both natively and to wasm. The Rust wasm results were very similar to the C++ ones. I ran these both on my old Macbook Pro (“m0”) and my new Macbook Pro (“m1”).

Language Environment Time, m1 Time, m0
C++ -O3 native 210ms 250ms
Rust -O native 210ms 250ms
JS Node 210ms 250ms
wasm Node 210ms 240ms
JS Chrome 210ms 250ms
wasm Chrome 210ms 240ms
JS Safari 210ms 250ms
wasm Safari 210ms 250ms
JS Firefox 260ms 260ms
wasm Firefox 330ms 330ms

If you want to try yourself:

Notes:

In Chrome 87 and Firefox 84, having web developer tools open slows down wasm a lot but doesn’t slow down js. In Chrome, it slowed from 211ms to 255ms. In Firefox, it slowed from 330ms to 807ms(!). This inconsistency puzzled me greatly until I found a reddit thread[3] that pointed to the developer tools as the culprit. One comment says it might be due to “wat” files.

Wasm doesn’t seem much faster than Javascript for this type of code. I found the same was true in Mapgen4. I think the Javascript optimizers seem to do reasonably well with numerical code. I’m going to stick to Javascript for this project.

I didn’t show it on this page but there’s a significant speedup as it runs the code the first ~10 times. Part of it is that browsers optimize code if it’s run many times, but on my ARM machine I see the same effect in C++ and Rust!

I thought it might be limited by the memory access, so I turned off writing of the data to an array, but it didn’t speed it up.

 2.1 Stupid quirks

Email me , or tweet @redblobgames, or comment: