I spent a good chunk of the past year working on an internal training class and a short book about performance measurement and optimization. You can download it here. Below is an excerpt.
“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time; premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”
Donald Knuth, Structured Programming With go to Statements
The trickiest part of speeding up a program is not doing it, but deciding whether it’s worth doing at all. There are few clear principles, only rules of thumb.
Part of the problem is that optimization is hard to do well. It’s frighteningly easy to devolve into superstitious ritual and rationalization. Then again, there can be big payoffs hidden in surprising places. That’s why expert advice about performance tends to have a gnomic, self-contradictory flavor: “If you don’t know what you are doing, don’t do it! You’ll know if you know what you are doing. And remember to design your programs for performance.” The experts are acutely worried about encouraging more folly, yet can’t quite bring themselves to ignore the possible gains.
Knuth’s famous quote about premature optimization was never meant to be a stick to beat people over the head with. It’s a witty remark he tossed off in the middle of a keen observation about leverage, which itself is embedded in a nuanced, evenhanded passage about, of all things, using gotos for fast and readable code. The final irony is that the whole paper was an earnest attempt to caution against taking Edsger Dijkstra’s infamous remark about gotos too seriously. It’s a wonder we risk saying anything at all about this stuff.
Structured Programming With go to Statements does make two general points about performance that have proven extremely valuable. Optimizing without measurement to guide you is foolish. So is trying to optimize everything. The biggest wins tend to be concentrated in a small portion of the code, “that critical 3%,” which can be found via careful measurement.
A proper measurement regime can tell you when and where optimization is likely to succeed, but says little about whether doing it is worthwhile. Knuth ultimately shoves that responsibility onto a hypothetical “good” and “wise” programmer, who is able to look past the witty remarks and dire warnings and decide on the merits. Great, but how?
I don’t know either. Performance optimization is, or should be, a cost/benefit decision. It’s made in the same way you decide just how much effort to put into other cross-cutting aspects of a system like security and testing. There is such a thing as too much testing, too much refactoring, too much of anything good.
Unlike testing or bug-fixing, performance work can often be deferred until just before or even after the program has shipped. In my experience, it makes most sense on mature systems whose architectures have settled down. New code is almost by definition slow code, but it’s also likely to be ripped out and replaced as a young program slouches towards beta-test. Unless your optimizations are going to stick around long enough to pay for the time you spend making them, plus the opportunity cost of not doing something else, it’s a net loss.
Optimization also makes sense when it’s needed for a program to ship. Performance is a feature when your system has unusually limited resources to play with or when it’s hard to change the software after the fact. This is common in games programming, and is making something of a comeback with the rise of mobile computing.
Even with all that, there are no guarantees. In the early 2000s, I helped build a system for search advertising. We didn’t have a lot of money so we were constantly tweaking the system for more throughput. The former CTO of one of our competitors, looking over our work, noted that we were handling ten times the traffic per server than he had. Unfortunately, we had spent so much time worrying about performance that we didn’t pay enough attention to credit card fraud. Fraud and chargebacks got very bad very quickly, and soon after our company went bankrupt. On one hand, we had pulled off a remarkable engineering feat. On the other hand, we were fixing the wrong problem.
The dusty warning signs placed around performance work are there for a reason. That reason may sting a little, because it boils down to “you are probably not wise enough to use these tools correctly.” If you are at peace with that, read on. There are many more pitfalls ahead.
Carlos Bueno, an Engineer at Facebook, doesn’t let data push him around.