Like with many modern and complex web applications, much of the code behind Facebook goes through a series of build steps to transform it from PHP source code and assets into the running application. These steps include identifying where source code classes live, mapping web controller paths, spriting images, using CSS and JavaScript to package it up, and more. Some of these steps are only necessary when we’re prepping for one of our twice-daily release pushes, but the rest are necessary for our engineers to have a functioning development environment.
Facebook scale magnifies just about everything, and we’re at a size where the simplistic approach of walking over the entire tree to figure out what needs to be built is a non-trivial exercise. A few of the build steps need to perform similar traversals, and if you’re unlucky enough to have had your filesystem cache go cold, you could be forced into a coffee break while you wait for the OS to pull everything back into cache.
Our engineering culture places a very high value on being able to move quickly and efficiently, and our build times were just too long. So we set about making build times faster by revising our build steps to be incremental. This way they only need to process the differences since the last run (and figure out what actually changed since the last run).
At the time we were considering this problem, a couple of other projects were surfacing the need for having a fast list of changed files on the filesystem. We needed a fast and reliable service that we could query and instantaneously return the set of files that were changed or deleted since our last query. We wanted this to be a service so we wouldn’t have to repeat our engineering efforts for each of the projects. We also wanted it to maintain its own view of the filesystem so we could query it and trigger scripts in response to observed changes. We looked around to see if there were other projects that met our requirements, and while there are numerous projects that can trigger scripts when files change, we didn’t find anything that had the right set of properties.
So we chose to build our own and make it open source. We call it “Watchman,” and it runs on Linux, OS X, FreeBSD, and Solaris.
We’re using Watchman to kick off builds while our engineers are editing files. Between this and more-intelligent incremental processing we’ve been able to reduce our interactive-user p50 build times by 60%.
This is a kernel density estimate (KDE) visualization of the interactive-user build time. The x-axis is measured in milliseconds. The solid line shows where we are today and the broken line shows where we were. The shaded area shows the change in the probability distribution. We want to move the p95 to the left of the red goal line. We’ve got some work left to do!
There’s more we want to do in this area but only so much that watchman on its own can do–it’s not a silver bullet for making builds go fast, but it has helped us to make noticeable improvements both here and in some other internal projects.
You can find and follow Watchman at its home on GitHub.