What the research is:
We’ve developed a new static analyzer that catches deadlocks in Java code for Android without ever running the code. What distinguishes our analyzer from past research is its ability to analyze revisions in codebases with hundreds of millions of lines of code.
We have deployed our analyzer in Meta’s continuous integration system, where it scans every commit to the Android app family. In the last two years, developers have actioned over 200 fixes in response to deadlock reports, at a fix rate of around 54 percent.
Our analyzer is open source and forms part of the Infer static analysis framework.
How it works:
We used abstract interpretation techniques to design our analyzer. For each method, the analyzer computes a summary of how the method behaves in terms of lock acquisition and release, as well as whether the method will run on the main thread or on a background thread. This is done in a compositional manner: Each method is summarized once, at most, and its summary is used in the summarization of its callers, ensuring predictable high performance.
The central part of the summary is the set of critical pairs of the method. A critical pair (A,B) records the following fact: The method tries to acquire lock B, and at that time, it’s already holding precisely locks in set A. This data, computed across all methods, is enough to let us answer the question of whether a deadlock is possible between two concurrent methods.
To be fast and efficient, our tool avoids analyzing all source files in an app. Instead, it first processes all methods in the revision’s modified files. Then, based on that data, it applies a heuristic that locates methods outside the revision that could potentially deadlock with one of the methods in the revision. It is this heuristic that enables our analysis to scale.
In our paper, we also prove that our analysis is sound and complete for an abstract programming language that has only non-deterministic control. In other words, in this language, we can find all the deadlocks without false positives.
Why it matters:
A deadlock is typically an unrecoverable error. Deadlocks are also very difficult bugs to diagnose because thread scheduling is essentially non-deterministic and, thus, a test might need to run thousands or millions of times in order to exhibit the problem.
Detecting deadlocks statically, without running or even building the code is, therefore, extremely valuable. Our approach achieves this goal while also making the analyzer sufficiently scalable so that it can be deployed on the massive codebases at Meta.
Read the full paper
Additional thanks to the co-authors of this research, James Brotherston, Paul Brunet, and Max Kanovich, all at University College London (UCL).