Zoncolan: Using static analysis to prevent security issues

Facebook’s web codebase currently contains more than 100 million lines of Hack code, and changes thousands of times per day. To handle the sheer volume of code, we build sophisticated systems that help our security engineers review code. Today, we are sharing the details of one of those tools, called Zoncolan, for the first time. Zoncolan helps security engineers scale their work by using static analysis to automatically examine our code and detect potentially dangerous security or privacy issues.

As with any system of this type, Zoncolan cannot find every possible issue. But it does allow us to find classes of issue that lend themselves well to detection via static analysis. It also allows us to scale. With it, we can analyze our entire codebase in less than 30 minutes from a cold start — a task that could take months or years if attempted manually. Since it’s been in production, Zoncolan has prevented thousands of potential security issues.

How Zoncolan works

At a high level, Zoncolan lets us quickly create a rule and find similar code patterns. This helps us scale the work in three ways:

A security engineer can verify findings quickly without going through untold hours of drudgery reading code manually.
We can use a codified rule to prevent future occurrences of the same issue. Zoncolan runs on thousands of code changes per day before the code ships to production, and it alerts us to issues. Our repertoire of rules has grown over time, and we can add or refine it as needed.
The rules themselves serve as documentation of issue classes. Once available as a Zoncolan rule, it is easier for security engineers to examine the nuances of the rule itself and the issues we’ve found as a result. This process helps educate engineers on what to look for in the future. It can also inform decisions to build better libraries and frameworks, to eliminate classes of problems entirely.

Zoncolan is the result of a close collaboration between our security engineers and a team of static analysis experts. The project began a few years ago, when we manually reviewed reports of past security vulnerabilities, including bug reports, root causes, and corresponding code fixes. Our Bug Bounty program also proved to be a useful source of data. Based on hundreds of these reports, we determined which classes of issue were amenable to static analysis and designed a system to catch them.

Abstract interpretation

Zoncolan uses a technique called abstract interpretation to track user-controlled input through the codebase. As it parses code, it builds data structures that represent the behavior of functions in the code (the control-flow graph) and how those functions interact (the call graph). It then creates a summary of the behavior of each function. Instead of actually running each line of code, the way an interpreter or HHVM would, Zoncolan only records properties that are relevant to potentially dangerous flows of information.

The abstract execution follows all control paths in the program. Whenever Zoncolan encounters a branch (e.g., an “if” statement), it analyzes both branches individually and then merges the results. This is a key difference between testing and static analysis: Whereas a single test represents a single path through the program, a static analysis system such as Zoncolan evaluates all possible paths at once.

Static analysis for and by security engineers

Zoncolan rules specify the conditions that portend a potential security issue. The most common type of rule consists of two things:

a point of origin (a source; where information comes from);
a destination (a sink; where the information from the source should end up).

The system will track data flowing from a source to its corresponding sink through any path, however long. Over time, we have built a corpus of rules, each representing a different type of potential security vulnerability. When the tool finds a flow that matches one of these rules, it emits a warning that contains a complete description of the path from source to sink — i.e., a trace, as derived by the abstract execution.

When we discover a new class of issue, we evaluate whether static analysis is the best form of detection (compared with other detection approaches, such as fuzzing or Invariant Detector). For each new Zoncolan rule, a security engineer evaluates the initial results to confirm that the rule actually captures the desired scenario and to provide guidance on ways to eliminate false positives. When a rule is precise enough, it is promoted to the main rule list for Zoncolan, which means it will run on every code change in the future.

An example: From Whitehat submission to static analysis

Facebook’s web frameworks protect against CSRF (cross-site request forgery) by automatically embedding a token into POST forms and verifying that token on the server side. There are also protections to make sure that forms with CSRF tokens are POSTed only to facebook.com endpoints.

We received a submission to our Whitehat program about a possible vulnerability in our CSRF protection on a specific endpoint on facebook.com. In the report, the researcher changed the “action” attribute of a CSRF-protected form to another valid facebook.com endpoint, bypassing both CSRF and form action hijacking protections. We tracked down the root cause to code similar to this:

<?hh
// Copyright 2004-present Facebook. All Rights Reserved.

final class AddMemberToGroup extends FacebookEndpoint {
  public function render(): :xhp {
    // User input, untrusted
    $group_id = $this->getRequest('gid');
    $username = $this->getRequest('username');
    $group = $this->get_group($group_id);
    return self::getConfirmationForm($group, $username);
  }

  public static function getConfirmationForm($group, $username): :xhp {
    // $url is partially user controlled. 
    // An attacker can forge the url
    // https://facebook.com/groups/add_member/../../users/delete_user/
    $url = "https://facebook.com/groups/add_member/" . $username;

    // the <fb-form tag is similar to the
    // <form tag, but adds the CSRF token by default
    return
      <fb-form method="post" action={$url}>
        <input name="gid" value={$group->getID()}/>
        <input name="method" value="add"/>
      </fb-form>;
  }
}

If $username has a value similar to “../../users/delete_user/", it is possible to redirect this form into another form on Facebook. On submission of the form, it would POST a request to https://facebook.com/groups/add_member/../../users/delete_user/, which would delete the user’s account. The code above was dangerous because the attacker could control part of the form action through control over the $username variable. The underlying problem was the use of string concatenation to build the URI instead of Facebook’s URI builder abstraction (which would have prevented the path traversal in the URI).

Let’s walk through the steps that a security engineer would perform to manually find such a vulnerability:

$username is user-controlled and hence untrusted.
It is passed as the second argument to getConfirmationForm.
$url is user-controlled through the concatenation of a literal string and some user input.
The attacker controls the action field on the fb-form, because $url is user-controlled.

Following the data flow in this code snippet is relatively simple because it is designed to be a self-contained example. But following it in a large codebase, across many function calls, is much more complex. Zoncolan can perform this task in 30 minutes. By comparison, manual inspection or grepping would take weeks, at best, so they are not scalable approaches.

Zoncolan automates the steps that security engineers take when tracking down this kind of vulnerability. In our example, that’s the flow from user-controlled data (the source) to the action field on the fb-form (the sink). Zoncolan analyzes each function in the code and computes how it returns possible sources, and how parameters might flow into possible sinks. In our example:

- The function getConfirmationForm contains a sink vulnerable to user-controlled data (fb-form::action). Zoncolan propagates this sink up to the input arguments, thus Zoncolan will discover that if $username is user-controlled, then fb-form::action is user-controlled as well.
- The function render, contains a user-controlled source ($this->getRequest('username')), which Zoncolan propagates to the variable $username. In turn, the value of $username is passed as an argument to getConfirmationForm.

Given the inferred information about these functions, Zoncolan emits an alarm for a flow from the source $this->getRequest('username') to the sink fb-form::action.

Preventing security bugs

Zoncolan evaluates thousands of code changes per day. We have built extensive infrastructure for running Zoncolan, tracking the results, and providing access to those results. In 2018, Zoncolan helped find and triage more than 1,100 security issues with severity “significant” or higher, indicating they required immediate action. The distribution of those findings is as follows.

- - Direct: Zoncolan flagged 46 percent of issues to code authors directly without the involvement of a security engineer; this typically also takes place before the code is landed.
  - On-call: The security engineering on-call rotation triaged 33 percent of issues before the relevant code was landed.
  - Other: We reported 21 percent of issues to security engineers outside of their on-call rotation; reporting typically occurs after the code has landed, when a security engineer develops a new rule to run against the full codebase.

Zoncolan has become an important part of our application security work at Facebook. It helps individual security engineers find complex issues more efficiently, and it helps our product security team as a whole quickly disseminate knowledge of new bug types through rules. We are actively working to prevent new issues from being introduced into our code and to broaden this kind of static analysis across more programming languages at Facebook. The Pyre type checker for Python is a concrete open sourced example of this.

Zoncolan’s success stems from its high signal-to-noise ratio, speed, and extensibility — and its low rate of false negatives. It achieves this by focusing on issue classes that lend itself well to static analysis. It complements other product security efforts such as bug bounty, security reviews, design reviews, and so on. Zoncolan has allowed security engineers to shift their focus from tracing code for low-hanging individual bugs to designing rules that can catch and eliminate entire classes of vulnerabilities. Looking ahead, we are working on open source tooling that uses Zoncolan’s core elements so that engineers outside of Facebook can take advantage of what we’ve built.