Crowdsourcing Mobile Device Capabilities

Unlike desktop browsers, the capabilities of mobile browsers vary widely from phone to phone. This presents a number of challenges to large scale mobile web development. For example, what can be fit on a 128×96 low-end phone is obviously different from what can be fit on an iPhone or a Nokia N900 with an 800×480 screen. In addition, only about 50% of smartphones today support JavaScript let alone HTML5.

We render our Facebook mobile site differently on different devices in order to deliver the best user experience to a wide spectrum of mobile phones. For example, we have both http://touch.facebook.com/ for higher-end smart phones and have http://m.facebook.com/ for lower-end feature phones.

We rely on WURFL, an open source mobile device database, to map a given phone to it’s capabilities. WURFL makes it easy for us to take any incoming HTTP request and determine whether it comes from a mobile browser without having to maintain our own giant list of mobile browser user agents. While that’s a simple example, WURFL goes many steps further in terms of helping us detect mobile browsers in more complex manners and in knowing what aspects of HTML and JavaScript each browser supports.

Improving WURFL’s database

It is challenging for the WURFL project to keep up with numerous emerging mobile devices especially given the increasing rate of innovation within the mobile industry. Given that over 150 million people use Facebook through their mobile devices every month, we wanted to find a way to harness that activity to make WURFL more accurate for everyone. We first launched a mobile device survey to crowd source information about mobile browsers directly from people. In the past few months, we have collected input from millions of people. We also launched programmatic experiments to automatically detect user device capabilities, for example whether a given mobile browser supports AJAX.

With the instrumentation of crowd sourcing, we not only augmented WURFL on its existing capability entries (things like screen size and AJAX), but also added new capabilities such as knowing if a mobile browser supports image inlining, CSS spriting, or support for Unicode characters. Both inlining and spriting reduce the number of HTTP requests thus making our mobile sites faster.

Making sense of millions of human responses

While each individual survey response may not be fully accurate, due to the massive scale of responses that we received, we are able to derive useful device capability statistics from noisy input. We really formulate this as an estimation problem. We first group data based on its WURFL device ID, and then apply point estimation algorithms to each group. In the ideal case, the data responses within each group are sufficiently consistent to allow us to derive a capability value with high confidence. Otherwise, inconsistent data may indicate that we cluster heterogeneous data into the same group. Therefore, we further divide the group into sub-devices, and apply the point estimation algorithm to each sub-group. This iterative procedure stops either when the data within each sub-group are consistent or when the group cannot be further divided. The end result of this procedure is a WURFL patch incorporating capability values derived from our crowd sourcing data.

The next challenge is how to evaluate and determine the best parameters for the point estimation algorithm. It is difficult to evaluate this rigorously due to the lack of comprehensive ground truth data. We solved this by by using both the small-scale ground truth data and the large-scale crowd sourced data. For example, we manually tested image inlining capability on more than a dozen of devices. Given that it is expensive to acquire large-scale ground truth data, we used our crowd sourcing data in the evaluation. Even though some survey responses are inaccurate, the majority of them are useful. Therefore we use the user survey responses as an approximation to ground truth. We then apply the cross validation techniques to the crowd sourcing data to determine the best parameter values used in the point estimation algorithm. In many cases, false positives for the estimation algorithm would cause a broken page; therefore, our objective function tries to minimize false positive rates while maintaining a reasonable recall rate.

Contributing back to the mobile community

Over the past few months, we’ve been working with Luca Passani to contribute all of the new information we’ve discovered both via crowd sourcing and automated detection back into WURFL itself. Further, we’re started to more directly support the WURFL project to help streamline the data integration process for others. If you’re interested in using WURFL, check out http://wurfl.sourceforge.net/.