Facebook has long supported open standards and interoperability between frameworks and hardware for driving machine learning (ML) innovation with projects like Open Neural Network Exchange (ONNX) and PyTorch. Developing industry-standard ML models and benchmarks will enable researchers and engineers to better evaluate and demonstrate the impact of their work. This is why we are supporting the MLPerf initiative. As part of this support, we are also open-sourcing Mask R-CNN2Go, our leading-edge computer vision model optimized for embedded and mobile devices.
MLPerf is a consortium that aims to build a common set of industry-wide benchmarks for measuring system-level performance of ML software frameworks, hardware accelerators, and cloud platforms. It provides benchmarks for training and inference for the cloud as well as for on-device edge inference. The benchmark suite covers a diverse set of application use cases, such as image classification, object detection, speech to text translation, and so forth. In order to deliver a fair, industry-standard ML benchmark suite, dedicated working groups are formed to identify and tackle challenges that arise at different aspects of the benchmark creation.
Facebook AI Infra research scientist Carole-Jean Wu co-chairs the MLPerf Edge Inference working group. Together with other industry and academic organizations, we will provide benchmark reference implementations that are trained with open source data sets for two ML models for the Edge Inference category. With Facebook AI Infra engineer, Fei Sun, we will release the benchmark implementations as part of Facebook AI Performance Evaluation Platform, providing a simple common benchmarking harness for ML.
For the image classification use case, we will provide the implementation for the ShuffleNet model. For the pose estimation use case, we will provide the implementation for the state-of-the-art Mask R-CNN2Go model, developed by Facebook’s mobile vision researchers. As more and more machine learning occurs at the edge, it is important that we define representative benchmarks for edge inference use cases and help the community characterize performance bottlenecks of on-device inference execution, design and optimize systems for efficient on-device inference solutions.
Open-sourcing Mask R-CNN2Go
As part of our MLPerf contribution, we are open-sourcing Mask R-CNN2Go, our leading-edge computer vision model optimized for embedded and mobile devices. Mask R-CNN2Go forms the basis of a variety of on-device ML use cases: object detection, classification, person segmentation, and body pose estimation, enabling accurate, real-time inference. As we shared earlier this year, the main model is based on the broader Mask R-CNN framework. As the name implies, MaskRCNN2Go is designed and optimized specifically for mobile devices.
Mask R-CNN2Go currently runs on Caffe2, with plans to run on PyTorch 1.0 as the ML framework continues to add more capabilities to provide developers with a seamless path from research to production. Currently, we use Mask R-CNN2Go to create useful and entertaining experiences on mobile devices, such as the hand tracking in the “Control the Rain” augmented reality (AR) effect in Facebook Camera.
We look forward to seeing creative AI-powered mobile experiences that our community creates with Mask R-CNN2Go. And as part of the MLPerf benchmark, Mask R-CNN2Go will help our community design and evaluate mobile and embedded devices for state-of-the-art ML inference.
We’d like to thank Peter Vajda, Peizhao Zhang, and all of those contributing to MLPerf and Mask R-CNN2Go.