What it is:
FioSynth is a benchmark tool used to automate the execution of storage workload suites and to parse results. It contains a base set of block level storage workloads, synthesized from production I/O traces, that simulate a diverse range of Facebook production services. It is useful for predicting how a storage device will perform in realistic production environments and for assisting with performance tuning.
Traditionally, synthetic I/O benchmark tools have been used to push flash drives and hard drives to their limits, where the drive that delivers the highest input/output operations per second (IOPS) or throughput wins. Facebook applications rarely push the peak IOPS or throughput boundaries of our drives, but they are highly sensitive to latency outliers. With this in mind, the majority of FioSynth workload suites are rate limited, based on I/O patterns that are commonly run on Facebook hardware. The drives that maintain the lowest outlier latency (P99, P99.99, P99.9999, etc.) while sustaining the expected I/O rates are considered to be the best-performing option, regardless of their peak IOPS and throughput capabilities.
What it does:
FioSynth uses fio, a flexible I/O workload generator. We designed FioSynth to be easy to deploy and easy to run on a variety of storage devices and configurations, and it’s easy to view results. FioSynth workloads are scalable, and results can be used to directly compare the relative performance of direct-attached storage to disaggregated storage (iSCSI, AOE, etc.), small capacity drives to large capacity devices, non-RAID to RAID, etc.
Some of the specific improvements that FioSynth provides include the following:
- It includes dozens of workload suites for flash drives and hard drives, representing the I/O profiles of a diverse range of services.
- Workloads are scalable based on available storage capacity, scaled on a per-terabyte basis. This allows the engineer to directly compare drives that have different capacities.
- FioSynth parses results from each workload suite and summarizes them in a single .csv file for easy viewing.
- In client/server mode, all workload suites can be executed on multiple hosts in parallel. Results from individual servers are stored in their own .csv files, while the combined summary results are stored in a separate .csv file.
- Optionally, the tool can be used to collect device health logs before and after each workload is executed. This is useful for calculating write amplification factor (a value representing the data physically written to flash media in relation to the data written by the host) for various workloads.
- In the workload suite definition files, there is a section for workloads that precondition the drives and a section for workloads used to measure drive performance. By default, the flash drives will be preconditioned before measuring drive performance, but preconditioning can be skipped through a command line option.
- The number of run cycles can be defined by default in the workload suite definition file or through the command line, to ensure the repeatability of results.
Why it matters:
FioSynth has been used internally at Facebook for years to identify performance inefficiencies in flash devices before they go live in the data center and to reproduce functional issues that were previously observed in production. Some flash device suppliers have used this benchmark tool to optimize performance of their own drives to meet the performance targets specified in the Open Compute Project’s NVMe Cloud SSD Specification.
We’ve made FioSynth available to the open source community to make it easier for storage device manufacturers to optimize their drives to perform well in hyperscale applications, to help standardize flash preconditioning methodology, and to make it easier to run and visualize the results of storage benchmarks.