It’s part of the Facebook Engineering ethos to buy what makes sense and build what we need to. In the last year or so, we’ve seen this play out many times with our switch and software designs.
Last year, our network engineering team added BMC to the Top-of-Rack (ToR) switch code-named “Wedge” so that the switches could be managed in a manner similar to our compute and storage servers. In March, we open-sourced our own version of the Baseboard Management Controller (BMC) software stack OpenBMC. Today, I want to share that we’ve added a series of new features to the OpenBMC, most notably support for the multi-node 1S server platform code-named “Yosemite.”
In porting OpenBMC to our multi-node server Yosemite, we made some major design choices, including:
- Removing RMCP+ protocol as OOB mechanism: This will eliminate all the known security vulnerabilities for this protocol from being exploited at our infrastructure.
- Adding SSH support: BMC to support SSH login for user to execute various utilities for managing and/or debugging the system. And this interface will be secured similar to our compute and storage servers.
- Adding REST API support: REST API over an http(s) connection is provided for managing the system and will use JSON objects for exchanging information instead of raw bytes.
These choices solved for a few issues. The first involved ipmitool, which we use to manage the servers that allow the user to send various IPMI commands in band and/or out of band (OOB). For the standard IPMI commands, ipmitool provides user-friendly/tool-friendly wrappers for IPMI command bytes — e.g., ‘ipmitool mc info’ — to read BMC information. But for the features that are implemented using IPMI OEM commands, the only way is to use ipmitool’s raw capability, which allows the user to send the complete IPMI request packet as raw bytes and receive the IPMI response packet as raw bytes. The interpretation of these raw bytes is nontrivial and prone to mistakes. We wanted to solve this usability problem.
On the security and authentication front, BMC uses RMCP/RMCP+ management protocol; it’s well-known that a set of security vulnerabilities is inherent in this protocol. In addition, the authentication depends on BMC’s internal username-password database, which is hard to manage at scale to make sure it is secure and reliably available during periodic password updates.
Want details on the new features? There are two ways to get more information. We’ve listed out some feature details below, and you can see a demo at our Intel Developer Forum (IDF) booth.
IPMI FRUID support
The Yosemite platform contains various Field Replaceable Units (FRUs) like Side Plane Board (spb), OCP Mezzanine Card, and four 1S server boards. Each of these FRUs has an eeprom that contains static information like manufacturer name, manufacturing date, part number, serial number, asset tag, and the like. This information needs to be parsed by BMC to understand the type of FRU that is in place and provide services accordingly. We have added support libraries and a utility in OpenBMC to read and parse the content from all these FRUs.
Yosemite contains analog sensors (like current, power, voltage, temperature) on various FRUs like Side Plane Board, Mezzanine Card, and all four 1S server boards. We added a daemon to monitor these sensors against predefined threshold values and log an error message in case of threshold violation.
Typical BMC supports one only one server node (or payload in IPMI terminology) at a time following the original IPMI architecture. In case of Yosemite, the OpenBMC has to support four independent server boards/nodes. To support this multi-node model, various software modules in OpenBMC are updated with support for multiple nodes. The in-band IPMI request message now includes the payload number as a parameter so that the IPMI stack can handle the request for a given payload and responds accordingly. Similarly, the IPMB framework involves multiple daemons to handle simultaneous IPMB requests to/from four 1S server’s Bridge-IC interface. And also most of the utilities now accept node number as an argument so that the user can initiate action for a given node, e.g., power on/off a given server board, request Serial-Over-Lan for a given server board, etc.
Ethernet driver enhancements
We have added Network Controller Sideband Interface (NCSI) support for the Ethernet driver, so that it can make use of sideband interface on a Multi-host Network Interface Controller (NIC) similar to ConnectX-4 (CX-4).
I2C driver secondary support
We enhanced the I2C driver to support secondary mode for all buses independently, as each 1S server board is connected to BMC using dedicated I2C Bus. BMC and Bridge-IC on 1S server communicate over I2C bus using Intelligent Platform Management Bus (IPMB) protocol. This protocol needs the I2C bus to be operated in multi-primary mode since Bridge-IC or BMC can initiate IPMB transaction at any time.
We added IPMB framework to allow BMC and Bridge-IC management controllers to communicate. The framework provides a library for the BMC application to send an IPMB request to Bridge-IC and get a response. And also has receive-side daemons to handle the incoming requests from Bridge-IC and send responses back.
IPMI SDR support
We added support libraries for reading and interpreting the SDR records in OpenBMC. In addition, we provided a caching daemon that reads the SDRs from all four 1S servers and keeps it locally at BMC storage. Since the SDRs are static, this caching strategy provides a huge performance benefit during run time.
Discrete signal monitoring
The Bridge-IC on 1S server provides CPU-agnostic interface for reading various discrete signals that indicate error conditions on server board, e.g., Power Good, VRHOT, Thermal Trip, Processor HOT. We added a daemon to monitor any changes in these critical discrete signals and log error messages accordingly for further analysis.
One of the user interfaces for OpenBMC is the ability for users to log in to BMC using secure shell, SSH. After logging in to BMC, users can use various utilities we added to manage the platform. These include ‘fruid-util’, which allows the user to view the IPMI FRUID information for Side Plane Board, Mezzanine Card, and four 1S servers; the ‘sol.sh’ utility, which allows a user to connect to a specific 1S server board’s console port for debugging; ‘power-util’, which allows the user to check if the given 1S server is powered on or not, and also allows the user to power on/power off/power cycle/graceful shutdown a given server; ‘sensor-util’, which allows the user to read the current value of the sensors on a given FRU, e.g., Side Plane Board, Mezzanine Card, and four 1S server boards.
With this update, OpenBMC now provides REST API as the programmatic interface to manage the server. The endpoints are designed to be simple, consistent, discoverable, and reflect hardware topology of the Yosemite server platform. Each of the endpoints provides three different attributes, with optional content for each attribute: The ‘Information’ attribute provides a set of key-value pairs that describe a given resource node; the ‘Actions’ attribute provides a list of various actions that can be taken on this given resource node; and the ‘Resources’ attribute provides a list of discoverable nodes from this resource node. With this approach, the Yosemite platform can be represented as a tree of resource nodes that closely reflects the hardware topology. The example REST API endpoints are /api, /api/spb, /api/spb/bmc, /api/spb/fruid, /api/server1, /api/server2, /api/server3/fruid, /api/server4/sensors, etc.
These changes aren’t the end of the road for OpenBMC. As we make future changes, we’ll be sure to update you in this space. As with any new features we share with the community, we’re excited to hear your feedback. We’ll be gradually making changes to the GitHub repo — look for them here.
In an effort to be more inclusive in our language, we have edited this post to replace the terms “master” and “slave” with “primary” and “secondary”.