Facebook believes in the power of open source, frequently contributing projects to the community but also leveraging good open source software when it fits our needs. At the beginning of 2014, we started migrating our data center DHCP infrastructure away from ISC dhcpd and into the open ISC Kea.
Angelo Failla, a production engineer at Facebook in Dublin, Ireland, gave a detailed account of Facebook’s migration. You can read the whole write-up on the ISC blog site, check out a few key parts below, or watch the video recording of the talk he gave at SREConEurope15:
We use DHCP for provisioning servers in our production data centers. We use it both for bare metal provisioning (to install the operating system) and to assign addresses to the out of band management interfaces.
Our old system was based on ISC dhcpd and static configuration files generated from our inventory system. We loaded them into our DHCP servers using a complex git/rsync based periodic pipeline, restarting the dhcp server(s) to pick up the changes.
This took longer than we wanted:
At our scale there are a lot of parts being added or replaced all the time (both NIC cards and servers). The DHCP servers were spending more time restarting to pick up the changes than serving actual traffic. In addition to that the reconfiguration pipeline was slow. Sometimes the changes would propagate very slowly (~3 hours), slowing down repair times in the data centers.
In short, we wanted a faster way to bootstrap hardware in our data centers after maintenance or expansion.
We liked the fact that ISC Kea is modern software and is designed to be extensible. Kea has hook points where you can add your own logic to parse incoming DHCP packets and modify them as you like right before they leave the server network interface. We leveraged the hooks feature extensively to customize Kea to meet our requirements.
We wanted to centralize as much configuration data as possible, and run a stateless DHCP service. We planned to deploy in Tupperware (our Linux Container technology, roughly equivalent to Google’s Borg). We didn’t want to package long configuration files with the application, nor did we want to maintain this data in multiple places on the network.
What we have developed is simple and fast to deploy: we just install the Kea binary with a very basic configuration file and then it fetches all the rest of the information dynamically from our inventory system. We maintain the client configuration information, such as host allocation, subnets, etc. centrally in our inventory system. This simplifies DHCP server deployment and on-going configuration maintenance.