Facebook updates the site with new features, product improvements, and bug fixes every work day. This can sometimes be a huge challenge, given that there are hundreds of engineers working on thousands of changes every week, and many of those changes immediately impact the over 800 million people using Facebook worldwide. But Chuck Rossi, who has worked in release engineering for over 20 years and started as Facebook’s very first release engineer in 2008, helps make it all happen. Read on to learn about the team behind the daily push and the tools and processes they built to make it possible.
Q: How did a daily push become part of Facebook’s engineering culture?
A: It’s something that grew out of the way Mark and the early engineers set things up—a very lightweight process that allows us to iterate quickly to get new features out. While there have certainly been some bumps, I’m very proud of the fact that we’ve built a system that supports so many changes on such a large scale.
Q: How do you keep things moving fast as the company grows?
A: When I came to Facebook in 2008, I was the only release engineer. Now that the company is so much bigger, my small team and I are focused on how we can maintain the culture and the tools that let us operate as quickly and efficiently as possible. Every day, we’re trying to answer the question, “How can we move faster?” This means expanding the role of release engineering, improving and expanding the test automation systems, and increasing push frequency.
We do these things because engineers build really quickly here. I don’t want my systems to be a bottleneck or a hindrance, but at the same time, I want to manage risk so that people using the site have the best experience possible. No one should have to deal with down time or silly bugs. To assess how risky changes might be at a very meta level, I can look at how big a diff is and how much discussion there has been around it. There are also stars that indicate an engineer’s push karma. Everyone is born with 4 stars, and you can only go down from there. People with low push karma have a higher bar to get into the daily push.
Q: How do you gear up for the daily push?
A: I have a secret weapon to battle the tough commuting conditions here in Silicon Valley, so I’m usually at work by 8:45 a.m. I use this time to get things organized, fight any fires with my infrastructure, or work on any open tasks I have. Some days I give two different onboarding classes for new engineers, which is something I enjoy. I think my onboarding sessions are critical because I really try to instill the company culture of being responsible for your changes from beginning to end. From the time they check their code into our source base until it’s out in front of their moms, our engineers understand that they’re the ones who need to make sure it works.
Starting around 1 p.m., I switch over to “operations mode” and work with my team to get ready to launch the changes that are going out to facebook.com that day. This is the more stressful part of the job and really relies on my team’s judgement and past experience. We work to make sure that everyone who has changes going out is accounted for and is actively testing and supporting their changes. We look for anything that might introduce risk so we can pay more attention to those areas during the release. It’s like playing air traffic controller—a lot of things are in the air and trying to land at once. And they’re all important.
If everything looks good and our test dashboards and canary tests are green, we push the big red button and the entire facebook.com server fleet gets the new code delivered. Within 20 minutes, thousands and thousands of machines are up on new code with no visible impact to the people using the site.
Q: What about the dev process at Facebook lets you get stuff done?
A: I live and die by the developer and operations tools we’ve built here. We’ve been really aggressive about automating as much ofthe dev and deploy environments as possible. Because of that, I’m able to get significant changes out to facebook.com every day.
Q: Why do you come to work in the morning?
A: I can’t get over the impact that Facebook has had on the world. Since I’ve started here, I’ve taken the quality of people’s experiences on the site very personally. I want everyone to experience Facebook as a solid, reliable service. If something doesn’t go as planned, I document what happened and we do a postmortem to figure out how we got into that situation and how we’ll improve to make sure it doesn’t happen again. I’m also compelled by the impact I have as an engineer here, mainly knowing that on the other side of my ‘Enter’ key (on an old IBM keyboard my wife gave me for my birthday 14 years ago), there are thousands of machines and more than 845 million people who will be affected by the actions I take.
Q: What advice do you have for other engineers?
A: Ship early and ship often. But tell the releng team what you’re doing—we hate surprises.
To learn more about Chuck and the Release Engineering team, check out his tech talk here and watch him do a live push here and watch a live push here.