At Facebook, my responsibilities include the quality of software design—not what the users see, but what the engineers work with every day. I have been thinking about software design most of my career, and I use specialized vocabulary in an effort to be both precise and concise. I assumed that everyone understood what I was talking about until one of our sharper engineers told me I was talking gibberish. In response, I prepared this glossary. It turned out to be a compact introduction to the important concepts software designers need to master.
An element is the generic term for things in the design. Processes, physical machines, modules, classes, functions, variables, statements, and expressions are all elements.
Elements are composed of smaller elements, and themselves make up larger elements. The generic term "element" is a reminder that the essential principles of design are scale free. "Architecture," "design," and "coding" are not particularly helpful artificial distinctions.
Two elements are coupled if changing one implies changing the other. For example, all the uses of a variable are coupled with respect to name changes because changing the name in one place requires changing it in all others. The type of a variable might create coupling or it might not, depending on the change.
Coupling can be subtle (we often see examples of this at Facebook). Site events, where parts of the site stop working for a time, are often caused by nasty bits of coupling that no one expected—changing a configuration in system A causes timeouts in system B, which overloads system C. You could stare at the source code for A all day and never guess that it was coupled to C. Coupling can also be obvious but just not addressed—"if you change this file be sure to change that one over there too."
Coupling is expensive but some coupling is inevitable. The responsive designer eliminates coupling triggered by frequent or likely changes and leaves in coupling that doesn't cause problems in practice.
Let's say we have an element composed of sub-elements. The element is cohesive with respect to a change if all sub-elements have to change at the same time. A class is cohesive if a change to one function requires changes to all the others. Cohesion is the inverse of coupling. The more cohesive your elements, the less likely they are to be coupled to other elements.
Cohesion fanatics, if they have to change two lines in the middle of a function, extract a helper function with just those two lines before changing them. Using this vocabulary, they are creating a cohesive sub-element before making the change instead of making the change to the larger, less cohesive parent element. This practice can seem wasteful, but it makes seeing the impact of the two-line change much easier.
Design changes are usually most efficiently implemented as a series of safe steps. Succession is the art of taking a single conceptual change, breaking it into safe steps, and then finding an order for those steps that optimizes safety, feedback, and efficiency.
We frequently migrate large amounts of data from one data store to another, to improve performance or reliability. These migrations are an example of succession, because there is no safe way to wave a wand and migrate the data in an instant. The succession we use is:
- Convert data fetching and mutating to a DataType, an abstraction that hides where the data is stored.
- Modify the DataType to begin writing the data to the new store as well as the old store.
- Bulk migrate existing data.
- Modify the DataType to read from both stores, checking that the same data is fetched and logging any differences.
- When the results match closely enough, return data from the new store and eliminate the old store.
You could theoretically do this faster as a single step, but it would never work. There is just too much hidden coupling in our system. Something would go wrong with one of the steps, leading to a potentially disastrous situation of lost or corrupted data.
Design evolution proceeds at the pace the community can absorb change, not the pace any single individual can initiate it. Socialization is a reminder to pay as much attention to social practices as to technical practices.
Latency is the amount of time before first seeing a feature. Experimenting with new features requires low latency because most of the features will last only a few hours or days before being discarded.
Throughput describes the number of features that can be implemented in a given amount of time. Polishing features and implementing simple variations of existing features requires high throughput because engineers are in such short supply.
The unpredictable variation in latency and throughput.
Sometimes these three factors can conflict. You can prioritize throughput over latency by building a framework on the promise of cranking out features once it is done. You can prioritize latency over throughput by throwing each feature together as fast as possible. The responsive designer can shift priorities between latency and throughput, but generally the strategy is to invest continually in reducing variance by keeping the design clean (free of coupling) so as to optimize both latency and throughput.
Put in plainer language, the responsive designer can crank out a feature quickly when time is valuable (when the feature is risky), but knows to shift gears before the code becomes a tangled mess.
Kent Beck is an engineer on the privacy team.