A friend of mine asked for advice about a problem he was having at his job. Onboarding each new customer required significant effort. The process was automated, but it took a long time to execute. The delay was visible to any new customer who had just signed an expensive contract and didn't want to wait days or weeks for their new service.
Before I could offer my friend some advice, I took a minute to explain something about my dishwasher.
Most people use their dishwasher as explained in the manual. Dirty dishes collect in its racks over a span of time. Once it's full, someone adds soap and starts the load. Later, the clean dishes are emptied. Repeat.
I do it a little differently.
I load the soap as soon as the clean dishes are put away. Later, once it's time to start the wash cycle, I need only press the start button.
I prefer this approach for a few reasons. It reduces the chance I'll spill the soap powder since I tend to start the wash cycle late at night just before going to bed when I'm both sleepy and in a rush. That combination tends to make me sloppy. Thus, spilled soap.
With my system, a full soap dispenser is an indicator that the dishes are dirty. Have you ever had a family member or housemate ask, "Are the dishes in the dishwasher clean?" It isn't always obvious. Using the soap dispenser as the signal is both more accurate and convenient than a CLEAN/DIRTY magnet or some other mechanism requiring human intervention.
My soap-loading technique isn't revolutionary, and I don't think I'm going to win the Turing Award for this innovation. But it does demonstrate a point about process design: You can eliminate delays in starting a process by front-loading tasks whenever possible.
Front-loading is interesting because it changes when you do tasks but not their order. The process still involves a loop: load dishes, add soap, press start button, empty dishes repeat. You've only changed your mental model of where the loop starts.
Now that you understand my amazing dishwasher technique, let's see how my friend might be able to add his soap ahead of time. He showed me a diagram of all the steps required to onboard a new customer. Three or four of these steps were generic enough to be done ahead of time.
This would save hours, which was significant and would be worthy of the engineering effort. Performing these steps ahead of time also would improve quality. In the old system, when people were rushing to fulfill the customer request, the automation performed only cursory quality-assurance checks. With the new design, any step done ahead of time would benefit from a longer, more rigorous testing cycle. The old system discouraged new tests. The new system encouraged more testing. For example, a new design could be run through disaster-recovery ("failover") tests, which can often take hours.
This then led to another design idea: Why not prebuild many instances and then hand them out as customer contracts are signed? The wait time visible to customers could be reduced from days to minutes.
Books such as The Phoenix Project (Gene Kim, et al.) advocate delaying variation to the end of the process. Auto manufacturers follow this approach. All cars of a particular model start out exactly the same. Variations such as interior colors and audio/entertainment packages are added at the end. Fast-food restaurants follow this approach as well. Burger King advertises that special orders don't upset them, but the only variations they offer are ones that can be accomplished just before the sandwich is wrapped.
Saving variations to the end makes it easier to manage defects. A generic unit with a defect can be moved to the side, repaired, and then put back on the assembly line. In the meantime, another generic item can take its place. Once a bespoke customization for a particular customer has been added, that flexibility is lost. In extreme cases, it's easier to simply throw the burger away.
Realizing this, my friend split the process into two systems: a slow, generic cluster builder and a fast customization engine. The first system focused on creating generic clusters, testing them, and then registering them in an inventory. It built a stockpile of clusters ready to be handed out. There was no need to rush this phase. Quality was more important than speed. We'll call this the "slow phase."
During the slow phase, you can take the time to do extensive testing. When failures are found, you can stop the process and take whatever time is necessary to study the problem, understand the failure, and fix it properly. Major problems can be resolved by deleting the cluster and starting over. Minor problems can be fixed before they become major problems.
This is similar to how the auto industry stops a production line to fix a small problem before it becomes a big problem. This is known as "pulling the Andon cord," referring back to a time when a physical cord was pulled to stop the line.
During the customization phase, meanwhile, my friend's process involved waiting for customer orders, picking a generic cluster from the stockpile and then customizing it for the customer. Let's call this the "fast phase."
Some customers, for example, require larger capacity than others. Originally, my friend's company believed that no work could start until the sales order was signed because that's when capacity became known. Given this assumption, the entire slow/fast design was not possible. But then someone observed that nearly all customers require the same capacity, with only a few outliers requiring larger capacity. So, the decision was made to use the slow phase to build standard clusters that then could be grown during the fast phase if necessary.
Another potential blocker was that the customer name was deeply embedded (or "tattooed") in the cluster configuration—which is to say the cloud provider had no way to rename clusters once they'd been built. This, too, was believed to be a blocker to the slow/fast design. But then the company decided to build all new clusters with generic names (cluster1, cluster2, cluster3, ) and then assign customer-specific aliases during the fast phase. The introduction of aliases required only minor changes to downstream processes. For example, some third-party tools do not pay attention to aliases and thus need to be passed the actual name.
Let's popularize the slow/fast pattern. I've seen this slow/fast pattern in many deployment or service-delivery systems: small systems such as VDI (virtual desktop infrastructure) deployments up to systems larger than the one described here. Sadly, what all of these have in common is that the slow/fast design was always part of a second-generation rewrite.
It's a shame we don't think to build systems this way from the start. I suppose this is because first-generation systems are built in haste. There's no time for architectural navel-gazing when you're tasked with automating a process after a flurry of orders has made it impossible to provide any of those with individualized attention.
However, I think the true reason we don't think to use the slow/fast pattern is that it hasn't yet achieved enough popularity to be at the front of our minds. It isn't taught at the university level, it isn't discussed much in online forums, and—even when the pattern is used—it is often hidden from end users.
Which is to say ACM members could play a large role in popularizing this pattern.
The timing of when we do things is not set in stone. It only feels that way.
There's no rule that dishwasher soap must be loaded immediately before you start the wash cycle. But it's such a common practice that people tend to act as if such a rule exists.
The day you gather trash from bins around your house does not need to be the same day you put your trash bins at the curb. I find it easier to collect the trash on the weekends when I'm doing other chores.
A retail store does not start the day by making preparations for customers. The night before is when the facility is cleaned and the new merchandise is put out on display. Ideally, the morning shift simply opens the doors and is ready for normal business.
Notice that if you sleep late, people call you lazy. But if you go to sleep super early, you sleep just as much and yet people call you wise.
Examining the order of steps can even help you realize that something can be postponed until much, much, later. In the best case, it might even be postponed long enough that it's never needed at all.
Before paperless billing, I used to fastidiously file away each utility, bank, and credit card statement in a filing cabinet. I had a separate folder for each utility, bank, credit-card company, and so on. Each folder contained past statements lovingly stored in chronological order. It was a lot of work, but I was sure that someday it would prove useful. Maybe I'd win a court case since I'd be able to swiftly calculate the exact amount I'd spent on groceries during the month of July 10 years earlier. I was young, optimistic, and stupid.
One day I realized that all my meticulous filing was eating up a considerable amount of time. In fact, I could reduce the time it took to process my monthly bills by 80 percent by simply not being so fastidious about how I stored old statements. Instead, I just started to pile up the statements in a single folder; and then start a new folder once the current one was full. I wouldn't bother to organize the statements until I actually needed some specific information.
This approach is what's referred to as "lazy evaluation" or "call-by-need" in programming languages. The win here is that, if the need never arises, we've saved a lot of time. In my case, the need to go through that folder never arose. And then, eventually, paperless billing eliminated the need for a filing cabinet altogether.
There's also a different possible outcome. Sometimes we examine optional tasks only to discover they aren't actually optional. In this case, managing the optionality (yes, I just invented that word) turns out to be wasted work and complexity that can be eliminated.
I encountered this very situation recently when I was preparing to optimize some complex code in an open source project. There was an expensive string operation that the code avoided until it was sure the result would be required.
Avoiding the operation was good, but I thought I could do better: I would memorize (cache) the result, so that—if the value was needed a second time—I wouldn't have to repeat the operation. This would involve some complex cache-invalidation logic, as the language didn't support lazy evaluation. But, as we all know, cache invalidation is one of the two most difficult problems in computer science. I dreaded the bugs this might introduce to the system.
This proved to be a good time to stop coding and start doing some analysis. Of all the inputs, what percentage actually required the expensive operation and how often was the result accessed two or more times?
To my surprise and delight, the result was required for 100 percent of the input and was always used at least once. With that discovery, I knew I could simply do the operation for each input string upon arrival and then store both the original and the processed result. I then could also expose both as public attributes—with no need for the complexity of memorization and cache invalidation.
And yes, here was another opportunity to move a task up to an earlier point in a process. Moving the task up meant there was no need to test whether it had been done. The result was less code.
Which goes to show that rethinking the order and timing of tasks within a process can actually lead to significant improvements in efficiency and quality.
Whether this means speeding up your morning routine with a simple trick or overhauling a complex business process, the principles remain the same. By front-loading what you can, delaying what isn't critical, dividing work between slow and fast, and reducing complexity by reexamining optional work, you're able not only to optimize tasks but also to pave the way for smoother, more efficient days.
Thomas A. Limoncelli is a senior site reliability engineer at Stack Overflow Inc. He works from his home in New Jersey. His books include The Practice of Cloud Administration (https://the-cloud-book.com), The Practice of System and Network Administration (https://the-sysadmin-book.com), and Time Management for System Administrators (https://TomOnTime.com). He is @YesThatTom on BlueSky and blogs at YesThatBlog.com. He holds a B.A. in computer science from Drew University.
Copyright © 2025 held by owner/author. Publication rights licensed to ACM.
Originally published in Queue vol. 23, no. 1—
Comment on this article in the ACM Digital Library
Dennis Roellke - String Matching at Scale
String matching can't be that difficult. But what are we matching on? What is the intrinsic identity of a software component? Does it change when developers copy and paste the source code instead of fetching it from a package manager? Is every package-manager request fetching the same artifact from the same upstream repository mirror? Can we trust that the source code published along with the artifact is indeed what's built into the release executable? Is the tool chain kosher?
Catherine Hayes, David Malone - Questioning the Criteria for Evaluating Non-cryptographic Hash Functions
Although cryptographic and non-cryptographic hash functions are everywhere, there seems to be a gap in how they are designed. Lots of criteria exist for cryptographic hashes motivated by various security requirements, but on the non-cryptographic side there is a certain amount of folklore that, despite the long history of hash functions, has not been fully explored. While targeting a uniform distribution makes a lot of sense for real-world datasets, it can be a challenge when confronted by a dataset with particular patterns.
Nicole Forsgren, Eirini Kalliamvakou, Abi Noda, Michaela Greiler, Brian Houck, Margaret-Anne Storey - DevEx in Action
DevEx (developer experience) is garnering increased attention at many software organizations as leaders seek to optimize software delivery amid the backdrop of fiscal tightening and transformational technologies such as AI. Intuitively, there is acceptance among technical leaders that good developer experience enables more effective software delivery and developer happiness. Yet, at many organizations, proposed initiatives and investments to improve DevEx struggle to get buy-in as business stakeholders question the value proposition of improvements.
João Varajão, António Trigo, Miguel Almeida - Low-code Development Productivity
This article aims to provide new insights on the subject by presenting the results of laboratory experiments carried out with code-based, low-code, and extreme low-code technologies to study differences in productivity. Low-code technologies have clearly shown higher levels of productivity, providing strong arguments for low-code to dominate the software development mainstream in the short/medium term. The article reports the procedure and protocols, results, limitations, and opportunities for future research.