Everything Sysadmin

Sort By:

The Time I Stole $10,000 from Bell Labs:
Or why DevOps encourages us to celebrate outages

If IT workers fear they will be punished for outages, they will adopt behavior that leads to even larger outages. Instead, we should celebrate our outages: Document them blamelessly, discuss what we've learned from them openly, and spread that knowledge generously. An outage is not an expense. It is an investment in the people who have learned from it. We can maximize that investment through management practices that maximize learning for those involved and by spreading that knowledge across the organization. Managed correctly, every outage makes the organization smarter.

by Thomas A. Limoncelli | November 11, 2020

Topic: Performance


Five Nonobvious Remote Work Techniques:
Emulating the efficiency of in-person conversations

The physical world has social conventions around conversations and communication that we use without even thinking. As we move to a remote-work world, we have to be more intentional to create such conventions. Developing these social norms is an ongoing commitment that outlasts initial technical details of VPN and desktop videoconference software configuration. Companies that previously forbade remote work can no longer deny its benefits. Once the pandemic-related lockdowns are over, many people will continue working remotely. Those who return to the office will need to work in ways that are compatible with their remotely working associates.

by Thomas A. Limoncelli | August 12, 2020


Communicate Using the Numbers 1, 2, 3, and More:
Leveraging expectations for better communication

People often use lists of various sizes when communicating. I might have 2 reasons for supporting the new company strategy. I might tell you my 3 favorite programming languages. I might make a presentation that describes 4 new features. There is 1 vegetable that I like more than any other. The length of the list affects how the audience interprets what is being said. Not aligning with what the human brain expects is like swimming upstream. Given the choice, why would anyone do that?

by Thomas A. Limoncelli | March 11, 2020

Topic: Business/Management


API Practices If You Hate Your Customers:
APIs speak louder than words.

Do you have disdain for your customers? Do you wish they would go away? When you interact with customers are you silently fantasizing about them switching to your competitor’s product? In short, do you hate your customers? In this article, I document a number of industry best practices designed to show customers how much you hate them. All of them are easy to implement. Heck, your company may be doing many of these already.

by Thomas A. Limoncelli | December 10, 2019

Topic: API Design


Demo Data as Code:
Automation helps collaboration.

A casual request for a demo dataset may seem like a one-time thing that doesn’t need to be automated, but the reality is that this is a collaborative process requiring multiple iterations and experimentation. There will undoubtedly be requests for revisions big and small, the need to match changing software, and to support new and revised demo stories. All of this makes automating the process worthwhile. Modern scripting languages make it easy to create ad hoc functions that act like a little language. A repeatable process helps collaboration, enables delegation, and saves time now and in the future.

by Thomas A. Limoncelli | August 5, 2019

Topic: Business/Management


Tom’s Top Ten Things Executives Should Know About Software:
Software acumen is the new norm.

Software is eating the world. To do their jobs well, executives and managers outside of technology will benefit from understanding some fundamentals of software and the software-delivery process.

by Thomas A. Limoncelli | April 14, 2019

Topic: Business/Management


SQL is No Excuse to Avoid DevOps:
Automation and a little discipline allow better testing, shorter release cycles, and reduced business risk.

Using SQL databases is not an impediment to doing DevOps. Automating schema management and a little developer discipline enables more vigorous and repeatable testing, shorter release cycles, and reduced business risk. When you can confidently deploy new releases, you do it more frequently. New features that previously sat unreleased for weeks or months now reach users sooner. Bugs are fixed faster. Security holes are closed sooner. It enables the company to provide better value to customers.

by Thomas A. Limoncelli | December 12, 2018

Topic: Testing


GitOps: A Path to More Self-service IT:
IaC + PR = GitOps

GitOps lowers the bar for creating self-service versions of common IT processes, making it easier to meet the return in the ROI calculation. GitOps not only achieves this, but also encourages desired behaviors in IT systems: better testing, reduction of bus factor, reduced wait time, more infrastructure logic being handled programmatically with IaC, and directing time away from manual toil toward creating and maintaining automation.

by Thomas A. Limoncelli | July 9, 2018

Topic: Development


Manual Work is a Bug:
A.B.A: always be automating

Every IT team should have a culture of constant improvement - or movement along the path toward the goal of automating whatever the team feels confident in automating, in ways that are easy to change as conditions change. As the needle moves to the right, the team learns from each other’s experiences, and the system becomes easier to create and safer to operate. A good team has a structure in place that makes the process frictionless and collaborative

by Thomas A. Limoncelli | March 14, 2018

Topic: Development


Operational Excellence in April Fools’ Pranks:
Being funny is serious work.

Successful pranks require care and planning. Write a design proposal and a project plan. Involve operations early. If this is a technical change to your website, perform load testing, preferably including a "dark launch" or hidden launch test. Hide the prank behind a feature flag rather than requiring a new software release. Perform a retrospective and publish the results widely. Remember that some of the best pranks require little or no technical changes at all. For example, one could simply summarize the best practices for launching any new feature but write it under the guise of how to launch an April Fools’ prank.

by Thomas A. Limoncelli | December 5, 2017

Topic: Development


Four Ways to Make CS & IT Curricula More Immersive:
Why the Bell Curve Hasn’t Transformed into a Hockey Stick

Our first experiences cement what becomes normal for us. Students should start off seeing a well-run system, dissect it, learn its parts, progressively dig down into the details. Don’t let them see what a badly run system looks like until they have experienced one that is well run. A badly run system should then disgust them.

by Thomas A. Limoncelli | August 1, 2017

Topic: Education


Are You Load Balancing Wrong?:
Anyone can use a load balancer. Using them properly is much more difficult.

A reader contacted me recently to ask if it is better to use a load balancer to add capacity or to make a service more resilient to failure. The answer is: both are appropriate uses of a load balancer. The problem, however, is that most people who use load balancers are doing it wrong.

by Thomas A. Limoncelli | December 20, 2016

Topic: Networks


10 Optimizations on Linear Search:
The operations side of the story

System administrators (DevOps engineers or SREs or whatever your title) must deal with the operational aspects of computation, not just the theoretical aspects. Operations is where the rubber hits the road. As a result, operations people see things from a different perspective and can realize opportunities outside of the basic O() analysis. Let’s look at the operational aspects of the problem of trying to improve something that is theoretically optimal already.

by Thomas A. Limoncelli | August 8, 2016

Topic: Search Engines


The Small Batches Principle:
Reducing waste, encouraging experimentation, and making everyone happy

The small batches principle is part of the DevOps methodology. It comes from the lean manufacturing movement, which is often called just-in-time manufacturing. It can be applied to just about any kind of process. It also enables the MVP (minimum viable product) methodology, which involves launching a small version of a service to get early feedback that informs the decisions made later in the project.

by Thomas A. Limoncelli | May 24, 2016

Topic: System Administration


How Sysadmins Devalue Themselves:
And how to track on-call coverage

Q: Dear Tom, How can I devalue my work? Lately I’ve felt like everyone appreciates me, and, in fact, I’m overpaid and underutilized. Could you help me devalue myself at work? A: Dear Reader, Absolutely! I know what a pain it is to lug home those big paychecks. It’s so distracting to have people constantly patting you on the back. Ouch! Plus, popularity leads to dates with famous musicians and movie stars. (Just ask someone like Taylor Swift or Leonardo DiCaprio.) Who wants that kind of distraction when there’s a perfectly good video game to be played?

by Thomas A. Limoncelli | February 8, 2016

Topic: System Administration


Automation Should Be Like Iron Man, Not Ultron:
The "Leftover Principle" Requires Increasingly More Highly-skilled Humans.

A few years ago we automated a major process in our system administration team. Now the system is impossible to debug. Nobody remembers the old manual process and the automation is beyond what any of us can understand. We feel like we’ve painted ourselves into a corner. Is all operations automation doomed to be this way?

by Thomas A. Limoncelli | October 31, 2015

Topic: Development