The Logic of Logging

And the illogic of PDF

George Neville-Neil

Dear KV,

I work in a pretty open environment, and by open I mean that many people have the ability to become the root user on our servers so that they can fix things as they break. When the company started, there were only a few of us to do all the work, and people with different responsibilities had to jump in to help if a server died or a process got away from us. That was several years ago, but there are still many people who have rootly powers, some because of legacy and some because they are deemed too important to restrict. The problem is that one of these legacy users insists on doing almost everything as root and, in fact, uses the sudo command only to execute sudo su -. Every time I need to debug a system this person has worked on, I wind up on a two- to four-hour log-spelunking tour because he also does not take notes on what he has done, and when he's finished he simply reports, "It's fixed." I think you will agree this is maddening behavior.

Routed by Root

Dear Routed,

I would like to tell you that you can do one thing and then say, "It's fixed," but I can't tell you that. I could also tell you to take off the tip of his pinky the next time he does this, but I bet HR frowns on Japanese gangster rituals at work.

What you have is more of a cultural problem than a technical problem, because as you suggest in your letter, there is a technical solution to the problem of allowing users to have auditable, root access to systems. Very few people or organizations wish to run their systems with military-style security levels; and, for the most part, they would be right, as those types of systems involve a lot of overkill—and, as we've seen of late, still fail to work.

In most environments it is sufficient to allow the great majority of staff to have access only to their own files and data, and then give a minority of staff—those whose roles truly require it—the ability to access broader powers. Those in this trusted minority must not be given blanket authority over systems, but, again, should be given limited access, which, when using a system such as sudo, is pretty simple. The rights to run any one program can be whitelisted by user or group, and having a short whitelist is the best way to protect systems.

Now we come to your point about logging, which is really about auditability. Many people dislike logging because they think it's like being watched, and, it turns out, most people don't like to be watched. Anyone in a position of trust ought to understand that trust needs to be verified to be maintained and that logging is one way of maintaining trust among a group of people. Logging is also a way of answering the age-old question, "Who did what to whom and when?" In fact, this is all alluded to in the message that sudo spits out when you first attempt to use it:

We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things:

#1) Respect the privacy of others.

#2) Think before you type.

#3) With great power comes great responsibility.

Item #3 is a quote from the comic book Spider-Man, but Stan Lee's words are germane in this context. Root users in a Unix system can do just about anything they want, either maliciously or because they aren't thinking sufficiently when they type. Frequently, the only way of bringing a system back into working order is to figure out what was done to it by a user with rootly powers, and the only way of having a chance of doing that is if the actions were logged somewhere.

If I were this person's manager, I would either remove his sudo rights completely until he learned how to play well with others, or I would fire him. No one is so useful to a company that they should be allowed to play god without oversight.

Dear KV,

I was recently implementing new code based on an existing specification. The documentation is 30 pages of tables with names and values. Unfortunately, this documentation was available only as a PDF, which meant that there was no easy, programmatic way of extracting the tables into code. What would have taken five minutes with a few scripts turned into a half day of copy and paste, with all the errors that that implies. I know that many specifications—Internet RFCs (Requests for Comments), for example—are still published in plain ASCII text, so I don't understand why anyone would publish a spec destined to be code in something as programmatically hard to work with as PDF.

Copied, Pasted, Spindled, Mutilated

Dear Mutilated,

Clearly you do not appreciate the beauty and majesty implicit in modern desktop publishing. The use of fonts—many, many fonts—and bold and underline, increase the clarity of the written words as surely as if they had been etched in stone by a hand of flame.

You are either facing a problem of overwhelming self-importance, as when certain people—often in the marketing departments of large companies—start sending e-mail with bold, underline, and colors, because they think that will make us pay attention, or you're dealing with someone who writes standards but never implements them. In either case this brings up a point about not only documentation, but also code.

Other than in mythology, standards and code are not written in stone. Any document that must change should be trackable in a change-tracking system such as a source-code control system, and this is as true for documents as it is for code that will be compiled. The advantage of keeping any document in a trackable and diffable format is that it is also possible to extract information from it using standard text-manipulation programs such as sed, cut, and grep, as well as with scripting languages such as Perl and Python. While it's true that other computer languages can also handle text manipulation, I find that many people doing what you suggest are doing it with Perl and Python.

I, too, would like to live in a world where—when someone updated a software specification—I could run a set of programs over the document, extract the values, and compare them with those that are extant in my code. It would both reduce errors and increase the speed with which updates could be made. The differences might still need to be checked visually, and I would check mine visually out of basic paranoia and mistrust, but this would be far easier than either printing a PDF file and going over it with a pen or using a PDF reader to do the same job electronically. While there are programs that will tear down PDFs for you, none is very good; the biggest problem is that if the authors had thought for 30 seconds before opening Word to write their spec, none would be necessary.

There are many things one can complain about with respect to the IETF (Internet Engineering Task Force), but the commitment to continue to publish its protocols in text format is not one of them. Alas, not everyone had Jon Postel to guide them through their first couple of thousand documents, but they can still learn from the example.

LOVE IT, HATE IT? LET US KNOW

[email protected]

KODE VICIOUS, known to mere mortals as George V. Neville-Neil, works on networking and operating system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are code spelunking, operating systems, and rewriting your bad code (OK, maybe not that last one). He earned his bachelor's degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. He is an avid bicyclist and traveler who currently lives in New York City.

Originally published in Queue vol. 12, no. 2—
Comment on this article in the ACM Digital Library

More related articles:

Justin Sheehy, Jonathan Reed - Moving Faster by Not Breaking Things
An engineering team that can move without fear, knowing that they have made themselves safe to do so, can ship more often and more quickly and make more dramatic changes without hesitation. This feels great to individual engineers and enables those engineers to be more effective for the business they work in. A bit of investment in safety pays huge dividends in speed as well as by reducing the frequency and severity of change-triggered incidents.

Dennis Roellke - String Matching at Scale
String matching can't be that difficult. But what are we matching on? What is the intrinsic identity of a software component? Does it change when developers copy and paste the source code instead of fetching it from a package manager? Is every package-manager request fetching the same artifact from the same upstream repository mirror? Can we trust that the source code published along with the artifact is indeed what's built into the release executable? Is the tool chain kosher?

Catherine Hayes, David Malone - Questioning the Criteria for Evaluating Non-cryptographic Hash Functions
Although cryptographic and non-cryptographic hash functions are everywhere, there seems to be a gap in how they are designed. Lots of criteria exist for cryptographic hashes motivated by various security requirements, but on the non-cryptographic side there is a certain amount of folklore that, despite the long history of hash functions, has not been fully explored. While targeting a uniform distribution makes a lot of sense for real-world datasets, it can be a challenge when confronted by a dataset with particular patterns.

Nicole Forsgren, Eirini Kalliamvakou, Abi Noda, Michaela Greiler, Brian Houck, Margaret-Anne Storey - DevEx in Action
DevEx (developer experience) is garnering increased attention at many software organizations as leaders seek to optimize software delivery amid the backdrop of fiscal tightening and transformational technologies such as AI. Intuitively, there is acceptance among technical leaders that good developer experience enables more effective software delivery and developer happiness. Yet, at many organizations, proposed initiatives and investments to improve DevEx struggle to get buy-in as business stakeholders question the value proposition of improvements.