An argument recently broke out between two factions of our systems administration team concerning the naming of our next set of hosts. One faction wants to name machines after services, with each host having a numeric suffix, and the other wants to continue our current scheme of each host having a unique name, without a numeric string. We now have so many hosts that any unique name is getting quite long—and is annoying to type. A compromise was recently suggested whereby each host could have two names in our internal DNS (Domain Name System), but this seems overly complicated. How do you decide on a host-naming scheme?
I refer you to T.S. Eliot, who pointed out—sort of:
"The Naming of Cats" (not Hosts) is a poem in T. S. Eliot's poetry book, Old Possum's Book of Practical Cats, and its stage adaptation is Andrew Lloyd Webber's popular musical Cats. The poem describes to humans how cats get their names. I took some liberties with Eliot's wording—as others have done before me—and extended the analogy to describe the naming of hosts. But given that T. S. Eliot died just about the time the first minicomputers were being designed, I don't think he had host names in mind when he wrote his poem. And that's a good thing, because if you think two names are bad, three would only be worse!
The naming of hosts is a difficult matter that ranks with coding style, editor choice, and language preference in the pantheon of things computer people fight about that don't matter to anyone else in the whole world. What's even more annoying—or amusing—but actually annoying, is that if you're in the wrong bar at the wrong time, you'll have to hear drunken systems administrators fighting about naming schemes and crying in their beers over the names they lovingly gave to hosts at their previous companies. What a way to ruin a good bender!
Giving something a name has a simple purpose: to make it understandable and memorable to a community of people. Naming your variables foo, bar, and baz is amusing in a short example program, but you wouldn't want to maintain 100 lines of code written like that. The same is true of host names. Hosts have names because people need to know how to get to them—either to use their services or to maintain them, or both. If people weren't involved, hosts could simply be identified by their Internet addresses. Unfortunately, host naming is an instance where geeks like to get creative. Even more unfortunately, geeks don't always know the difference between creative and annoying. It's all very well to decide that your hosts should be named after Star Trek, Star Wars, or Tolkien or Twilight characters. With Tolkien you can probably write—and, dear God, someone has probably already done so—a script to generate new names based on his works, just in case The Hobbit, The Lord of the Rings trilogy, and The Silmarillion didn't have enough ridiculous names in them to begin with!
Everyone has a naming horror story. My first was at a university where the hosts were named after rivers. That would have been fine if you could remember how to spell Seine, but once you run out of nice short names, you get to Mississippi and Dnjeper. That's what I want to do when I remotely log in to a host, I want to think in my head, "M-I crooked letter crooked letter I crooked letter crooked letter I hump back hump back I," which is how I and many other American schoolchildren learned to spell Mississippi. I could go on and on about this, but then I would sound like those folks I mentioned who were ruining my bender. Here, therefore, is a short guide to picking host names.
A name that you're going to use on a daily basis needs to be easy to type. That means no silent letters, such as in Dnjeper, and nothing that's too long, like thisisthehostthatjackbuilt.
It's a good idea to choose names that everyone you work with can pronounce. With globalization, finding pronounceable names has become more difficult, since some people can't pick up L vs. R, or understand whether you just used a double o or a single o, and diphthongs will kill you (no, diphthongs are not a new Brazilian bathing suit). The main point here is to avoid picking a name with a lot of sounds that are difficult to translate into typing. Typing is still faster than using a voicerecognition system; so remember, these names will have to be typed.
If you're going to use services as names, make sure you can replace the systems behind the names without hiccups. It should be obvious that everyone is going to be annoyed if they have to use mail2.yourdomain.com when mail.yourdomain.com goes down. (This point isn't really about naming, because any sysadmin worth his or her paycheck can build a system like this; but I've seen it done the wrong way, so I wanted to state it for the record.)
Avoid at all costs having two different, unrelated names for the same thing. In fact, this is true in code and host names. If you have two similar services and you want two different names, make it completely obvious how to map one name to the other and back. It is maddening to have the kind of back and forth where one person asks,
"Hey, can I reboot fibble?"
And then someone asks,
"Who rebooted mail1?"
"But I didn't know it was mail1; I thought it was fibble."
Finally, try to avoid being cute. I know that giving this piece of advice is basically tilting at windmills, but I have to say that people who name their mail servers male and female make my normally icy blood boil.
One of my company's front-line engineers—in the group that looks at the live traffic hitting our switches and servers—keeps reporting problems, and then, before anyone can look at the server that's having issues, reboots the system to clear the problem. How do you explain to someone that there is information that needs to be collected when the system is misbehaving that is absolutely vital to finding and solving the problem?
I would start by standing with my foot on this person's chest and yelling, "There is information that needs to be collected when the system is misbehaving that is absolutely vital to finding and solving the problem." But I take it you've tried that already, though perhaps without enough screaming.
True, systems tend to build up state during execution that is not written to some permanent storage often enough. The problem you need to solve isn't preventing the person from insta-booting a misbehaving machine, as much as it is to make sure there is a good, searchable record of what the system is doing when it's running. Most system-monitoring tools on modern servers generate plain text output. It's a simple matter to write scripts that execute periodically to write the output of these tools—such as procstat, netstat, iostat, and the like—into files that will be preserved across reboots.
For more pernicious problems, you can write your own tools, either scripts or new programs that are executed when the system is shut down or rebooted. In this way, if people are insta-booting your machines before you can get to them, you can make it so that their reboot command does your bidding. You can even go so far as to rig your operating system to produce a kernel core dump on each reboot. This gives you a snapshot of the system as it was when it was broken, which you can go back to later and pick through. I warn you, though, that picking through a kernel core dump is about as much fun as picking fleas off a dog.
The only downside to collecting all this data is analyzing it. Since it's no longer really necessary to delete data, you may wind up spending a good deal of time organizing it into trees of trees of files. I offer a couple of quick suggestions. Do not make the tree scheme too difficult to traverse, either for a person or a program. It can take a very long time to access a ton of files in deep trees, due to the cost of traversing the directory trees. Keep things simple for both yourself and your analysis programs. Before you start, have a plan for what you want to store, where you want to store it, and how you plan to access it. Most people put this kind of thought into their applications, but not enough into how and where they store logs or other runtime information generated by their systems. You should put at least half as much time into the latter as you do into the former.
LOVE IT, HATE IT? LET US KNOW
KODE VICIOUS, known to mere mortals as George V. Neville-Neil, works on networking and operating system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are code spelunking, operating systems, and rewriting your bad code (OK, maybe not that last one). He earned his bachelor's degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. He is an avid bicyclist and traveler who currently lives in New York City.
© 2013 ACM 1542-7730/13/0600 $10.00
Originally published in Queue vol. 11, no. 6—
see this item in the ACM Digital Library
Follow Kode Vicious on Twitter
Have a question for Kode Vicious? E-mail him at firstname.lastname@example.org. If your question appears in his column, we'll send you a rare piece of authentic Queue memorabilia. We edit e-mails for style, length, and clarity.
Ivar Jacobson, Ian Spence, Ed Seidewitz - Industrial Scale Agile - from Craft to Engineering
Essence is instrumental in moving software development toward a true engineering discipline.
Andre Medeiros - Dynamics of Change: Why Reactivity Matters
Tame the dynamics of change by centralizing each concern in its own module.
Brendan Gregg - The Flame Graph
This visualization of software execution is a new necessity for performance profiling and debugging.
Ivar Jacobson, Ian Spence, Brian Kerr - Use-Case 2.0
The Hub of Software Development
(newest first)-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
context: comment on the article "Columns > Kode Vicious - The Naming of Hosts is a Difficult Matter" (by George Neville-Neil on June 1, 2013) (http://queue.acm.org/detail.cfm?id=2493946) on http://queue.acm.org (ACM Queue)
"If people weren't involved, hosts could simply be identified by their Internet addresses"
On the contrary, I know of multiple companies where all the employees refer to the machines they commonly use by IP address, which of course has the advantage that the behemoth that is DNS is not involved. In groups where everyone is in the same subnet, they refer to machines by the last digit of the IP address, e.g "123" would mean 192.168.0.123. In groups where there are multiple subnets, they may say "5.123" for 192.168.5.123 or "4.123" for 192.168.4.123. Indeed, this sounds silly, but I consider it just as silly as using hostnames.
For all practical purposes within the companies I am not ashamed to be part of, we've been able to use tor hidden services successfully, which means there is no issue of IP/name conflict, and the service is globally unambiguously identifiable (e.g 3g2upl4pq6kufc4m.onion). You don't even need to care which machine the service is on. Naming (e.g a bookmark in the web browser, possibly with tags instead of bothering with a name) is then done on an ad-hoc, as-needed basis; humans have the natural ability to coordinate names for things as they need to. Tor hidden services superscede DNS, TLS, and that sort of thing. Then again, if you're a typical IT admin you're probably stuck with silly things (e.g email) which are intertwined with DNS - I'm empathetic. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (GNU/Linux)
iEYEARECAAYFAlH9nk0ACgkQ3PGpByoQpZHq5wCglBvcwgcMUQqR7lyskj8nJDPz 4S0Anill/u4Plp6a6xh0NsQEj4XL+cit =M/yV -----END PGP SIGNATURE-----