The Kollected Kode Vicious

Kode Vicious - @kode_vicious

  Download PDF version of this article PDF

Kode Vicious Unscripted

Some months, when he’s feeling ambitious, Kode Vicious reads through all of your letters carefully, agonizing for days over which to respond to. Most of the time, though, he takes a less measured approach. This usually involves printing the letters out, throwing them up in the air, and seeing which land face up, repeating the process until only two remain. And occasionally, KV dispenses with reader feedback altogether, as is the case this month. Not to worry though, he still digs reading and responding to your monthly koding kwestions, so keep ‘em coming to [email protected].

Greetings,

You know, sometimes I really don’t need to read my mail to get going on a subject; sometimes I just need to read a little code. I’m sorry I can’t share the code with you, as it is proprietary, but it brings up two more items in the long list of things that Kode Vicious really hates. The more I think about it, the more I realize it is just one big problem with a lot of different faces.

The problem? Computers make it too easy to copy data. Yes, that’s right, I know you were all expecting me to rail on about the poor quality of comments, or documentation, but, in reality it’s just that computers are too good at something they’re designed to do. I guess I really shouldn’t blame the computers; I ought to blame the -idiots behind the keyboards, but it’s much more acceptable to take out your anger on machines than on people. After all, you’re probably not going to be arrested for chucking a computer out a window, unless it hits someone on the street below, but you can bet that chucking a co-worker out a window, though it might feel good at the time, will have consequences. To shed some light on what I am ranting about, let me tell you a story from my day.

The day started out fine, the birds were singing, the sun was shining, all was... Oh never mind that! Today I was looking over a fix a programmer had made to some code that handled C++ strings and C char* buffers. The code had a bug where the string, when placed in the buffer, was not properly terminated, leading to some data leakage into other parts of the system. All well and good, pointers and strings are difficult beasties and the known source of many program errors. So, loaded with my usual level of midday caffeine—that is to say, just short of grinding my teeth to dust—I decided to check the fixes.

I opened up my editor to one of the files that had been changed and though the code itself offended me, I was here only to look and did not want to get mired in reworking the fix. Having checked the first change, I went on to the next, which looked like a carbon copy of the first one. OK, well, that’s fine, a couple of places isn’t a problem. I moved to the next difference, and it too was the same as the two previous. I think you can see how this goes. I went through more than 10 changes. The same bug had appeared in more than 10 places, in a program with only about 200 files, and why? Because the person who had written the first version of the buggy code had simply copied that bug, over and over again. The code that I’m talking about wasn’t just a single line call to some function; this was 15 lines of code that was handling a known dangerous quantity, a pointer to a buffer, a frequent source of errors.

So, now we come to the first annoyance: the ability to copy and paste code all over the place. I don’t want to say, “Never cut and paste code!” because such strong statements don’t take enough situations into account, but I can say, “Before you cut and paste code, THINK!” You see, many years ago, long before you or I, or many people reading this, were born, some very nice people invented the function call and then in 1951, the library. It would seem that most people think of libraries as being provided by others but not by themselves. As you all know, a function call is a way of simplifying repetitive work. Instead of doing the same thing with 10 or 20 or yes, I have seen this, 100 lines of code copied and pasted all over your software, you simply say, “Oh, look, this code is used again and again, I bet it’s generally useful.” Then you take that code, put it in a function, put that function in a library, and share it with all your co-workers who can thereby benefit from your genius. Just like Mom said when we were kids, “Sharing is good!”

Now, unfortunately, the story doesn’t end here. It is on days like today when I actually feel sorry for the people who sit near me, because although I have learned to control my use of extremely vulgar language in the office, people still find the sound of my head hitting my desk to be somewhat disturbing. It was that sound that brought the usual calls of, “What now?” from my neighbors, who are occasionally amused by my rantings, so long as I stop hitting my head on the desk.

What I had found, completely by accident, was a whole subdirectory of the current product, which contained a subset of the files from the product I was checking. It would seem that in order to make a new product, someone had just copied the old one, and started editing it. Now, there weren’t just the 10-plus bugs in the code I was supposed to check, but the same bugs in the code that someone had copied to make their new product. But wait! There’s more! The new product hadn’t retained all of the old code; no, it had kept many of the APIs, but had subtly changed their underlying meanings, adding a few new constants here, changing a return value there. In a single file, the one that had led me to this dubious discovery, there were 200-plus separate changes—not significant enough that someone linking with the wrong library would get an obvious error; oh no, only enough that the code would break in weird and mysterious ways.

So, now we have two different problems caused by the ease of copying. The first problem is the replication of bugs in dangerous, pointer-handling code that must now be maintained in 10-plus places in one product. The second problem is the copying of a whole product, all bugs included, and two products that are related but subtly different enough that fixing a bug in one requires manually fixing a bug in another.

All I can wonder is, “What were these people thinking?” I mean, yes, for those of us on Unix-like systems it is very easy to type cp -r OldProduct NewProduct and then get right to work. Look at all the time we’ve saved! We are sure to get a raise for our productivity, instead of a kick in the teeth, which is what we deserve. For those with a desktop metaphor in mind, it’s just a point and a click, even less work than typing the 28 characters above (including “Enter”) required for those of us on Unix. That, though, is not the point; the point is that software is made up of libraries and functions for a reason, and when you come across something that you need, you should attempt to make it easily reusable by you and by others. It will save you more time in the long run than you saved by blindly copying code or files. If you work with me, it may also save you from defenestration. Yes, that’s the word for today: defenestration. Look it up!

KODE VICIOUS, known to mere mortals as George V. Neville-Neil, works on networking and operating system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are code spelunking, operating systems, and rewriting your bad code (OK, maybe not that last one). He earned his bachelor’s degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. He is an avid bicyclist and traveler who has made San Francisco his home since 1990.

acmqueue

Originally published in Queue vol. 3, no. 8
Comment on this article in the ACM Digital Library





More related articles:

Nicole Forsgren, Eirini Kalliamvakou, Abi Noda, Michaela Greiler, Brian Houck, Margaret-Anne Storey - DevEx in Action
DevEx (developer experience) is garnering increased attention at many software organizations as leaders seek to optimize software delivery amid the backdrop of fiscal tightening and transformational technologies such as AI. Intuitively, there is acceptance among technical leaders that good developer experience enables more effective software delivery and developer happiness. Yet, at many organizations, proposed initiatives and investments to improve DevEx struggle to get buy-in as business stakeholders question the value proposition of improvements.


João Varajão, António Trigo, Miguel Almeida - Low-code Development Productivity
This article aims to provide new insights on the subject by presenting the results of laboratory experiments carried out with code-based, low-code, and extreme low-code technologies to study differences in productivity. Low-code technologies have clearly shown higher levels of productivity, providing strong arguments for low-code to dominate the software development mainstream in the short/medium term. The article reports the procedure and protocols, results, limitations, and opportunities for future research.


Ivar Jacobson, Alistair Cockburn - Use Cases are Essential
While the software industry is a fast-paced and exciting world in which new tools, technologies, and techniques are constantly being developed to serve business and society, it is also forgetful. In its haste for fast-forward motion, it is subject to the whims of fashion and can forget or ignore proven solutions to some of the eternal problems that it faces. Use cases, first introduced in 1986 and popularized later, are one of those proven solutions.


Jorge A. Navas, Ashish Gehani - OCCAM-v2: Combining Static and Dynamic Analysis for Effective and Efficient Whole-program Specialization
OCCAM-v2 leverages scalable pointer analysis, value analysis, and dynamic analysis to create an effective and efficient tool for specializing LLVM bitcode. The extent of the code-size reduction achieved depends on the specific deployment configuration. Each application that is to be specialized is accompanied by a manifest that specifies concrete arguments that are known a priori, as well as a count of residual arguments that will be provided at runtime. The best case for partial evaluation occurs when the arguments are completely concretely specified. OCCAM-v2 uses a pointer analysis to devirtualize calls, allowing it to eliminate the entire body of functions that are not reachable by any direct calls.





© ACM, Inc. All Rights Reserved.