A recent conversation about development methodologies turned to the relative value of various artifacts produced during the development process, and the person I was talking with said: the code has "always been the only artifact that matters. It's just that we're only now coming to recognize that." My reaction to this, not expressed at that time, was twofold. First, I got quite a sense of déjà-vu since it harkened back to my time as an undergraduate and memories of many heated discussions about whether code was self-documenting. Second, I thought of several instances from recent experience in which the code alone simply was not enough to understand why the system was architected in a particular way.
Contrary to the point the speaker was making, the notion that code is all that matters has a long history in software development. In my experience, really good programmers have always tended towards the idea that code is self-sufficient. In fact, the ability to look at a piece of code and perceive its overall structure and purpose without recourse to comments or design documents has often been the informal way to separate the best from the rest. To paraphrase a theme from my undergraduate days: "If you don't understand my code, it doesn't mean that it needs comments, it means you need to learn more about programming."
And it is true that the best programmers I have worked with do have an amazing ability to "think in code." Six months after they had cranked out 10,000 lines of code, you could go back, point to one of the 10,000 lines of code, and ask them why it was there. And without hesitation, they would tell you. The trouble is that people with that level of ability are rare. I've only encountered a few in my roughly 30 years of software development experience. The reality of software development is that there is a much larger class of programmers who are good, but not that good. And unless you have had the immense good fortune to have a development team composed of nothing but programming ninjas, then your software development processes have to be geared to that broader class of software developers.
Two recent experiences really brought this into focus for me. The first relates to a relatively simple piece of code. The code itself was simple to understand, but what the code could not communicate was why the code existed at all. The second instance involved the value of working with formalisms other than codeĀin this particular case, finite state machines.
The first example came to mind because I was recently looking over the code base for my company's products. I came across a fairly simple piece of code that was part of our base implementation for the factory pattern. Whenever a factory creates an object, it stores it onto a list. There is an entry point into the factory that then traverses this list and calls into our underlying storage system for each object in the list. The code is self-documenting in the sense that it is completely obvious what it is doing. On the other hand, it is completely unclear why it is doing it. I knew that the storage sub-system handles object creation automatically, so simply instantiating the object in the factory should have been sufficient.
I went and asked the developer who had written the code why the extra traversal of the list. At first, even he had trouble recalling why we had added this bit of code. Eventually, between the two of us we managed to recall that we had written the factory in the straightforward manner originally, and it hadn't worked. Although the storage sub-system did automatically handle the creation of the objects, we had run into a problem with database constraints being violated because the way our factories work, you create the object first, and then initialize various properties. Unfortunately, the storage sub-system was inserting the objects into the DB immediately upon creation. Since the properties of the object had not been set at that point, certain types of constraints were violated by that insertion, typically constraints that required certain properties to be non-null.
So the additional traversal of the list of created objects in the factory was part of a mechanism to delay inserting the objects into the DB until after the properties had been set. What this experience brought home for me is that if you have well-written code, you can easily understand what the code is doing. However, even the best-written code can't reveal why it is doing it. That's because the question of "why" is not centered on the code itself, but on the context it operates in and the design decisions made during the development of the system. The best way to communicate those ideas is not code, but comments and design documents. To me this clearly demonstrated that it is not just code that has value.
I won't go into the second example in as much detail. Suffice to say that we had embarked on doing some event-based programming. I knew from previous experience that finite state machines (FSMs) worked well in these types of circumstances and we created FSMs for various components. While working with the developers, it became immediately clear that the best way to discuss the FSMs with them was not to sit and look at the code implementing them, but to grab a sheet of paper and draw them. While it is true that the code implementing the FSM completely defines how that FSM behaves, looking at the code doesn't really give you an intuitive sense of what the FSM does. A diagram does. The graphic formalism adds value in this case that the code cannot.
I've been through enough experiences like this that I simply don't believe the "only code has value" proposition. Clearly code is the core value product of the software development process. But it's not the only thing of value, and if we want to maintain and extend our code, we need to gear our software development processes accordingly.
Originally published in Queue vol. 5, no. 6—
Comment on this article in the ACM Digital Library