Volume 19, Issue 4

Kode Vicious:
Patent Absurdity

A case when ignorance is the best policy

The main reason a lawyer will give for not reading a software patent is that, if you run afoul of the patent and it can be shown that you had knowledge of it, your company will incur triple the damages that they would have, had you not had knowledge of the patent. That seems like reason enough to avoid reading them, but there is an even better reason, and that is, as design or technical documents, software patents suck.

Business, Kode Vicious

The Bikeshed:
The Software Industry IS STILL the Problem

  Poul-Henning Kamp

The time is (also) way overdue for IT professional liability

The time is way overdue for IT engineers to be subject to professional liability, like almost every other engineering profession. Before you tell me that is impossible, please study how the very same thing happened with electricity, planes, cranes, trains, ships, automobiles, lifts, food processing, buildings, and, for that matter, driving a car.

Business, Compliance, The Bikeshed

Drill Bits
Crashproofing the Original NoSQL Key-Value Store

  Terence Kelly

An upgrade for the gdbm database

Fortifying software to protect persistent data from crashes can be remarkably easy if a modern file system handles the heavy lifting. This episode of Drill Bits unveils a new crash-tolerance mechanism that vaults the venerable gdbm database into the league of transactional NoSQL data stores. We'll motivate this upgrade by tracing gdbm's history. We'll survey the subtle science of crashproofing, navigating a minefield of traps for the unwary. We'll arrive at a compact and rugged design that leverages modern file-system features, and we'll tour the production-ready implementation of this design and its ergonomic interface. This new approach is quite generic: It can enable a wide range of software to tolerate crashes.

Code, Databases, Development, Drill Bits, Open Source, Software Design

Special issue on Static Analysis

Static Analysis: An Introduction

  Patrick Thomson

The fundamental challenge of software engineering is one of complexity.

Modern static-analysis tools provide powerful and specific insights into codebases. The Linux kernel team, for example, developed Coccinelle, a powerful tool for searching, analyzing, and rewriting C source code; because the Linux kernel contains more than 27 million lines of code, a static-analysis tool is essential both for finding bugs and for making automated changes across its many libraries and modules. Another tool targeted at the C family of languages is Clang scan-build, which comes with many useful analyses and provides an API for programmers to write their own analyses. Like so many things in computer science, the utility of static analysis is self-referential: To write reliable programs, we must also write programs for our programs. But this is no paradox. Static-analysis tools, complex though their theory and practice may be, are what will enable us, and engineers of the future, to overcome this challenge and yield the knowledge and insights that we practitioners deserve.

Code, Development, Tools

Static Analysis at GitHub

  Timothy Clem and Patrick Thomson

An experience report

The Semantic Code team at GitHub builds and operates a suite of technologies that power symbolic code navigation on github.com. We learned that scale is about adoption, user behavior, incremental improvement, and utility. Static analysis in particular is difficult to scale with respect to human behavior; we often think of complex analysis tools working to find potentially problematic patterns in code and then trying to convince the humans to fix them. Our approach took a different tack: use basic analysis techniques to quickly put information that augments our ability to understand programs in front of everyone reading code on GitHub with zero configuration required and almost immediate availability after code changes.

Code, Development, Tools

Human-Centered Approach to Static-Analysis-Driven Developer Tools

  Ayman Nadeem

The future depends on good HCI.

Complex and opaque systems do not scale easily. A human-centered approach for evolving tools and practices is essential to ensuring that software is scaled safely and securely. Static analysis can unveil information about program behavior, but the goal of deriving this information should not be to accumulate hairsplitting detail. HCI can help direct static-analysis techniques into developer-facing systems that structure information and embody relationships in representations that closely mirror a programmer's thought. The survival of great software depends on programming languages that support, rather than inhibit, communicating, reasoning, and abstract thinking.

Code, Development, Tools

Designing UIs for Static Analysis Tools

  Daniil Tiganov, Lisa Nguyen Quang Do, Karim Ali

Evaluating tool design guidelines with SWAN

Static-analysis tools suffer from usability issues such as a high rate of false positives, lack of responsiveness, and unclear warning descriptions and classifications. Here, we explore the effect of applying user-centered approach and design guidelines to SWAN, a security-focused static-analysis tool for the Swift programming language. SWAN is an interesting case study for exploring static-analysis tool usability because of its large target audience, its potential to integrate easily into developers' workflows, and its independence from existing analysis platforms.

Code, Development, Tools


Volume 19, Issue 3

Escaping the Singularity:
Don't Get Stuck in the "Con" Game

  Pat Helland

Consistency, convergence, and confluence are not the same! Eventual consistency and eventual convergence aren't the same as confluence, either.

"Eventual consistency" is a popular phrase with a fuzzy definition. People are even inconsistent in their use of consistency. But two other terms, "convergence" and "confluence", that have crisper definitions and are more easily understood.

Data, Databases, Escaping the Singularity,

Declarative Machine Learning Systems

  Piero Molino, Christopher Ré

The future of machine learning will depend on it being in the hands of the rest of us.

The people training and using ML models now are typically experienced developers with years of study working within large organizations, but the next wave of ML systems should allow a substantially larger number of people, potentially without any coding skills, to perform the same tasks. These new ML systems will not require users to fully understand all the details of how models are trained and used for obtaining predictions, but will provide them a more abstract interface that is less demanding and more familiar. Declarative interfaces are well-suited for this goal, by hiding complexity and favoring separation of interest, and ultimately leading to increased productivity.


Real-world String Comparison

  Torsten Ullrich

How to handle Unicode sequences correctly

In many languages a string comparison is a pitfall for beginners. With any Unicode string as input, a comparison often causes problems even for advanced users. The semantic equivalence of different characters in Unicode requires a normalization of the strings before comparing them. This article shows how to handle Unicode sequences correctly. The comparison of two strings for equality often raises questions concerning the difference between comparison by value, comparison of object references, strict equality, and loose equality. The most important aspect is semantic equivalence.

Code, Data

Kode Vicious:
Divide and Conquer

The use and limits of bisection

Bisection is of no use if you have a heisenbug that fails only from time to time. These subtle bugs are the hardest to fix and the ones that cause us to think critically about what we are doing. Timing bugs, bugs in distributed systems, and all the difficult problems we face in building increasingly complex software systems can't yet be addressed by simple bisection. It's often the case that it would take longer to write a usable bisection test for a complex problem than it would to analyze the problem whilst at the tip of the tree.

Debugging, Kode Vicious

When Curation Becomes Creation

  Liu Leqi, Dylan Hadfield-Menell, and Zachary C. Lipton

Algorithms, microcontent, and the vanishing distinction between platforms and creators

Media platforms today benefit from: (1) discretion to organize content, (2) algorithms for curating user-posted content, and (3) absolution from liability. This favorable regulatory environment results from the current legal framework, which distinguishes between intermediaries and content providers. This distinction is ill-adapted to the modern social media landscape, where platforms deploy powerful data-driven algorithms to play an increasingly active role in shaping what people see, and where users supply disconnected bits of raw content as fodder. Today's platforms have license to monetize whatever content they like, moderate if and when it aligns with their corporate objectives, and curate their content however they wish.

HCI, Opinion, Privacy and Rights

Digging into Big Provenance
(with SPADE)

  Ashish Gehani, Raza Ahmad, Hassaan Irshad, Jianqiao Zhu, and Jignesh Patel

A user interface for querying provenance

Several interfaces exist for querying provenance. Many are not flexible in allowing users to select a database type of their choice. Some provide query functionality in a data model that is different from the graph-oriented one that is natural for provenance. Others have intuitive constructs for finding results but have limited support for efficiently chaining responses, as needed for faceted search. This article presents a user interface for querying provenance that addresses these concerns and is agnostic to the underlying database being used.


The Bikeshed:
What Went Wrong?

  Poul-Henning Kamp

Why we need an IT accident investigation board

Governments should create IT accident investigation boards for the exact same reasons they have done so for ships, railroads, planes, and in many cases, automobiles. Denmark got its Railroad Accident Investigation Board because too many people were maimed and killed by steam trains. The UK's Air Accidents Investigation Branch was created for pretty much the same reasons, but, specifically, because when the airlines investigated themselves, nobody was any the wiser. Does that sound slightly familiar in any way?

Compliance, The Bikeshed


Volume 19, Issue 2

Commit to Memory:
A New Era for Mechanical CAD

  Jessie Frazelle

Time to move forward from decades-old design

The hardware industry is desperate for a modern way to do mechanical design. A new CAD program created for the modern world would lower the barrier to building hardware, decrease the time of development, and usher in a new era of building. The tools used to build with today are supported on the shoulders of giants, but a lot could be done to make them even better. At some point, mechanical CAD lost some of its roots of innovation. Let's dive into a few of the problems with the CAD programs that exist today and see how to make them better.

Commit to Memory, Hardware,

Escaping the Singularity:
ACID: My Personal "C" Change

  Pat Helland

How could I miss such a simple thing?

I had a chance recently to chat with my old friend, Andreas Reuter, the inventor of ACID. He and his Ph.D. advisor, Theo Härder, coined the term in their famous 1983 paper, Principles of Transaction-Oriented Database Recovery. I had blinders on after almost four decades of seeing C based on my assumptions. One big lesson for me is to work hard to ALWAYS question your assumptions. Try hard to surround yourself with curious and passionate people, both young and old, who will challenge you and try to dislodge your blinders. Foster a culture that makes them safe as they do so.

Databases, Escaping the Singularity,

Kode Vicious:
In Praise of the Disassembler

There's much to be learned from the lower-level details of hardware.

When you're starting out you want to be able to hold the entire program in your head if at all possible. Once you're conversant with your first, simple assembly language and the machine architecture you're working with, it will be completely possible to look at a page or two of your assembly and know not only what it is supposed to do but also what the machine will do for you step by step. When you look at a high-level language, you should be able to understand what you mean it to do, but often you have no idea just how your intent will be translated into action. Assembly and machine code is where the action is.

Development, Kode Vicious

Drill Bits
Schrödinger's Code: Undefined Behavior in Theory and Practice

  Terence Kelly with special guest borers Weiwei Gu and Vladimir Maksimovski

Undefined behavior ranks among the most baffling and perilous aspects of popular programming languages. This installment of Drill Bits clears up widespread misconceptions and presents practical techniques to banish undefined behavior from your own code and pinpoint meaningless operations in any software—techniques that reveal alarming faults in software supporting business-critical applications at Fortune 500 companies.

Code, Databases, Development, Drill Bits, Open Source, Software Design

Case Study: Quantum-safe Trust for Vehicles:
The Race is Already On

A discussion with Michael Gardiner, Alexander Truskovsky, George Neville-Neil, and Atefeh Mashatan

In the automotive industry, cars now coming off assembly lines are sometimes referred to as "rolling data centers" in acknowledgment of all the entertainment and communications capabilities they contain. The fact that autonomous driving systems are also well along in development does nothing to allay concerns about security. Indeed, it would seem the stakes of automobile cybersecurity are about to become immeasurably higher just as some of the underpinnings of contemporary cybersecurity are rendered moot.

Case studies, Privacy and Rights, Security

The Complex Path to Quantum Resistance

  Dr. Atefeh Mashatan and Douglas Heintzman

Is your organization prepared?

Competing quantum-resistant proposals are currently going through academic due diligence and scrutiny by industry leaders. Until the newly minted quantum-resistant standards are finalized, ICT leaders should do their best to plan for a smooth transition. This article provides a series of recommendations for these decision-makers, including what they need to know and do today. It will help them in devising an effective quantum transition plan with a holistic lens that considers the affected assets in people, process, and technology. To do so, the decision-makers first need to comprehend the nature of quantum computing in order to grasp the impact of the impending quantum threat and appreciate its magnitude.

Privacy and Rights, Security

Biases in AI Systems

  Ramya Srinivasan and Ajay Chander

A survey for practitioners

This article provides an organization of various kinds of biases that can occur in the AI pipeline starting from dataset creation and problem formulation to data analysis and evaluation. It highlights the challenges associated with the design of bias-mitigation strategies, and it outlines some best practices suggested by researchers. Finally, a set of guidelines is presented that could aid ML developers in identifying potential sources of bias, as well as avoiding the introduction of unwanted biases. The work is meant to serve as an educational resource for ML developers in handling and addressing issues related to bias in AI systems.

AI, Privacy and Rights


Volume 19, Issue 1

Escaping the Singularity:
Fail-fast Is Failing... Fast!

  Pat Helland

Changes in compute environments are placing pressure on tried-and-true distributed-systems solutions.

For more than 40 years, fail-fast has been the dominant way of achieving fault tolerance. In this approach, some mechanism is responsible for ensuring that each component is up, functioning, and responding to work. As the industry moves to leverage cloud computing, this is getting more challenging. The way we create robust solutions is under pressure as the individual components don't fail fast but instead, starts running slow, which is far worse The slow component may be healthy enough to say, "I'm still here!" but slow enough to clog up all the work. This makes fail-fast schemes vulnerable.

Distributed Computing, Distributed Development, Escaping the Singularity, Quality Assurance

Software Development in Disruptive Times

  João Varajão

Creating a software solution with fast decision capability, agile project management, and extreme low-code technology

In this project, the challenge was to "deploy software faster than the coronavirus spread." In a project with such peculiar characteristics, several factors can influence success, but some clearly stand out: top management support, agility, understanding and commitment of the project team, and the technology used. Conventional development approaches and technologies would simply not be able to meet the requirements promptly.


Kode Vicious:
Aversion to Versions

Resolving code-dependency issues

One should never hardcode a version or a path inside the code itself. Code needs to be flexible so that it can be installed anywhere and run anywhere so long as the necessary dependencies can be resolved, either at build time for statically compiled code or at runtime for interpreted code or code with dynamically linked libraries. There are current, good ways to get this right, so it's a shame that so many people continue to get it wrong.

Development, Kode Vicious

WebRTC - Realtime Communication for the Open Web Platform

  Niklas Blum, Serge Lachapelle, and Harald Alvestrand, Google

What was once a way to bring audio and video to the web has expanded into more use cases we could ever imagine.

In this time of pandemic, the world has turned to Internet-based, RTC (realtime communication) as never before. The number of RTC products has, over the past decade, exploded in large part because of cheaper high-speed network access and more powerful devices, but also because of an open, royalty-free platform called WebRTC. WebRTC is growing from enabling useful experiences to being essential in allowing billions to continue their work and education, and keep vital human contact during a pandemic. The opportunities and impact that lie ahead for WebRTC are intriguing indeed.

Web Services

Toward Confidential Cloud Computing

  Mark Russinovich, Manuel Costa, Cédric Fournet, David Chisnall, Antoine Delignat-Lavaud, Sylvan Clebsch, Kapil Vaswani, Vikas Bhatia

Extending hardware-enforced cryptographic protection to data while in use

Although largely driven by economies of scale, the development of the modern cloud also enables increased security. Large data centers provide aggregate availability, reliability, and security assurances. The operational cost of ensuring that operating systems, databases, and other services have secure configurations can be amortized among all tenants, allowing the cloud provider to employ experts who are responsible for security; this is often unfeasible for smaller businesses, where the role of systems administrator is often conflated with many others.

Distributed Computing, Privacy, Security

The SPACE of Developer Productivity

  Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, Jenna Butler

There's more to it than you think.

Developer productivity is about more than an individual's activity levels or the efficiency of the engineering systems relied on to ship software, and it cannot be measured by a single metric or dimension. The SPACE framework captures different dimensions of productivity, and here we demonstrate how this framework can be used to understand productivity in practice and why using it will help teams better understand developer productivity and create better measures to inform their work and teams.

Management, Workflow



Older Issues