July/August issue of acmqueue

The July/August issue of acmqueue is out now

July/August 2017

Research for Practice:
Private Online Communication; Highlights in Systems Verification

  Albert Kwon, James Wilcox

Expert-curated Guides to the Best of CS Research

First, Albert Kwon provides an overview of recent systems for secure and private communication. While messaging protocols such as Signal provide privacy guarantees, Albert's selected research papers illustrate what is possible at the cutting edge: more transparent endpoint authentication, better protection of communication metadata, and anonymous broadcasting. These papers marry state-of-the-art cryptography with practical, privacy-preserving protocols, providing a glimpse of what we might expect from tomorrow's secure messaging systems.

Second, James Wilcox takes us on a tour of recent advances in verified systems design. It's now possible to build end-to-end verified compilers, operating systems, and distributed systems that are provably correct with respect to well-defined specifications, providing high assurance of well-defined, well-behaved code. Because these system components interact with low-level hardware like the instruction set architecture and external networks, each paper introduces new techniques to balance the tension between formal correctness and practical applicability. As programming language techniques advance and more of the modern computing stack continues to crystallize, expect these advances to make their way into production systems.

Research for Practice

Network Applications Are Interactive

  Antony Alappatt

The network era requires new models, with interactions instead of algorithms.

The miniaturization of devices and the prolific interconnectedness of these devices over high-speed wireless networks is completely changing how commerce is conducted. These changes will profoundly change how enterprises operate. Software is at the heart of this digital world, but the software toolsets and languages were conceived for the host-based era. The issues that already plague software practice will be more profound with such an approach. It is time for software to be made simpler, secure, and reliable.


Escaping the Singularity
XML and JSON Are Like Cardboard

  Pat Helland

Cardboard surrounds and protects stuff as it crosses boundaries.

Semi-structured representations of data are not the cheapest format. There's typically a lot of extra stuff like angle brackets contained in it. JSON, XML, and other semi-structured representations allow for wonderful flexibility and dynamic interpretation. The efficiencies and savings gained from flexibility more than make up for the overhead.

Data, Escaping the Singularity

The Soft Side of Software
Breadth and Depth

  Kate Matsudaira

We all wear many hats, but make sure you have one that fits well.

When people ask me the question of where they should focus their time—should I keep learning one technology or spend time learning a new one?—I ask them this question: What is the one thing you could be the best in the world at?

Business and Management, The Soft Side of Software

Cache Me If You Can

  Jacob Loveless

Building a decentralized web-delivery model

The world is more connected than it ever has been before, and with our pocket supercomputers and IoT (Internet of Things) future, the next generation of the web might just be delivered in a peer-to-peer model. It's a giant problem space, but the necessary tools and technology are here today. We just need to define the problem a little better.

Networks, Web Services

Bitcoin's Academic Pedigree

  Arvind Narayanan and Jeremy Clark

The concept of cryptocurrencies is built from forgotten ideas in research literature.

We've seen repeatedly that ideas in the research literature can be gradually forgotten or lie unappreciated, especially if they are ahead of their time, even in popular areas of research. Both practitioners and academics would do well to revisit old ideas to glean insights for present systems. Bitcoin was unusual and successful not because it was on the cutting edge of research on any of its components, but because it combined old ideas from many previously unrelated fields. This is not easy to do, as it requires bridging disparate terminology, assumptions, etc., but it is a valuable blueprint for innovation.

Education, Networks, Security

Kode Vicious:
Cold, Hard Cache

On the implementation and maintenance of caches

Dear KV, Our latest project at work requires a large number of slightly different software stacks to deploy within our cloud infrastructure. With modern hardware, I can test this deployment on a laptop. The problem I keep running up against is that our deployment system seems to secretly cache some of my files and settings and not clear them, even when I repeatedly issue the command to do so. I've resorted to repeatedly using the find command so that I can blow away the offending files. What I've found is that the system caches data in many places so I've started a list. All of which brings me to my question: Who writes this stuff?!

Kode Vicious, Networks

May/June 2017

Research for Practice:
Vigorous Public Debates in Academic Computer Science

  John Regehr

Expert-curated Guides to the Best of CS Research

This installment of Research for Practice features a special curated selection from John Regehr, who takes us on a tour of great debates in academic computer science research. In case you thought flame wars were reserved for Usenet mailing lists and Twitter, think again: the academic literature is full of dramatic, spectacular, and vigorous debates spanning file systems, operating system kernel design, and formal verification.

Research for Practice

Hootsuite: In Pursuit of Reactive Systems

A discussion with Edward Steel, Yanik Berube, Jonas Bonér, Ken Britton, and Terry Coatta

It has become apparent how critical frameworks and standards are for development teams when using microservices. People often mistake the flexibility microservices provide with a requirement to use different technologies for each service. Like all development teams, we still need to keep the number of technologies we use to a minimum so we can easily train new people, maintain our code, support moves between teams, and the like.

Case Studies, Web Services

Everything Sysadmin:
Four Ways to Make CS & IT Curricula More Immersive

  Thomas A. Limoncelli

Why the bell curve hasn't transformed into a hockey stick

Education should seek to normalize best practices from the start. Working outside these best practices should be considered a bug. Students should not struggle to learn best practices after graduation, and they should be shocked if potential new employers do not already have these practices in place.

Education, Everything Sysadmin

Metaphors We Compute By

  Alvaro Videla

Code is a story that explains how to solve a particular problem.

Programmers must be able to tell a story with their code, explaining how they solved a particular problem. Like writers, programmers must know their metaphors. Many metaphors will be able to explain a concept, but you must have enough skill to choose the right one that's able to convey your ideas to future programmers who will read the code.


The Soft Side of Software
10 Ways to Be a Better Interviewer

  Kate Matsudaira

Plan ahead to make the interview a successful one.

As an interviewer, the key to your success is preparation. Planning will help ensure the success of the interview (both in terms of getting the information you need and giving the candidate a good impression).

Business and Management, The Soft Side of Software

Is There a Single Method for the Internet of Things?

  Ivar Jacobson, Ian Spence, Pan-Wei Ng

Essence can keep software development for the IoT from becoming unwieldy.

The Industrial Internet Consortium predicts the IoT (Internet of Things) will become the third technological revolution after the Industrial Revolution and the Internet Revolution. Its impact across all industries and businesses can hardly be imagined. Existing software (business, telecom, aerospace, defense, etc.) is expected to be modified or redesigned, and a huge amount of new software, solving new problems, will have to be developed. As a consequence, the software industry should welcome new and better methods.


Kode Vicious:
IoT: The Internet of Terror

If it seems like the sky is falling, that's because it is.

It is true that many security-focused engineers can sound like Chicken Little, running around announcing that the sky is falling, but, unless you've been living under a rock, you will notice that, indeed, the sky IS falling. Not a day goes by without a significant attack against networked systems making the news, and the Internet of Terror is leading the charge in taking distributed systems down the road to hell—a road that you wish to pave with your good intentions.

Kode Vicious, Networks, Security


March/April 2017

Research for Practice:
- Technology for UnderservedCommunities
- Personal Fabrication

  Tawanna Dillahunt, Stefanie Mueller and Patrick Baudisch

Expert-curated Guides to the Best of CS Research

This installment of Research for Practice provides curated reading guides to technology for underserved communities and to new developments in personal fabrication. First, Tawanna Dillahunt describes design considerations and technology for underserved and impoverished communities. Designing for the more than 1.6 billion impoverished individuals worldwide requires special consideration of community needs, constraints, and context. Tawanna's selections span protocols for poor-quality communication networks, community-driven content generation, and resource and public service discovery. Second, Stefanie Mueller and Patrick Baudisch provide an overview of recent advances in personal fabrication (e.g., 3D printers). Their selection covers new techniques for fabricating (and emulating) complex materials (e.g., by manipulating the internal structure of an object), for more easily specifying object shape and behavior, and for human-in-the-loop rapid prototyping. Combined, these two guides provide a fascinating deep dive into some of the latest human-centric computer science research results.

Research for Practice

Data Sketching

  Graham Cormode

The approximate approach is often faster and more efficient.

Do you ever feel overwhelmed by an unending stream of information? It can seem like a barrage of new email and text messages demands constant attention, and there are also phone calls to pick up, articles to read, and knocks on the door to answer. Putting these pieces together to keep track of what's important can be a real challenge.

Data, Networks

Escaping the Singularity
Side Effects, Front and Center!

  Pat Helland

One System's Side Effect is Another's Meat and Potatoes.

We think of computation in terms of its consequences. The big MapReduce job returns a large result. Web interactions display information. Enterprise applications update the database and return an answer. These are the reasons we do our work.

What we rarely discuss are the side effects of doing the work we intend. Side effects may be unwanted, or they may actually cause desired behavior at different layers of the system. This column points out some fun patterns to keep in mind as we build and use our systems.

Escaping the Singularity

The Calculus of Service Availability

  Ben Treynor, Mike Dahlin, Vivek Rau, Betsy Beyer

You're only as available as the sum of your dependencies.

Most services offered by Google aim to offer 99.99 percent (sometimes referred to as the "four 9s") availability to users. Some services contractually commit to a lower figure externally but set a 99.99 percent target internally. This more stringent target accounts for situations in which users become unhappy with service performance well before a contract violation occurs, as the number one aim of an SRE team is to keep users happy. For many services, a 99.99 percent internal target represents the sweet spot that balances cost, complexity, and availability. For some services, notably global cloud services, the internal target is 99.999 percent.

Web Services

The Soft Side of Software
Conversations with Technology Leaders: Erik Meijer

  Kate Matsudaira

Great engineers are able to maximize their mental power.

Whether you are a leader, a programmer, or just someone aspiring to be better, I am sure there are some smart takeaways from our conversation that will help you grow in your role. Oh, and if you read to the end, you can find out what his favorite job interview question is—and see if you would be able to pass his test.

The Soft Side of Software

The IDAR Graph

  Mark A. Overton

An improvement over UML

UML is the de facto standard for representing object-oriented designs. It does a fine job of recording designs, but it has a severe problem: its diagrams don't convey what humans need to know, making them hard to understand. This is why most software developers use UML only when forced to.

To be useful, a graph that portrays software design must communicate in a way that humans understand. An organization of objects in software is analogous to a human organization, and almost without exception, an organization of people is portrayed as a control hierarchy, with the topmost person having the broadest span of control.

Development, Workflow Systems

Kode Vicious: The Observer Effect

Finding the balance between zero and maximum

The problem here is usually a failure to appreciate just what you are asking a system to do when polling it for information. Modern systems contain thousands of values that can be measured and recorded. Blindly retrieving whatever it is that might be exposed by the system is bad enough, but asking for it with a high-frequency poll is much worse for several reasons.

Kode Vicious

January/February 2017

Too Big NOT to Fail

  Pat Helland, Simon Weaver, and Ed Harris

Embrace failure so it doesn't embrace you.

Web-scale infrastructure implies LOTS of servers working together—often tens or hundreds of thousands of servers all working toward the same goal. How can the complexity of these environments be managed? How can commonality and simplicity be introduced?

Failure and Recovery

Research for Practice:
- Tracing and Debugging Distributed Systems;
- Programming by Examples

  Peter Alvaro, Sumit Gulwani

Expert-curated Guides to the Best of CS Research

This installment of Research for Practice covers two exciting topics in distributed systems and programming methodology. First, Peter Alvaro takes us on a tour of recent techniques for debugging some of the largest and most complex systems in the world: modern distributed systems and service-oriented architectures. The techniques Peter surveys can shed light on order amid the chaos of distributed call graphs. Second, Sumit Gulwani illustrates how to program without explicitly writing programs, instead synthesizing programs from examples! The techniques Sumit presents allow systems to "learn" a program representation from illustrative examples, allowing nonprogrammer users to create increasingly nontrivial functions such as spreadsheet macros. Both of these selections are well in line with RfP's goal of accessible, practical research; in fact, both contributors have successfully transferred their own research in each area to production, at Netflix and as part of Microsoft Excel. Readers may also find a use case!

Debugging, Development, Distributed Development, Research for Practice

The Debugging Mindset

  Devon H. O'Dell

Understanding the psychology of learning strategies leads to effective problem-solving skills.

Software developers spend 35-50 percent of their time validating and debugging software. The cost of debugging, testing, and verification is estimated to account for 50-75 percent of the total budget of software development projects, amounting to more than $100 billion annually. While tools, languages, and environments have reduced the time spent on individual debugging tasks, they have not significantly reduced the total time spent debugging, nor the cost of doing so. Therefore, a hyperfocus on elimination of bugs during development is counterproductive; programmers should instead embrace debugging as an exercise in problem solving.


Kode Vicious: Forced Exception-Handling

You can never discount the human element in programming.

Yes, KV also reads "The Morning Paper," although he has to admit that he does not read everything that arrives in his inbox from that list. Of course, the paper you mention piqued my interest, and one of the things you don't point out is that it's actually a study of distributed systems failures. Now, how can we make programming harder? I know! Let's take a problem on a single system and distribute it. Someday I would like to see a paper that tells us if problems in distributed systems increase along with the number of nodes, or the number of interconnections. Being an optimist, I can only imagine that it's N(N + 1) / 2, or worse.

Kode Vicious

MongoDB's JavaScript Fuzzer

  Robert Guo

The fuzzer is for those edge cases that your testing didn't catch.

Fuzzing, or fuzz testing, is a technique for generating randomized, unexpected, and invalid input to a program to trigger untested code paths. Fuzzing was originally developed in the 1980s and has since proven to be effective at ensuring the stability of a wide range of systems, from file systems to distributed clusters to browsers. As people have attempted to make fuzzing more effective, two philosophies have emerged: smart and dumb fuzzing. As the state of the art evolves, the techniques that are used to implement fuzzers are being partitioned into categories, chief among them being generational and mutational. In many popular fuzzing tools, smart fuzzing corresponds to generational techniques, and dumb fuzzing to mutational techniques, but this is not an intrinsic relationship. Indeed, in our case at MongoDB, the situation is precisely reversed.

Databases, QA

The Soft Side of Software
Does Anybody Listen to You?

  Kate Matsudaira

How do you step up from mere contributor to real change-maker?

When you are navigating an organization, it pays to know whom to talk to and how to reach them. Here is a simple guide to sending your ideas up the chain and actually making them stick. It takes three elements: the right people, the right time, and the right way.

The Soft Side of Software

Making Money Using Math

  Erik Meijer

Modern applications are increasingly using probabilistic machine-learned models.

Machine learning, or ML, is all the rage today, and there are good reasons for that. Models created by machine-learning algorithms for problems such as spam filtering, speech and image recognition, language translation, and text understanding have many advantages over code written by human developers. Machine learning, however, is not as magical as it sounds at first. In fact, it is rather analogous to how human developers create code using test-driven development.

Artificial Intelligence

November/December 2016

Pervasive, Dynamic Authentication of Physical Items

  Meng-Day (Mandel) Yu, Srinivas Devadas

The use of silicon PUF circuits

Authentication of physical items is an age-old problem. Common approaches include the use of bar codes, QR codes, holograms, and RFID (radio-frequency identification) tags. Traditional RFID tags and bar codes use a public identifier as a means of authenticating. A public identifier, however, is static: it is the same each time when queried and can be easily copied by an adversary. Holograms can also be viewed as public identifiers: a knowledgeable verifier knows all the attributes to inspect visually. It is difficult to make hologram-based authentication pervasive; a casual verifier does not know all the attributes to look for. Further, to achieve pervasive authentication, it is useful for the authentication modality to be easy to integrate with modern electronic devices (e.g., mobile smartphones) and to be easy for non-experts to use.


Research for Practice:
- Cryptocurrencies, Blockchains, and Smart Contracts;
- Hardware for Deep Learning

  Peter Bailis, Arvind Narayanan, Andrew Miller, and Song Han

Expert-curated Guides to the Best of CS Research

First, Arvind Narayanan and Andrew Miller, co-authors of the increasingly popular open-access Princeton Bitcoin textbook, provide an overview of ongoing research in cryptocurrencies. This is a topic with a long history in the academic literature that has recently come to prominence with the rise of Bitcoin, blockchains, and similar implementations of advanced, decentralized protocols. These developments have captured the public imagination and the eye of the popular press. In the meantime, academics have been busy, delivering new results in maintaining anonymity, ensuring usability, detecting errors, and reasoning about decentralized markets, all through the lens of these modern cryptocurrency systems. It is a pleasure having two academic experts deliver the latest updates from the burgeoning body of academic research on this subject.

Second, Song Han provides an overview of hardware trends related to another long-studied academic problem that has recently seen an explosion in popularity: deep learning. Fueled by large amounts of training data and inexpensive parallel and scale-out compute, deep-learning-model architectures have seen a massive resurgence of interest based on their excellent performance on traditionally difficult tasks such as image recognition. These deep networks are compute-intensive to train and evaluate, and many of the best minds in computer systems (e.g., the team that developed MapReduce) and AI are working to improve them. As a result, Song has provided a fantastic overview of recent advances devoted to using hardware and hardware-aware techniques to compress networks, improve their performance, and reduce their often large amounts of energy consumption.

AI, Networks, Privacy, Research for Practice

Uninitialized Reads

  Robert C. Seacord, NCC Group

Understanding the proposed revisions to the C language

Most developers understand that reading uninitialized variables in C is a defect, but some do it anyway. What happens when you read uninitialized objects is unsettled in the current version of the C standard (C11). Various proposals have been made to resolve these issues in the planned C2X revision of the standard. Consequently, this is a good time to understand existing behaviors as well as proposed revisions to the standard to influence the evolution of the C language. Given that the behavior of uninitialized reads is unsettled in C11, prudence dictates eliminating uninitialized reads from your code.

Programming Languages

Heterogeneous Computing: Here to Stay

  Mohamed Zahran

Hardware and Software Perspectives

Mentions of the buzzword heterogeneous computing have been on the rise in the past few years and will continue to be heard for years to come, because heterogeneous computing is here to stay. What is heterogeneous computing, and why is it becoming the norm? How do we deal with it, from both the software side and the hardware side? This article provides answers to some of these questions and presents different points of view on others.


Time, but Faster

  Theo Schlossnagle

A computing adventure about time through the looking glass

Every once in a while, you find yourself in a rabbit hole, unsure of where you are or what time it might be. This article presents a computing adventure about time through the looking glass.

The first premise was summed up perfectly by the late Douglas Adams in The Hitchhiker's Guide to the Galaxy: "Time is an illusion. Lunchtime doubly so." The concept of time, when colliding with decoupled networks of computers that run at billions of operations per second, is... well, the truth of the matter is that you simply never really know what time it is. That is why Leslie Lamport's seminal paper on Lamport timestamps was so important to the industry, but this article is actually about wall-clock time, or a reasonably useful estimation of it.


Kode Vicious: The Chess Player Who Couldn't Pass the Salt

AI: Soft and hard, weak and strong, narrow and general

The problem inherent in almost all nonspecialist work in AI is that humans actually don't understand intelligence very well in the first place. Now, computer scientists often think they understand intelligence because they have so often been the "smart" kid, but that's got very little to do with understanding what intelligence actually is. In the absence of a clear understanding of how the human brain generates and evaluates ideas, which may or may not be a good basis for the concept of intelligence, we have introduced numerous proxies for intelligence, the first of which is game-playing behavior.

AI, Kode Vicious

Everything Sysadmin:
Are You Load Balancing Wrong?

  Thomas A. Limoncelli

Anyone can use a load balancer. Using them properly is much more difficult.

In today's web-centric, service-centric environments the use of load balancers is widespread. I assert, however, that most of the time they are used incorrectly. To understand the problem, we first need to discuss a little about load balancers in general. Then we can look at the problem and solutions.

Everything Sysadmin, System Administration

Older Issues