May/June issue of acmqueue


The May/June issue of acmqueue is out now


May/June 2017



Metaphors We Compute By

  Alvaro Videla

Code is a story that explains how to solve a particular problem.

Programmers must be able to tell a story with their code, explaining how they solved a particular problem. Like writers, programmers must know their metaphors. Many metaphors will be able to explain a concept, but you must have enough skill to choose the right one that's able to convey your ideas to future programmers who will read the code.

Development


The Soft Side of Software
10 Ways to Be a Better Interviewer


  Kate Matsudaira

Plan ahead to make the interview a successful one.

As an interviewer, the key to your success is preparation. Planning will help ensure the success of the interview (both in terms of getting the information you need and giving the candidate a good impression).

Business and Management, The Soft Side of Software



Is There a Single Method for the Internet of Things?

  Ivar Jacobson, Ian Spence, Pan-Wei Ng

Essence can keep software development for the IoT from becoming unwieldy.

The Industrial Internet Consortium predicts the IoT (Internet of Things) will become the third technological revolution after the Industrial Revolution and the Internet Revolution. Its impact across all industries and businesses can hardly be imagined. Existing software (business, telecom, aerospace, defense, etc.) is expected to be modified or redesigned, and a huge amount of new software, solving new problems, will have to be developed. As a consequence, the software industry should welcome new and better methods.

Development



Kode Vicious:
IoT: The Internet of Terror


If it seems like the sky is falling, that's because it is.

It is true that many security-focused engineers can sound like Chicken Little, running around announcing that the sky is falling, but, unless you've been living under a rock, you will notice that, indeed, the sky IS falling. Not a day goes by without a significant attack against networked systems making the news, and the Internet of Terror is leading the charge in taking distributed systems down the road to hell—a road that you wish to pave with your good intentions.

Kode Vicious, Networks, Security


 


March/April 2017


Research for Practice:
- Technology for UnderservedCommunities
- Personal Fabrication


  Tawanna Dillahunt, Stefanie Mueller and Patrick Baudisch

Expert-curated Guides to the Best of CS Research

This installment of Research for Practice provides curated reading guides to technology for underserved communities and to new developments in personal fabrication. First, Tawanna Dillahunt describes design considerations and technology for underserved and impoverished communities. Designing for the more than 1.6 billion impoverished individuals worldwide requires special consideration of community needs, constraints, and context. Tawanna's selections span protocols for poor-quality communication networks, community-driven content generation, and resource and public service discovery. Second, Stefanie Mueller and Patrick Baudisch provide an overview of recent advances in personal fabrication (e.g., 3D printers). Their selection covers new techniques for fabricating (and emulating) complex materials (e.g., by manipulating the internal structure of an object), for more easily specifying object shape and behavior, and for human-in-the-loop rapid prototyping. Combined, these two guides provide a fascinating deep dive into some of the latest human-centric computer science research results.

Research for Practice



Data Sketching

  Graham Cormode

The approximate approach is often faster and more efficient.

Do you ever feel overwhelmed by an unending stream of information? It can seem like a barrage of new email and text messages demands constant attention, and there are also phone calls to pick up, articles to read, and knocks on the door to answer. Putting these pieces together to keep track of what's important can be a real challenge.

Data, Networks


Escaping the Singularity
Side Effects, Front and Center!


  Pat Helland

One System's Side Effect is Another's Meat and Potatoes.

We think of computation in terms of its consequences. The big MapReduce job returns a large result. Web interactions display information. Enterprise applications update the database and return an answer. These are the reasons we do our work.

What we rarely discuss are the side effects of doing the work we intend. Side effects may be unwanted, or they may actually cause desired behavior at different layers of the system. This column points out some fun patterns to keep in mind as we build and use our systems.

Escaping the Singularity



The Calculus of Service Availability

  Ben Treynor, Mike Dahlin, Vivek Rau, Betsy Beyer

You're only as available as the sum of your dependencies.

Most services offered by Google aim to offer 99.99 percent (sometimes referred to as the "four 9s") availability to users. Some services contractually commit to a lower figure externally but set a 99.99 percent target internally. This more stringent target accounts for situations in which users become unhappy with service performance well before a contract violation occurs, as the number one aim of an SRE team is to keep users happy. For many services, a 99.99 percent internal target represents the sweet spot that balances cost, complexity, and availability. For some services, notably global cloud services, the internal target is 99.999 percent.

Web Services


The Soft Side of Software
Conversations with Technology Leaders: Erik Meijer


  Kate Matsudaira

Great engineers are able to maximize their mental power.

Whether you are a leader, a programmer, or just someone aspiring to be better, I am sure there are some smart takeaways from our conversation that will help you grow in your role. Oh, and if you read to the end, you can find out what his favorite job interview question is—and see if you would be able to pass his test.

The Soft Side of Software



The IDAR Graph

  Mark A. Overton

An improvement over UML

UML is the de facto standard for representing object-oriented designs. It does a fine job of recording designs, but it has a severe problem: its diagrams don't convey what humans need to know, making them hard to understand. This is why most software developers use UML only when forced to.

To be useful, a graph that portrays software design must communicate in a way that humans understand. An organization of objects in software is analogous to a human organization, and almost without exception, an organization of people is portrayed as a control hierarchy, with the topmost person having the broadest span of control.

Development, Workflow Systems



Kode Vicious: The Observer Effect

Finding the balance between zero and maximum

The problem here is usually a failure to appreciate just what you are asking a system to do when polling it for information. Modern systems contain thousands of values that can be measured and recorded. Blindly retrieving whatever it is that might be exposed by the system is bad enough, but asking for it with a high-frequency poll is much worse for several reasons.

Kode Vicious


January/February 2017


Too Big NOT to Fail

  Pat Helland, Simon Weaver, and Ed Harris

Embrace failure so it doesn't embrace you.

Web-scale infrastructure implies LOTS of servers working together—often tens or hundreds of thousands of servers all working toward the same goal. How can the complexity of these environments be managed? How can commonality and simplicity be introduced?

Failure and Recovery


Research for Practice:
- Tracing and Debugging Distributed Systems;
- Programming by Examples


  Peter Alvaro, Sumit Gulwani

Expert-curated Guides to the Best of CS Research

This installment of Research for Practice covers two exciting topics in distributed systems and programming methodology. First, Peter Alvaro takes us on a tour of recent techniques for debugging some of the largest and most complex systems in the world: modern distributed systems and service-oriented architectures. The techniques Peter surveys can shed light on order amid the chaos of distributed call graphs. Second, Sumit Gulwani illustrates how to program without explicitly writing programs, instead synthesizing programs from examples! The techniques Sumit presents allow systems to "learn" a program representation from illustrative examples, allowing nonprogrammer users to create increasingly nontrivial functions such as spreadsheet macros. Both of these selections are well in line with RfP's goal of accessible, practical research; in fact, both contributors have successfully transferred their own research in each area to production, at Netflix and as part of Microsoft Excel. Readers may also find a use case!

Debugging, Development, Distributed Development, Research for Practice




The Debugging Mindset

  Devon H. O'Dell

Understanding the psychology of learning strategies leads to effective problem-solving skills.

Software developers spend 35-50 percent of their time validating and debugging software. The cost of debugging, testing, and verification is estimated to account for 50-75 percent of the total budget of software development projects, amounting to more than $100 billion annually. While tools, languages, and environments have reduced the time spent on individual debugging tasks, they have not significantly reduced the total time spent debugging, nor the cost of doing so. Therefore, a hyperfocus on elimination of bugs during development is counterproductive; programmers should instead embrace debugging as an exercise in problem solving.

Debugging



Kode Vicious: Forced Exception-Handling

You can never discount the human element in programming.

Yes, KV also reads "The Morning Paper," although he has to admit that he does not read everything that arrives in his inbox from that list. Of course, the paper you mention piqued my interest, and one of the things you don't point out is that it's actually a study of distributed systems failures. Now, how can we make programming harder? I know! Let's take a problem on a single system and distribute it. Someday I would like to see a paper that tells us if problems in distributed systems increase along with the number of nodes, or the number of interconnections. Being an optimist, I can only imagine that it's N(N + 1) / 2, or worse.

Kode Vicious


MongoDB's JavaScript Fuzzer

  Robert Guo

The fuzzer is for those edge cases that your testing didn't catch.

Fuzzing, or fuzz testing, is a technique for generating randomized, unexpected, and invalid input to a program to trigger untested code paths. Fuzzing was originally developed in the 1980s and has since proven to be effective at ensuring the stability of a wide range of systems, from file systems to distributed clusters to browsers. As people have attempted to make fuzzing more effective, two philosophies have emerged: smart and dumb fuzzing. As the state of the art evolves, the techniques that are used to implement fuzzers are being partitioned into categories, chief among them being generational and mutational. In many popular fuzzing tools, smart fuzzing corresponds to generational techniques, and dumb fuzzing to mutational techniques, but this is not an intrinsic relationship. Indeed, in our case at MongoDB, the situation is precisely reversed.

Databases, QA


The Soft Side of Software
Does Anybody Listen to You?


  Kate Matsudaira

How do you step up from mere contributor to real change-maker?

When you are navigating an organization, it pays to know whom to talk to and how to reach them. Here is a simple guide to sending your ideas up the chain and actually making them stick. It takes three elements: the right people, the right time, and the right way.

The Soft Side of Software


Making Money Using Math

  Erik Meijer

Modern applications are increasingly using probabilistic machine-learned models.

Machine learning, or ML, is all the rage today, and there are good reasons for that. Models created by machine-learning algorithms for problems such as spam filtering, speech and image recognition, language translation, and text understanding have many advantages over code written by human developers. Machine learning, however, is not as magical as it sounds at first. In fact, it is rather analogous to how human developers create code using test-driven development.

Artificial Intelligence



November/December 2016


Pervasive, Dynamic Authentication of Physical Items

  Meng-Day (Mandel) Yu, Srinivas Devadas

The use of silicon PUF circuits

Authentication of physical items is an age-old problem. Common approaches include the use of bar codes, QR codes, holograms, and RFID (radio-frequency identification) tags. Traditional RFID tags and bar codes use a public identifier as a means of authenticating. A public identifier, however, is static: it is the same each time when queried and can be easily copied by an adversary. Holograms can also be viewed as public identifiers: a knowledgeable verifier knows all the attributes to inspect visually. It is difficult to make hologram-based authentication pervasive; a casual verifier does not know all the attributes to look for. Further, to achieve pervasive authentication, it is useful for the authentication modality to be easy to integrate with modern electronic devices (e.g., mobile smartphones) and to be easy for non-experts to use.

Security



Research for Practice:
- Cryptocurrencies, Blockchains, and Smart Contracts;
- Hardware for Deep Learning


  Peter Bailis, Arvind Narayanan, Andrew Miller, and Song Han

Expert-curated Guides to the Best of CS Research

First, Arvind Narayanan and Andrew Miller, co-authors of the increasingly popular open-access Princeton Bitcoin textbook, provide an overview of ongoing research in cryptocurrencies. This is a topic with a long history in the academic literature that has recently come to prominence with the rise of Bitcoin, blockchains, and similar implementations of advanced, decentralized protocols. These developments have captured the public imagination and the eye of the popular press. In the meantime, academics have been busy, delivering new results in maintaining anonymity, ensuring usability, detecting errors, and reasoning about decentralized markets, all through the lens of these modern cryptocurrency systems. It is a pleasure having two academic experts deliver the latest updates from the burgeoning body of academic research on this subject.

Second, Song Han provides an overview of hardware trends related to another long-studied academic problem that has recently seen an explosion in popularity: deep learning. Fueled by large amounts of training data and inexpensive parallel and scale-out compute, deep-learning-model architectures have seen a massive resurgence of interest based on their excellent performance on traditionally difficult tasks such as image recognition. These deep networks are compute-intensive to train and evaluate, and many of the best minds in computer systems (e.g., the team that developed MapReduce) and AI are working to improve them. As a result, Song has provided a fantastic overview of recent advances devoted to using hardware and hardware-aware techniques to compress networks, improve their performance, and reduce their often large amounts of energy consumption.

AI, Networks, Privacy, Research for Practice



Uninitialized Reads

  Robert C. Seacord, NCC Group

Understanding the proposed revisions to the C language

Most developers understand that reading uninitialized variables in C is a defect, but some do it anyway. What happens when you read uninitialized objects is unsettled in the current version of the C standard (C11). Various proposals have been made to resolve these issues in the planned C2X revision of the standard. Consequently, this is a good time to understand existing behaviors as well as proposed revisions to the standard to influence the evolution of the C language. Given that the behavior of uninitialized reads is unsettled in C11, prudence dictates eliminating uninitialized reads from your code.

Programming Languages



Heterogeneous Computing: Here to Stay

  Mohamed Zahran

Hardware and Software Perspectives

Mentions of the buzzword heterogeneous computing have been on the rise in the past few years and will continue to be heard for years to come, because heterogeneous computing is here to stay. What is heterogeneous computing, and why is it becoming the norm? How do we deal with it, from both the software side and the hardware side? This article provides answers to some of these questions and presents different points of view on others.

Architecture



Time, but Faster

  Theo Schlossnagle

A computing adventure about time through the looking glass

Every once in a while, you find yourself in a rabbit hole, unsure of where you are or what time it might be. This article presents a computing adventure about time through the looking glass.

The first premise was summed up perfectly by the late Douglas Adams in The Hitchhiker's Guide to the Galaxy: "Time is an illusion. Lunchtime doubly so." The concept of time, when colliding with decoupled networks of computers that run at billions of operations per second, is... well, the truth of the matter is that you simply never really know what time it is. That is why Leslie Lamport's seminal paper on Lamport timestamps was so important to the industry, but this article is actually about wall-clock time, or a reasonably useful estimation of it.

Networks




Kode Vicious: The Chess Player Who Couldn't Pass the Salt

AI: Soft and hard, weak and strong, narrow and general

The problem inherent in almost all nonspecialist work in AI is that humans actually don't understand intelligence very well in the first place. Now, computer scientists often think they understand intelligence because they have so often been the "smart" kid, but that's got very little to do with understanding what intelligence actually is. In the absence of a clear understanding of how the human brain generates and evaluates ideas, which may or may not be a good basis for the concept of intelligence, we have introduced numerous proxies for intelligence, the first of which is game-playing behavior.

AI, Kode Vicious



Everything Sysadmin:
Are You Load Balancing Wrong?


  Thomas A. Limoncelli

Anyone can use a load balancer. Using them properly is much more difficult.

In today's web-centric, service-centric environments the use of load balancers is widespread. I assert, however, that most of the time they are used incorrectly. To understand the problem, we first need to discuss a little about load balancers in general. Then we can look at the problem and solutions.

Everything Sysadmin, System Administration



Older Issues