Deterministic Record-and-Replay:
Zeroing in only on the nondeterministic actions of the process
This column describes three recent research advances related to deterministic record-and-replay, with the goal of showing both classical use cases and emerging use cases. A growing number of systems use a weaker form of deterministic record-and-replay. Essentially, these systems exploit the determinism that exists across many program executions but intentionally allow some nondeterminism for performance reasons. This trend is exemplified in GPUReplay in particular, but also in systems such as ShortCut and Dora.
Automatically Testing Database Systems:
DBMS testing with test oracles, transaction history, and fuzzing
The automated testing of DBMS is an exciting, interdisciplinary effort that has seen many innovations in recent years. The examples addressed here represent different perspectives on this topic, reflecting strands of research from software engineering, (database) systems, and security angles. They give only a glimpse into these research strands, as many additional interesting and effective works have been proposed. Various approaches generate pairs of related tests to find both logic bugs and performance issues in a DBMS. Similarly, other isolation-level testing approaches have been proposed.
OS Scheduling:
Better scheduling policies for modern computing systems
In any system that multiplexes resources, the problem of scheduling what computations run where and when is perhaps the most fundamental. Yet, like many other essential problems in computing (e.g., query optimization in databases), academic research in scheduling moves like a pendulum, with periods of intense activity followed by periods of dormancy when it is considered a "solved" problem. These three papers make significant contributions to an ongoing effort to develop better scheduling policies for modern computing systems.
The Fun in Fuzzing:
The debugging technique comes into its own.
Stefan Nagy, an assistant professor in the Kahlert School of Computing at the University of Utah, takes us on a tour of recent research in software fuzzing, or the systematic testing of programs via the generation of novel or unexpected inputs. The first paper he discusses extends the state of the art in coverage-guided fuzzing with the semantic notion of "likely invariants," inferred via techniques from property-based testing. The second explores encoding domain-specific knowledge about certain bug classes into test-case generation.
Crash Consistency:
Keeping data safe in the presence of crashes is a fundamental problem.
Keeping data safe in the presence of crashes is a fundamental problem in storage systems. Although the high-level ideas for crash consistency are relatively well understood, realizing them in practice is surprisingly complex and full of challenges. The systems research community is actively working on solving this challenge, and the papers examined here offer three solutions.
Convergence:
Research for Practice reboot
It is with great pride and no small amount of excitement that I announce the reboot of acmqueue's Research for Practice column. For three years, beginning at its inception in 2016, Research for Practice brought both seminal and cutting-edge research - via careful curation by experts in academia - within easy reach for practitioners who are too busy building things to manage the deluge of scholarly publications. We believe the series succeeded in its stated goal of sharing "the joy and utility of reading computer science research" between academics and their counterparts in industry. We know our readers have missed it, and we are delighted to rekindle the flame after a three-year hiatus.
The DevOps Phenomenon:
An executive crash course
Stressful emergency releases are a thing of the past for companies that subscribe to the DevOps method of software development and delivery. New releases are frequent. Bugs are fixed rapidly. New business opportunities are sought with gusto and confidence. New features are released, revised, and improved with rapid iterations. DevOps presents a strategic advantage for organizations when compared with traditional software-development methods. Leadership plays an important role during that transformation. DevOps is about providing guidelines for faster time to market of new software features and achieving a higher level of stability. Implementing cross-functional, product-oriented teams helps bridge the gaps between software development and operations.
Troubling Trends in Machine Learning Scholarship:
Some ML papers suffer from flaws that could mislead the public and stymie future research.
Flawed scholarship threatens to mislead the public and stymie future research by compromising ML’s intellectual foundations. Indeed, many of these problems have recurred cyclically throughout the history of AI and, more broadly, in scientific research. In 1976, Drew McDermott chastised the AI community for abandoning self-discipline, warning prophetically that "if we can’t criticize ourselves, someone else will save us the trouble." The current strength of machine learning owes to a large body of rigorous research to date, both theoretical and empirical. By promoting clear scientific thinking and communication, our community can sustain the trust and investment it currently enjoys.
Edge Computing:
Scaling resources within multiple administrative domains
Creating edge computing infrastructures and applications encompasses quite a breadth of systems research. Let’s take a look at the academic view of edge computing and a sample of existing research that will be relevant in the coming years.
Security for the Modern Age:
Securely running processes that require the entire syscall interface
Giving operators a usable means of securing the methods they use to deploy and run applications is a win for everyone. Keeping the usability-focused abstractions provided by containers, while finding new ways to automate security and defend against attacks, is a great path forward.
Knowledge Base Construction in the Machine-learning Era:
Three critical design points: Joint-learning, weak supervision, and new representations
More information is accessible today than at any other time in human history. From a software perspective, however, the vast majority of this data is unusable, as it is locked away in unstructured formats such as text, PDFs, web pages, images, and other hard-to-parse formats. The goal of knowledge base construction is to extract structured information automatically from this "dark data," so that it can be used in downstream applications for search, question-answering, link prediction, visualization, modeling and much more.
FPGAs in Data Centers:
FPGAs are slowly leaving the niche space they have occupied for decades.
This installment of Research for Practice features a curated selection from Gustavo Alonso, who provides an overview of recent developments utilizing FPGAs (field-programmable gate arrays) in datacenters. As Moore’s Law has slowed and the computational overheads of datacenter workloads such as model serving and data processing have continued to rise, FPGAs offer an increasingly attractive point in the trade-off between power and performance. Gustavo’s selections highlight early successes and practical deployment considerations that inform the ongoing, high-stakes debate about the future of datacenter- and cloud-based computation substrates.
Prediction-Serving Systems:
What happens when we wish to actually deploy a machine learning model to production?
This installment of Research for Practice features a curated selection from Dan Crankshaw and Joey Gonzalez, who provide an overview of machine learning serving systems. What happens when we wish to actually deploy a machine learning model to production, and how do we serve predictions with high accuracy and high computational efficiency? Dan and Joey’s selection provides a thoughtful selection of cutting-edge techniques spanning database-level integration, video processing, and prediction middleware.
Toward a Network of Connected Things:
A look into the future of IoT deployments and their usability
While the scale of data presents new avenues for improvement, the key challenges for the everyday adoption of IoT systems revolve around managing this data. First, we need to consider where the data is being processed and stored and what the privacy and systems implications of these policies are. Second, we need to develop systems that generate actionable insights from this diverse, hard-to-interpret data for non-tech users. Solving these challenges will allow IoT systems to deliver maximum value to end users.
Cluster Scheduling for Data Centers:
Expert-curated Guides to the Best of CS Research: Distributed Cluster Scheduling
This installment of Research for Practice features a curated selection from Malte Schwarzkopf, who takes us on a tour of distributed cluster scheduling, from research to practice, and back again. With the rise of elastic compute resources, cluster management has become an increasingly hot topic in systems R&D, and a number of competing cluster managers including Kubernetes, Mesos, and Docker are currently jockeying for the crown in this space.
Private Online Communication; Highlights in Systems Verification:
The importance of private communication will continue to grow. We need techniques to build larger verified systems from verified components.
First, Albert Kwon provides an overview of recent systems for secure and private communication. Second, James Wilcox takes us on a tour of recent advances in verified systems design.
Vigorous Public Debates in Academic Computer Science:
Expert-curated Guides to the Best of CS Research
This installment of Research for Practice features a special curated selection from John Regehr, who takes us on a tour of great debates in academic computer science research. In case you thought flame wars were reserved for Usenet mailing lists and Twitter, think again: the academic literature is full of dramatic, spectacular, and vigorous debates spanning file systems, operating system kernel design, and formal verification.
Research for Practice: Technology for UnderservedCommunities; Personal Fabrication:
Expert-curated Guides to the Best of CS Research
This installment of Research for Practice provides curated reading guides to technology for underserved communities and to new developments in personal fabrication. First, Tawanna Dillahunt describes design considerations and technology for underserved and impoverished communities. Designing for the more than 1.6 billion impoverished individuals worldwide requires special consideration of community needs, constraints, and context. Tawanna’s selections span protocols for poor-quality communication networks, community-driven content generation, and resource and public service discovery. Second, Stefanie Mueller and Patrick Baudisch provide an overview of recent advances in personal fabrication (e.g., 3D printers). Their selection covers new techniques for fabricating (and emulating) complex materials (e.g., by manipulating the internal structure of an object), for more easily specifying object shape and behavior, and for human-in-the-loop rapid prototyping.
Research for Practice: Tracing and Debugging Distributed Systems; Programming by Examples:
Expert-curated Guides to the Best of CS Research
This installment of Research for Practice covers two exciting topics in distributed systems and programming methodology. First, Peter Alvaro takes us on a tour of recent techniques for debugging some of the largest and most complex systems in the world: modern distributed systems and service-oriented architectures. The techniques Peter surveys can shed light on order amid the chaos of distributed call graphs. Second, Sumit Gulwani illustrates how to program without explicitly writing programs, instead synthesizing programs from examples! The techniques Sumit presents allow systems to "learn" a program representation from illustrative examples, allowing nonprogrammer users to create increasingly nontrivial functions such as spreadsheet macros.
Research for Practice: Cryptocurrencies, Blockchains, and Smart Contracts; Hardware for Deep Learning:
Expert-curated Guides to the Best of CS Research
First, Arvind Narayanan and Andrew Miller, co-authors of the increasingly popular open-access Princeton Bitcoin textbook, provide an overview of ongoing research in cryptocurrencies. Second, Song Han provides an overview of hardware trends related to another long-studied academic problem that has recently seen an explosion in popularity: deep learning.
Research for Practice: Distributed Transactions and Networks as Physical Sensors:
Expert-curated Guides to the Best of CS Research
First, Irene Zhang delivers a whirlwind tour of recent developments in distributed concurrency control. If you thought distributed transactions were prohibitively expensive, Irene’s selections may prompt you to reconsider: the use of atomic clocks, clever replication protocols, and new means of commit ordering all improve performance at scale. Second, Fadel Adib provides a fascinating look at using computer networks as physical sensors. It turns out that the radio waves passing through our environment and bodies are subtly modulated as they do so.
Research for Practice: Web Security and Mobile Web Computing:
Expert-curated Guides to the Best of CS Research
Our third installment of Research for Practice brings readings spanning programming languages, compilers, privacy, and the mobile web.
Research for Practice: Distributed Consensus and Implications of NVM on Database Management Systems:
Expert-curated Guides to the Best of CS Research
First, how do large-scale distributed systems mediate access to shared resources, coordinate updates to mutable state, and reliably make decisions in the presence of failures? Second, while consensus concerns distributed shared state, our second selection concerns the impact of hardware trends on single-node shared state.
Introducing Research for Practice:
Expert-curated guides to the best of CS research
Reading a great research paper is a joy. A team of experts deftly guides you, the reader, through the often complicated research landscape, noting the prior art, the current trends, the pressing issues at hand--and then, sometimes artfully, sometimes through seeming sheer force of will, expands the body of knowledge in a fell swoop of 12 or so pages of prose. A great paper contains a puzzle and a solution; these can be useful, enlightening, or both.