Volume 20, Issue 6
Kode Vicious:
All Sliders to the Right
Hardware overkill
There are many reasons why this year's model isn't any better than last year's, and many reasons why performance fails to scale, some of which KV has covered in these pages. It is true that the days of upgrading every year and getting a free performance boost are long gone, as we're not really getting single cores that are faster than about 4GHz. One thing that many software developers fail to understand is the hardware on which their software runs at a sufficiently deep level.
Hardware,
Kode Vicious,
Performance
Three-part Harmony for Program Managers
Who Just Don't Get It, Yet
Guenever Aldrich, Danny Tsang, Jason Mckenney
Open-source software, open standards, and agile software development
This article examines three tools in the system acquisitions toolbox that can work to expedite development and procurement while mitigating programmatic risk: OSS, open standards, and the Agile/Scrum software development processes are all powerful additions to the DoD acquisition program management toolbox.
Development,
Systems Administration
Research for Practice:
The Fun in Fuzzing
Stefan Nagy, with Introduction By Peter Alvaro
The debugging technique comes into its own.
Stefan Nagy, an assistant professor in the Kahlert School of Computing at the University of Utah, takes us on a tour of recent research in software fuzzing, or the systematic testing of programs via the generation of novel or unexpected inputs. The first paper he discusses extends the state of the art in coverage-guided fuzzing with the semantic notion of "likely invariants," inferred via techniques from property-based testing. The second explores encoding domain-specific knowledge about certain bug classes into test-case generation. His last selection takes us through the looking glass, randomly generating entire C programs and using differential analysis to compare traces of optimized and unoptimized executions, in order to find bugs in the compilers themselves.
Research for Practice,
Testing
To PiM or Not to PiM
Gabriel Falcao And João Dinis Ferreira
The case for in-memory inferencing of quantized CNNs at the edge
As artificial intelligence becomes a pervasive tool for the billions of IoT (Internet of things) devices at the edge, the data movement bottleneck imposes severe limitations on the performance and autonomy of these systems.
PiM (processing-in-memory) is emerging as a way of mitigating the data movement bottleneck while satisfying the stringent performance, energy efficiency, and accuracy requirements of edge imaging applications that rely on CNNs (convolutional neural networks).
AI,
Data,
Networks,
Performance
Taking Flight with Copilot
Christian Bird, Denae Ford, Thomas Zimmermann, Nicole Forsgren, Eirini Kalliamvakou, Travis Lowdermilk, Idan Gazit
Early insights and opportunities of AI-powered pair-programming tools
Over the next five years, AI-powered tools likely will be helping developers in many diverse tasks. For example, such models may be used to improve code review, directing reviewers to parts of a change where review is most needed or even directly providing feedback on changes. Models such as Codex may suggest fixes for defects in code, build failures, or failing tests. These models are able to write tests automatically, helping to improve code quality and downstream reliability of distributed systems. This study of Copilot shows that developers spend more time reviewing code than actually writing code. As AI-powered tools are integrated into more software development tasks, developer roles will shift so that more time is spent assessing suggestions related to the task than doing the task itself.
AI,
Development
Volume 20, Issue 5
Reinventing Backend Subsetting at Google
Peter Ward and Paul Wankadia with Kavita Guliani
Designing an algorithm with reduced connection churn that could replace deterministic subsetting
Backend subsetting is useful for reducing costs and may even be necessary for operating within the system limits. For more than a decade, Google used deterministic subsetting as its default backend subsetting algorithm, but although this algorithm balances the number of connections per backend task, deterministic subsetting has a high level of connection churn. Our goal at Google was to design an algorithm with reduced connection churn that could replace deterministic subsetting as the default backend subsetting algorithm.
Performance,
Testing
Kode Vicious:
The Elephant in the Room
It's time to get the POSIX elephant off our necks.
By writing code for the elephant that is Posix, we lose the chance to take advantage of modern hardware.
Development,
Kode Vicious
OCCAM-v2: Combining Static and Dynamic Analysis for Effective and Efficient Whole-program Specialization
Jorge A. Navas and Ashish Gehani
Leveraging scalable pointer analysis, value analysis, and dynamic analysis
OCCAM-v2 leverages scalable pointer analysis, value analysis, and dynamic analysis to create an effective and efficient tool for specializing LLVM bitcode. The extent of the code-size reduction achieved depends on the specific deployment configuration. Each application that is to be specialized is accompanied by a manifest that specifies concrete arguments that are known a priori, as well as a count of residual arguments that will be provided at runtime. The best case for partial evaluation occurs when the arguments are completely concretely specified. OCCAM-v2 uses a pointer analysis to devirtualize calls, allowing it to eliminate the entire body of functions that are not reachable by any direct calls. The hybrid analysis feature can handle cases that are challenging for static analysis, such as input loops, string processing, and external data (in files, for example). On the suite of evaluated programs, OCCAM-v2 was able to reduce the instruction count by 40.6 percent on average, taking a median of 2.4 seconds.
Development,
Quality Assurance,
Testing,
Tools
Case Study
OSS Supply-chain Security:
What Will It Take?
A discussion with Maya Kaczorowski, Falcon Momot, George Neville-Neil, and Chris McCubbin
While enterprise security teams naturally tend to turn their focus primarily to direct attacks on their own infrastructure, cybercrime exploits now are increasingly aimed at easier targets upstream. This has led to a perfect storm, since virtually all significant codebase repositories at this point include at least some amount of open-source software. But opportunities also abound there for the authors of malware. The broader cybercrime world, meanwhile, has noted that open-source supply chains are generally easy to penetrate. What's being done at this point to address the apparent risks?
Case studies,
Open Source,
Security
Drill Bits
Literate Executables
Terence Kelly
Literate executables redefine the relationship between compiled binaries and source code to be that of chicken and egg, so it's easy to derive either from the other. This episode of Drill Bits provides a general-purpose literacy tool and showcases the advantages of literacy by retrofitting it onto everyone's favorite command-line utility.
Drill Bits,
Code,
Data,
Development
Operations and Life:
Split Your Overwhelmed Teams
Thomas A. Limoncelli
Two teams of five is not the same as one team of ten.
This team's low morale and high stress were a result of the members feeling overwhelmed by too many responsibilities. The 10-by-10 communication structure made it difficult to achieve consensus, there were too many meetings, and everyone was suffering from the high cognitive load. By splitting into two teams, each can be more nimble, which the manager likes, and have a lower cognitive load, which the team likes. There is more opportunity for repetition, which lets people develop skills and demonstrate them. Altogether, this helps reduce stress and improve morale.
Business and Management,
Operations and Life
Volume 20, Issue 4
The Rise of Fully Homomorphic Encryption
Mache Creeger
Often called the Holy Grail of cryptography, commercial FHE is near.
Once commercial FHE is achieved, data access will become completely separated from unrestricted data processing, and provably secure storage and computation on untrusted platforms will become both relatively inexpensive and widely accessible. In ways similar to the impact of the database, cloud computing, PKE, and AI, FHE will invoke a sea change in how confidential information is protected, processed, and shared, and will fundamentally change the course of computing at a foundational level.
Data,
Privacy and Rights
Security
Mapping the Privacy Landscape for Central Bank Digital Currencies
Raphael Auer, Rainer Böhme, Jeremy Clark, Didem Demirag
Now is the time to shape what future payment flows will reveal about you.
As central banks all over the world move to digitize cash, the issue of privacy needs to move to the forefront. The path taken may depend on the needs of each stakeholder group: privacy-conscious users, data holders, and law enforcement.
Data,
Privacy and Rights
Security
From Zero to One Hundred
Matthew Bush, Atefeh Mashatan
Demystifying zero trust and its implications on enterprise people, process, and technology
Changing network landscapes and rising security threats have imparted a sense of urgency for new approaches to security. Zero trust has been proposed as a solution to these problems, but some regard it as a marketing tool to sell existing best practice while others praise it as a new cybersecurity standard. This article discusses the history and development of zero trust and why the changing threat landscape has led to a new discourse in cybersecurity. Drivers, barriers, and business implications of zero trust provide a backdrop for a brief overview of key logical components of a zero trust architecture and implementation challenges.
Security
Case Study
The Arrival of Zero Trust:
What Does it Mean?
A discussion with Michael Loftus, Andrew Vezina, Rick Doten, and Atefeh Mashatan
Enterprise cybersecurity used to rely on securing the corporate network, and then: "Trust, but verify."
But now, with cloud computing and most employees working from home at least some of the time, there is no longer such thing as a single perimeter.
And, with corporate security breaches having become a regular news item over the past two decades, trust has essentially evaporated as well.
John Kindervag, who articulated the zero trust enterprise defense strategy a little over a decade ago, explained: "The concept is framed around the principle that no network, user, packet, interface, or device should be trusted. Some people think zero trust is about making a system trusted, but it really involves eliminating the concept of trust from cybersecurity strategy."
Case studies,
Security
Research for Practice:
Crash Consistency
Peter Alvaro, Ram Alagappan
Keeping data safe in the presence of crashes is a fundamental problem.
For our second article in the Research for Practice reboot, we asked Ram Alagappan, an Assistant Professor at the University of Illinois Urbana Champaign, to survey recent research on crash consistency—the guarantee that application data will survive system crashes. Unlike memory consistency, crash consistency is an end-to-end concern, requiring not only that the lower levels of the system (e.g., the file system) are implemented correctly, but also that their interfaces are used correctly by applications.
Alagappan has chosen a collection of papers that reflects this complexity, traversing the stack from applications all the way to hardware.
The first paper focuses on the filesystem, and uses bug-finding techniques to witness violations of interface-level guarantees.
The second moves up the stack, rethinking the interfaces that file systems provide to application programmers to make it easier to write crash-consistent programs.
In the last, the plot thickens with the new challenges that persistent memory brings to crash consistency.
Data,
Failure and Recovery,
Research for Practice
Kode Vicious:
The Four Horsemen of an Ailing Software Project
Don't let the pale rider catch you with an exception.
KV has talked about various measures of software quality in past columns, but perhaps falling software quality is one of the most objective measures that a team is failing. This Pestilence, brought about by the low morale engendered in the team by War and Famine, is a clear sign that something is wrong. In the real world, a diseased animal can be culled so that disease does not spread and become a pestilence over the land. Increasing bug counts, especially in the absence of increased functionality, is a sure sign of a coming project apocalypse.
Development,
Kode Vicious
The Bikeshed:
CSRB's Opus One
Poul-Henning Kamp
Comments on the Cyber Safety Review Board Log4j Event Report
We in FOSS need to become much better at documenting design decisions in a way and a place where the right people will find it, read it, and understand it, before they do something ill-advised or downright stupid with our code.
The Bikeshed,
Open Source,
Privacy and Rights
Volume 20, Issue 3
Research for Practice:
Convergence
Peter Alvaro, Martin Kleppmann
Research for Practice reboot
It is with great pride and no small amount of excitement that I announce the reboot of acmqueue's Research for Practice column. For three years, beginning at its inception in 2016, Research for Practice brought both seminal and cutting-edge research—via careful curation by experts in academia—within easy reach for practitioners who are too busy building things to manage the deluge of scholarly publications. We believe the series succeeded in its stated goal of sharing "the joy and utility of reading computer science research" between academics and their counterparts in industry. We know our readers have missed it, and we are delighted to rekindle the flame after a three-year hiatus.
For this first installment, we invited Dr. Martin Kleppmann, research fellow and affiliated lecturer at the University of Cambridge, to curate a selection of recent research papers in a perennially interesting domain: convergent or "eventual consistent" replicated systems. His expert analysis circles the topic, viewing it through the lens of recent work in four distinct research domains: systems, programming languages, human-computer interaction, and data management. Along the way, readers will be exposed to a variety of data structures, algorithms, proof techniques, and programming models (each described in terms of a distinct formalism), all of which attempt to make programming large-scale distributed systems easier. I hope you enjoy his column as much as I did.
Distributed Computing,
Research for Practice
Privacy of Personal Information
Sutapa Mondal, Mangesh S. Gharote, and Sachin P. Lodha
Going incog in a goldfish bowl
Each online interaction with an external service creates data about the user that is digitally recorded and stored. These external services may be credit card transactions, medical consultations, census data collection, voter registration, etc. Although the data is ostensibly collected to provide citizens with better services, the privacy of the individual is inevitably put at risk. With the growing reach of the Internet and the volume of data being generated, data protection and, specifically, preserving the privacy of individuals, have become particularly important. In this article we discuss the data privacy concepts using two fictitious characters, Swara and Betaal, and their interactions with a fictitious entity, namely Asha Hospital.
Privacy and Rights
Kode Vicious:
Securing the Company Jewels
GitHub and runbook security
Often the problem with a runbook isn't the runbook itself, it's the runner of the runbook that matters. A runbook, or a checklist, is supposed to be an aid to memory and not a replacement for careful and independent thought. But our industry being what it is, we now see people take these things to their illogical extremes, and I think this is the problem you are running into with your local runbook runner.
Kode Vicious,
Security
Escaping the Singularity:
I'm Probably Less Deterministic Than I Used to Be
Pat Helland
Embracing randomness is necessary in cloud environments.
In my youth, I thought the universe was ruled by cause and effect like a big clock. In this light, computing made sense. Now I see that both life and computing can be a crapshoot, and that has given me a new peace.
Distributed Computing,
Escaping the Singularity
The Challenges of IoT, TLS, and Random
Number Generators in the Real World
James P. Hughes, Whitfield Diffie
Bad random numbers are still with us and are proliferating in modern systems.
Many in the cryptographic community scoff at the mistakes made in implementing RNGs. Many cryptographers and members of the IETF resist the call to make TLS more resilient to this class of failures. This article discusses the history, current state, and fragility of the TLS protocol, and it closes with an example of how to improve the protocol. The goal is not to suggest a solution but to start a dialog to make TLS more resilient by proving that the security of TLS without the assumption of perfect random numbers is possible.
Business,
Privacy and Rights