Functional at Scale
Applying functional programming principles to distributed computing projects
Modern server software is demanding to develop and operate: it must be available at all times and in all locations; it must reply within milliseconds to user requests; it must respond quickly to capacity demands; it must process a lot of data and even more traffic; it must adapt quickly to changing product needs; and in many cases it must accommodate a large engineering organization, its many engineers the proverbial cooks in a big, messy kitchen.
The Soft Side of Software
Just because you have been doing it the same way doesn't mean you are doing it the right way.
Wouldn't it be great if you frequently were in a position where you were pushed to grow outside of your comfort zone? Where you had to start new and fresh?
The Soft Side of Software
A discussion with Pete Hunt, Paul O'Shannessy, Dave Smith and Terry Coatta
Scaling Synchronization in Multicore Programs
Advanced synchronization methods can boost the performance of multicore software.
Designing software for modern multicore processors poses a dilemma. Traditional software designs, in which threads manipulate shared data, have limited scalability because synchronization of updates to shared data serializes threads and limits parallelism. Alternative distributed software designs, in which threads do not share mutable data, eliminate synchronization and offer better scalability. But distributed designs make it challenging to implement features that shared data structures naturally provide, such as dynamic load balancing and strong consistency guarantees, and are simply not a good fit for every program.
Often, however, the performance of shared mutable data structures is limited by the synchronization methods in use today, whether lock-based or lock-free. To help readers make informed design decisions, this article describes advanced (and practical) synchronization methods that can push the performance of designs using shared mutable data to levels that are acceptable to many applications.
10 Optimizations on Linear Search
Thomas A. Limoncelli
The operations side of the story
System administrators (DevOps engineers or SREs or whatever your title) must deal with the operational aspects of computation, not just the theoretical aspects. Operations is where the rubber hits the road. As a result, operations people see things from a different perspective and can realize opportunities outside of the basic O() analysis. Let's look at the operational aspects of the problem of trying to improve something that is theoretically optimal already.
Escaping the Singularity:
The Singular Success of SQL
SQL has a brilliant future as a major figure in the pantheon of data representations.
SQL has been singularly successful in its impact on the database industry. Nothing has come remotely close to its ubiquity. Its success comes from its high-level use of relational algebra allowing set-oriented operations on data shaped as rows, columns, cells, and tables.
SQL's impact can be seen in two broad areas. First, the programmer can accomplish a lot very easily with set-oriented operations. Second, the high-level expression of the programmer's intent has empowered huge performance gains.
This column discusses how these features are dependent on SQL creating a notion of stillness through transactions and a notion of a tight group of tables with schema fixed at the moment of the transaction. These characteristics are what make SQL different from the increasingly pervasive distributed systems.
SQL has a brilliant past and a brilliant future. That future is not as the singular and ubiquitous holder of data but rather as a major figure in the pantheon of data representations. What the heck happens when data is not kept in SQL?
Escaping the Singularity
The Soft Side of Software:
Bad Software Architecture is a People Problem
When people don't work well together they make bad decisions.
It all started with a bug. Customers were complaining that their information was out of date on the website. They would make an update and for some reason their changes weren't being reflected. Caching seemed like the obvious problem, but once we started diving into the details, we realized it was a much bigger issue.
The Soft Side of Software
Dynamics of Change: Why Reactivity Matters
Tame the dynamics of change by centralizing each concern in its own module.
Professional programming is about dealing with software at scale. Everything is trivial when the problem is small and contained: it can be elegantly solved with imperative programming or functional programming or any other paradigm. Real-world challenges arise when programmers have to deal with large amounts of data, network requests, or intertwined entities, as in UI (user interface) programming.
Research for Practice:
Distributed Consensus and Implications of NVM on Database Management Systems
Peter Bailis, Camille Fournier, Joy Arulraj, Andy Pavlo
Expert-curated Guides to the Best of CS Research
First, how do large-scale distributed systems mediate access to shared resources, coordinate updates to mutable state, and reliably make decisions in the presence of failures? Camille Fournier, a seasoned and experienced distributed-systems builder (and ZooKeeper PMC), has curated a fantastic selection on distributed consensus in practice. The history of consensus echoes many of the goals of RfP: for decades the study and use of consensus protocols were considered notoriously difficult to understand and remained primarily academic concerns. As a result, these protocols were largely ignored by industry. The rise of Internet-scale services and demands for automated solutions to cluster management, failover, and sharding in the 2000s finally led to the widespread practical adoption of these techniques. Adoption proved difficult, however, and the process in turn led to new (and ongoing) research on the subject. The papers in this selection highlight the challenges and the rewards of making the theory of consensus practical-both in theory and in practice.
Second, while consensus concerns distributed shared state, our second selection concerns the impact of hardware trends on single-node shared state. Joy Arulaj and Andy Pavlo provide a whirlwind tour of the implications of NVM (nonvolatile memory) technologies on modern storage systems. NVM promises to overhaul the traditional paradigm that stable storage (i.e., storage that persists despite failures) be block-addressable (i.e., requires reading and writing in large chunks). In addition, NVM's performance characteristics lead to entirely different design trade-offs than conventional storage media such as spinning disks.
Cluster-level Logging of Containers with Containers
Logging Challenges of Container-Based Cloud Deployments
Collecting and analyzing log information is an essential aspect of running production systems to ensure their reliability and to provide important auditing information. Many tools have been developed to help with the aggregation and collection of logs for specific software components (e.g., an Apache web server) running on specific servers (e.g., Fluentd and Logstash.) They are accompanied by tools such as Elasticsearch for ingesting log information into persistent storage and tools such as Kibana for querying log information.
This article shows how cluster-level logging infrastructure can be implemented using open source tools and deployed using the very same abstractions that are used to compose and manage the software systems being logged.
Kode Vicious: Chilling the Messenger
Keeping ego out of software-design review
Trying to correct someone who has just done a lot of work, even if, ultimately, that work is not the right work, is a daunting task. The person in question no doubt believes that he has worked very hard to produce something of value to the rest of the team, and walking in and spitting on it, literally or metaphorically, probably crosses your "offense" line—at least I think it does. I'm a bit surprised that since this is the first sprint and there is already so much code written, shouldn't the software have shown up after the sprints established what was needed, who the stakeholders were, etc.? Or was this a piece of previously existing code that was being brought in to solve a new problem? It probably doesn't matter, because the crux of your letter is the fact that you and your team do not sufficiently understand the software in question to be comfortable with fielding it.
The Hidden Dividends of Microservices
Microservices aren't for every company, and the journey isn't easy.
Microservices are an approach to building distributed systems in which services are exposed only through hardened APIs; the services themselves have a high degree of internal cohesion around a specific and well-bounded context or area of responsibility, and the coupling between them is loose. Such services are typically simple, yet they can be composed into very rich and elaborate applications. The effort required to adopt a microservices-based approach is considerable, particularly in cases that involve migration from more monolithic architectures. The explicit benefits of microservices are well known and numerous, however, and can include increased agility, resilience, scalability, and developer productivity. This article identifies some of the hidden dividends of microservices that implementers should make a conscious effort to reap.